The average American is considered to have a readability level equivalent to a 7th/8th grader (12 to 14 years old).
Readability formulas are often used to assess the difficulty of text.
These formulas use mathematical calculations to produce a score. The score shows the relative difficulty on a continuum from easy (grade five or lower) to very hard (college level and above). Higher scores mean lower reading ease or lower readability.
This app calculates the score according to the Coleman–Liau Index:
The Coleman–Liau index is a readability test designed by Meri Coleman and T. L. Liau to gauge the understandability of a text. Like the Flesch–Kincaid Grade Level, Gunning fog index, SMOG index, and Automated Readability Index, its output approximates the U.S. grade level thought necessary to comprehend the text.
Measuring the similarity is of key importance in several natural processing applications including information retrieval, book recommendation, news categorization and essay scoring.
The texts can be similar in a) meaning or b) surface closeness. The first is referred to as semantic similarity and the latter is referred to as lexical similarity.
This app evaluates the Lexical Similarity:
It means how similar two pieces of text are at the surface level. For example, how similar are the phrases “the cat ate the mouse” with “the mouse ate the cat food” by just looking at the words? On the surface, if you consider only word level similarity, these two phrases appear very similar as 3 of the 4 unique words are an exact overlap. This notion of similarity does not take into account the actual meaning behind words or the entire phrase in context.
The project was inspired by a problem set called "Readability" presented by the CS50's program in weeks 2 and 6.
The application was built with the following technologies: