Sentences and Paragraphs

38 - Readability Scoring

Measuring the readability of a text by looking at the keyword density, syllable count and the average length of sentences and words in a document.

Readability is the quality of the text that was written. If it’s too long and complicated, no one will understand it. Measuring the readability is about measuring the text quality. This can be done by looking at the keyword density, syllable count and the average length of sentences and words in a document. Also checking for simpler synonyms or words with a higher word prevalence can help. Word prevalence is about word knowledge in the crowd and refers to the number of people who know the word.

Well-known Readability measures are Flesch-Kincaid Grade Level and the Coleman-Liau Index. These are developed for English. For non-English languages there might be specific variants. However, the best language-agnostic linguistic proxy for readability is (not surprisingly) the average number of words per sentence.

You can try the English readability metrics with this python package.

Flesch-Kincaid Grade Levels (source)




This article is part of the project Periodic Table of NLP Tasks. Click to read more about the making of the Periodic Table and the project to systemize NLP tasks.