BLEU Score [Meaning] - MasterTerms.com

BLEU Score

BLEU Score is a metric for evaluating the quality of text generated by a machine translation model compared to a reference translation.

BLEU (Bilingual Evaluation Understudy) Score measures the correspondence between machine-generated translations and human reference translations by calculating the n-gram overlap. An n-gram is a contiguous sequence of n items from a given sample of text. The BLEU Score ranges from 0 to 1, with higher scores indicating better translation quality. The score accounts for precision (the number of matching n-grams) and employs a brevity penalty to prevent short translations from receiving high scores unfairly. The precision is calculated for different n-grams, typically 1-gram to 4-gram, and the final score is a geometric mean of these precisions.

BLEU Score Example

For example, if a machine translation produces the sentence “The cat is on the mat,” and the reference translation is “The cat sits on the mat,” the BLEU Score will reflect how many words and phrases overlap between these two sentences, evaluating the translation quality accordingly.