dotted

NLP evaluators with Azure AI Evaluation SDK (preview)

November 10, 2025

Evaluate

In NLP (Traditional Lexical) Evaluators, the one concept that unites all of them is n-gram overlap. An “n-gram” is just a sequence of ‘n’ words. These evaluators work by comparing the n-grams in the candidate text (what your AI generated) against the n-grams in one or more reference texts (the “ground truth” or human-written answers).…
Simulate Evaluation (Test) Data

November 10, 2025

Evaluate

Relevant, robust evaluation data is essential for effective evaluations. This data can be generated manually, can include production data, or can be assembled with the help of AI. There are two main types of evaluation data: