Challenges of Evaluating and Understanding Foundation models

The process of evaluating and understanding foundation model such as LLMs or SLMs is complex. This involves finding Benchmarks needed to established performance thresholds as well as accelerating model improvement. It is vital to keep responsible AI practices in mind throughout the analysis cycle.

✨ Join the #MarchResponsibly challenge by learning responsible AI tools and services available to you.

In this video, Besmira Nushi, an AI researcher at Microsoft, discusses critical factors to consider when understanding or evaluating foundation models. In her talk, she addresses risks such as ensuring the data represents the real-world, and the model produces factual information as well as non-toxic content. In addition, she illustrates some performance issues that can arise from long open-ended generative outputs. When building generative AI apps, different prompt variants are sometimes needed. She shares the negative effects on the variants that can occur when a model is updated.

💡Learn responsible AI harms to consider when working with foundation models and some practices AI researchers use in order to understand, evaluate and improve foundation models:

👉🏽 Checkout Besmira Nushi's video: https://aka.ms/march-rai/evaluate-foundation-models

🎉Happy Learning :)