Evaluating Large Language Models

A course by

Data Science Dojo

Nov/2024 3 lessons English

Description

Curriculum

Instructors

Reviews

In this course, you will gain a deep understanding of how to assess the quality and effectiveness of large language models (LLMs). You will explore the importance of LLM evaluation, common challenges, and key metrics such as BLEU, ROUGE, and BERTScore, and learn to apply these in real-world tasks like summarization and question-answering. By the end of the course, you will be able to use advanced frameworks like RAGAS to evaluate LLMs’ performance across various applications, ensuring that their outputs are both reliable and accurate.

What you'll learn

Understand the importance of evaluation in LLMs.
Analyze LLM performance using key metrics like BLEU, ROUGE, and BERTScore.
Implement human evaluation techniques such as Likert scales.

Apply the RAGAS framework for context precision, faithfulness, and relevancy.
Evaluate summarization and question-answering models through hands-on exercises.
Evaluate end-to-end pipelines using advanced tools like GPTEval.

Evaluating Large Language Models

What you'll learn

Large Language Models

Courses you might be interested in

LLM for Everyone – Session Recordings

LLM Bootcamp – Session Recordings

Final Project: Build A Multi-Agent LLM Application

LLMs Evaluation, Monitoring and Guardrails

Trainings

Enterprise

Community

About

Evaluating Large Language Models

What you'll learn

Large Language Models

Courses you might be interested in

LLM for Everyone – Session Recordings

LLM Bootcamp – Session Recordings

Final Project: Build A Multi-Agent LLM Application

LLMs Evaluation, Monitoring and Guardrails

Trainings

Enterprise

Community

About

Login with your site account

Connect with :

Register a new account