Eki
29
Salı
2024
Building a LLM Judge with Weights & Biases
5:00 ÖS - 6:00 ÖS (UTC)
Evaluating LLM outputs accurately is critical to being able to iterate quickly on a LLM system. Human annotations can be slow and expensive and using LLMs instead promises to solve this. However, aligning a LLM Judge with human judgements is often hard with many implementation details to consider. In this workshop we will explore: Evaluating specialized LLMs using Weave Productionizing the latest LLM-as-a-judge research Improving on your existing judge Building annotation UIs
Konu: Akıllı Uygulamalar
Dil: İngilizce