Evaluation of Agentic System // Aditya Gautam // Agent Hour

Name: Evaluation%20of%20Agentic%20System%20//%20Aditya%20Gautam%20//%20Agent%20Hour
Uploaded: 2025-04-22T10:44:08.808Z

Posted Apr 22, 2025 | Views 91

# Agents

Aditya Gautam

Machine Learning Technical Lead @ Meta

Aditya is a seasoned AI expert and thought leader at the nexus of AI Integrity, recommendation systems, and LLM-powered agents, focusing on building trustworthy, efficient AI at scale. As a Machine Learning Technical Lead for Integrity at Meta, he architects large-scale AI systems to improve ranking algorithms, combat misinformation, and improve user engagement. He previously served as a founding engineer for a Computer Vision startup within Google’s prestigious Area 120 incubator.

Aditya is quite active in Generative AI community with him being a sought after speaker, panelist, and interviewee, frequently sharing novel insights on agentic system evaluation, LLM cost optimization on industry podcasts and at premier summits like the Databricks Data + AI Summit 2025, Marktechpost, AI agent conference, Analytics Vidhya, MLops Community and other. His expertise, particularly on Generative AI and agent misinformation, has been featured in major media articles, including the Daily Herald and Marktechpost. His recent research presented at ICWSM 2025 offers a blueprint for a multi-agent system for the misinformation lifecycle. Dedicated to maintaining high standards, he serves as an Ethics Reviewer for NeurIPS 2025 and reviewer several papers for top-tier conferences like ICML, AAAI, ACM among others.

+ Read More

SUMMARY

As complex AI agents become common, standard evaluation isn't enough. This presentation provides a structured overview of the critical field of agentic system evaluation. We will briefly explore common single and multi-agent patterns, delve into the fundamental reasons why rigorous evaluation is necessary, and outline core principles for conducting meaningful assessments. This talk covers essential principles, methods (benchmarks, simulation, human feedback), and metrics for evaluating agentic system performance, highlighting key challenges.

+ Read More

Comments (0)

Popular

Watch More

Exploring the Impact of Agentic Workflows

Posted Oct 15, 2024 | Views 7.9K

# AI agents in production

# LLMs

# AI

Multi-Agent Systems for the Misinformation Lifecycle // Aditya Gautam

Posted Nov 25, 2025 | Views 108

# Agents in Production

# Prosus AI

# Multi-Agent System

Why we built PydanticAI, and why you might care // Samuel Colvin // Agent Hour #2

Posted Dec 19, 2024 | Views 3.9K

# Pydantic

# Agents

# Agent Hour

# AI agents in production

Evaluation of Agentic System // Aditya Gautam // Agent Hour

Speaker

SUMMARY

Watch More