
MEETING
Coding Agents Lunch & Learn, Session 6
Coding Agents Lunch & Learn - Session 6
Community Benchmarking for AI Coding Agents
In this session, we’ll explore ideas for building a community-driven benchmark for AI coding agents. The goal is to test how different LLMs and agent setups perform when solving the same tasks using shared prompts and tools.
We’ll discuss the concept of agent harnesses, how they enable consistent testing across frameworks, and how the community could contribute benchmark examples through a shared repository.
We’ll also begin drafting a few example benchmark tasks together during the session and discuss how this could evolve into a collaborative LLMOps benchmark dataset for evaluating coding agents.
Speakers
Rahul Parundekar
Founder @ A.I. Hero, inc.
Leo Walker
AI Engineer @ KaiCare.ai
Demetrios Brinkmann
Chief Happiness Engineer @ MLOps Community
Attendees


Bessie
member


Arlene
member


Cody
member


Colleen
member


Kathryn
member


Bessie
member
Already registered?
Log in to access
Starting in 3 days
March 20, 4:00 PM GMT
Online
Organized by

MLOps Community
Add to calendar
Starting in 3 days
March 20, 4:00 PM GMT
Online
Organized by

MLOps Community
Add to calendar



