MLOps Community
Coding Agents Lunch & Learn, Session 6
MEETING

Coding Agents Lunch & Learn, Session 6


Coding Agents Lunch & Learn - Session 6


Community Benchmarking for AI Coding Agents


In this session, we’ll explore ideas for building a community-driven benchmark for AI coding agents. The goal is to test how different LLMs and agent setups perform when solving the same tasks using shared prompts and tools.


We’ll discuss the concept of agent harnesses, how they enable consistent testing across frameworks, and how the community could contribute benchmark examples through a shared repository.


We’ll also begin drafting a few example benchmark tasks together during the session and discuss how this could evolve into a collaborative LLMOps benchmark dataset for evaluating coding agents.


Speakers

Rahul Parundekar
Founder @ A.I. Hero, inc.
Leo Walker
AI Engineer @ KaiCare.ai
Demetrios Brinkmann
Chief Happiness Engineer @ MLOps Community

Attendees

Bessie's Avatar
Bessie's Avatar
Bessie
member
Arlene's Avatar
Arlene's Avatar
Arlene
member
Cody's Avatar
Cody's Avatar
Cody
member
Colleen's Avatar
Colleen's Avatar
Colleen
member
Kathryn's Avatar
Kathryn's Avatar
Kathryn
member
Bessie's Avatar
Bessie's Avatar
Bessie
member
Already registered?
Starting in 3 days
March 20, 4:00 PM GMT
Online
Organized by
user's Avatar
MLOps Community
Starting in 3 days
March 20, 4:00 PM GMT
Online
Organized by
user's Avatar
MLOps Community
Code of Conduct