MLOps Community
+00:00 GMT
Agent Hour
# Code Completion
# Autonomous Agents

From Code Completion to Autonomous Software Engineering Agents // Kilian Lieret // Agent Hour

As language models have advanced, they have moved beyond code completion and are beginning to tackle software engineering tasks in a more autonomous, agentic way. However, evaluating agentic capabilities is challenging. To address this, we first introduce SWE-bench, a benchmark built from real GitHub issues that has become the standard for assessing AI’s ability to resolve complex software tasks in large codebases. We will discuss the current state of the field, the limitations of today’s models, and how far we still are from truly autonomous AI developers. Next, we will explore the fundamentals of agents based on SWE-agent, a simple yet powerful agent framework designed for software engineering but adaptable to a variety of domains. By the end of this talk, you will have an understanding of the current frontier of agentic AI in software engineering, the challenges ahead, and various tips and tricks on optimizing AI agents for tool use and iterative problem solving of reasoning-heavy tasks.
Kilian Lieret
Kilian Lieret · Apr 2nd, 2025
Popular topics
# Interview
# Case Study
# Model Serving
# Machine Learning
# AI
# RAG
# FinTech
# Open Source
# Cultural Side
# Scaling
# Deployment
# Data Science
# LinkedIn
# MLOps Cycle
# Rule-bases Systems
# Artificial Intelligence
# Intelligence
# Foundational Models and MLOps
# LLM in Production
# Generative AI
All
Enrique Ferrao
Enrique Ferrao · Apr 2nd, 2025
The open-source AI ecosystem is evolving rapidly, with frequent releases of new models, architectures, and hardware accelerators. While this flexibility drives innovation, it also introduces significant hidden complexities when fine-tuning and deploying AI models. In this talk, we’ll explore the key challenges teams face when updating fine-tuned models, switching between inference engines, and deploying across different GPUs such as the Nvidia A100, L40s, H100, and Intel Gaudi 2. We’ll share real-world examples, including tokenizer issues, multi-GPU fine-tuning hurdles, and API inconsistencies across AI components.
# Open Source AI
# GPU
# Fine Tuning
22:43
Xiaomei Song
Xiaomei Song · Apr 2nd, 2025
This talk explores the cutting-edge development of a multi-agent AI system designed for open-ended game environments, with a focus on Minecraft. The presenter will discuss the creation of specialized AI agents, including a Vision Agent for spatial analysis, a Curriculum Agent for dynamic task generation, an Action Agent for behavior execution, a Critic Agent for performance evaluation, and a Skill Manager for knowledge retention. The presentation will highlight how these agents work together, leveraging advanced techniques like Reinforcement Learning, Chain of Thought, and Tree of Thought reasoning to enhance decision-making and adaptability. A key feature of this project is the multi-bot system, which enables multiple AI agents to operate concurrently in the same game world, fostering collaboration and skill sharing. By demonstrating the potential of AI-driven automation and adaptability in complex, dynamic environments, this research opens up exciting possibilities for applications beyond gaming.
# Minecraft
# spatial analysis
# curriculum agent
# gaming
# dynamic
11:01
As Large Language Models (LLMs) evolve, the challenge shifts from raw capability to structuring them into reliable, scalable systems. Many real-world AI products struggle with robustness, complexity management, and evaluation—especially in enterprise contexts. This talk explores how multi-agent systems can help overcome these obstacles by decomposing large monolithic agents into specialized subagents working together in structured architectures. We’ll cover: - Why enterprises struggle to integrate LLM agents effectively. - How multi-agent architectures (Assembly Line, Call Center, and Manager-Worker) improve scalability, modularity, and reliability. - Practical trade-offs and implementation strategies from real-world applications. (planning to adapt my post https://blog.sshh.io/p/building-multi-agent-systems)
# Multi-Agent
# AI Systems
# Security
17:30
Artificial Intelligence is transforming the way we interact with technology, and Agentic AI—systems that exhibit autonomy, adaptability, and decision-making capabilities—is at the forefront of this revolution. But what does this mean for the Arabic language, one of the richest and most complex languages in the world? As we advance AI-driven agents, ensuring they understand, process, and generate Arabic with the same fluency and nuance as English or other dominant languages is not just a technological challenge but a cultural imperative. In this speech, we will explore how Agentic AI can empower Arabic speakers, enhance accessibility, and preserve the linguistic heritage of over 400 million people while driving innovation across industries. The future of AI is agentic. The future of Arabic in AI depends on how we shape it today. Artificial Intelligence is transforming the way we interact with technology, and Agentic AI—systems that exhibit autonomy, adaptability, and decision-making capabilities—is at the forefront of this revolution. But what does this mean for the Arabic language, one of the richest and most complex languages in the world? As we advance AI-driven agents, ensuring they understand, process, and generate Arabic with the same fluency and nuance as English or other dominant languages is not just a technological challenge but a cultural imperative. In this speech, we will explore how Agentic AI can empower Arabic speakers, enhance accessibility, and preserve the linguistic heritage of over 400 million people while driving innovation across industries. The future of AI is agentic. The future of Arabic in AI depends on how we shape it today.
# Arabic
# Agents
# Linguistics
23:00
R1 > computer? In this talk, we will explore applying the ideas of R1 style RL fine-tuning for computer use
12:28
We are currently in the midst of a paradigm shift from stateless LLM workflows to stateful LLM agents. Today, developers are responsible for managing state (e.g. message history across sessions) and memory (e.g. with a RAG and a vector DB) themselves. Letta is an agents framework where the agents service is responsible for state and memory management, rather than client-side applications. This dramatically simplifies the experience of building stateful agentic applications, as Letta will use memory management techniques (extending the ideas from MemGPT) to automatically ensure the most relevant information is passed into the LLM context window, and also avoid context overflow errors. In this talk, we’ll cover Letta’s high-level architecture, and also explain the details of state and memory management. We’ll also go over how to use Letta to build stateful, reasoning agents with support for custom tools, secure tool environments, and personalized memory.
19:35
Agents are transforming how we approach problem-solving, automation, and user interaction. In this talk, I will explore the practical applications of agents, focusing on how they can deliver value. We'll discuss when agents are the right tool for the job, scenarios where they are not the right tool for the job, and strategies for deploying them to production with confidence and reliability. Whether you're new to agents or looking to refine your approach, this session offers actionable insights grounded in real-world experience.
# Agents
# real world
# AI agents in production
19:47
AI Agents as Neuro-Symbolic Systems: Expanding the Boundaries of Intelligence" The current discourse around AI agents often centers on LLM-based systems with tool-calling capabilities, like REACT agents. While effective, this narrow definition limits the potential of agents to solve complex, real-world problems. In this talk, we explore a broader, more robust perspective—AI agents as neuro-symbolic systems. By combining neural networks' adaptability with the precision of symbolic reasoning, neuro-symbolic architectures bridge traditional AI approaches and modern advancements, enabling scalable and versatile workflows. This expanded definition accommodates not only LLMs but also embedding models, decision trees, and hybrid systems that integrate various modalities of intelligence. We will delve into: 1. The evolution of AI agents and the limitations of current models. 2. The core principles of neuro-symbolic systems and their practical applications. 3. A reimagined framework for building intelligent agents that operate flexibly across diverse tasks. This session aims to redefine the way we think about AI agents, empowering developers and researchers to design systems that are more efficient, resilient, and capable of tackling dynamic challenges. Join us as we explore the future of agentic AI and its transformative potential.
# Agents
# neuro
# symbolic
# neuro-symbolic systems
15:55
//Abstract AI has transformed industries, yet its true potential often lies untapped within core business processes. In this session, we’ll explore how AI agents differ from generative AI models, emphasizing their deterministic, hallucination-free approach to problem-solving. We’ll take a live example of an AI Agent in the logistics sector, and will detail the architectural foundations that enable AI agents to reason effectively, execute chain-of-thought workflows, and integrate seamlessly into human teams. We’ll discuss how these agents confidently navigate complex, multimodal tasks, extracting structured insights from unstructured data, and leveraging dynamic workflows for maximum flexibility. With customizable confidence thresholds, statefulness to track long-term cases, and advanced document understanding, these agents solve real business challenges, such as processing autonomously claims till resolution, with precision. Through a live case study, we’ll illustrate the measurable top and bottom-line effects of deploying AI agents—highlighting significant efficiency gains, multilingual capabilities, and safe, scalable applications in mission-critical environments. By showcasing how AI agents mimic human decision-making at unparalleled speed, we’ll inspire senior management to rethink AI’s role in their organizations and harness its full potential for transformative impact. //Bio Passionate about connecting deep tech to end-users, Vanessa’s work is at the forefront of AI’s transformative potential. For over a decade, she has been transforming cutting-edge innovations into actionable solutions that drive industry change. This is a bi-weekly "Agent Hour" event to continue the conversation about AI agents. Thanks to arcade-ai.com for the support! Join the next live event at home.mlops.community
# logistics
# europe
# AI Agents
20:52
//Abstract Demonstrating agents embedded within websites that utilize real-time audio and structured outputs to dynamically update web pages through conversational interactions. //Bio Raised in Reykjavík, living in Berlin. Studied computational and data science, did R&D in NLP and started making LLM apps as soon as GPT4 changed the game. This is a bi-weekly "Agent Hour" event to continue the conversation about AI agents. Thanks to arcade-ai.com for the support! Join the next live event at home.mlops.community
# sales agent
# voice agent
22:22
Popular