Collections
All Collections
All Content

Subham Kundu · Feb 4th, 2026
This blog explains a systematic way to fix CUDA out-of-memory (OOM) errors during GRPO reinforcement learning training, instead of randomly lowering hyperparameters until something works.
Subham argues that most GPU memory issues come from three sources: vLLM reserving GPU memory upfront (often the biggest chunk), training activations (which scale with batch size, sequence length, number of generations, and model size), and model memory (usually the smallest contributor). By carefully reading the OOM error message and estimating how memory is distributed across these components, you can identify exactly what’s causing the crash.
The recommended approach is to calculate memory usage first, then adjust the highest-impact settings, such as GPU memory allocation for vLLM, number of generations, batch size, and sequence length. The guide also shows how to maintain training quality by using techniques like gradient accumulation instead of simply shrinking everything.
Overall, the key message is: treat OOM debugging as a measurable engineering problem, not trial-and-error, so you can fix memory issues faster while preserving training performance.
# GRPO
# CUDA
# GPU Memory
# LLM Training


Kris Beevers & Demetrios Brinkmann · Feb 3rd, 2026
Hundreds of neocloud operators and "AI Factory" builders have emerged to serve the insatiable demand for AI infrastructure. These teams are compressing the design, build, deploy, operate, scale cycle of their infrastructures down to months, while managing massive footprints with lean teams. How? By applying modern intent driven infrastructure automation principles to greenfield deployments. We'll explore how these teams carry design intent through to production, and how operating and automating around consistent infrastructure data is compressing "time to first train".
# AI Agents
# AI Engineer
# AI agents in production
# AI Agents use case
# System Design


Mike Oaten & Demetrios Brinkmann · Jan 27th, 2026
As AI models move into high-stakes environments like Defence and Financial Services, standard input/output testing, evals, and monitoring are becoming dangerously insufficient. To achieve true compliance, MLOps teams need to access and analyse the internal reasoning of their models to achieve compliance with the EU AI Act, NIST AI RMF, and other requirements.
In this session, Mike introduces the company's patent-pending AI assurance technology that moves beyond statistical proxies. He will break down the architecture of the Synapses Logger, a patent-pending technology that embeds directly into the neural activation flow to capture weights, activations, and activation paths in real-time.
# EU AI Act
# Regulations Compliance
# Tikos



+2
Valdimar Eggertsson, Lucas Pavanelli, Rohan Prasad & 2 content:more content:speakers · Jan 26th, 2026
AI agents aren’t “helping” devs code anymore—they’re starting to run the workflow. This panel pokes at the uncomfortable question: are engineers still in control, or just supervising very confident machines that are slowly replacing how we think, design, and build software?
# AI Agents
# Coding Agents
# LLMs
# Devs Code


Paulo Vasconcellos & Demetrios Brinkmann · Jan 23rd, 2026
“Agent as a product” sounds like hype, until Hotmart turns creators’ content into AI businesses that actually work.
# AI Agents
# AI Engineer
# AI agents in production
# AI Agents use case
# System Design


Wilder Lopes & Demetrios Brinkmann · Jan 20th, 2026
Enterprise organizations face a critical paradox in AI deployment: while 52% struggle to access needed GPU resources with 6-12 month waitlists, 83% of existing CPU capacity sits idle. This talk introduces an approach to AI infrastructure optimization through universal resource management that reshapes applications to run efficiently on any available hardware—CPUs, GPUs, or accelerators.
We explore how code reshaping technology can unlock the untapped potential of enterprise computing infrastructure, enabling organizations to serve 2-3x more workloads while dramatically reducing dependency on scarce GPU resources. The presentation demonstrates why CPUs often outperform GPUs for memory-intensive AI workloads, offering superior cost-effectiveness and immediate availability without architectural complexity.
# AI Agents
# AI Engineer
# AI agents in production
# AI agent usecase
# System Design

Subham Kundu · Jan 20th, 2026
A system with 70+ automated tools was sending all tool definitions with every query, wasting tokens and slowing responses. The solution was a semantic tool selection system using Redis as a vector database with intelligent embeddings to understand user intent. This approach cut token consumption by 91% while improving accuracy by matching queries to only the relevant tools needed.
# AI Cost Optimization
# LLM Tooling
# Semantic Search
# Vector Databases
# Redis
# Embeddings



+1
Corey Zumar, Danny Chiao, Jules Damji & 1 content:more content:speaker · Jan 16th, 2026
MLflow isn’t just for data scientists anymore—and pretending it is is holding teams back.
Corey Zumar, Jules Damji, and Danny Chiao break down how MLflow is being rebuilt for GenAI, agents, and real production systems where evals are messy, memory is risky, and governance actually matters. The takeaway: if your AI stack treats agents like fancy chatbots or splits ML and software tooling, you’re already behind.
# Agents in Production
# Open Source
# MLflow
# Databricks

Médéric Hurier · Jan 13th, 2026
Generative AI has evolved at lightning speed from LLMs and RAG to autonomous AI agents capable of reasoning, planning, and acting. But creating a single agent is easy; managing thousands in an enterprise requires a full AI Agent Platform. This guide breaks down the architecture of a production-grade platform, covering layers like Interaction, Development, Core, Foundation, Information, Observability, and Trust. It shows how to build systems that are secure, scalable, and capable of delivering real business impact.
# AI Agents
# Artificial Intelligence
# Data Science
# Software Architecture
# Cloud Computing



Euro Beinat, Mert Öztekin & Demetrios Brinkmann · Jan 13th, 2026
Agents sound smart until millions of users show up. A real talk on tools, UX, and why autonomy is overrated.
# AI Leadership
# AI Agents
# Coding Agents
# Just Eat
# Prosus Group
