Collections

All Content

All Tags

All Types

Médéric Hurier · Mar 10th, 2026

mAIdAI: Building a Personal Assistant with Google Cloud and Vertex AI

mAIdAI is a lightweight personal AI assistant built with Google Chat, Cloud Run, and Vertex AI, designed to automate repetitive micro-tasks. By grounding the model with a local markdown context file, it provides highly personalized workflow assistance directly within your chat environment.

# Generative AI Tools

# Artificial Intelligence

# AI Agent

# Programming

# Automation

Médéric Hurier · Mar 3rd, 2026

MLOps Coding Skills: Bridging the Gap Between Specs and Agents

This article explores how to use "Agent Skills"—simple Markdown-based context modules—to ensure AI agents strictly adhere to your team's MLOps practices and tooling preferences. By providing explicit organizational rules upfront, developers can eliminate generic boilerplate and align AI-generated code with production-grade standards.

# MLOps

# AI Agent

# Software Engineering

# Generative AI Tools

# Coding

Valdimar Eggertsson, Adam Becker & Arthur Coleman · Feb 27th, 2026

Advancing Open-source World Models // MLOps Reading Group // February 2026

We present LingBot-World, an open-sourced world simulator stemming from video generation. Positioned as a top-tier world model, LingBot-World offers the following features. (1) It maintains high fidelity and robust dynamics in a broad spectrum of environments, including realism, scientific contexts, cartoon styles, and beyond. (2) It enables a minute-level horizon while preserving contextual consistency over time, which is also known as "long-term memory". (3) It supports real-time interactivity, achieving a latency of under 1 second when producing 16 frames per second. We provide public access to the code and model in an effort to narrow the divide between open-source and closed-source technologies. We believe our release will empower the community with practical applications across areas like content creation, gaming, and robot learning.

# Coding Agents

# Open Source World Models

# LinkBot World

Chris Fregly & Demetrios Brinkmann · Feb 24th, 2026

Performance Optimization and Software/Hardware Co-design across PyTorch, CUDA, and NVIDIA GPUs

In today’s era of massive generative models, it's important to understand the full scope of AI systems' performance engineering. This talk discusses the new O'Reilly book, AI Systems Performance Engineering, and the accompanying GitHub repo (https://github.com/cfregly/ai-performance-engineering). This talk provides engineers, researchers, and developers with a set of actionable optimization strategies. You'll learn techniques to co-design and co-optimize hardware, software, and algorithms to build resilient, scalable, and cost-effective AI systems for both training and inference.

# NVIDIA GPUs

# CUDA framework

# GitHub repo

Axel Mendoza · Feb 24th, 2026

How To Get Started With Kubernetes: A Practical Guide

A hands-on beginner roadmap for learning Kubernetes, designed to walk you through core concepts (like clusters, pods, services, deployments, storage, RBAC, autoscaling, etc.) with simple explanations, CLI examples, and practical exercises. By following it you build real experience and are prepared to use Kubernetes locally or on cloud platforms like GKE or EKS.

# DevOps

# Kubernetes

# From Scratch

Ioana Apetrei, Igor Šušić & Adam Becker · Feb 19th, 2026

Serving LLMs in Production: Performance, Cost & Scale // CAST AI Roundtable

Experimenting with LLMs is easy. Running them reliably and cost-effectively in production is where things break. Most AI teams never make it past demos and proofs of concept. A smaller group is pushing real workloads to production—and running into very real challenges around infrastructure efficiency, runaway cloud costs, and reliability at scale. This session is for engineers and platform teams moving beyond experimentation and building AI systems that actually hold up in production.

# AI Applications

# GPU Orchestration

# Kubernetes Clusters

# CAST AI

Rahul Raja & Demetrios Brinkmann · Feb 17th, 2026

The Future of Information Retrieval: From Dense Vectors to Cognitive Search

Information Retrieval is evolving from keyword matching to intelligent, vector-based understanding. In this talk, Rahul Raja explores how dense retrieval, vector databases, and hybrid search systems are redefining how modern AI retrieves, ranks, and reasons over information. He discusses how retrieval now powers large language models through Retrieval-Augmented Generation (RAG) and the new MLOps challenges that arise, embedding drift, continuous evaluation, and large-scale vector maintenance. Looking ahead, the session envisions a future of Cognitive Search, where retrieval systems move beyond recall to genuine reasoning, contextual understanding, and multimodal awareness. Listeners will gain insight into how the next generation of retrieval will bridge semantics, scalability, and intelligence, powering everything from search and recommendations to generative AI.

# AI Agents

# AI Engineer

# AI agents in production

# AI Agents use case

# System Design

Médéric Hurier · Feb 17th, 2026

Building with A2UI: Extending the Expressiveness of AI Agent Interfaces

This post details the practical application of the A2UI protocol, introducing the Agent-View-Controller (AVC) pattern to decouple agent logic from UI rendering. It highlights that while A2UI enables secure, adaptable interfaces, a hybrid architecture combining static and dynamic elements is often required to balance expressiveness with latency.

# Artificial Intelligence

# Generative Ui

# Software Architecture

# LLMs

# Fronted Development

Vincent D. Warmerdam & Demetrios Brinkmann · Feb 13th, 2026

Rethinking Notebooks Powered by AI

Vincent Warmerdam joins Demetrios fresh off marimo’s acquisition by Weights & Biases—and makes a bold claim: notebooks as we know them are outdated. They talk Molab (GPU-backed, cloud-hosted notebooks), LLMs that don’t just chat but actually fix your SQL and debug your code, and why most data folks are consuming tools instead of experimenting. Vincent argues we should stop treating notebooks like static scratchpads and start treating them like dynamic apps powered by AI. It’s a conversation about rethinking workflows, reclaiming creativity, and not outsourcing your brain to the model.

# Vincent D. Warmerdam

# Calmcode

# marimo

# wandb

# Jupiter Notebooks

# Data Science

Ereli Eran & Demetrios Brinkmann · Feb 10th, 2026

Software Engineering in the Age of Coding Agents: Testing, Evals, and Shipping Safely at Scale

A conversation on how AI coding agents are changing the way we build and operate production systems. We explore the practical boundaries between agentic and deterministic code, strategies for shared responsibility across models, engineering teams, and customers, and how to evaluate agent performance at scale. Topics include production quality gates, safety and cost tradeoffs, managing long-tail failures, and deployment patterns that let you ship agents with confidence.