Blog
# DevOps
# Kubernetes
# From Scratch
How To Get Started With Kubernetes: A Practical Guide
A hands-on beginner roadmap for learning Kubernetes, designed to walk you through core concepts (like clusters, pods, services, deployments, storage, RBAC, autoscaling, etc.) with simple explanations, CLI examples, and practical exercises. By following it you build real experience and are prepared to use Kubernetes locally or on cloud platforms like GKE or EKS.

Axel Mendoza · Feb 24th, 2026

Médéric Hurier · Feb 17th, 2026
This post details the practical application of the A2UI protocol, introducing the Agent-View-Controller (AVC) pattern to decouple agent logic from UI rendering. It highlights that while A2UI enables secure, adaptable interfaces, a hybrid architecture combining static and dynamic elements is often required to balance expressiveness with latency.
# Artificial Intelligence
# Generative Ui
# Software Architecture
# LLMs
# Fronted Development

Médéric Hurier · Feb 10th, 2026
Addressing the challenge of AI agent exposition, this post evaluates various implementation paths, including full-stack frameworks and AI-generated code. It identifies A2UI as a promising declarative solution that enables dynamic, secure interfaces by decoupling the agent's logic from the client's rendering capabilities.
# Artificial Intelligence
# UI
# Generative AI Tools
# AI Agent
# Software Development

Subham Kundu · Feb 4th, 2026
This blog explains a systematic way to fix CUDA out-of-memory (OOM) errors during GRPO reinforcement learning training, instead of randomly lowering hyperparameters until something works.
Subham argues that most GPU memory issues come from three sources: vLLM reserving GPU memory upfront (often the biggest chunk), training activations (which scale with batch size, sequence length, number of generations, and model size), and model memory (usually the smallest contributor). By carefully reading the OOM error message and estimating how memory is distributed across these components, you can identify exactly what’s causing the crash.
The recommended approach is to calculate memory usage first, then adjust the highest-impact settings, such as GPU memory allocation for vLLM, number of generations, batch size, and sequence length. The guide also shows how to maintain training quality by using techniques like gradient accumulation instead of simply shrinking everything.
Overall, the key message is: treat OOM debugging as a measurable engineering problem, not trial-and-error, so you can fix memory issues faster while preserving training performance.
# GRPO
# CUDA
# GPU Memory
# LLM Training

Subham Kundu · Jan 20th, 2026
A system with 70+ automated tools was sending all tool definitions with every query, wasting tokens and slowing responses. The solution was a semantic tool selection system using Redis as a vector database with intelligent embeddings to understand user intent. This approach cut token consumption by 91% while improving accuracy by matching queries to only the relevant tools needed.
# AI Cost Optimization
# LLM Tooling
# Semantic Search
# Vector Databases
# Redis
# Embeddings

Médéric Hurier · Jan 13th, 2026
Generative AI has evolved at lightning speed from LLMs and RAG to autonomous AI agents capable of reasoning, planning, and acting. But creating a single agent is easy; managing thousands in an enterprise requires a full AI Agent Platform. This guide breaks down the architecture of a production-grade platform, covering layers like Interaction, Development, Core, Foundation, Information, Observability, and Trust. It shows how to build systems that are secure, scalable, and capable of delivering real business impact.
# AI Agents
# Artificial Intelligence
# Data Science
# Software Architecture
# Cloud Computing

Haziqa Sajid · Jan 6th, 2026
Learn how a natively multimodal database like ApertureDB helps healthcare ads stay compliant by flagging missing facts and improving transparency by supporting true multimodality alongside vector search.
Longer abstract: Technologies like RAG (retrieval-augmented generation), semantic search systems, and generative applications wouldn’t be possible without vector databases. A very few of these databases, such as ApertureDB, are truly capable of natively handling more than just text. They now work with images, audio, and other data types, which opens up new possibilities across industries like healthcare, retail, and finance.
For building this example, we pick healthcare advertising because it shows a great blend of multimodality. With strict rules around accuracy, disclosure, and patient privacy, it’s critical to include all Material Facts in marketing content. These are details that could influence a patient’s understanding or choices.
In this blog, we will discuss how a combination of ApertureDB, Unstructured, and OpenAI can help detect and flag missing material facts in healthcare advertisements.
# Multimodal/Generative AI
# RAG
# Vector/similarity/semantic search

Vishakha Gupta · Dec 23rd, 2025
As AI applications move beyond rows and columns into images, video, embeddings, and graphs, traditional query languages like SQL and Cypher start to crack. This post explains why ApertureDB chose to design a JSON-based query language from scratch—one built for multimodal search, data processing, and scale. By aligning with how modern AI systems already communicate (JSON, agents, workflows, and natural language), ApertureDB avoids brittle joins, performance tradeoffs, and DIY pipelines, while still offering SQL and SPARQL wrappers for familiarity. The result is a layered, future-proof way to query, process, and explore multimodal data without forcing old abstractions onto new problems.
# Multimodal/Generative AI
# Usability and Debugging

Médéric Hurier · Dec 16th, 2025
The traditional centralized data platform, characterized by rigid data warehouses and complex ETL pipelines, creates technical bottlenecks that severely slow down the delivery of business insights, forcing decision-makers to wait for overburdened data engineering teams. The open-source prototype Da2a proposes a radical new paradigm: a distributed, agentic ecosystem where specialized, autonomous agents (e.g., Marketing, E-commerce) manage their own domain data and collaborate via an Agent-to-Agent (A2A) protocol to answer complex, cross-domain queries. Instead of focusing on the engineering of data movement and storage, this approach is insight-focused, allowing an orchestrator agent to plan and delegate tasks, abstracting underlying complexity and enabling greater scalability, extensibility, and alignment with high-level business logic—a critical evolution for MLOps engineers looking to build more flexible and responsive data foundations.
# Generative AI Tools
# Artificial Intelligence
# Machine Learning
# Data Science
# AI Agent

Kopal Garg · Dec 10th, 2025
Everyone obsesses over models, but NVIDIA’s stack makes it obvious: the real power move is owning everything around the model. NeMo trains it, RAPIDS cleans it, TensorRT speeds it up, Triton serves it, Operators manage it — and the hardware seals the deal.
It’s less a toolkit and more a gravity well for your entire GenAI pipeline. Once you’re in, good luck escaping.
# Generative AI
# AI Frameworks
# NVIDIA

Médéric Hurier · Dec 2nd, 2025
Overcome the friction of boilerplate code and infrastructure wrangling by adopting a declarative approach to AI agent development. This article introduces Ackgent, a production-ready template built on Google’s Agent Developer Kit (ADK) and Agent Config, which allows developers to define agent behaviors via structured YAML files while keeping implementation logic in Python. Learn how to leverage a modern stack—including uv, just, and Multi-agent Communication Protocol (MCP)—to rapidly prototype, test, and deploy scalable multi-agent systems on Google Cloud Run.
# AI Agents
# Generative AI Agents
# Artificial Intelligence
# Google ADK
# Data Science
