Blog
# Generative AI Tools
# Artificial Intelligence
# AI Agent
# Programming
# Automation
mAIdAI: Building a Personal Assistant with Google Cloud and Vertex AI
mAIdAI is a lightweight personal AI assistant built with Google Chat, Cloud Run, and Vertex AI, designed to automate repetitive micro-tasks. By grounding the model with a local markdown context file, it provides highly personalized workflow assistance directly within your chat environment.

Médéric Hurier · Mar 10th, 2026

Médéric Hurier · Mar 3rd, 2026
This article explores how to use "Agent Skills"—simple Markdown-based context modules—to ensure AI agents strictly adhere to your team's MLOps practices and tooling preferences. By providing explicit organizational rules upfront, developers can eliminate generic boilerplate and align AI-generated code with production-grade standards.
# MLOps
# AI Agent
# Software Engineering
# Generative AI Tools
# Coding

Axel Mendoza · Feb 24th, 2026
A hands-on beginner roadmap for learning Kubernetes, designed to walk you through core concepts (like clusters, pods, services, deployments, storage, RBAC, autoscaling, etc.) with simple explanations, CLI examples, and practical exercises. By following it you build real experience and are prepared to use Kubernetes locally or on cloud platforms like GKE or EKS.
# DevOps
# Kubernetes
# From Scratch

Médéric Hurier · Feb 17th, 2026
This post details the practical application of the A2UI protocol, introducing the Agent-View-Controller (AVC) pattern to decouple agent logic from UI rendering. It highlights that while A2UI enables secure, adaptable interfaces, a hybrid architecture combining static and dynamic elements is often required to balance expressiveness with latency.
# Artificial Intelligence
# Generative Ui
# Software Architecture
# LLMs
# Fronted Development

Médéric Hurier · Feb 10th, 2026
Addressing the challenge of AI agent exposition, this post evaluates various implementation paths, including full-stack frameworks and AI-generated code. It identifies A2UI as a promising declarative solution that enables dynamic, secure interfaces by decoupling the agent's logic from the client's rendering capabilities.
# Artificial Intelligence
# UI
# Generative AI Tools
# AI Agent
# Software Development

Subham Kundu · Feb 4th, 2026
This blog explains a systematic way to fix CUDA out-of-memory (OOM) errors during GRPO reinforcement learning training, instead of randomly lowering hyperparameters until something works.
Subham argues that most GPU memory issues come from three sources: vLLM reserving GPU memory upfront (often the biggest chunk), training activations (which scale with batch size, sequence length, number of generations, and model size), and model memory (usually the smallest contributor). By carefully reading the OOM error message and estimating how memory is distributed across these components, you can identify exactly what’s causing the crash.
The recommended approach is to calculate memory usage first, then adjust the highest-impact settings, such as GPU memory allocation for vLLM, number of generations, batch size, and sequence length. The guide also shows how to maintain training quality by using techniques like gradient accumulation instead of simply shrinking everything.
Overall, the key message is: treat OOM debugging as a measurable engineering problem, not trial-and-error, so you can fix memory issues faster while preserving training performance.
# GRPO
# CUDA
# GPU Memory
# LLM Training

Subham Kundu · Jan 20th, 2026
A system with 70+ automated tools was sending all tool definitions with every query, wasting tokens and slowing responses. The solution was a semantic tool selection system using Redis as a vector database with intelligent embeddings to understand user intent. This approach cut token consumption by 91% while improving accuracy by matching queries to only the relevant tools needed.
# AI Cost Optimization
# LLM Tooling
# Semantic Search
# Vector Databases
# Redis
# Embeddings

Médéric Hurier · Jan 13th, 2026
Generative AI has evolved at lightning speed from LLMs and RAG to autonomous AI agents capable of reasoning, planning, and acting. But creating a single agent is easy; managing thousands in an enterprise requires a full AI Agent Platform. This guide breaks down the architecture of a production-grade platform, covering layers like Interaction, Development, Core, Foundation, Information, Observability, and Trust. It shows how to build systems that are secure, scalable, and capable of delivering real business impact.
# AI Agents
# Artificial Intelligence
# Data Science
# Software Architecture
# Cloud Computing

Haziqa Sajid · Jan 6th, 2026
Learn how a natively multimodal database like ApertureDB helps healthcare ads stay compliant by flagging missing facts and improving transparency by supporting true multimodality alongside vector search.
Longer abstract: Technologies like RAG (retrieval-augmented generation), semantic search systems, and generative applications wouldn’t be possible without vector databases. A very few of these databases, such as ApertureDB, are truly capable of natively handling more than just text. They now work with images, audio, and other data types, which opens up new possibilities across industries like healthcare, retail, and finance.
For building this example, we pick healthcare advertising because it shows a great blend of multimodality. With strict rules around accuracy, disclosure, and patient privacy, it’s critical to include all Material Facts in marketing content. These are details that could influence a patient’s understanding or choices.
In this blog, we will discuss how a combination of ApertureDB, Unstructured, and OpenAI can help detect and flag missing material facts in healthcare advertisements.
# Multimodal/Generative AI
# RAG
# Vector/similarity/semantic search

Vishakha Gupta · Dec 23rd, 2025
As AI applications move beyond rows and columns into images, video, embeddings, and graphs, traditional query languages like SQL and Cypher start to crack. This post explains why ApertureDB chose to design a JSON-based query language from scratch—one built for multimodal search, data processing, and scale. By aligning with how modern AI systems already communicate (JSON, agents, workflows, and natural language), ApertureDB avoids brittle joins, performance tradeoffs, and DIY pipelines, while still offering SQL and SPARQL wrappers for familiarity. The result is a layered, future-proof way to query, process, and explore multimodal data without forcing old abstractions onto new problems.
# Multimodal/Generative AI
# Usability and Debugging

Médéric Hurier · Dec 16th, 2025
The traditional centralized data platform, characterized by rigid data warehouses and complex ETL pipelines, creates technical bottlenecks that severely slow down the delivery of business insights, forcing decision-makers to wait for overburdened data engineering teams. The open-source prototype Da2a proposes a radical new paradigm: a distributed, agentic ecosystem where specialized, autonomous agents (e.g., Marketing, E-commerce) manage their own domain data and collaborate via an Agent-to-Agent (A2A) protocol to answer complex, cross-domain queries. Instead of focusing on the engineering of data movement and storage, this approach is insight-focused, allowing an orchestrator agent to plan and delegate tasks, abstracting underlying complexity and enabling greater scalability, extensibility, and alignment with high-level business logic—a critical evolution for MLOps engineers looking to build more flexible and responsive data foundations.
# Generative AI Tools
# Artificial Intelligence
# Machine Learning
# Data Science
# AI Agent
