MLOps Community
+00:00 GMT

Model Context Protocol

Model Context Protocol
# MCP
# LLMs
# AI Applications

A deep dive into MCP architecture and building your first Hacker News integration

July 8, 2025
Soham Chatterjee
Soham Chatterjee
Model Context Protocol
The Model Context Protocol (MCP) is an open standard designed to help LLMs and LLM applications, interact with external systems. It was introduced by Anthropic in late 2024 to standardize the way applications provide context to, and utilize capabilities from external data sources and tools.
Anthropic realised that while AI models are getting increasingly powerful and sophisticated, their training data cutoff and lack of access to external tools was limiting their capabilities. MCPs purpose is to bridge this gap and provide LLMs access to the large, often siloed, data and tool ecosystems they need to operate in real-world environments.
You can think of MCP as USB port for AI applications. Just as USB provides a standardized physical connection for different devices, MCP is a standardized communication protocol for connecting AI models to different external resources, such as databases, APIs, content repositories, business tools, and even development environments.

The Need for MCP


Before MCP, integrating an AI application with multiple external tools or data sources often involved building custom connectors for each system. If an organization had many AI applications and needed to connect them to multiple tools or systems (like databases, APIs for Slack, GitHub, etc.), it could potentially require developing and maintaining dozens of unique integrations. This lead to significant inefficiencies with development teams solve similar integration challenges across different projects and a lack of standardization leading to fragmented and unreliable integrations.
Furthermore, LLMs are fundamentally limited by the context provided to them. To generate relevant, accurate, and personalized responses or perform meaningful actions, they need access to real-time, domain-specific context from external sources and without a standardized way to access this context, models operate with incomplete information, hindering their effectiveness.
With MCP, tool creators build one MCP servers for each external system, and application developers build MCP clients for each AI application. Both sides conform to the MCP standard, drastically reducing integration complexity and enabling various AI applications and tools to work together seamlessly, forming a modular AI ecosystem. Models packaged with the MCP context can be easily integrated into other systems, offering a standardized framework for accessing, sharing, and updating context, including mechanisms for user consent and control.

How MCP Works

MCP has a client-host-server architectural pattern to manage interactions between AI apps and servers:
  • MCP Host: This is the primary application the user interacts with, such as an AI chat interface (like Claude Desktop), an IDE plugin (like Cursor), or a custom AI workflow tool. The Host acts as a coordinator for one or more MCP Clients. It is responsible for managing the lifecycle of these clients, enforcing security policies (like user consent and permissions), initiating connections, and handling the integration with the underlying LLM to interpret user requests and orchestrate interactions with MCP Servers.
  • MCP Client: Residing within the MCP Host, each MCP Client acts as an intermediary, managing a dedicated, stateful, one-to-one connection with a MCP Server. Clients handle capability negotiation with their respective servers, orchestrate the flow of requests and responses according to the MCP specification, and maintain security boundaries, ensuring one client cannot access resources intended for another.
  • MCP Server: These are services that act as the bridge between the MCP Host and external systems. Servers expose specific capabilities like tools and prompts from data sources or external services like APIs. Any MCP Client can potentially connect to and utilize the capabilities offered by an MCP Server.
This architecture is modular, allowing different components to be developed and updated independently while ensuring standardized communication pathways.

MCP relies on JSON-RPC as the message format for all communication between Clients and Servers. The protocol defines specific request and response structures for various operations like initialization, capability discovery, and invoking features.
MCP supports stateful connections, meaning context and session information can be maintained throughout the interaction between a Client and a Server. Communication primarily occurs over HTTP. The client connects to the server and the server to pushes messages asynchronously to the client over a persistent connection. This allows servers to be hosted remotely and accessed by multiple clients.

Connection Lifecycle

Here is what the interaction between an MCP client and server looks like:
  1. Initialization: The Client sends an initialization request containing its supported protocol version and capabilities. The Server responds with its own version and capabilities. The Client confirms with an initialized notification.
  1. Message Exchange: After initialization, the Client and Server exchange messages to discover and utilize capabilities (e.g., listing tools, calling tools, reading resources).
  1. Termination: The connection can end cleanly via a shutdown request, transport disconnection, or error conditions.


Fundamental MCP Concepts

MCP defines a few features or capabilities that servers can offer to clients, and some that clients can offer to servers. These capabilities are designed to cater to the specific needs of AI agents interacting with external systems, reflecting patterns observed in agent development.
  • Tools (Model-Controlled): These are functions or actions that the AI model (via the Client) can execute through the Server. Tools allow the AI to perform actions with potential side effects, such as creating a GitHub issue, sending a Slack message, querying an API, or updating a database entry. The AI model typically decides when to use a tool based on the user's request and the tool's description. Clients discover available tools and invoke them. User approval is typically required before a tool is executed.
  • Resources (Application/User-Controlled): Resources represent context and data (like files, database records, API responses, or knowledge base articles) that the Server makes available for reading by the Client, user, or AI model. Unlike tools, these are primarily for providing information. Servers can also notify clients about changes to resources, and clients can subscribe to updates for specific resources.
  • Prompts (User-Controlled): These are pre-defined, templated messages or workflows offered by the Server to guide user interactions or structure requests to the LLM for specific tasks. A user might select a prompt to perform a common action, like summarizing text using a specific format or generating a Git commit message based on provided changes. Prompts can accept dynamic arguments and incorporate context from resources.
  • Sampling (Server-Initiated): This capability reverses the usual interaction flow. It allows an MCP Server to request that the Client initiate an LLM completion or interaction. This enables more complex, server-driven agentic behaviors where the server might need the LLM to perform further reasoning, analysis, or generation based on intermediate results, without requiring explicit user input for each step. 
  • Roots (Client-Exposed): Roots are basically endpoints that the Client exposes to the Server. They define the intended scope or boundaries within which the server should operate, informing the server about relevant directories, files, or project locations it has access to. For example, an IDE acting as a Host might expose the current project directory as a root to a file system server. Clients supporting this capability declare it during initialization and can notify servers if the list of roots changes. This helps manage context and security by clarifying the intended operational scope for servers interacting with local resources.

An MCP Server for Hackernews

Our MCP server will provide six main tools:
  1. get_top_stories - Fetch trending stories
  1. get_new_stories - Get the latest submissions
  1. get_best_stories - Retrieve highest-quality content
  1. get_story_details - Deep dive into specific stories with comments
  1. get_user_info - Look up user profiles
  1. search_stories - Find stories by keyword
Imports
Let's start with the core structure:
import asyncio import json import logging from typing import Any, Dict, List, Optional import aiohttp from mcp.server import Server from mcp.server.stdio import stdio_server
We're using aiohttp for async HTTP requests and the official MCP Python library. The stdio_server handles communication with Claude through standard input/output.
Setting Up the MCP Server
The heart of our server is the MCP Server instance:
server = Server("hackernews-server")

Registering Tools
MCP uses decorators to register tools. The @server.list_tools() decorator tells Claude what functions are available:

@server.list_tools() async def handle_list_tools() -> ListToolsResult:     return ListToolsResult(         tools=[             Tool(                 name="get_top_stories",                 description="Get top stories from Hacker News",                 inputSchema={                     "type": "object",                     "properties": {                         "limit": {                             "type": "integer",                             "description": "Number of stories to fetch (default: 10, max: 30)",                             "default": 10,                             "minimum": 1,                             "maximum": 30                         }                     }                 }             ),             # ... more tools         ]     )

Each tool definition includes:
  • name: The function identifier
  • description: What the tool does (helps Claude decide when to use it)
  • inputSchema: JSON Schema defining parameters, types, and validation rules
The schema is crucial - it ensures Claude sends valid parameters and provides autocomplete/validation in development.
Implementing Tool Logic
The @server.call_tool() decorator handles actual function calls:
@server.call_tool() async def handle_call_tool(request: CallToolRequest) -> CallToolResult:     async with HackerNewsAPI() as hn_api:         if request.params.name == "get_top_stories":             limit = request.params.arguments.get("limit", 10)             story_ids = await hn_api.get_top_stories(limit)                         stories = []             for story_id in story_ids:                 story = await hn_api.get_item(story_id)                 if story:                     stories.append(format_story(story))                         result = f"## Top {len(stories)} Stories\n\n" + "\n---\n".join(stories)             return CallToolResult(content=[TextContent(type="text", text=result)])
This pattern repeats for each tool:
  1. Extract parameters from the request
  1. Call the appropriate API methods
  1. Format the results for display
  1. Return a CallToolResult with the formatted text
Data Formatting
You can also format the API responses for human consumption. Hacker News returns raw data that needs cleaning:
def format_story(story: Dict[str, Any]) -> str:     title = story.get("title", "No title")     by = story.get("by", "Unknown")     score = story.get("score", 0)     url = story.get("url", "")     hn_url = f"https://news.ycombinator.com/item?id={story.get('id', '')}"         formatted = f"**{title}**\n"     formatted += f"By: {by} | Score: {score}\n"     if url:         formatted += f"URL: {url}\n"     formatted += f"HN Discussion: {hn_url}\n"         return formatted

Advanced Features: Story Search
Your server can also have more complex logic like searching for hackernews stories:
elif request.params.name == "search_stories":     keyword = request.params.arguments["keyword"].lower()     story_type = request.params.arguments.get("story_type", "top")     limit = request.params.arguments.get("limit", 10)         # Get stories based on type     if story_type == "new":         story_ids = await hn_api.get_new_stories(30)     elif story_type == "best":         story_ids = await hn_api.get_best_stories(30)     else:         story_ids = await hn_api.get_top_stories(30)         matching_stories = []     for story_id in story_ids:         story = await hn_api.get_item(story_id)         if story and story.get("title"):             if keyword in story["title"].lower():                 matching_stories.append(format_story(story))                 if len(matching_stories) >= limit:                     break

Running the Server
The main function sets up the MCP server:
async def main():     options = InitializationOptions(         server_name="hackernews-server",         server_version="1.0.0",         capabilities=server.get_capabilities(             notification_options=None,             experimental_capabilities=None,         ),     )         async with stdio_server() as (read_stream, write_stream):         await server.run(read_stream, write_stream, options)
The server communicates through stdio, which Claude Desktop (and other MCP clients) use to send JSON-RPC messages back and forth.
To use the server with Claude Desktop, add it to your configuration file:
{   "mcpServers": {     "hackernews": {       "command": "python",       "args": ["/path/to/hackernews_mcp_server.py"]     }   } }

Testing Your Server
Once configured, you can test with requests like:
  • "Show me the top 5 Hacker News stories"
  • "Get details about Hacker News story 38905019"
  • "Search for stories about 'Python' in new stories"
  • "Tell me about the Hacker News user 'pg'"


Conclusion: MCP's Role in Advancing Integrated AI Systems

MCP is a significant step towards standardizing the integration of LLMs with the external world. Its core value lies in addressing the integration problem by providing a common language for AI applications to communicate with diverse data sources and tools (Servers). By reducing integration complexity, improving interoperability and enabling access to real-time context, MCP tackles fundamental limitations that previously hindered the practical application of LLMs in complex environments.
As the protocol matures we can expect a few things to happen:
  • Rise of Agentic Systems: MCP provides the necessary plumbing for AI agents to discover and utilize tools, access relevant data, and perform multi-step tasks, moving AI beyond simple chatbots towards systems that can actively assist users in complex workflows.
  • Enterprise Integration: MCP offers a standardized and potentially more secure pathway for enterprises to connect AI models to internal databases, proprietary APIs, and sensitive data sources, unlocking significant value within organizations.
  • Shift in Development Paradigms: Building applications may increasingly involve composing capabilities via MCP servers as well as standard APIs. This could lead to more modular and rapidly developed AI systems.
  • Democratization of AI Integration: By simplifying the process of connecting AI to external systems, MCP could lower the barrier for developers and potentially even end-users to create sophisticated AI-powered automations and workflows.

With our HN MCP server, Claude can actively browse, search, and analyze current content on HN. This same pattern works for any API: GitHub repositories, weather data, financial markets, internal company tools, or databases. Each MCP server extends Claude's capabilities in a specific domain while maintaining the natural conversational interface.
The future of AI assistants is going to be about agents seamlessly connecting LLMs to tools and data that matter to users. MCP provides the foundation for that connected future.

Dive in

Related

49:25
video
Exploring Long Context Language Models
By Joselito Balleta • Aug 26th, 2024 Views 581
Blog
Enriching LLMs with Real-Time Context using Tecton
By Sergio Ferragut • Oct 28th, 2024 Views 498
49:19
video
Inference Scaling for Long-Context Retrieval Augmented Generation
By Joselito Balleta • Dec 2nd, 2024 Views 312
9:11
video
Building Context-Aware Reasoning Applications with LangChain and LangSmith
By Joselito Balleta • Oct 18th, 2023 Views 781
49:25
video
Exploring Long Context Language Models
By Joselito Balleta • Aug 26th, 2024 Views 581
49:19
video
Inference Scaling for Long-Context Retrieval Augmented Generation
By Joselito Balleta • Dec 2nd, 2024 Views 312
9:11
video
Building Context-Aware Reasoning Applications with LangChain and LangSmith
By Joselito Balleta • Oct 18th, 2023 Views 781
Blog
Enriching LLMs with Real-Time Context using Tecton
By Sergio Ferragut • Oct 28th, 2024 Views 498