Back to ritik.io

RAGnarok: Web Search MCP for Open-Source LLMs

Ritik Bompilwar March 2026

RAGnarok is a Docker-first Model Context Protocol (MCP) server that brings deterministic, real-time web search to open-source Large Language Models. No API keys for proprietary search providers. No closed-source dependencies. Just a fully self-hosted search pipeline which is packaged as a single Docker container that turns any local model into a web-connected assistant.

Open-Source Models Deserve Open-Source Search

Open-source LLMs are increasingly powerful, but they still lack the ability to access live information. This is a problem because it means that the moment you ask your local model about a library released last week, a breaking news story, or any fact that postdates its training data, it hallucinates or simply says it doesn't know.

Without RAGnarok

A local LLM tries to access a GitHub repository and analyze an arXiv paper. Without web access, it fails at both tasks.

Fails to access GitHub repo link
Fails to analyze arXiv paper

RAGnarok closes this gap. It gives any open-source model the access to current information that users expect from proprietary alternatives.

With RAGnarok

The same prompts with RAGnarok enabled. The model reads the repo and analyzes the paper in real time.

Successfully reads GitHub repo
Successfully analyzes arXiv paper

Fully Self-Hosted, Zero External APIs

RAGnarok's search stack is completely self-contained inside a single Docker container. Here is how a query flows through the pipeline:

Discovery
Web queries are dispatched through SearXNG, a privacy-respecting metasearch engine that aggregates results from multiple public search backends. No API keys or sign-ups required.
Verification
Candidate URLs are checked against access policies including private IP filtering and robots.txt compliance.
Retrieval
Pages are rendered in a headless Playwright Chromium instance, handling JavaScript-heavy sites that simple HTTP fetchers miss.
Extraction
Main content is isolated with Mozilla's Readability and converted to clean Markdown, stripping ads, navbars, and boilerplate.
Ranking
Extracted text is chunked and ranked against the original query using bm25s for fast, deterministic retrieval.

Everything runs inside Docker. Your queries never leave your machine.

Two Tools, Full Coverage

RAGnarok exposes two MCP tools:

web_search(query, top_k)
Searches the query through SearXNG, scrapes allowed pages, and returns the top-k BM25-ranked results in Markdown.
read_source(url, max_pages)
Reads a public URL directly and returns the entire extracted content in Markdown. Optionally expands into nearby same-origin pages up to max_pages.

Both tools return structured JSON with metadata like response time, page counts, and relevance scores.

Built for Responsibility

Giving models the ability to fetch live web content comes with responsibility. RAGnarok is intentionally conservative:

Get Started

Pull the image from Docker Hub:

docker pull ritik12/ragnarok-mcp:latest

Or build locally from source:

git clone https://github.com/RITIK-12/RAGnarok.git
cd RAGnarok
docker build -t ragnarok-mcp:local .

LM Studio

LM Studio Add to LM Studio

OpenCode

For OpenCode users, an example configuration is provided at examples/opencode.json. Point it at the container and you are ready to go.

Visit the RAGnarok repository to get started and give your local models the search capabilities they have been missing.