rag-standardmainstream

Standard RAG

Hybrid vector + BM25 retrieval pipeline. The production baseline.

Overview

The most widely deployed RAG architecture. Combines dense vector search with BM25 sparse retrieval, then reranks results with a cross-encoder. Handles 90% of production use cases with predictable latency and cost.

Architecture

Interactive walkthrough

Ingestion

Document LoaderPDF · web · txt · docx

Text Splitterrecursive · 512 tok

Embeddertext-embedding-3

Vector StoreQdrant / Pinecone

Retrieval

Dense Retrievercosine top-k

BM25 Retrieverkeyword match

Ensemble Mergerreciprocal rank fusion

Rerankercross-encoder

Generation

Prompt Templatecontext + citations

LLM Callprovider-agnostic

Response Formatterinline [1] citations

Ingestion

Document LoaderPDF · web · txt · docx

Text Splitterrecursive · 512 tok

Embeddertext-embedding-3

Vector StoreQdrant / Pinecone

Retrieval

Dense Retrievercosine top-k

BM25 Retrieverkeyword match

Ensemble Mergerreciprocal rank fusion

Rerankercross-encoder

Generation

Prompt Templatecontext + citations

LLM Callprovider-agnostic

Response Formatterinline [1] citations

Summarized pipeline view. For the full interactive, scroll-driven walkthrough with clickable stages → Pipeline detail

When to use

Use when

corpus_size < 10M vectors
update_frequency == frequent
query_complexity == simple_to_moderate
team needs a proven, battle-tested baseline

Avoid when

multi-hop relational queries across entities
corpus > 1B vectors
zero infrastructure preference
queries require knowledge graph traversal

Compatible vector databases

PineconeQdrantpgvectorWeaviateChroma

Compatible frameworks

langchainllamaindexraw pythontypescript

#hybrid#bm25#semantic-search#reranking#production-ready

Ready to build with Standard RAG?

Walk through the wizard to generate a complete, parameterized pipeline.

Generate Pipeline