Question 1

What is the algorithm pattern for RAG?

Accepted Answer

Retrieve Then Generate: RAG reduces LLM hallucination by grounding generation in retrieved facts. The query is embedded, compared to a document vector store, and the top-k matches are injected into the prompt before generation.

Question 2

How do you solve RAG step by step?

Accepted Answer

Embed the query into a dense vector using an embedding model. Compute cosine similarity between query and all document embeddings. Retrieve the top-k most similar documents. Concatenate: prompt = query + retrieved_context. LLM generates an answer conditioned on the augmented prompt.

Question 3

What are common mistakes when solving RAG?

Accepted Answer

RAG requires an up-to-date vector store — stale docs produce stale answers. Retrieval quality is the bottleneck — better embeddings → better answers. Chunking strategy matters: too small = no context, too large = irrelevant dilution.

RAG — Step-by-Step Visualization

Algorithm Pattern

Key Idea

Step-by-Step Approach

Common Gotchas

Related Problems