Step through Retrieval-Augmented Generation — watch a query embed, retrieve the most relevant documents by similarity, then augment the LLM prompt for a grounded answer.
Retrieve Then Generate
RAG reduces LLM hallucination by grounding generation in retrieved facts. The query is embedded, compared to a document vector store, and the top-k matches are injected into the prompt before generation.