RAG

TechMethodology

RAG (Retrieval-Augmented Generation) is an architectural pattern for LLM applications in which, before answering a question, the system fetches relevant passages from a corpus of documents in order to ground the generat…

RAG (Retrieval-Augmented Generation) is an architectural pattern for LLM applications in which, before answering a question, the system fetches relevant passages from a corpus of documents in order to ground the generation in trustworthy sources.

A typical RAG pipeline has three stages: (1) ingestion - splitting documents into chunks and computing embeddings stored in a vector [database](/ressources/glossaire-de-la-tech/vector-database); (2) retrieval - for each question, retrieving the chunks semantically closest to it; (3) generation - the LLM answers the question with the chunks provided in its context.

RAG is the standard answer to the problem of hallucinations and to the dated knowledge of LLMs. It is simpler to keep up to date than fine-tuning.

Related terms

Ready to find the missing piece of your team?

Let's talk about your hiring needs. A team member will get back to you quickly to qualify the brief and kick off the search.

Start a search I'm a candidate

RAG

Related terms

LLM

Vector Database

CTO

Data

Embeddings

Fine-tuning

Ready to find the missing piece of your team?