Skip to content
AImpact
IT EN
Inference Beginner Also known as: Retrieval-Augmented Generation · Generazione aumentata da recupero

RAG

/rag/

A technique that fetches relevant text from an external data source and inserts it into the model's prompt before generating the response.

ShareLinkedInX

In practice

It lets an LLM answer using company documents, internal knowledge bases, or up-to-date articles without training. It cuts hallucinations on specific data and refreshes knowledge without re-training. It is the first architecture to consider for an enterprise chatbot.

Related terms

Seen in the wild

20 entries mentioning it
  1. Cohere Command A: the foundation model that runs on-prem on 2 GPUs
    Medium
  2. KoboldCpp v1.84: native RAG with embedded ChromaDB, no separate servers
    Medium
  3. Oracle OCI Generative AI: Llama 3.1, dedicated clusters, and RAG with Oracle Database 23ai
    Medium
  4. AnythingLLM 1.0: the complete local RAG stack for enterprise use
    High
  5. Dify 0.7: visual agentic workflows with integrated RAG and 10+ LLMs
    Medium
  6. TabbyML: open-source GitHub Copilot alternative with self-hosted codebase RAG
    Medium
  7. KoboldCpp adds integrated RAG: offline all-in-one LLM with documents and character AI
    Medium
  8. Copilot+ PC and Recall: Microsoft tries 'infinite PC memory', privacy backlash erupts
    High
  9. Notion AI Q&A: answers across the entire enterprise workspace with source citation
    Medium
  10. Cohere Command R+: an enterprise-focused model built for RAG and tool use
    Medium
  11. Automatic Prefix Caching in vLLM: Shared KV Cache Across Requests for Near-Zero TTFT
    High
  12. Box AI: questions and summaries on enterprise documents with page citations
    Medium
  13. Indirect Prompt Injection: the attack vector in RAG systems and AI agents
    High
  14. Open WebUI: ChatGPT-style web interface for Ollama with multi-user and history
    High
  15. LlamaIndex 0.10 stable: the standard RAG framework for local LLMs
    Medium
  16. AnythingLLM: full local RAG with web UI and embedded vector DB
    Medium
  17. SuperAGI: the first open-source autonomous agent platform with a GUI
    Medium
  18. privateGPT: chat with your documents, completely offline
    High
  19. RETRO: DeepMind foreshadows RAG with retrieval over 2 trillion tokens
    High
  20. RAG: Retrieval-Augmented Generation enters the literature
    Landmark
← All terms