Models Beginner Also known as: Large Language Model · Modello linguistico di grandi dimensioni

LLM

/el-el-em/

An AI model trained on huge amounts of text to predict the next word and generate natural language responses.

In practice

It is the engine behind ChatGPT, Claude, Gemini. When you embed an LLM into your product you pay per token and get a service that reads and writes text. Quality depends heavily on the chosen model and the prompt you give it.

Seen in the wild

62 entries mentioning it

May 13, 2026

Mistral releases Devstral Small: 7B coding model for agentic tasks on consumer GPU

Medium
April 20, 2026

Figure AI releases Figure 02 autonomy update: fully autonomous warehouse picking without human teleoperation

High
March 5, 2026

Ollama 0.9: concurrent model serving, multi-GPU split, and REST API v2 for local AI

Medium
August 14, 2025

Local AI 2025: Ollama, MLX LM, Apple Foundation Models triple the speed

Medium
July 8, 2025

Private LLM: models up to 7B directly on iPhone and Mac, fully offline

Medium
July 2, 2025

vLLM v0.7: chunked prefill by default and a redesigned V1 engine

Medium
May 1, 2025

NVIDIA NIM 1.0: Containerized LLM Inference with OpenAI-Compatible API

High
April 14, 2025

WebLLM and LLM in WASM: browser-based LLM inference via WebGPU, no server needed

Medium
April 8, 2025

Continuous Batching for LLM Serving: survey and state of the art of Orca, vLLM, SGLang, TGI

Medium
March 20, 2025

DeepMind: 60+ cases of Specification Gaming in LLMs documented

High
January 22, 2025

FlashInfer 0.2: attention library for LLM serving with paged KV cache and RoPE fusion

Medium
January 8, 2025

Prefill/decode disaggregation: separate GPUs for low TTFT and high throughput

High
September 10, 2024

KV Cache Quantization FP8/INT8: Double User Density per GPU

High
September 1, 2024

AnythingLLM 1.0: the complete local RAG stack for enterprise use

High
August 5, 2024

LLM Compressor: unified toolkit for quantization and sparsity with native vLLM integration

Medium
July 18, 2024

CyberSecEval 2: Meta's LLM cybersecurity benchmark

Medium
July 15, 2024

Dify 0.7: visual agentic workflows with integrated RAG and 10+ LLMs

Medium
July 15, 2024

DrEureka: LLM automates simulation-to-real transfer without manual tuning

Medium
July 1, 2024

NeMo Guardrails 0.8: NVIDIA's framework for adding safety rails to any LLM

Medium
May 14, 2024

Microsoft RoboGen: generating robot tasks, skills and environments from text

Medium

← All terms

In practice

Related terms

Seen in the wild