2021

31 entries

December 20, 2021 High

GLIDE: OpenAI shifts from autoregressive to CLIP-guided diffusion

OpenAI publishes GLIDE, a text-to-image diffusion model with classifier-free guidance — technical foundation for DALL·E 2 and the models that follow.

Image & Video Gen OpenAIGLIDEDiffusion

December 16, 2021 High

WebGPT: OpenAI teaches GPT-3 to browse the web

OpenAI publishes WebGPT, a GPT-3 fine-tune that learns to use a text browser to search the web for answers with source citations, trained via imitation learning + RLHF.

Agents OpenAIWebGPTBrowsing

December 8, 2021 High

Gopher 280B: DeepMind officially enters the LLM race

DeepMind releases Gopher, a 280B dense model, alongside a systematic 152-task study and a companion paper on ethical considerations of foundation models.

Foundation Models DeepMindGopherScaling

December 8, 2021 High

RETRO: DeepMind foreshadows RAG with retrieval over 2 trillion tokens

DeepMind publishes RETRO, a 7B-parameter model that retrieves relevant passages from a 2T-token database at inference, matching the performance of models 25x larger.

Foundation Models DeepMindRETRORetrieval

November 18, 2021 High

OpenAI drops the waitlist: GPT-3 API available to all

Eighteen months after the GPT-3 paper, OpenAI removes the API access waitlist and lets any developer sign up, accelerating mainstream adoption of foundation models.

Enterprise AI OpenAIAPIGPT-3

October 29, 2021 Medium

Replit Ghostwriter: AI coding in the browser, zero setup

First AI coding tool integrated into a browser IDE: intelligent code completion for students and developers with no local configuration required.

AI Coding Code CompletionBrowser IDEAI Assistant

October 28, 2021 Medium

Pathways: Google sketches the post-Transformer architecture

Jeff Dean outlines Pathways, Google's unified architecture for sparse, multitask, multimodal models — the infrastructure foundation that will power PaLM and Gemini.

AI Infrastructure GooglePathwaysMultitask

October 21, 2021 High

FLAN: instruction tuning that teaches models to follow directions

Google shows that training a model on 60+ tasks framed as instructions dramatically improves zero-shot performance on unseen tasks.

Foundation Models FLANinstruction tuningzero-shot

October 21, 2021 Medium

PyTorch 1.10: CUDA Graphs, FX, and the maturing of the dominant framework

Meta releases PyTorch 1.10 with CUDA Graphs integration, FX-based quantization, TorchScript improvements — consolidating leadership of the framework for AI research and production.

AI Infrastructure PyTorchFrameworkCUDA Graphs

October 11, 2021 High

Megatron-Turing NLG 530B: Microsoft and NVIDIA scale dense past GPT-3

Microsoft and NVIDIA announce MT-NLG, a 530B-parameter dense model trained with DeepSpeed and Megatron-LM, at the time the largest dense LM ever produced.

Foundation Models MicrosoftNVIDIAMegatron

September 29, 2021 Low

Copilot Labs: GitHub opens a sandbox for experimental features

GitHub introduces Copilot Labs, a VS Code extension hosting experimental features beyond simple autocomplete: code explanation, language translation, test generation.

AI Coding GitHubCopilot LabsCode Explain

September 9, 2021 Medium

HuBERT: Meta brings self-supervised to speech, foreshadows Whisper

Meta AI publishes HuBERT, a self-supervised audio model based on masked prediction of discrete clusters — conceptual base for Whisper, w2v-BERT and audio-multimodal models.

Voice & Audio FacebookMetaAV-HuBERT

August 31, 2021 Medium

Copilot lands on JetBrains and Neovim

GitHub extends the Copilot technical preview to the main JetBrains IDEs (IntelliJ, PyCharm, GoLand, WebStorm) and to Neovim, taking AI coding outside the VS Code ecosystem.

AI Coding GitHubCopilotJetBrains

August 16, 2021 High

On the Opportunities and Risks of Foundation Models: Stanford coins the term

Stanford's Center for Research on Foundation Models publishes a 200+ page report coining the term foundation models, now standard in technical, academic and regulatory discourse.

Foundation Models StanfordCRFMFoundation Models

August 10, 2021 High

Codex API: OpenAI opens access to the model behind Copilot

OpenAI releases the Codex API in private beta, giving developers direct access to the code generation model behind GitHub Copilot, free during the beta.

AI Coding OpenAICodexAPI

July 28, 2021 Medium

OpenAI Triton: writing GPU kernels in Python becomes practical

OpenAI releases Triton, a Python-like language and compiler for writing custom GPU kernels at performance close to hand-written CUDA — dramatically lowering the barrier for model optimization.

AI Infrastructure OpenAITritonGPU

July 15, 2021 High

AlphaFold 2: open code and database, biology accelerates

DeepMind publishes AlphaFold 2 code and weights on GitHub and, with EMBL-EBI, releases a database with predicted structures for 350,000 human and model-organism proteins.

AI Infrastructure DeepMindAlphaFoldProtein Folding

July 12, 2021 High

Megatron-LM v2: 3D Parallelism for 530-Billion-Parameter Models

NVIDIA adds interleaved pipeline scheduling and sequence parallelism to Megatron-LM, enabling training of the 530B-parameter MT-NLG on 2240 A100 GPUs with Microsoft.

AI Infrastructure Megatron-LM3D parallelismpipeline parallelism

July 7, 2021 High

Codex paper: OpenAI publishes HumanEval and the model behind Copilot

OpenAI releases Evaluating Large Language Models Trained on Code describing Codex (the model powering GitHub Copilot) and introduces HumanEval, the standard benchmark for code generation.

AI Coding OpenAICodexHumanEval

June 29, 2021 High

GitHub Copilot: autocomplete grows up

GitHub and OpenAI launch a technical preview of an assistant that suggests entire lines and functions right in the editor, based on a GPT-3-derived model trained on public code.

AI Coding GitHubCopilotCodex

June 15, 2021 High

VITS: end-to-end TTS with variational autoencoder

VITS unifies the acoustic model and vocoder into a single end-to-end model, achieving quality surpassing Tacotron 2 with faster inference.

Voice & Audio VITSTTSend-to-end

June 4, 2021 High

GPT-J 6B: the open source model that matches GPT-3 Curie on many benchmarks

EleutherAI releases GPT-J, a 6B-parameter model trained in JAX on TPUs, performance comparable to GPT-3 Curie, shipped under Apache 2.0.

Open Source Models EleutherAIGPT-JOpen Source

June 1, 2021 High

The Pile: the 825 GB open dataset that fuels the open LLM era

EleutherAI publishes The Pile, an 825 GB dataset built from 22 diverse sub-datasets — the base for GPT-Neo, GPT-J, Pythia and much of the early open source ecosystem.

Open Source Models EleutherAIThe PileDataset

June 1, 2021 Medium

Wu Dao 2.0: China announces a 1.75T-parameter model

BAAI (Beijing Academy of Artificial Intelligence) introduces Wu Dao 2.0, a 1.75 trillion-parameter multimodal Mixture of Experts model — China's response to GPT-3 and Switch Transformer.

Foundation Models BAAIWu DaoChina

May 28, 2021 Landmark

Anthropic: an AI safety-focused lab is born

Dario and Daniela Amodei, former VP of Research and VP of Safety at OpenAI, co-found Anthropic with a group of researchers, explicitly focused on AI safety and interpretability.

AI Security AnthropicAI SafetyFounding

May 18, 2021 Medium

MUM: Google unveils the multitask model for Search

At Google I/O, Google announces MUM (Multitask Unified Model), T5-based, claimed 1000x more powerful than BERT, capable of handling 75 languages and multimodal content.

Multimodal AI GoogleMUMSearch

May 18, 2021 High

LaMDA: Google unveils its dialogue model

At Google I/O, Sundar Pichai introduces LaMDA (Language Model for Dialogue Applications), a 137B-parameter model fine-tuned for dialogue, direct ancestor of Bard.

Foundation Models GoogleLaMDADialogue

April 15, 2021 Medium

OpenAI Content Filter: first integrated AI-side moderation infrastructure

OpenAI ships the content filter endpoint to classify GPT-3 outputs as safe/sensitive/unsafe — the first integrated moderation tool inside a commercial foundation-model API.

AI Security OpenAIContent FilterSafety

March 22, 2021 High

GPT-Neo: the first open source clone of GPT-3

EleutherAI releases GPT-Neo 1.3B and 2.7B, open source language models trained on The Pile — the first serious attempt to replicate the GPT-3 architecture with public weights.

Open Source Models EleutherAIGPT-NeoOpen Source

January 12, 2021 High

Switch Transformer: Google scales to 1.6T parameters with Mixture of Experts

Google Brain publishes Switch Transformer, a sparse model with 1.6 trillion parameters that activates only one expert per token, proving sparse routing can scale beyond dense models.

Foundation Models GoogleMoESparse

January 5, 2021 High

DALL·E and CLIP: text and images finally talk

OpenAI announces DALL·E (generates images from text) and CLIP (aligns images and text in the same semantic space) side by side. Two pieces of the multimodal puzzle.

Multimodal AI OpenAIDALL-ECLIP