Reading path
Cloud & solution architect: deploying AI on cloud infrastructure
AWS Bedrock, Azure Copilot, GPU scalability and LLM costs: the milestones that reshape architecture.
You are a cloud or solution architect responsible for bringing language models into production on managed cloud infrastructure. This path connects the hardware foundations (GPUs, dedicated chips) with the cloud services that expose them, up to the latest architectural standards for agents and integration, with close attention to costs and regulatory governance.
- 01
Why it matters to you
The GPU that made large-scale cloud training and inference economically viable: understanding its architecture explains why per-token prices from major cloud providers dropped so rapidly.
Landmark AI InfrastructureNVIDIA A100: Ampere arrives and the GPU that trains GPT-3
At GTC 2020 Jensen Huang announces the A100 GPU built on the Ampere architecture: 54 billion transistors, 40-80 GB HBM2e, TF32, 2:4 structured sparsity, and MIG support.
- 02
Why it matters to you
The generational leap that brought the Transformer Engine into cloud data centers: essential for evaluating high-end GPU instances on AWS, Azure and GCP and knowing when they justify the cost.
Landmark AI InfrastructureNVIDIA H100 and Hopper architecture: the foundation-model GPU
At GTC 2022 NVIDIA unveils the Hopper architecture and the H100 GPU, with FP8 Transformer Engine and NVLink 4. It will become the hardware substrate for nearly every large LLM of the following years.
- 03
Why it matters to you
The moment when deploying open-weight models becomes a standard cloud operation with managed SLAs: the reference point for comparing build-vs-buy on any provider.
Medium AI InfrastructureHugging Face Inference Endpoints: deploy LLMs in two clicks
Hugging Face launches Inference Endpoints, a managed service to deploy Hub models on AWS, Azure or GCP with autoscaling, on-demand GPUs and private endpoints.
- 04
Why it matters to you
AWS opens managed access to frontier models via API without infrastructure management: the paradigm shift that turns LLM deployment from an operational problem into an architectural decision.
High AI InfrastructureAWS Bedrock: managed multi-model AI on Amazon cloud
AWS announces Bedrock, a managed service exposing Claude (Anthropic), Jurassic-2 (AI21), Stable Diffusion, and its own Titan via one API. Reply to Azure OpenAI.
- 05
Why it matters to you
Microsoft integrates Copilot across the entire Azure and M365 stack: for a solution architect working on the Microsoft ecosystem this is the moment AI moves from pilot project to enterprise architecture standard.
High Enterprise AIMicrosoft Build 2023: Copilot everywhere, a shared plugin standard
At Build 2023 Microsoft announces Windows Copilot, Copilot in Edge and 365, and adopts OpenAI's plugin standard. Strategy: 'AI co-pilot' as the primary UI.
- 06
Why it matters to you
The European regulatory framework that classifies AI systems by risk and imposes transparency and governance obligations: every cloud architecture handling EU data must treat it as a hard project constraint.
Landmark AI SecurityEU AI Act: European Parliament adopts the first comprehensive AI law
The European Parliament formally adopts the AI Act, the world's first comprehensive AI law, with a risk-based approach and specific obligations for foundation models.
- 07
Why it matters to you
The open standard defining how agents and tools connect regardless of cloud provider: building it into your architecture prevents vendor lock-in on orchestration and simplifies multi-cloud integration.
High AI InfrastructureModel Context Protocol: the open standard to connect LLMs and data
Anthropic open-sources the Model Context Protocol (MCP), a JSON-RPC standard that lets AI assistants talk to tools, file systems, databases, and SaaS without per-model ad-hoc integrations.