Cloud & solution architect: deploying AI on cloud infrastructure

AWS Bedrock, Azure Copilot, GPU scalability and LLM costs: the milestones that reshape architecture.

You are a cloud or solution architect responsible for bringing language models into production on managed cloud infrastructure. This path connects the hardware foundations (GPUs, dedicated chips) with the cloud services that expose them, up to the latest architectural standards for agents and integration, with close attention to costs and regulatory governance.

01

Why it matters to you

The GPU that made large-scale cloud training and inference economically viable: understanding its architecture explains why per-token prices from major cloud providers dropped so rapidly.

May 14, 2020 Landmark AI Infrastructure

NVIDIA A100: Ampere arrives and the GPU that trains GPT-3

At GTC 2020 Jensen Huang announces the A100 GPU built on the Ampere architecture: 54 billion transistors, 40-80 GB HBM2e, TF32, 2:4 structured sparsity, and MIG support.
02

Why it matters to you

The generational leap that brought the Transformer Engine into cloud data centers: essential for evaluating high-end GPU instances on AWS, Azure and GCP and knowing when they justify the cost.

March 22, 2022 Landmark AI Infrastructure

NVIDIA H100 and Hopper architecture: the foundation-model GPU

At GTC 2022 NVIDIA unveils the Hopper architecture and the H100 GPU, with FP8 Transformer Engine and NVLink 4. It will become the hardware substrate for nearly every large LLM of the following years.
03

Why it matters to you

The moment when deploying open-weight models becomes a standard cloud operation with managed SLAs: the reference point for comparing build-vs-buy on any provider.

September 27, 2022 Medium AI Infrastructure

Hugging Face Inference Endpoints: deploy LLMs in two clicks

Hugging Face launches Inference Endpoints, a managed service to deploy Hub models on AWS, Azure or GCP with autoscaling, on-demand GPUs and private endpoints.
04

Why it matters to you

AWS opens managed access to frontier models via API without infrastructure management: the paradigm shift that turns LLM deployment from an operational problem into an architectural decision.

April 13, 2023 High AI Infrastructure

AWS Bedrock: managed multi-model AI on Amazon cloud

AWS announces Bedrock, a managed service exposing Claude (Anthropic), Jurassic-2 (AI21), Stable Diffusion, and its own Titan via one API. Reply to Azure OpenAI.
05

Why it matters to you

Microsoft integrates Copilot across the entire Azure and M365 stack: for a solution architect working on the Microsoft ecosystem this is the moment AI moves from pilot project to enterprise architecture standard.

May 23, 2023 High Enterprise AI

Microsoft Build 2023: Copilot everywhere, a shared plugin standard

At Build 2023 Microsoft announces Windows Copilot, Copilot in Edge and 365, and adopts OpenAI's plugin standard. Strategy: 'AI co-pilot' as the primary UI.
06

Why it matters to you

The European regulatory framework that classifies AI systems by risk and imposes transparency and governance obligations: every cloud architecture handling EU data must treat it as a hard project constraint.

March 13, 2024 Landmark AI Security

EU AI Act: European Parliament adopts the first comprehensive AI law

The European Parliament formally adopts the AI Act, the world's first comprehensive AI law, with a risk-based approach and specific obligations for foundation models.
07

Why it matters to you

The open standard defining how agents and tools connect regardless of cloud provider: building it into your architecture prevents vendor lock-in on orchestration and simplifies multi-cloud integration.

November 25, 2024 High AI Infrastructure

Model Context Protocol: the open standard to connect LLMs and data

Anthropic open-sources the Model Context Protocol (MCP), a JSON-RPC standard that lets AI assistants talk to tools, file systems, databases, and SaaS without per-model ad-hoc integrations.

Cloud & solution architect: deploying AI on cloud infrastructure

NVIDIA A100: Ampere arrives and the GPU that trains GPT-3

NVIDIA H100 and Hopper architecture: the foundation-model GPU

Hugging Face Inference Endpoints: deploy LLMs in two clicks

AWS Bedrock: managed multi-model AI on Amazon cloud

Microsoft Build 2023: Copilot everywhere, a shared plugin standard

EU AI Act: European Parliament adopts the first comprehensive AI law

Model Context Protocol: the open standard to connect LLMs and data