Skip to content
AImpact
IT EN
← Reading paths

Reading path

Cloud & solution architect: deploying AI on cloud infrastructure

AWS Bedrock, Azure Copilot, GPU scalability and LLM costs: the milestones that reshape architecture.

You are a cloud or solution architect responsible for bringing language models into production on managed cloud infrastructure. This path connects the hardware foundations (GPUs, dedicated chips) with the cloud services that expose them, up to the latest architectural standards for agents and integration, with close attention to costs and regulatory governance.

  1. 01

    Why it matters to you

    The GPU that made large-scale cloud training and inference economically viable: understanding its architecture explains why per-token prices from major cloud providers dropped so rapidly.

    Landmark AI Infrastructure

    NVIDIA A100: Ampere arrives and the GPU that trains GPT-3

    At GTC 2020 Jensen Huang announces the A100 GPU built on the Ampere architecture: 54 billion transistors, 40-80 GB HBM2e, TF32, 2:4 structured sparsity, and MIG support.

  2. 02

    Why it matters to you

    The generational leap that brought the Transformer Engine into cloud data centers: essential for evaluating high-end GPU instances on AWS, Azure and GCP and knowing when they justify the cost.

    Landmark AI Infrastructure

    NVIDIA H100 and Hopper architecture: the foundation-model GPU

    At GTC 2022 NVIDIA unveils the Hopper architecture and the H100 GPU, with FP8 Transformer Engine and NVLink 4. It will become the hardware substrate for nearly every large LLM of the following years.

  3. 03

    Why it matters to you

    The moment when deploying open-weight models becomes a standard cloud operation with managed SLAs: the reference point for comparing build-vs-buy on any provider.

    Medium AI Infrastructure

    Hugging Face Inference Endpoints: deploy LLMs in two clicks

    Hugging Face launches Inference Endpoints, a managed service to deploy Hub models on AWS, Azure or GCP with autoscaling, on-demand GPUs and private endpoints.

  4. 04

    Why it matters to you

    AWS opens managed access to frontier models via API without infrastructure management: the paradigm shift that turns LLM deployment from an operational problem into an architectural decision.

    High AI Infrastructure

    AWS Bedrock: managed multi-model AI on Amazon cloud

    AWS announces Bedrock, a managed service exposing Claude (Anthropic), Jurassic-2 (AI21), Stable Diffusion, and its own Titan via one API. Reply to Azure OpenAI.

  5. 05

    Why it matters to you

    Microsoft integrates Copilot across the entire Azure and M365 stack: for a solution architect working on the Microsoft ecosystem this is the moment AI moves from pilot project to enterprise architecture standard.

    High Enterprise AI

    Microsoft Build 2023: Copilot everywhere, a shared plugin standard

    At Build 2023 Microsoft announces Windows Copilot, Copilot in Edge and 365, and adopts OpenAI's plugin standard. Strategy: 'AI co-pilot' as the primary UI.

  6. 06

    Why it matters to you

    The European regulatory framework that classifies AI systems by risk and imposes transparency and governance obligations: every cloud architecture handling EU data must treat it as a hard project constraint.

    Landmark AI Security

    EU AI Act: European Parliament adopts the first comprehensive AI law

    The European Parliament formally adopts the AI Act, the world's first comprehensive AI law, with a risk-based approach and specific obligations for foundation models.

  7. 07

    Why it matters to you

    The open standard defining how agents and tools connect regardless of cloud provider: building it into your architecture prevents vendor lock-in on orchestration and simplifies multi-cloud integration.

    High AI Infrastructure

    Model Context Protocol: the open standard to connect LLMs and data

    Anthropic open-sources the Model Context Protocol (MCP), a JSON-RPC standard that lets AI assistants talk to tools, file systems, databases, and SaaS without per-model ad-hoc integrations.