MLX: Apple Research brings native machine learning to Apple Silicon

In one sentence Apple Research releases MLX, an open source ML framework optimized for M1/M2/M3: it leverages unified CPU-GPU memory for LLM inference at near-discrete-GPU performance.

Verified Official source

ShareLinkedIn X

Apple Silicon chips (M1, M2, M3) have a unique feature: CPU and GPU share the same physical memory. This means an AI model loaded in RAM is already available to the GPU, without expensive data copies. MLX is the framework Apple Research built to fully exploit this architecture.

Before MLX, LLM inference on Mac used llama.cpp with a Metal backend, which worked well but wasn't optimized for Python-level programming. MLX brings the same efficiency in an API resembling NumPy and PyTorch, making it easy to write and experiment with new models.

The result: on an M2 Pro MacBook with 32 GB RAM, 13B or 34B parameter models run at real-world speeds, no external GPU, no excessive power draw.