Machines

GPTGPT

Generative. Pre-trained. Transformer. Three letters that rewrote a decade.

GPT is the three-stage assembly line that built the era. First the Transformer — an architecture that can mix any context into any output. Then pre-training — staring at the world's text long enough to absorb its statistical shape. Then fine-tuning — being polished into a useful, aligned assistant by a much smaller, carefully chosen second pass. Skip any of the three and you do not get the system everyone is using. Together they are AI's industrial trinity.

The Three Pillars

Truth · 真

Transformer

The architecture.

The truth of GPT is the model body itself: layers of attention and feed-forward blocks. The architecture is what allows arbitrary patterns to be expressible at all.

Goodness · 善

Pre-trained

The world-soak.

Months of compute predicting the next token across the open web, books, code, conversation. Pre-training is the goodness axis because it gives the model a vast common-sense substrate to act in.

Beauty · 美

Fine-tuning

The polish.

Hundreds of millions of carefully labeled examples — RLHF, RLAIF, DPO — turn a general statistical machine into something useful, polite, and aesthetically aligned with humans.

Evolution

How this trinity came to be.

2018
GPT-1
OpenAI ships a 117M-parameter Transformer pre-trained on books. The pattern is announced.
2020
GPT-3
Scale meets emergence. 175B parameters. The world realizes pre-training is the binding axis.
2022
ChatGPT
Fine-tuning makes the system humans actually want to talk to. The third axis closes the loop.

Practical Applications

How to use this lens today.

01Every modern frontier model — Claude, Gemini, Grok, DeepSeek — uses the same three-stage assembly. The competition is on the margins of each.
02Domain models (medicine, law, code) reuse stages 1 and 2 and replace stage 3.

Future Trends

Where this trinity is heading.

→Continual learning will erase the line between pre-training and fine-tuning. The trinity collapses into a continuous flow.
→Personal pre-training — models that ingest the lifetime of a single person — becomes a new product category.

Related Triads

注

Transformer

Query, Value, Key — three vectors that learned to mean everything.

智

Intelligence

Algorithm. Compute. Data. The three commodities of the new century.

超

Superhuman

Free will. Self-consciousness. Independent personality. The Nietzschean trinity.

GPTGPT

How this trinity came to be.

GPT-1

GPT-3

ChatGPT

How to use this lens today.

Where this trinity is heading.