How do machines actually learn?

A machine learns by adjusting itself in response to many labelled examples. Each time it guesses, it checks the answer and nudges its internal settings to be a little more right next time. Repeat that across millions of examples and a reliable pattern forms.

What are the three main types of machine learning?

Supervised learning trains on labelled examples (this is a cat, this is a dog). Unsupervised learning finds structure in unlabelled data, like grouping similar items. Reinforcement learning learns by trial and error, earning rewards for good outcomes.

Why does AI need so much data?

More examples let a model see more variety and average out noise, so the pattern it learns is more reliable. But quality matters as much as quantity: biased or messy data teaches biased or messy patterns, a problem summed up as garbage in, garbage out.

Step 02 — How Machines Learn

▰ THE GIST · 30-SECOND VERSION

Machines learn by a simple loop: guess, check the answer, adjust — repeated at huge scale.
Three styles: supervised (labelled), unsupervised (find structure), reinforcement (trial & reward).
Clean, representative data matters as much as quantity — garbage in, garbage out.

01Learning from examples

→Machines learn by one loop: guess, check the answer, adjust — repeated millions of times.

THE LOOP

Show the machine an example. Let it guess. Tell it the right answer. It nudges its settings to be a little less wrong — then does it again, millions of times.

Try it: tell this learner what each fruit is, and watch its accuracy climb.

▦ TINY LEARNER v0.10 examples seen

🍎

I have no idea what this is yet. You tell me…

Tip: teach it the same fruit a few times and watch what happens.

LEARNED ACCURACY0%

What just happened

No single guess matters — the magic is the repetition. After enough rounds, those tiny nudges add up to settings that get the answer right most of the time, without anyone writing the rule. That's training.

The first time the learner sees a fruit it's just guessing. Once you've labelled it, it remembers — and accuracy climbs. That's supervised learning in miniature: learning from examples with the right answer attached.

02The three ways machines learn

→There are three styles — and the only difference is what feedback the machine gets.

TYPE 01

Supervised

Learns from examples that already have the right answer labelled.

LIKE a student with flashcards — question on front, answer on back.

TYPE 02

Unsupervised

Gets data with no labels and finds the hidden structure or groups by itself.

LIKE sorting a pile of laundry into groups without being told the categories.

TYPE 03

Reinforcement

Learns by trial and error, earning rewards for good moves and penalties for bad ones.

LIKE training a dog with treats — or mastering a video game by dying a lot.

A luminous lattice of glowing data points forming an ordered structure — Raw examples, organised into patterns

How real AI blends them

Modern AI often mixes these. A chatbot is first trained on mountains of text (a self-supervised twist on the first type), then polished with human feedback (a cousin of the third). You'll meet that again in Step 04.

03Why more — and cleaner — data means smarter AI

→More data helps, but clean, representative data matters just as much.

▲ GOOD DATA

Varied, accurate, and representative of the real world the model will face. Leads to patterns that hold up on new examples.

▼ BAD DATA

Narrow, outdated, or skewed toward one group. The model faithfully learns those flaws — and repeats them confidently.

Quantity vs quality — and "garbage in, garbage out"

Quantity helps because more examples show more variety and average out flukes. See ten dogs and you might think all dogs are brown; see ten million and you learn what "dog" really covers.

Quality matters just as much. Feed a model skewed or messy examples and it learns skewed, messy patterns.

The golden rule of data

Garbage in, garbage out. A model is only as fair and accurate as the examples it learned from. Clean, representative data beats simply more data.

04Where learning goes wrong

→Even good learning can misfire: memorising instead of understanding, and inheriting our biases.

The two failure modes to know

Memorising instead of understanding

If a model sees the same examples too many times, it can overfit — memorising the answers instead of the general idea. It aces the practice questions and flunks the real exam.

Everyday analogy

Overfitting is the student who memorises last year's exam word-for-word, then panics when this year's questions are phrased differently. Real learning generalises.

Learning our blind spots

Because models learn from human-made data, they also absorb human bias. If the examples under-represent some group, the model will too — and won't know it. We dig into fairness in Step 06.

RECAP · WHAT YOU NOW KNOW

Machines learn by a loop: guess, check, adjust — repeated at huge scale. That's training.
Three styles: supervised (labelled), unsupervised (find structure), reinforcement (trial and reward).
More data helps, but clean, representative data matters just as much — garbage in, garbage out.
Learning can misfire: overfitting (memorising) and bias (inheriting flaws in the data).

QUICK CHECK

Two questions. Then you're done.

1. What is the basic loop a machine uses to learn?

2. Why can a model trained on biased data be a problem?

Finished Step 02?

Mark it complete to lock in your 100 XP and unlock Step 03.

How machines learn.

01Learning from examples

02The three ways machines learn

Supervised

Unsupervised

Reinforcement

03Why more — and cleaner — data means smarter AI

04Where learning goes wrong

Memorising instead of understanding

Learning our blind spots

Two questions. Then you're done.