How Uncanny Atlas works
It is difficult to know what is 'real' and what is generated by AI on the internet. People generally point to specific things when declaring “that's AI” - The hands, the
light, a plastic sheen they can't quite name. On r/isthisAI and r/RealOrAI, thousands of people argue about exactly
that, in the comments.
This tool is an attempt to turn that pile of arguments - currently 912,187 comments - into a map of the indicators people actually
use. This page explains how, from first principles. No prior knowledge of the project is assumed;
just scroll, and play with the figures.
The whole job sounds simple: find every comment that names an indicator, work out which indicator, and count them. The trouble is hiding in the middle. Let's build up to it.
1 · A thousand ways to say one thing
People don't speak in tidy labels. The single idea “the hands are wrong” shows up as six fingers melty hands too many knuckles fused fingers - and a hundred more.
The obvious approach is to search for matching words. But meaning and words come apart fast. Here are two comments that plainly mean the same thing. Which words do they actually share?
We need a way for the computer to see meaning, not spelling.
2 · A model that pulls out the indicators
First, someone (or something) has to read a comment and say what indicator it names. This project used a small language model
(gemma3:4b, running locally). It read one comment and returned the indicators as short
phrases:
This works, but it has two limits. It's slow and costly - one comment per call - so the project only runs on a sample of a few thousand.
How we choose that sample matters. We don't read random comments - only ones likely to be discussing authenticity at all, picked by a deliberately broad keyword net: AI real fake generated obvious look. These are topical words on purpose - not visual-indicator words. Filtering the sample for visual-indicator words (finger, shadow, lighting) would be circular: you cannot discover which indicators people use if you only read comments that already mention the indicators you guessed. So the net catches “is this real or AI?” talk, and lets the model find whatever indicator is actually there (and occasionally a couple that aren't).
So after this step we have a few thousand comments, each with real indicator-phrases - but those phrases are still just words, with the paraphrase problem from §1. Time to fix meaning.
3 · Turning meaning into a place on a map
Here's the key trick. Imagine a giant map where every phrase gets an address, and
phrases that mean similar things live close together - same street - while unrelated ones
live in different cities. That address (a long list of numbers; really 768 of them) is called an embedding. A second little AI - an embedding model
(nomic-embed-text, also running locally) - learned to place text on this map by
reading a mountain of writing.
Below is a flattened, 2-D sketch of such a map for some real comment-phrases. Notice the clumps: all the hand complaints sit together, all the shadow ones together - even though they share no words. The big ◆ markers are our known indicators; the small dots are individual comments.
Drag the crosshair anywhere (or hit a button to jump it to an indicator). Everything inside the circle is “close enough to count as the same thing.” Drag the slider to change how close close has to be.
To check if two phrases mean the same thing, the computer just measures the distance
between their addresses. Close = same idea. That distance threshold is the 0.73 you saw on the slider. Crank it up and you demand near-identical meaning (you
miss loose paraphrases); loosen it and you sweep in more, including the occasional wrong one.
4 · Letting the indicators find their own comments
Now the payoff. The first model only saw a sample. But every comment can be placed on the map cheaply. So for each known indicator (the ◆ markers), we simply ask: which comment-dots are nearby? - and tag them all, no re-reading required. Flip the figure above to “Tag near every indicator.”
This is semantic expansion. A comment that says “her fingers are all fused together” never typed the word “hands,” but it lives right next door to the ◆ hands indicator - so it gets counted. That's how a tiny sample grows into broad coverage, and why the counts reflect what people actually said rather than just the few comments the model had time for.
One guard rail: expansion only looks at comments long enough to be describing something - at least 20 characters, and not a bot. A lone 👍 or lol sits near plenty of indicators on the map but isn't evidence of anything, so it's skipped. Without that floor a vague seed like “AI voice” would hoover up thousands of one-word reactions and drown out the real signal.
Where the ◆ indicators come from: seeds
Those ◆ markers have a name: seeds. A seed is a known indicator that semantic expansion reaches out from - each one gathers the comment-dots in its neighbourhood. Seeds aren't hand-listed up front: the pipeline builds them automatically by taking the ~200 most-frequently extracted phrases and sorting each into a category - that's the taxonomy. A curator (me, in this case) can also seed a phrase the model never surfaced, or un-seed a bad one. Toggle the seeds below and watch coverage shrink and grow - the grey non-indicator cluster is never seeded, so it stays dark:
5 · From a teaspoon to the whole ocean
The scale gap is what makes this worthwhile. Reading a comment with the language model is slow, so only about 15,990 were ever read that way. Placing a comment on the map is cheap, so every comment can get an address - all 912,187 of them. The bar shows how much of the corpus is mapped:
6 · Tidying the map
Two messes remain, and both are human-in-the-loop.
Merging synonyms into one canonical group
First, the same indicator is scattered across synonyms, so its count is split several ways. We merge the variants into one canonical indicator and re-point every comment to it. They collapse into a single entry everywhere on the Explore side - Top indicators, Inspect, and Semantic matches - and their counts combine, so the tally reflects the real concept instead of splitting across spellings. (Each raw phrase is still curated on its own - and a merged group is itself just a tidied seed.) Merging is done on the Merge page. Press the button:
Second, the model looking for indicators sometimes builds a fake indicator - a vague verdict like “looks obviously fake,” which isn't a visual indicator at all. Left on the map, semantic expansion would drag hundreds of reaction comments to it. So a curator (again, in this case me) marks it Noise - the equivalent of pulling that house off the map. Expansion then skips it forever, and (because the decision is written into the master map) it stays gone even after future runs. That grey “not an indicator” blob in the big figure is exactly what Noise looks like: present, but never matched against comments.
What you end up with
Put it together - read a sample, place everything on the map of meaning, let indicators gather their own
comments, then manually merge and de-noise - and the pile of 912,187 arguments becomes a
perspective on which indicators people actually rely on.
In this dataset, from my perspective, the most-cited indicator is Fingers with 1,909 associated comments.