The menu · pick something you can do

Find a gap you can fill.

Lacuna helps anyone — no PhD required — use AI to make a small, real, verifiable contribution to science. Science databases are full of "gaps": facts published in papers but never entered into the canonical database. We find these gaps, fill them from the original sources, double-check them, and submit through official channels. You bring curiosity and judgment; the AI does the heavy lifting.

Everything below is a real gap in a database scientists rely on. We sort them by one thing: how ready each is for you to jump in today.

Open to you right now(2)

Self-serve, no approval needed — you could start today.

Cantonese pronunciation

Wiktionary · Lingua Libre

Open now

If Cantonese is your native tongue: record the pronunciation of a Cantonese word that has no audio yet — your voice is the one piece of data AI can't fake (six+ tones; TTS is genuinely weak here).

You can start today — recording is self-serve, with no gatekeeper. Right now about 183,000 Cantonese words have no audio at all. The one thing we're still building: since "does this sound right" needs other native speakers to confirm, we're creating a way for the community to check each other's recordings.

Scale · About 183,000 Cantonese words with no audio yet — only 1.6% of 185,933 have a recording (measured June 2026)

Heritage building photos

Wikidata · Wikimedia Commons

Open now

Find a building registered as a heritage monument but with no photo on Wikidata, go photograph it, and upload — your camera and your presence are the data AI can't fabricate.

You can start today — find a listed building with no picture yet, photograph it, and upload; no approval needed. France alone has 161,462 such heritage buildings. Checking is easy here too: a photo's location can be matched against the building's known address.

Scale · 161,462 listed heritage buildings in France alone with no photo yet (more worldwide; measured June 2026)

Mostly an AI job — you can watch(2)

The answer already sits in the data, so AI can find and fix it end to end. There's little only a human can add here.

General knowledge graph

Wikidata

AI does most of this

Pick a type of fact missing across thousands of entries (e.g. scientists with no birth date), find the answer in a reliable source, and add it.

Mostly an AI job. The missing facts usually already exist somewhere online, so AI can look them up and fill them without a person — we run it as a script rather than ask you to.

Scale · Massive (billions of items), sliceable by any property

Classics

Perseus Digital Library

AI does most of this

Scan classic Greek/Latin texts for markup or metadata errors (e.g. deprecated language codes, transcription slips) and submit a fix.

Mostly an AI job. The fixes — like a wrong language tag — sit right in the text files, so AI can find and fix them end to end. A real Perseus maintainer already merged our first one. You can follow the trail rather than do the work.

Scale · Many legacy texts; the markup-error subset is scannable, exact count TBD

We're still checking these out(5)

Real potential, but a channel or a clear task definition we haven't pinned down yet.

Paleontology

Paleobiology Database (PBDB)

Still checking

Find a fossil species scientists named in a paper but never added to the global fossil database, then pull its dig-site, rock layer, and age from the original paper and complete the record.

Real and proven — we've already prepared the work for nine missing fossil species. But only credentialed researchers can enter records here, so we've asked the database for access and are waiting to hear back.

Scale · 1,046 birds, 1,277 dinosaurs, 4,651 mammals with no records yet; likely tens of thousands database-wide

Astronomy

SIMBAD + ADS

Still checking

Find a star or galaxy whose recent papers were never linked to its database entry, or whose aliases are incomplete, and connect them.

Promising — the largest scale and the cleanest signal of any domain here. We're still confirming the submission channel is open enough to be worth your time.

Scale · SIMBAD: >15 million astronomical objects

Ecology / Biodiversity

GBIF + IUCN

Still checking

Find species whose names changed (split/merged) in recent papers while the biodiversity database still uses the old classification, and flag it.

Large, but we're still figuring out how to define one clear, checkable task an ordinary person could do here.

Scale · Huge, but the unit gap is fuzzily defined

Bioinformatics

UniProtKB

Still checking

Find a protein whose function was newly characterized in a paper but whose database entry wasn't updated, and report the evidence.

Interesting, but spotting what's missing means reading research papers closely — we're still weighing whether that's a good fit for non-specialists.

Scale · Large, but gap detection is costly

Chemistry / Crystallography

PubChem · ChEMBL · COD

Still checking

Find a compound's measured property or bioactivity published in a paper but missing from the chemistry database, and add it.

Similar in shape to the fossil database, but it asks for more chemistry background to do and to check. Under evaluation.

Scale · Large, scattered

The hard frontier — a later phase(2)

Real unknowns and from-scratch documentation. The highest human value, and the hardest game.

Endangered language documentation

Wu / Shanghainese · no Wiktionary category yet

Frontier · someday

(Phase 3 · fieldwork) If Wu/Shanghainese is your native tongue: document vocabulary, pronunciation, usage. Not "add audio to a known word" — it's recording from scratch a mostly-spoken, orthography-less endangered variety.

The hardest and most valuable kind: recording a spoken language from scratch, where the words aren't even written down yet — so there's no list to work from and no way to machine-check it. A later phase.

Scale · The full vocabulary of an endangered variety (largely undigitized)

Mathematics

Erdős problems · OEIS

Frontier · someday

(Phase 3 · real unknowns) Pick a recognized open problem (e.g. an Erdős conjecture), make a rigorous AI-assisted attempt, and honestly log the process and outcome — solved / partial / dead-end / disproved. What's verified is the attempt's rigor and reproducibility, not acceptance of an answer.

The frontier: take a famous unsolved problem, make a careful AI-assisted attempt, and log it honestly — solved, partial, or dead end. What counts is the rigor of the attempt, not whether an answer gets accepted. A later phase.

Scale · The set of recognized open problems (Erdős / Millennium / gaps in OEIS…)

Don't see your thing yet? This list grows. Every gap we add is one more real, checkable way for an ordinary person to help science — and the verification that keeps it honest is built in from the start.