The main paper sample contains 14,599,178 works from 1960-2000. It is designed to keep venue-published research works with a usable field label and references, without imposing an English-language or curated-journal restriction.
Filters Applied
A work enters the main sample if it satisfies these paper-level rules:
- publication year is between 1960 and 2000;
- OpenAlex source type is
journalorconference; - the work has at least one reference;
- the work is not retracted;
- the work is not paratext;
- the work has at least one author;
- the work has a non-missing OpenAlex primary topic that maps to a field.
Filters Not Applied
The default sample is not English-only and is not restricted to OpenAlex core sources. Among the 14,599,178 works, 13,633,892 are English and 11,795,499 are in core sources, but neither condition is required.
How It Is Used
The field classifier, paper explorer, distinct institution-field-year outcome panel, and exposure/pre-period publication weights all use this sample definition by default. Alternative builds can add an English-only or source-quality restriction, but those variants are written separately and are not the default website data.
Institution concentration charts apply additional filters on top of this: institutions must clear the publication threshold in 1960-1969, and displayed author counts keep only authors with at least 10 OpenAlex works.