Work sample filters

The main paper sample contains 14,599,178 works from 1960-2000. It is designed to keep venue-published research works with a usable field label and references, without imposing an English-language or curated-journal restriction.

Filters Applied

A work enters the main sample if it satisfies these paper-level rules:

Filters Not Applied

The default sample is not English-only and is not restricted to OpenAlex core sources. Among the 14,599,178 works, 13,633,892 are English and 11,795,499 are in core sources, but neither condition is required.

How It Is Used

The field classifier, paper explorer, distinct institution-field-year outcome panel, and exposure/pre-period publication weights all use this sample definition by default. Alternative builds can add an English-only or source-quality restriction, but those variants are written separately and are not the default website data.

Institution concentration charts apply additional filters on top of this: institutions must clear the publication threshold in 1960-1969, and displayed author counts keep only authors with at least 10 OpenAlex works.