PlainPatent

Data Sources — USPTO PatentsView

Detailed documentation of where PlainPatent data comes from, how it is processed, and what limitations users should understand.

USPTO PatentsView

All patent data on PlainPatent comes from PatentsView, a research-grade patent data platform maintained by the US Patent and Trademark Office in partnership with the American Institutes for Research. PatentsView provides disambiguated patent data covering all US utility patents, including inventor and assignee information processed through machine-learning name disambiguation models trained on USPTO records.

PatentsView is widely used in academic research, government policy analysis, and corporate competitive intelligence. The disambiguation methodology has been peer-reviewed and published in academic journals, making it one of the most reliable sources of structured patent data available.

What the Data Includes

PlainPatent processes patent grants from PatentsView covering the period 2015 through 2025. For each patent, we extract: the grant date and filing date; disambiguated assignee (company) name and identifier; CPC classification codes at the group and subgroup level; inventor count and names; number of claims; patent title and abstract; and citation references to other patents.

This data is aggregated into company-level profiles showing total patent counts, technology distribution across CPC classes, filing velocity trends, and innovation scores. Technology-level pages show which companies lead in each CPC category and how patent activity trends over time.

Data Processing Pipeline

We download bulk TSV files from PatentsView's public Amazon S3 repository. These files are parsed into a structured SQLite database through our ETL pipeline. Company profiles are constructed by aggregating individual patent records by disambiguated assignee identifier. CPC codes are normalized using the current CPC scheme to ensure consistent technology categorization across the full 2015-2025 time period.

Update Schedule

PatentsView typically releases updated bulk data on a quarterly basis. We update PlainPatent within 30 days of each new release. There is an inherent lag of several months between patent grant dates and their appearance in the PatentsView dataset, as USPTO data must pass through disambiguation and quality assurance processes before release.

Known Limitations

Users should understand the following limitations when using PlainPatent data. We track only US utility patents — design patents, plant patents, and foreign patents are excluded. The 2-3 year lag between patent application and grant means data reflects past R&D decisions. Disambiguation is imperfect — some company names may be split or merged incorrectly. For detailed analysis of these limitations and guidance on interpretation, see our editorial guides.