About PatentWorld
By Saerom (Ronnie) Lee, The Wharton School, University of Pennsylvania
PatentWorld is an interactive data exploration project that visualizes 50 years of global innovation through US patent data. Our goal is to make the rich, complex world of patents accessible and engaging through data-driven storytelling.
About the Author
Saerom (Ronnie) Lee is an Assistant Professor of Management at The Wharton School, University of Pennsylvania, where he studies organizational design, human capital, and high-growth entrepreneurship. He built PatentWorld to make half a century of US patent data accessible and interactive for students, researchers, policymakers, and the public.
Reach him at saeroms@upenn.edu — feedback, collaboration ideas, and feature requests are welcome.
Explore the Chapters
PatentWorld is organized into 14 chapters, each exploring a different dimension of the US patent system:
- The Innovation Landscape — How has the pace and nature of patenting changed since 1976?
- The Technology Revolution — Which technologies are rising, and which are fading?
- Who Innovates? — From IBM to Samsung: who holds the patents, and how has that changed?
- The Inventors — Team sizes, gender trends, and the most prolific inventors.
- The Geography of Innovation — Innovation hubs from Silicon Valley to Shenzhen.
- Collaboration Networks — How do firms and inventors collaborate? Network analysis reveals the hidden structure of innovation.
- The Knowledge Network — Citations, government funding, and the flow of knowledge.
- Innovation Dynamics — Grant lag, cross-domain convergence, global collaboration, and the velocity of innovation.
- Patent Quality — Forward citations, originality, generality, and other dimensions of patent quality over 50 years.
- Patent Law & Policy — Major legislation, Supreme Court decisions, and policy changes that have shaped the US patent system.
- The Green Innovation Race — Green patents grew from a trickle to a torrent. Who leads the race to decarbonize, and where is AI meeting climate?
- Artificial Intelligence — How AI patenting has evolved from early expert systems to the deep learning and generative AI era.
- The Language of Innovation — Topic modeling and semantic analysis of 9.36 million patent abstracts reveals emerging themes, technology convergence, and the novelty of invention.
- Company Innovation Profiles — Deep-dive into the innovation trajectories, portfolio strategies, and competitive positioning of the world's most prolific patent filers.
Data Source
All data comes from PatentsView, a patent data platform supported by the United States Patent and Trademark Office (USPTO). PatentsView provides disambiguated and linked patent data covering:
- 9.36 million granted patents (1976–2025)
- Disambiguated inventor and assignee identities
- Cooperative Patent Classification (CPC) technology categories
- WIPO technology field classifications
- Geographic location data for inventors
- Patent citation networks
- Government interest statements
Key Metrics Defined
- Forward Citations
- The number of times a patent is cited by later patents. Widely used as a proxy for patent impact and technological importance.
- Originality
- Measures how broadly a patent draws on prior art across different technology classes. Based on a Herfindahl index of backward citation classes. Higher values indicate more diverse knowledge sources.
- Generality
- Measures how broadly a patent is cited across different technology classes. Based on a Herfindahl index of forward citation classes. Higher values indicate wider downstream influence.
- Grant Lag (Pendency)
- The time between a patent application filing date and the date the patent is granted, measured in days or years.
- Herfindahl-Hirschman Index (HHI)
- A measure of market concentration calculated as the sum of squared market shares. Ranges from 0 (perfectly fragmented) to 10,000 (monopoly).
- CPC (Cooperative Patent Classification)
- A hierarchical classification system jointly managed by the USPTO and EPO. Sections include A (Human Necessities) through H (Electricity), plus Y (Cross-Sectional).
- WIPO Technology Fields
- A classification of patents into 35 technology fields grouped into 5 sectors: Electrical engineering, Instruments, Chemistry, Mechanical engineering, and Other fields.
Methodology
Raw data was downloaded as tab-separated value (TSV) files from PatentsView's bulk data downloads. We processed these files using DuckDB, an analytical SQL database engine, to compute aggregated statistics for each visualization.
Key processing steps include:
- Joining patent records with inventor, assignee, location, and classification data
- Aggregating by year, technology category, geography, and organization
- Computing derived metrics: citation counts, team sizes, concentration ratios, diversity indices
- Filtering to primary classifications (sequence = 0) to avoid double-counting
Data Limitations
- Granted patents only: The dataset includes only granted patents, not applications that were abandoned or rejected. This introduces survivorship bias.
- US patents only: The analysis covers patents granted by the USPTO. It does not include patents filed only at foreign patent offices (EPO, JPO, CNIPA, etc.).
- Inventor disambiguation: PatentsView uses algorithmic disambiguation to link inventor records across patents. Some errors in matching or splitting inventor identities may exist.
- Citation truncation: Recently granted patents have had less time to accumulate forward citations, creating a right-truncation bias in citation-based metrics.
- Classification changes: The CPC system was introduced in 2013, replacing the earlier USPC system. Historical patents were retrospectively reclassified, but some inconsistencies may remain.
- Gender inference: Inventor gender is inferred from first names and may not reflect actual gender identity. Non-binary identities are not captured.
Technology
The website is built with Next.js 14 and uses Recharts for interactive visualizations. All data is pre-computed and served as static JSON files, requiring no backend server. The design uses Tailwind CSS with dark/light theme support.
AI-Assisted Development
The data analyses, visualizations, and website development for PatentWorld were conducted with the assistance of Claude AI (Anthropic). Claude was used for data pipeline development, statistical computations, analytical writing, and front-end implementation. All analytical insights and interpretations were reviewed for accuracy and scholarly rigor.
Attribution
Data attribution: PatentsView (www.patentsview.org), USPTO.
PatentsView is a tool built to increase the usability and transparency of US patent data. The database is derived from the USPTO examination and granting of patents.