Publication | Closed Access
Big Data from Pharmaceutical Patents: A Computational Analysis of Medicinal Chemists’ Bread and Butter
431
Citations
36
References
2016
Year
Recent studies aim to map the medicinal chemist’s toolbox. This study investigates chemical reactions and molecules from U.S. patents spanning 1976–2015. A text‑mining pipeline extracted 1.15 million unique reaction schemes from patents and classified them into known types using an expert system.
Multiple recent studies have focused on unraveling the content of the medicinal chemist's toolbox. Here, we present an investigation of chemical reactions and molecules retrieved from U.S. patents over the past 40 years (1976–2015). We used a sophisticated text-mining pipeline to extract 1.15 million unique whole reaction schemes, including reaction roles and yields, from pharmaceutical patents. The reactions were assigned to well-known reaction types such as Wittig olefination or Buchwald–Hartwig amination using an expert system. Analyzing the evolution of reaction types over time, we observe the previously reported bias toward reaction classes like amide bond formations or Suzuki couplings. Our study also shows a steady increase in the number of different reaction types used in pharmaceutical patents but a trend toward lower median yield for some of the reaction classes. Finally, we found that today's typical product molecule is larger, more hydrophobic, and more rigid than 40 years ago.
| Year | Citations | |
|---|---|---|
Page 1
Page 1