The dataset
An always-on knowledge graph that ingests trial-registry updates, press releases, regulatory filings, conference abstracts, and publications, then structures them into drugs, companies, trials, deals, and biological targets.
Figures are approximate and grow continuously as new evidence is published.
We extract safety and efficacy results from every trial that has disclosed them, whether in a paper, a poster, a press release, or an investor deck, not just what registries report. That is coverage no other platform matches.
Every source is parsed into detailed ontologies the moment it appears, so drugs, trials, endpoints, and biomarkers stay linked and instantly findable, with a citation behind each data point.
The graph updates in real time as new evidence is published, so competitive landscapes and risk assessments reflect what happened this week, not last quarter.
Our trial-outcome models are evaluated out-of-sample against trials whose actual outcomes are now known. On completion-date forecasting, model predictions land a median of 47 days closer to the true completion date than the sponsor's own posted estimate, across the held-out evaluation set. Probability-of-success estimates are reported with calibration curves so you can see how predicted probabilities track observed outcomes, not just a single score.
We'll walk through the methodology, the evaluation set, and the limitations in detail on a call. We'd rather show our work than ask you to take the number on faith.