The dataset

The most comprehensive clinical-evidence dataset in biotech

An always-on knowledge graph that ingests trial-registry updates, press releases, regulatory filings, conference abstracts, and publications, then structures them into drugs, companies, trials, deals, and biological targets.

1M+Clinical trials
400k+Trial results
100k+Drugs & compounds
120k+Companies
23k+Biological targets
100k+Deals

Figures are approximate and grow continuously as new evidence is published.

Results no one else has

We extract safety and efficacy results from every trial that has disclosed them, whether in a paper, a poster, a press release, or an investor deck, not just what registries report. That is coverage no other platform matches.

Structured, not scraped

Every source is parsed into detailed ontologies the moment it appears, so drugs, trials, endpoints, and biomarkers stay linked and instantly findable, with a citation behind each data point.

Always current

The graph updates in real time as new evidence is published, so competitive landscapes and risk assessments reflect what happened this week, not last quarter.

How we measure forecasting accuracy

Our trial-outcome models are evaluated out-of-sample against trials whose actual outcomes are now known. On completion-date forecasting, model predictions land a median of 47 days closer to the true completion date than the sponsor's own posted estimate, across the held-out evaluation set. Probability-of-success estimates are reported with calibration curves so you can see how predicted probabilities track observed outcomes, not just a single score.

We'll walk through the methodology, the evaluation set, and the limitations in detail on a call. We'd rather show our work than ask you to take the number on faith.

See it on your own pipeline