At a time when LLMs can and do hallucinate (including making data up), using accurate, scientifically valid and properly-sourced academic and government data is crucial to an organization’s success.
On the one hand, there are many medical devices and pharma companies located outside the US that want to enter or expand into the US market. These companies need to quickly understand the US BioTech market and:
On the other hand, there are massive amounts of Open Data available across many federal government agencies that can answer those companies’ questions. The challenge is that such data is spread out across multiple agencies’ websites such as CMS (“Center for Medicare and Medicaid Services”), FDA (“Food and Drug Administration”), NLM (“National Library of Medicine”), and many others.
Our solution is to build an integrated repository of 30+ years of Open Data sourced from US agencies. Our goal is to help users gain a 360 degrees view from compound to compliance to commerce. This view helps users to make better decisions faster, accelerating their progress in the US market.
Our solution adds unique analytics, customized alerting and dashboards to deliver actionable regulatory foresight to users. Our proprietary data-enrichment pipeline enhances the free APIs and raw data downloads available from the agencies’ websites.
This is a done-for-you service that does not require support from an organization’s IT resources.
The repository can be deployed in a dedicated instance (either AWS or Azure) for organizations that need their users to use this data with total privacy and under tight access control rules.
Every pharmaceutical product starts with an active ingredient, in the form of a chemical compound.
We download and integrate the National Library of Medicine’s PubChem collection:
At the time of writing, PubChem includes:
This chemical compound data allows users to enhance their Computer-Aided Drug Design (“CADD”) pipelines.
There are millions of BioTech-related articles published in academic journals workdwide.
We download and integrate the National Library of Medicine’s PubMed collection:
Users can search across PubMed using a richer, simpler interface than what NLM currently provides.
Most new pharmaceutical products and many medical devices need to go through a clinical trial.
ClinicalTrials.gov contains information on over 585,000 clinical trials worldwide, and over 191,000 of those had at least one site in the US.
This information can be useful to users searching for either sites or Principal Investigators (“PIs”) with experience in particular disease states.
Also, ClinicalTrials.gov also provides documents such as the protocol, Statistical Analysis Plan (“SAP”), and Informed Consent Form (“ICF”) for many studies. Over 60,000 individual PDFs.
The European Union has a similar site, available at https://www.clinicaltrialsregister.eu
The site provides either the protocol and/or the trial results for many studies.
The FDA releases large amounts of regulatory information to the public.
For medical devices:
For pharmaceutical products:
This information helps users to review competitors’ regulatory documents for insights and guidance.
Sources:
https://www.accessdata.fda.gov/scripts/cder/daf/index.cfm
https://www.accessdata.fda.gov/scripts/cdrh/cfdocs/cfpmn/pmn.cfm
The FDA regularly releases datasets with all adverse events reported to the agency.
These datasets cover both medical devices (“MAUDE”) as well as pharmaceutical products.
Users can navigate through the adverse events data and compile a dossier for each of their competitors. These dossiers can help guide the test plans for new drugs and medical devices.
Sources:
European Union:
https://www.adrreports.eu/en/search.html
US:
https://fis.fda.gov/extensions/FPD-QDE-FAERS/FPD-QDE-FAERS.html
https://www.accessdata.fda.gov/scripts/cdrh/cfdocs/cfMAUDE/search.CFM
Medicare regularly discloses all payments made under its several programs.
Users can analyze this data to extract Medicare reimbursement patterns. And to create lists of Key Opinion Leaders (“KOLs”) to run studies in the US.
The individual datasets include:
Medicare publishes yearly reports on payments received by providers from device and drug manufacturers.
This information can help users to identify specific providers that are financially supported by users’ competitors.
The US federal government is probably the single largest buyer of goods and services in the world. Most of its purchasing information is publicly available.
This dataset covers the broad US economy, not just BioTech. Still, having access to information about government grants and purchases can be very useful to users in understanding the flow of federal money through specific companies, providers, and individuals.
The area where we’re spending the most time is on is building friendly, easy-to-use dashboards to summarize all the information that’s relevant to the user’s query.
We’re also building an alerting mechanism to allow user to create highly-targeted alerts to monitor specific topics, words, drug / device names, and/or manufacturer names.
This is where we’ll benefit the most from users’ feedback.
Please contact me if you’d like to enroll in our alpha user program.
Please contact us if your organization has a data management challenge you need assistance with.