Home
BioBanking ESG US BioTech Market About Us Contact

© 2020-2026 Data Santander, SL :: Privacy Policy :: Hand-crafted HTML code. :: v. 26 May 2026 JCL

Understand the US BioTech Market through Open Data

At a time when LLMs can and do hallucinate (including making data up), using accurate, scientifically valid and properly-sourced academic and government data is crucial to an organization’s success.

On the one hand, there are many medical devices and pharma companies located outside the US that want to enter or expand into the US market. These companies need to quickly understand the US BioTech market and:

On the other hand, there are massive amounts of Open Data available across many federal government agencies that can answer those companies’ questions. The challenge is that such data is spread out across multiple agencies’ websites such as CMS (“Center for Medicare and Medicaid Services”), FDA (“Food and Drug Administration”), NLM (“National Library of Medicine”), and many others.

Our solution is to build an integrated repository of 30+ years of Open Data sourced from US agencies. Our goal is to help users gain a 360 degrees view from compound to compliance to commerce. This view helps users to make better decisions faster, accelerating their progress in the US market.

Our solution adds unique analytics, customized alerting and dashboards to deliver actionable regulatory foresight to users. Our proprietary data-enrichment pipeline enhances the free APIs and raw data downloads available from the agencies’ websites.

This is a done-for-you service that does not require support from an organization’s IT resources.

The repository can be deployed in a dedicated instance (either AWS or Azure) for organizations that need their users to use this data with total privacy and under tight access control rules.

Stage 01: Chemical Compounds

Every pharmaceutical product starts with an active ingredient, in the form of a chemical compound.

We download and integrate the National Library of Medicine’s PubChem collection:

PubChem® is the world's largest collection of freely accessible chemical information. Search chemicals by name, molecular formula, structure, and other identifiers. Find chemical and physical properties, biological activities, safety and toxicity information, patents, literature citations and more.

Source

At the time of writing, PubChem includes:

This chemical compound data allows users to enhance their Computer-Aided Drug Design (“CADD”) pipelines.

Stage 02: Biomedical Literature

There are millions of BioTech-related articles published in academic journals workdwide.

We download and integrate the National Library of Medicine’s PubMed collection:

PubMed® comprises more than 40 million citations for biomedical literature from MEDLINE, life science journals, and online books. Citations may include links to full text content from PubMed Central and publisher web sites.

Source

Users can search across PubMed using a richer, simpler interface than what NLM currently provides.

Stage 03: Clinical Trials

Most new pharmaceutical products and many medical devices need to go through a clinical trial.

ClinicalTrials.gov contains information on over 585,000 clinical trials worldwide, and over 191,000 of those had at least one site in the US.

ClinicalTrials.gov is a website and online database of clinical research studies and information about their results. The purpose of ClinicalTrials.gov is to provide information about clinical research studies to the public, researchers, and health care professionals. The U.S. government does not review or approve the safety and science of all studies listed on this website.

Source

This information can be useful to users searching for either sites or Principal Investigators (“PIs”) with experience in particular disease states.

Also, ClinicalTrials.gov also provides documents such as the protocol, Statistical Analysis Plan (“SAP”), and Informed Consent Form (“ICF”) for many studies. Over 60,000 individual PDFs.

The European Union has a similar site, available at https://www.clinicaltrialsregister.eu

The site provides either the protocol and/or the trial results for many studies.

Stage 04: Regulatory Approvals

The FDA releases large amounts of regulatory information to the public.

For medical devices:

For pharmaceutical products:

This information helps users to review competitors’ regulatory documents for insights and guidance.

Sources:
https://www.accessdata.fda.gov/scripts/cder/daf/index.cfm
https://www.accessdata.fda.gov/scripts/cdrh/cfdocs/cfpmn/pmn.cfm

Stage 05: Adverse Events

The FDA regularly releases datasets with all adverse events reported to the agency.

These datasets cover both medical devices (“MAUDE”) as well as pharmaceutical products.

Users can navigate through the adverse events data and compile a dossier for each of their competitors. These dossiers can help guide the test plans for new drugs and medical devices.

Sources:
European Union:
https://www.adrreports.eu/en/search.html
US:
https://fis.fda.gov/extensions/FPD-QDE-FAERS/FPD-QDE-FAERS.html
https://www.accessdata.fda.gov/scripts/cdrh/cfdocs/cfMAUDE/search.CFM

Stage 06: Medicare Payments

Medicare regularly discloses all payments made under its several programs.

Summary Statistics on Use and Payments
This series of public data files summarizes use and payments for Medicare and Medicaid. Included are Medicare reports by type of service, Medicare premium reports, Medicare geographic comparisons, spending for Medicare and Medicaid by drug, Medicare and Medicaid opioid prescribing rates, and program integrity market saturation by type of service.

Source

Users can analyze this data to extract Medicare reimbursement patterns. And to create lists of Key Opinion Leaders (“KOLs”) to run studies in the US.

The individual datasets include:

Stage 07: Vendor Payments

Medicare publishes yearly reports on payments received by providers from device and drug manufacturers.

The mission of the program is to provide the public with a more transparent health care system.
Open Payments collects and publishes information about financial relationships between drug and medical device companies (referred to as "reporting entities") and certain health care providers (referred to as "covered recipients"). These relationships may involve payments to providers for things including but not limited to research, meals, travel, gifts or speaking fees.
All information available on the Open Payments database is open to personal interpretation and if there are questions about the data, patients and their advocates should speak directly to the health care provider for a better understanding.

Source

This information can help users to identify specific providers that are financially supported by users’ competitors.

Stage 08: Federal Payments

The US federal government is probably the single largest buyer of goods and services in the world. Most of its purchasing information is publicly available.

USAspending.gov links data from many government systems, including agency financial systems and government-wide award systems.

Source

This dataset covers the broad US economy, not just BioTech. Still, having access to information about government grants and purchases can be very useful to users in understanding the flow of federal money through specific companies, providers, and individuals.

Analytics, Alerts and Dashboards

The area where we’re spending the most time is on is building friendly, easy-to-use dashboards to summarize all the information that’s relevant to the user’s query.

We’re also building an alerting mechanism to allow user to create highly-targeted alerts to monitor specific topics, words, drug / device names, and/or manufacturer names.

This is where we’ll benefit the most from users’ feedback.
Please contact me if you’d like to enroll in our alpha user program.

Contact us

Please contact us if your organization has a data management challenge you need assistance with.