BioAtlas
490+ million rows integrating genetics, expression, perturbations, drugs, and disease into one harmonized, queryable PostgreSQL database. This is the same core data layer we use internally at VivaMed.
Research Use Only: BioAtlas is not intended for diagnosis, treatment decisions, or clinical care. Always consult qualified healthcare professionals for medical advice.
Solving the Biomedical Data Integration Crisis
Modern biomedical research produces incredible datasets, but they're isolated silos:
- GWAS Catalog has genetics, but no gene expression
- LINCS has drug perturbations, but no genetic evidence
- ChEMBL has drugs, but no patient data or genomics
- Open Targets connects some pieces, but missing perturbation data
The result? Researchers spend months manually linking databases just to ask: "Which drugs target genetically validated disease genes AND have good safety profiles?"
BioAtlas Solves This. One PostgreSQL database. 40+ sources integrated. 490M+ rows. Unified IDs. Harmonized coordinates. Cross-platform normalized.
Complete Coverage: Drugs, Diseases, Pathways, Mechanisms
BioAtlas provides EVERYTHING about drugs, diseases, pathways, and mechanisms. Competitors give you fragments.
Drug Coverage (COMPLETE)
| What You Need | BioAtlas | Clarivate | STRING | Open Targets |
|---|---|---|---|---|
| Drug Information | 10,380 drugs | Limited | None | Limited |
| Drug-Target | 24,987 (11 sources) | Yes | None | Yes |
| Binding Affinities | 1.35M (pChEMBL unified) | Limited | None | Some |
| Adverse Events | 1.45M (FDA FAERS) | Yes | None | Limited |
| Drug Perturbations | 720K LINCS + 1.5M Tahoe | Limited | None | None |
| Drug Screening | 8.6M combo + 3.8M PRISM | None | None | None |
| Activity Scores | 204M TF/pathway | None | None | None |
Disease Coverage (COMPLETE)
| What You Need | BioAtlas | Clarivate | STRING | Open Targets |
|---|---|---|---|---|
| Disease Ontology | 30,259 (MONDO) | Yes | None | Yes |
| Gene-Disease | 1.14M associations | Yes | None | Yes |
| Disease-Phenotype | 272,246 (HPO) | Limited | None | Yes |
| Colocalization | 80M causal genes | None | None | Some |
Pathway & Mechanism Coverage (COMPLETE)
| What You Need | BioAtlas | Clarivate | STRING | Open Targets |
|---|---|---|---|---|
| Pathway Definitions | 2,781 (Reactome) | Yes | None | Yes |
| TF Regulation | 31,953 edges (DoRothEA) | Limited | None | Limited |
| Ligand-Receptor | 20.9M pairs | Limited | None | Some |
| TF/Pathway Activities | 204M scores | None | None | None |
Integration & Normalization (UNIQUE)
| Feature | BioAtlas | Everyone Else |
|---|---|---|
| All-in-One Database | SQL joins across 40+ sources | Separate downloads |
| ID Harmonization | Ensembl↔ChEMBL↔MONDO | Manual mapping |
| Colocalization | 80M precomputed | Compute yourself |
| Local SQL Access | Full access | APIs/Web only |
| Price | FREE | $100K-$200K/year |
The Bottom Line
For DRUGS
Targets (25K), affinities (1.35M), indications (47K), safety (1.45M), perturbations (2.2M), screening (12M) — ALL in one place
For DISEASES
Ontology (30K), genetics (1.14M associations), phenotypes (272K), variants (299K), colocalization (80M) — COMPLETE coverage
For PATHWAYS
Definitions (2.7K), members (137K), footprints (253K), TF regulation (32K), activities (204M) — FULL mechanism knowledge
For INTEGRATION
BioAtlas is the ONLY one that connects all of these in one queryable database with advanced normalization
Competitors have 1-2 of these. BioAtlas has ALL.
Quick Start: Installation Guide
Prerequisites
- • PostgreSQL 14+ installed
- • ~30 GB free disk space
- • ~8 GB RAM minimum
1. Download Files (Total ~26 GB)
# Core Knowledge Graph (14.2 GB) huggingface-cli download vivamed/Bio-Atlas bioatlas_public_v1.0.dump # LINCS Activity Scores (5.1 GB) huggingface-cli download vivamed/Bio-Atlas bio_kg_v1.0.dump # Colocalization Data (6.5 GB) huggingface-cli download vivamed/Bio-Atlas coloc_bayesian.dump
2. Load Database
# Create database createdb bioatlas # Load Core Tables (~15 mins) psql -d bioatlas -f bioatlas_public_v1.0.dump # Load LINCS Activities (~10 mins) pg_restore -d bioatlas bio_kg_v1.0.dump # Load Colocalization (~10 mins) pg_restore -d bioatlas coloc_bayesian.dump
3. Verify Installation
psql -d bioatlas -c "\dt" -- Should see 79 tables psql -d bioatlas -c "SELECT COUNT(*) FROM drug;" -- 10,380 psql -d bioatlas -c "SELECT COUNT(*) FROM l1000_activity;" -- 202,282,258
Ready to Transform Your Research?
Download BioAtlas today and query across genetics, drugs, diseases, and pathways in seconds — not months.