Open Targets (GraphQL)

Query the Open Targets API for target–disease associations and evidence.

Overview

Problem. How strong is the gene–disease link; is it druggable?

Use when: Target–disease links, druggability
Avoid when: Concluding from a single evidence line

Learning goals

Figures

Open Targets Overview
Entities and IDs
Query Workflow
Association Scoring
Common Issues

Tutorial

Programmatic access to Open Targets target–disease associations, evidence, and annotations via a single GraphQL endpoint.

When to Use This Skill

Use Open Targets when the user wants:

Don't use Open Targets for:

  • ❌ Bulk/systematic extraction across many entities → use the FTP downloads, BigQuery (open-targets-prod), or AWS Open Data buckets instead
  • ❌ Non-human biology, general literature search, EHR/clinical-trial-recruitment data, or proprietary datasets

Quick Start

Test this skill in ~30 seconds — no API key required:

import requests

URL = "https://api.platform.opentargets.org/api/v4/graphql"
query = """
query { disease(efoId: "MONDO_0004975") {
  name
  associatedTargets(page: { index: 0, size: 5 }) {
    rows { target { approvedSymbol } score }
  }
} }
"""
print(requests.post(URL, json={"query": query}).json())

Expected: Top 5 targets associated with Alzheimer's disease (MONDO_0004975) with overall association scores (0–1).

Installation

Required:

pip install requests

Optional (for tabular handling):

pip install pandas

No API key, no auth, no rate-limit headers in the public docs. The maintainers ask you not to loop one entity at a time — use bulk downloads for that.

License: Open Targets data is released under CC0 1.0; the API is free to use.

Inputs

Required for most queries — one of the following standardised IDs:

Entity ID format Example
Target Ensembl gene ENSG00000169083
Disease EFO (or imported) MONDO_0004975
Drug ChEMBL CHEMBL1201583
Variant chrom_pos_ref_alt 19_44908822_C_T
Study GWAS Catalog GCST005194

If the user provides a free-text name or non-primary identifier (gene symbol, disease name, drug brand, HGNC ID), resolve it first with the search query before any other call.

Outputs

GraphQL returns JSON shaped exactly like your query. Typical deliverables for the user:

CSV/TSV export from the JSON is straightforward with pandas.json_normalize.

Clarification Questions

Ask only for missing information. If the user already gave a standard ID and a clear goal, proceed directly.

1. Entity & ID:

2. Goal: Annotation lookup, target–disease associations, supporting evidence, or genetics (variant/GWAS/L2G)?

3. Scope:

4. Filters / weighting (associations only): Default scoring, or custom datasource weights (e.g. "genetics-only", "downweight literature")? Roll up evidence through disease ontology descendants (enableIndirect: true)?

5. Output: Print summary, return JSON, or save to CSV/TSV?

Standard Workflow

Endpoint: https://api.platform.opentargets.org/api/v4/graphql Playground (with built-in schema docs): https://api.platform.opentargets.org/api/v4/graphql/browser

Step 1 — Helper

import requests

URL = "https://api.platform.opentargets.org/api/v4/graphql"

def ot_query(query: str, variables: dict | None = None) -> dict:
    r = requests.post(URL, json={"query": query, "variables": variables or {}})
    r.raise_for_status()
    payload = r.json()
    if "errors" in payload:
        raise RuntimeError(payload["errors"])
    return payload["data"]

Step 2 — Resolve names → IDs (only if needed)

QUERY = """
query Search($q: String!) {
  search(queryString: $q, entityNames: ["target","disease","drug"]) {
    hits { id name entity }
  }
}
"""
ot_query(QUERY, {"q": "BRCA1"})

Step 3 — Run the actual query

Target annotation:

QUERY = """
query Target($ensemblId: String!) {
  target(ensemblId: $ensemblId) {
    id approvedSymbol biotype
    geneticConstraint { constraintType score oe oeLower oeUpper }
    tractability { label modality value }
  }
}
"""
ot_query(QUERY, {"ensemblId": "ENSG00000169083"})  # AR

Disease → known drugs + top associated targets:

QUERY = """
query Disease($efoId: String!) {
  disease(efoId: $efoId) {
    id name
    knownDrugs { uniqueDrugs rows { drug { id name isApproved } } }
    associatedTargets(page: { index: 0, size: 25 }) {
      rows {
        target { id approvedSymbol }
        score
        datatypeScores { id score }
      }
    }
  }
}
"""
ot_query(QUERY, {"efoId": "MONDO_0004975"})  # Alzheimer's

Target–disease evidence (filter to specific datasources):

QUERY = """
query Evidence($ensemblId: String!, $efoId: String!) {
  disease(efoId: $efoId) {
    evidences(ensemblIds: [$ensemblId],
              datasourceIds: ["europepmc","ot_genetics_portal"]) {
      count
      rows { datasourceId score literature }
    }
  }
}
"""

Custom-weighted association scoring (e.g. "genetics-only"):

associatedTargets(
  datasources: [
    { id: "ot_genetics_portal", weight: 1.0, propagate: true, required: true }
    { id: "europepmc",          weight: 0.2, propagate: true, required: false }
  ]
) { rows { target { approvedSymbol } score } }

Drug profile (mechanism, indications, FAERS adverse events):

QUERY = """
query Drug($chemblId: String!) {
  drug(chemblId: $chemblId) {
    id name drugType maximumClinicalStage
    mechanismsOfAction { rows { mechanismOfAction targetName actionType } }
    indications { count rows { disease { id name } maxClinicalStage } }
    adverseEvents(page: { index: 0, size: 10 }) {
      count
      rows { name count logLR }
    }
  }
}
"""
ot_query(QUERY, {"chemblId": "CHEMBL1201583"})  # bevacizumab

Variant annotation (consequence, allele frequencies, credible-set membership):

QUERY = """
query Variant($variantId: String!) {
  variant(variantId: $variantId) {
    id chromosome position referenceAllele alternateAllele rsIds
    mostSevereConsequence { id label }
    alleleFrequencies { populationName alleleFrequency }
    transcriptConsequences {
      target { id approvedSymbol }
      variantConsequences { id label }
      isEnsemblCanonical
    }
  }
}
"""
ot_query(QUERY, {"variantId": "19_44908822_C_T"})  # APOE rs7412

GWAS study metadata (root field study for one ID, studies for batch):

QUERY = """
query Study($studyId: String!) {
  study(studyId: $studyId) {
    id studyType traitFromSource pubmedId publicationFirstAuthor
    nSamples nCases nControls
    diseases { id name }
    credibleSets(page: { index: 0, size: 10 }) {
      count
      rows { studyLocusId region pValueMantissa pValueExponent }
    }
  }
}
"""
ot_query(QUERY, {"studyId": "GCST005194"})  # CAD GWAS

Credible sets + L2G + colocalisation (the former "Genetics Portal" core query):

QUERY = """
query CredibleSets($studyIds: [String!]!) {
  credibleSets(page: { index: 0, size: 25 }, studyIds: $studyIds) {
    count
    rows {
      studyLocusId region
      pValueMantissa pValueExponent
      variant { id rsIds mostSevereConsequence { label } }
      l2GPredictions { rows { target { id approvedSymbol } score } }
      colocalisation { rows { otherStudyLocus { studyId } h4 clpp } }
    }
  }
}
"""
ot_query(QUERY, {"studyIds": ["GCST005194"]})

Step 4 — Iterate / paginate

List fields take page: { index, size }. Don't fetch thousands of rows in one call; if the user wants more, paginate or switch to bulk downloads.

Common Issues

Issue Solution
HTTP 200 but errors in response GraphQL errors come back in the body, not as 4xx — always check payload["errors"]
Cannot query field "X" on type "Y" Schema field name has changed; check the playground or run an introspection query
Empty associatedTargets for a broad disease Add enableIndirect: true to roll up evidence from descendant ontology terms
Symbol/name not recognised Run a search query first; the API only accepts standardised IDs (Ensembl/EFO/ChEMBL/GCST)
Truncated results List fields are paginated — pass page: { index, size } and iterate
Slow or timing out across many IDs Stop and switch to bulk downloads (FTP, BigQuery open-targets-prod, AWS)
Looking for old api.genetics.opentargets.org endpoint Genetics data is now part of the main Platform API; use variant, study, credibleSet fields here

Best Practices

  1. Resolve names → IDs once with search, then cache the IDs
  2. Request only the fields you need — GraphQL gives you exactly what you ask for
  3. Traverse the graph in a single query instead of chaining requests (e.g. disease → associatedTargets → target { tractability })
  4. Always check errors in the response body — not just the HTTP status
  5. Use enableIndirect: true for broad disease terms so descendant evidence is included
  6. Paginate lists with page: { index, size }
  7. ⚠️ Hand off to bulk downloads when the user needs thousands of entities — the docs explicitly discourage looping the API
  8. Cite the data release version in any report (meta { dataVersion { year month } })

Related Skills

References

Official documentation: - API landing page: https://platform.opentargets.org/api

  • API docs: https://platform-docs.opentargets.org/data-access/graphql-api
  • GraphQL playground (with schema): https://api.platform.opentargets.org/api/v4/graphql/browser
  • Schema dump: https://api.platform.opentargets.org/api/v4/graphql/schema
  • Bulk data downloads: https://platform-docs.opentargets.org/data-access/datasets
  • Community / example queries: https://community.opentargets.org/

Citation:

  • Open Targets Platform: Ochoa et al., Nucleic Acids Research (most recent release paper)

License: Data CC0 1.0; API free to use.

Code preview

No Python/R preview files were found.

Companion files

TypePathBytes
MarkdownSKILL.md11,494
JSONskill.meta.json732