Retrieving Target Classifications from ChEMBL

There are a number of scenarios when it’s useful to be able to classify protein targets – high level summaries, enrichment calculations and so on. There are a variety of protein classification schemes out there such as PANTHER, SCOP and InterPro. These schemes are based on domains and other structural features. ChEMBL provides it’s own hierarchical classification. Since I use this from time to time, it’s useful to pull all the classifications for a given species, at one go via the SQL below (tested with v17):

1
2
3
4
5
6
7
8
9
10
11
12
13
SELECT
    td.pref_name, description, accession, pfc . *
FROM
    target_dictionary td,
    target_components tc,
    component_sequences cs,
    component_class cc,
    protein_family_classification pfc
WHERE
    td.tax_id = 9606 AND td.tid = tc.tid
        AND tc.component_id = cs.component_id
        AND cc.component_id = cs.component_id
        AND pfc.protein_class_id = cc.protein_class_id;

Leave a Reply

Your email address will not be published. Required fields are marked *