Search Mailing List Archives

Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort
Limit to: All This Week Last Week This Month Last Month
Select Date Range     through    

[bioontology-support] extraction of subtree of ontology

Eugene eugene at
Tue Mar 16 17:32:51 PDT 2010

Dear Sir/Madam,

Is there a way to download the whole disease or compound category by just one command?

The example said in the email needs to specify the concept id. However, under disease category, for example, it has more than 10 top-level concepts. So I need to call more than 10 times. Then I need to merge those 10+ files into one file. It is not pretty straightforward.



From: drnigam at [mailto:drnigam at] On Behalf Of Nigam Shah
Sent: Tuesday, March 16, 2010 3:46 PM
To: Eugene; Satnam Alag
Subject: Re: Introductions

BTW, for questions like this emailing support at<mailto:support at> is best. It will go to people who respond to users regularly. I merely end up reading the documentation and replying to you.

On Tue, Mar 16, 2010 at 3:36 PM, Nigam Shah <nigam at<mailto:nigam at>> wrote:
Yes it is. You are not using a conceptid ("Peptides" is not a concept id in MSH). See: to find a valid ID for "Peptides" .. use the search box above the tree view.

For example, for "Melanoma", the ID is D008545. And if I use that, I will get an OWL file from: I just tested it. So the URL is correct, you need to use the right parameters.


On Tue, Mar 16, 2010 at 3:29 PM, Eugene <eugene at<mailto:eugene at>> wrote:

Is the URL right?

Should ontology name have prefix "http"? I use above URL, but I cannot get it.

42142 is MeSH ID (according to the link ).

I have followed the instruction:

Any idea?



From: drnigam at<mailto:drnigam at> [mailto:drnigam at<mailto:drnigam at>] On Behalf Of Nigam Shah
Sent: Tuesday, March 16, 2010 2:31 PM
To: Eugene
Subject: Fwd: Introductions

Might be useful for you to know the options too .. sorry for the short onliners .. trying to multi task at a meeting.


---------- Forwarded message ----------
From: Nigam Shah <nigam at<mailto:nigam at>>
Date: Tue, Mar 16, 2010 at 2:27 PM
Subject: Re: Introductions
To: Satnam Alag <satnam at<mailto:satnam at>>

I am in Seattle right now at a meeting .. so can't get on the phone. Overall, here are the options:

1) The simple, extraction of a sub-tree of an ontology (say the 'disease branch' or MeSH .. or the subtree under 'Melanoma'). That can be done using our production services .. for example the one at:

2) A bit detailed extraction that give more "knobs" beyond just the the sub-tree .. i.e. allow you to 'exclude' terms that are not the right semantic type, that have a high freq in medline, that have the wrong syntactic type on average (say not noun-phrase). That can be done using our prototype Lexicon Builder .. the one at:

3) Getting the compound and diseases sections of multiple (or all) UMLS ontologies; the 13% actually used in medline (per Rong Xu's paper). This data is in mysql tables that are not open to the public. Someone (i.e. me or my student) would have to work with you to figure out the exact query you want to run and then run it.

My guess is the number (2) WILL get you what you are after. However, in order to do that, you (or probably Eugene) need to:

- identify what UMLS ontologies you want (MSH, SNOMEDCT, NCIT, ICD9 .. what else?). You can't do all "UMLS" in the Lexicon Builder.
- make a few trial runs (say with MSH) ... trying different parameters for the semantic types, the syntactic types and the frequency cutoffs; try including, excluding synoyms, using or not using "mappings" to terms from other ontologies. Basically read the Lexicon Builder paper and try out different parameter combinations
- if it gets you what you need, great .. if not, we can iterate to define the right parameter set for you.

If you are convinced that (2) will not work, or you don't want to (or can't) spend time digging into this, we can explore option (3). But that requires my time, which I don't have much right now.


On Tue, Mar 16, 2010 at 1:51 PM, Satnam Alag <satnam at<mailto:satnam at>> wrote:
Is there an easy way for us to get the compound and diseases sections of the UMLS ontology that has been processed by you. We are particularly interested in the 13% of the ontology that is actually used in medline. Is there a good phone number I can call you at? Thx

Satnam Alag
V.P. of Engineering
Ph: 408 582 4160 (C)<>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the bioontology-support mailing list