Search Mailing List Archives
[bioontology-support] term mapping experience in BioPortal appliance
markampa at pennmedicine.upenn.edu
Thu Dec 19 10:59:39 PST 2019
Thanks, John. That’s a great response.
I guess my two follow-up questions at this point are:
1. How can I programmatically/remotely issue a SPARQL query to 4store? I posted a second email on the 19th that describes an error I’m experiencing.
2. What general diagnosing steps should I take when an ontology appears as uploaded but not parsed, indeed, etc. even after several days (with 0% CPU utilization most of that time)? Similarly, what should I do when I get the “there was a problem and we have been notified” message (roughly speaking). Are there any temp directories or logs that I should turn towards first?
Thanks again, and happy holidays.
From: John Graybeal [mailto:jgraybeal at stanford.edu]
Sent: Thursday, December 19, 2019 1:58 AM
To: Miller, Mark <markampa at pennmedicine.upenn.edu>
Cc: support at bioontology.org
Subject: [External] Re: [bioontology-support] term mapping experience in BioPortal appliance
Thanks for this question. I believe this represents a complete answer, but our team might refine it once they see it.
The mapping process involves a number of intermediate steps, and the mapping information is preserved in the triple store using intermediate data. So you can not get "the mappings" in a directly downloadable way.
In BioPortal and the Virtual Appliance, the 'intermediate data' for mappings are generated as soon as the ontology is received and parsed, You can find the .ttl files containing intermediate triples in the ontology submission folder; these triples are submitted to 4store. 4store is then queried to satisfy user requests for mappings.
So you can likewise access the mappings with the right queries to 4store, but before you do so, be aware of the reason for storing the data in an intermediate format. In BioPortal, there is a lot of re-use of ontologies and terms (particularly acute in Views (ontologies that are subsets of other ontologies), and in reference ontologies like NCIT). So there are a very large number of combinations of mappings.
For example, imagine there are 100 terms labelled "Heart Disease" (actually there are 174)—every term is mapped to 99 other terms (and for obscure reasons to itself), so there would be 10,000 mappings for that one term if all of them were explicitly stated. Instead we map all 100 terms to a string representation ("heartdisease"), and than instantiate the necessary mappings for users when they are requested. Considering BioPortal has 10 million terms, the number of opportunities for these to similarly multiply are significant.
I'll leave it there for now, we can provide more specific details if you need that after you've looked at the TTL files.
On Dec 17, 2019, at 11:07 AM, Miller, Mark <markampa at pennmedicine.upenn.edu<mailto:markampa at pennmedicine.upenn.edu>> wrote:
I really like the BioPortal in general and especially the term mapper. I'd like to retrieve a large number of mappings on a frequent basis, so I was thinking of doing that in the AWS appliance.
I see that mappings were generated for two ontologies I loaded into an AWS instance some time ago, but I don’t think I could find the crontab entry that triggered it. I had poked around in /etc/crontab and /etc/anacrontab
Once the mappings are created, can I retrieve them through any route other than the REST API, since I’ll be logged directly into the Bioportal Appliance? Like via Solr, Redis or 4store?
bioontology-support mailing list
bioontology-support at lists.stanford.edu<mailto:bioontology-support at lists.stanford.edu>
Technical Program Manager
Center for Expanded Data Annotation and Retrieval /+/ NCBO BioPortal
Stanford Center for Biomedical Informatics Research
650-736-1632 | ORCID 0000-0001-6875-5360
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the bioontology-support