Search Mailing List Archives
[bioontology-support] HIV ontology parsing issue [WAS: Ontology Upload]
Jennifer Leigh Vendetti
vendetti at stanford.edu
Mon May 22 14:23:29 PDT 2017
I fixed the problem in the BioPortal code that resulted in missing class data in the RDF / XML serialization, and performed a new release of our software today. The HIV ontology is now fully parsed and available to browse in BioPortal.
Thanks for your patience, and apologies that it took us some time to track this one down.
[cid:1AA118DC-06CE-4A11-86AD-1B2382FBC6F5 at stanford.edu]
On Apr 12, 2017, at 4:54 PM, Jennifer Leigh Vendetti <vendetti at stanford.edu<mailto:vendetti at stanford.edu>> wrote:
I’m writing with an update on the status of the HIV ontology in BioPortal. We’ve made some progress, but aren’t all the way there yet.
We upgraded the BioPortal software to use the latest version of OWL API 4 series (version 4.3.1). With this newer version of the API, our software successfully loads your ontology into memory without errors (this is the first step in our parsing process). The second step of the parsing process involves serializing your ontology into triples that can be stored in BioPortal’s triplestore. This second step initially failed due to some errors in your ontology file. There are some OBO tags that are missing underscores. On line 296, the “preceded_by” tag is missing an underscore, i.e.:
preceded by: HIV:36 ! viral maturation
Also, on lines 944, 963, 972, and 981, the “property_value” tag is missing underscores, e.g.:
property value: host_range human
We fixed these errors in the version of the ontology file on our server by adding the necessary underscores. If you submit new versions to BioPortal in the future, you’ll need to add these underscores in your copy.
The second step in the parsing process now completes without errors. Unfortunately though, we’ve encountered yet another issue where the serialization is missing the class data. So, we’ll need to look into this further.
On Apr 6, 2017, at 4:57 PM, Jennifer Leigh Vendetti <vendetti at stanford.edu<mailto:vendetti at stanford.edu>> wrote:
On Apr 6, 2017, at 2:50 PM, Sabrina Falcon <falcos2 at unlv.nevada.edu<mailto:falcos2 at unlv.nevada.edu>> wrote:
Yes, I think that was what we meant to do, thank you. I changed the version tags to correctly reflect that. From now on we will be using OBO format 1.4.
OK, thanks for the clarification.
After several exchanges with the OWL API developers, the conclusion is that we either need to include an OWL API compatibility module, or upgrade to a more recent version of the OWL API in order to parse your ontology. I’m going to try the route of upgrading to a newer version of the API, and I’ve entered an issue in our tracker for this:
We’ll try to get this completed as soon as we’re able.
In case anyone is interested in the exchange with the OWL API developers, their mailing list archives are public:
Also, I noticed in the link you sent me that in sections 8.1.2 and 8.2.2 the “part_of” tags used not as a stand-alone tag, but as something to be referenced:
8.1.2 File Comments
* If a frame references an identifier, and that identifier is opaque (i.e. it conforms to the Canonical-Prefixed-ID production rule), then the generator should add commments, adding a label for every opaque identifier. For example:
relationship: part_of ABC:1234567 ! hand
relationship: R:9999999 ABC:1234567 ! part_of hand
* All file comments should be preceded by a tag-value pair, and there should be exactly one space character on either side of the '!' character
8.2.2 Relation (Property) Identifiers
Relation identifiers should in follow the same guidelines as class identifiers. Note however that the use of symbolic identifiers such as 'part_of' is common in almost all OBO format ontologies, and has a precedent stretching back over ten years. A large body of software now expects symbolic identifiers for relations, and ontology maintainers are understandably reluctant to change these to numeric identifiers.
This specification provides a means of using numeric identifiers globally whilst retaining symbolic identifiers within the context of a single file. Refer to section 5.9.3<http://owlcollab.github.io/oboformat/doc/obo-syntax.html#5.9.3> for details.
* Every symbolic relation identifier (e.g. 'part_of') should have an xref tag to a formal relation identifier. E.g.
* This xref should refer to either BFO (http://purl.obolibrary.org/obo/bfo.owl) or to RO (http://purl.obolibrary.org/obo/ro.owl). Xrefs to the old RO can be provided for historic purposes, but are otherwise discouraged.
* According to the rules in section 5.9.3<http://owlcollab.github.io/oboformat/doc/obo-syntax.html#5.9.3>, the symbolic relation identifier can be used as a shorthand for the formal relation identifier.
* When roundtripping the OBO file, the symbolic identifiers should be preserved.
It was also not listed in the OBO2OWL mappings https://docs.google.com/spreadsheets/d/1VaERPs9EubExHRlU37fcBDzGRHgCrnoS28ZhDKRrX34/edit#gid=6
Although it is on the Relations Ontology http://obofoundry.org/ontology/ro.html (id: BFO:0000050)
Is the there a possibility the tag is obsolete or is there a problem with the way we integrated it?
I’m not an expert in OBO syntax. I would suggest posting your question about the use of part_of tags on the “OBO Discuss” support list. You can subscribe to that list here: https://lists.sourceforge.net/lists/listinfo/obo-discuss.
bioontology-support mailing list
bioontology-support at lists.stanford.edu<mailto:bioontology-support at lists.stanford.edu>
-------------- next part --------------
An HTML attachment was scrubbed...
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screenshot 2017-05-22 14.19.53.png
Size: 229782 bytes
Desc: Screenshot 2017-05-22 14.19.53.png
More information about the bioontology-support