Search Mailing List Archives


Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort
Limit to: All This Week Last Week This Month Last Month
Select Date Range     through    

[bioontology-support] Bad URIs used in Mappings for PR

Darren Natale dan5 at georgetown.edu
Fri Jun 11 14:36:16 PDT 2021


Hi Jennifer,

Responses inline.

UPDATE: I do now have editing access. Thanks!

On 6/11/2021 5:19 PM, Jennifer Leigh Vendetti wrote:
> Hi Darren,
>
> Just wanted to make sure I’m on the same page with you. I issued a 
> REST call to get the mappings between PR and CIDO:
>
> https://data.bioontology.org/mappings?ontologies=PR,CIDO&display_context=false&display_links=false 
> <https://data.bioontology.org/mappings?ontologies=PR,CIDO&display_context=false&display_links=false>
>
> In the results, I assume you’re referring to the following LOOM 
> mapping between the classes in PR and CIDO that have a human readable 
> label of “PGR (human)”:

Yes, that is correct.

> {
>   "id": null,
>   "source": "LOOM",
>   "classes": [
>     {
>         "@id": "http://purl.obolibrary.org/obo/HGNC_8910 
> <http://purl.obolibrary.org/obo/HGNC_8910>",
>         "@type": "http://www.w3.org/2002/07/owl#Class 
> <http://www.w3.org/2002/07/owl#Class>"
>     },
>     {
>         "@id": 
> "http://www.genenames.org/cgi-bin/gene_symbol_report?hgnc_id=8910 
> <http://www.genenames.org/cgi-bin/gene_symbol_report?hgnc_id=8910>",
>         "@type": "http://www.w3.org/2002/07/owl#Class 
> <http://www.w3.org/2002/07/owl#Class>"
>     }
>   ],
>   "process": null,
>   "@id": "",
>   "@type": "http://data.bioontology.org/metadata/Mapping 
> <http://data.bioontology.org/metadata/Mapping>"
> }
>
> See the rest of my comments inline below.
>
>> On Jun 8, 2021, at 3:16 PM, Darren Natale <dan5 at georgetown.edu 
>> <mailto:dan5 at georgetown.edu>> wrote:
>>
>> In looking at mappings between the PRotein Ontology and 
>> others--specifically those that were found by LOOM--I noticed that 
>> the URLs indicated for PR-defined entities are sometimes incorrect. 
>> For example, between PR and CIDO (Coronavirus Infectious Disease 
>> Ontology) there is a LOOM-based mapping for "PGR (human)" that, after 
>> removing the BioPortal additions to the URL, becomes:
>>
>> in PR: http://purl.obolibrary.org/obo/HGNC_8910 
>> <http://purl.obolibrary.org/obo/HGNC_8910>
>> in CIDO: 
>> http://www.genenames.org/cgi-bin/gene_symbol_report?hgnc_id=8910 
>> <http://www.genenames.org/cgi-bin/gene_symbol_report?hgnc_id=8910>
>>
>> What's interesting about this is that CIDO actually imports this very 
>> same class from PR. The only difference is that CIDO is getting the 
>> URI from the OWL version of PR, while the URI supplied by PR appears 
>> to be coming from a conversion of the OBO-file CURIE to a URI. 
>> Unfortunately, that CURIE-to-URI conversion fails to account for the 
>> idspace declaration in the OBO header:
>>
>> idspace: HGNC 
>> http://www.genenames.org/cgi-bin/gene_symbol_report?hgnc_id= 
>> <http://www.genenames.org/cgi-bin/gene_symbol_report?hgnc_id=>
>>
>> I know that the conversion failure is not a BioPortal issue because 
>> we encounter the very same problem when we do the OBO-to-OWL 
>> conversion ourselves (I believe it is actually an OWLAPI issue). The 
>> difference is that we run a post-processing script to fix them.
>
>
> So if I understand you correctly, you feel that the class ID for the 
> “PGR (human)” class in the PRotein Ontology in BioPortal is recorded as:
>
> http://purl.obolibrary.org/obo/HGNC_8910 
> <http://purl.obolibrary.org/obo/HGNC_8910>
>
> … when instead it should be recorded as:
>
> http://www.genenames.org/cgi-bin/gene_symbol_report?hgnc_id=8910 
> <http://www.genenames.org/cgi-bin/gene_symbol_report?hgnc_id=8910>
>
> ?

Yes, that too is correct.

> And, the underlying reason is an issue in the OWL API where it ignores 
> the idspace declarations in the OBO file?

Yes, that's my suspicion.

> I looked at the PGR (human) term declaration in pro_reasoned.obo on 
> line 413,549:
>
> [Term]
> id: HGNC:8910
> name: PGR (human)
> namespace: gene
> def: "A protein coding gene PGR in human." [PRO:DNx]
> comment: Category=external.
> is_a: SO:0001217 ! protein_coding_gene
> relationship: only_in_taxon NCBITaxon:9606 ! Homo sapiens
>
> I think I see what you mean, in that there’s an HGNC idspace notation 
> that prepends the numeric portion of the ID.
>
> I’m curious - did anyone enter an issue about this in the OWL API’s 
> issue tracker in GitHub? If so, I’d be interested in having a look. 
> We’ve submitted issues against their API before, and they tend to be 
> responsive with implementing fixes.

I'm unaware of a submitted issue on this. But, it's so obvious an error 
that has persisted over many years that I presume it is 
intentional--well, perhaps that's the wrong word. I mean I suspect that 
dealing with the peculiarities of OBO format is out of scope for the API.

>>
>> I see two ways around the problem:
>>
>> 1) Use the OWL file http://purl.obolibrary.org/obo/pr.owl 
>> <http://purl.obolibrary.org/obo/pr.owl> instead of the OBO file 
>> currently downloaded by BioPortal, or
>> 2) Run the same converter we do (which I can supply if desired) to 
>> post-process the RDF/XML after BioPortal conversion.
>>
>> I suspect method #1 will be much easier to implement, but I'm not 
>> able to make the change myself anymore (I'm not granted editing 
>> ability even after signing in and even though I'm the contact person 
>> for PR).
>
>
> We’re somewhat resource constrained at the moment, so I think option 
> #1 would be the more feasible approach. I looked at the list of 
> administrators for the PR - there's only one account listed where the 
> account name actually matches your email address. Perhaps that was an 
> older account of yours? At any rate, if you give me your account name 
> in BioPortal, I can add it to the list of administrators for PR.

No, that's my current account. I am able to log in without trouble. It's 
just that the 'edit' button is no where to be found. In the past I was 
able to edit with no issue.

>> I'm also not sure if this is best for users, who up to now have been 
>> downloading the reasoned OBO from BioPortal. Would they be forced to 
>> use the OWL version, or can BioPortal download both formats?
>
> I’m not sure I know how to answer what’s best for the user community. 
> I think as the ontology maintainer, it’s up to you which version you’d 
> like us to serve. There’s no way in BioPortal currently for an 
> ontology entry to have more than one ontology source file association. 
> In other words, if you changed the PR entry to use the OWL version, 
> there would be no way for an end user to get the reasoned OBO version.

That's what I thought. I'll have to weigh the pros and cons of making 
such a change once I have access.

> Kind regards,
> Jennifer
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.stanford.edu/pipermail/bioontology-support/attachments/20210611/ada03612/attachment.html>


More information about the bioontology-support mailing list