Search Mailing List Archives


Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort
Limit to: All This Week Last Week This Month Last Month
Select Date Range     through    

[bioontology-support] Bad URIs used in Mappings for PR

Jennifer Leigh Vendetti vendetti at stanford.edu
Fri Jun 11 14:19:24 PDT 2021


Hi Darren,

Just wanted to make sure I’m on the same page with you. I issued a REST call to get the mappings between PR and CIDO:

https://data.bioontology.org/mappings?ontologies=PR,CIDO&display_context=false&display_links=false

In the results, I assume you’re referring to the following LOOM mapping between the classes in PR and CIDO that have a human readable label of “PGR (human)”:

{
  "id": null,
  "source": "LOOM",
  "classes": [
    {
        "@id": "http://purl.obolibrary.org/obo/HGNC_8910",
        "@type": "http://www.w3.org/2002/07/owl#Class"
    },
    {
        "@id": "http://www.genenames.org/cgi-bin/gene_symbol_report?hgnc_id=8910",
        "@type": "http://www.w3.org/2002/07/owl#Class"
    }
  ],
  "process": null,
  "@id": "",
  "@type": "http://data.bioontology.org/metadata/Mapping"
}

See the rest of my comments inline below.


On Jun 8, 2021, at 3:16 PM, Darren Natale <dan5 at georgetown.edu<mailto:dan5 at georgetown.edu>> wrote:

In looking at mappings between the PRotein Ontology and others--specifically those that were found by LOOM--I noticed that the URLs indicated for PR-defined entities are sometimes incorrect. For example, between PR and CIDO (Coronavirus Infectious Disease Ontology) there is a LOOM-based mapping for "PGR (human)" that, after removing the BioPortal additions to the URL, becomes:

in PR: http://purl.obolibrary.org/obo/HGNC_8910
in CIDO: http://www.genenames.org/cgi-bin/gene_symbol_report?hgnc_id=8910

What's interesting about this is that CIDO actually imports this very same class from PR. The only difference is that CIDO is getting the URI from the OWL version of PR, while the URI supplied by PR appears to be coming from a conversion of the OBO-file CURIE to a URI. Unfortunately, that CURIE-to-URI conversion fails to account for the idspace declaration in the OBO header:

idspace: HGNC http://www.genenames.org/cgi-bin/gene_symbol_report?hgnc_id=

I know that the conversion failure is not a BioPortal issue because we encounter the very same problem when we do the OBO-to-OWL conversion ourselves (I believe it is actually an OWLAPI issue). The difference is that we run a post-processing script to fix them.


So if I understand you correctly, you feel that the class ID for the “PGR (human)” class in the PRotein Ontology in BioPortal is recorded as:

http://purl.obolibrary.org/obo/HGNC_8910

… when instead it should be recorded as:

http://www.genenames.org/cgi-bin/gene_symbol_report?hgnc_id=8910

?

And, the underlying reason is an issue in the OWL API where it ignores the idspace declarations in the OBO file? I looked at the PGR (human) term declaration in pro_reasoned.obo on line 413,549:

[Term]
id: HGNC:8910
name: PGR (human)
namespace: gene
def: "A protein coding gene PGR in human." [PRO:DNx]
comment: Category=external.
is_a: SO:0001217 ! protein_coding_gene
relationship: only_in_taxon NCBITaxon:9606 ! Homo sapiens

I think I see what you mean, in that there’s an HGNC idspace notation that prepends the numeric portion of the ID.

I’m curious - did anyone enter an issue about this in the OWL API’s issue tracker in GitHub? If so, I’d be interested in having a look. We’ve submitted issues against their API before, and they tend to be responsive with implementing fixes.



I see two ways around the problem:

1) Use the OWL file http://purl.obolibrary.org/obo/pr.owl instead of the OBO file currently downloaded by BioPortal, or
2) Run the same converter we do (which I can supply if desired) to post-process the RDF/XML after BioPortal conversion.

I suspect method #1 will be much easier to implement, but I'm not able to make the change myself anymore (I'm not granted editing ability even after signing in and even though I'm the contact person for PR).


We’re somewhat resource constrained at the moment, so I think option #1 would be the more feasible approach. I looked at the list of administrators for the PR - there's only one account listed where the account name actually matches your email address. Perhaps that was an older account of yours? At any rate, if you give me your account name in BioPortal, I can add it to the list of administrators for PR.


I'm also not sure if this is best for users, who up to now have been downloading the reasoned OBO from BioPortal. Would they be forced to use the OWL version, or can BioPortal download both formats?


I’m not sure I know how to answer what’s best for the user community. I think as the ontology maintainer, it’s up to you which version you’d like us to serve. There’s no way in BioPortal currently for an ontology entry to have more than one ontology source file association. In other words, if you changed the PR entry to use the OWL version, there would be no way for an end user to get the reasoned OBO version.

Kind regards,
Jennifer

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.stanford.edu/pipermail/bioontology-support/attachments/20210611/e845e9cb/attachment-0001.html>


More information about the bioontology-support mailing list