Search Mailing List Archives


Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort
Limit to: All This Week Last Week This Month Last Month
Select Date Range     through    

[bioontology-support] [ORDO] bioportal issue

Jennifer Leigh Vendetti vendetti at stanford.edu
Wed Aug 23 16:28:51 PDT 2017


Hello Marc,

Apologies that it took me some time to sort this one out. I had to correspond with the developers of the OWL API. They confirmed that there was a bug in their API that generated invalid blank node identifiers when saving ontologies in RDF/XML format. They released a new version of their API, which I have incorporated into BioPortal. I reprocessed ORDO, which is now fully parsed and available.


On Jul 27, 2017, at 8:49 AM, Marc Hanauer <marc.hanauer at inserm.fr<mailto:marc.hanauer at inserm.fr>> wrote:


Hello Jennifer,

The structure was exactly the same in ORDO V2.3 (especially for this exemple of annotation embeded).

Yes, I had noticed that. I think the issue stemmed from the fact that we were using a newer version of the OWL API (v4.3.1) when you submitted ORDO 2.4. The issue didn’t appear to be present in the version of the OWL API that we were using (v4.2.5) when you submitted ORDO 2.3 in December of last year.



The OWL file is code generated and then checked with Protegee.

Thanks for that information.



We have try to reproduce your issue using OWL API but we never succeed to have this kind of axiom.



My correspondence with the OWL API developers included example code for how this problem arose in BioPortal. If anyone is interested in the particulars, the conversation is still available in their public list archive:

https://sourceforge.net/p/owlapi/mailman/owlapi-developer/?viewmonth=201707

Kind regards,
Jennifer




Please feel free to contact our developper Samuel Demarest (in cc) who's in charge of the generating code if you have any other question. I'll be back in september.

Best regards,

www.orpha.net<https://www.orpha.net/> | Twitter @Orphanet<https://twitter.com/Orphanet>
Marc HANAUER
Directeur Adjoint / Deputy Director
Directeur technique / Chief technology officer
Stratégie et Innovation
marc.hanauer at inserm.fr<mailto:marc.hanauer at inserm.fr>

ORPHANET - INSERM US14
Plateforme Maladies Rares / Rare Disease Platform
96 rue Didot
75014 PARIS
FRANCE
Le 14/07/2017 à 18:41, Jennifer Leigh Vendetti a écrit :
Hello Marc,

I don’t have a  solution yet for parsing ORDO 2.4 in BioPortal, but I’ll provide a summary below of what I’ve discovered so far.

The system is having difficulty handling classes in your ontology that have annotations nested 3 levels deep. Here’s a screenshot of your ontology in the Protege ontology editor with class “48,XXYY syndrome” highlighted:


<Screenshot 2017-07-14 09.18.32.png>


Note the second occurrence of hasDbXref in the screenshot where there’s an annotation on an annotation on an annotation.

I think I mentioned in a previous correspondence that we use the OWL API internally for parsing. When the OWL API loads this ontology, it inserts a blank node for these nested annotations with a blank node identifier in the form of “_:genid”. Using the example in the screen shot above, this is a snippet of the ontology source, as loaded by the OWL API:

<Annotation>
  <annotatedSource>
    <Axiom rdf:nodeID="_:genid25">
      <annotatedSource rdf:resource="http://www.orpha.net/ORDO/Orphanet_10"/>
      <annotatedProperty rdf:resource="http://www.geneontology.org/formats/oboInOwl#hasDbXref"/>
      <annotatedTarget rdf:datatype="http://www.w3.org/2001/XMLSchema#string">ICD-10:Q98.8</annotatedTarget>
      <obo1:ECO_0000218 rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Attributed</obo1:ECO_0000218>
    </Axiom>
  </annotatedSource>
  <annotatedProperty rdf:resource="http://purl.obolibrary.org/obo/ECO_0000218"/>
  <annotatedTarget rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Attributed</annotatedTarget>
  <obo1:ECO_0000218 rdf:datatype="http://www.w3.org/2001/XMLSchema#string">NTBT (narrower term maps to a broader term)</obo1:ECO_0000218>
</Annotation>

Although we can load the ontology without errors, the second step in the parsing process fails when we attempt to serialize the data to ntriples format to load it into our RDF store:

Illegal rdf:nodeID value '_:genid25' rapper: Failed to parse file

I went looking for a specification for what is considered valid syntax for blank node identifiers, but so far I haven’t been able to find anything. It’s unclear to me if the OWL API is generating invalid RDF, or if the Raptor RDF parser isn’t handling blank nodes that it should be. I will likely need to correspond with the developers of one or both projects.

Apologies that it hasn’t been straightforward so far to track this down.

On another note, I’m wondering if you could give me some basic information about how your OWL file is generated. Is it done programmatically? Or, do you use an ontology editing environment to maintain the ontology?

Kind regards,
Jennifer



On Jul 10, 2017, at 6:26 PM, Jennifer Leigh Vendetti <vendetti at stanford.edu<mailto:vendetti at stanford.edu>> wrote:

Hello Marc,

I’ve entered an issue in our tracker [1], and will have a look.

Kind regards,
Jennifer

[1] https://github.com/ncbo/bioportal-project/issues/31


On Jul 10, 2017, at 2:11 PM, marc.hanauer at inserm.fr<mailto:marc.hanauer at inserm.fr> wrote:


Hello Jennifer,

Sorry about that, but once again the new version of ORDO (2.4) seems to have issues : 2.3 (Uploaded, Error Rdf)     07/05/2017     07/05/2017
https://bioportal.bioontology.org/ontologies/ORDO


This version was produced the same way than the previous one.

Any ideas ?

Best regards,

---

Marc HANAUER
Directeur Adjoint / Deputy Director
Directeur technique / Chief technology officer
Stratégie et Innovation
marc.hanauer at inserm.fr<mailto:marc.hanauer at inserm.fr>

ORPHANET - INSERM US14
Plateforme Maladies Rares / Rare Disease Platform
96 rue Didot
75014 PARIS
FRANCE


Le 11-08-2016 19:22, Jennifer Leigh Vendetti a écrit :

Hello Marc,

We've addressed the issue I mentioned below.  Version 2.2 of ORDO is parsed and available in BioPortal:

http://bioportal.bioontology.org/ontologies/ORDO

Apologies again that this one took us some time to address.

Best,
Jennifer


On Jul 13, 2016, at 4:21 PM, Jennifer Leigh Vendetti <vendetti at stanford.edu<mailto:vendetti at stanford.edu>> wrote:

Hi Marc,

I am writing with a status report regarding BioPortal's failure to handle ORDO 2.2.

As you may already know, BioPortal uses the OWL API [1] internally for ontology parsing.  I wrote some Java code, independent of the BioPortal application, to make sure the OWL API handles your ontology properly.  The parsing succeeds, which indicates to me that your ontology file is valid.  When the OWL API parses your ontology file, it emits a very large number of "unparsed triple" messages (roughly 15.5K of them), e.g.:

[OWLRDFConsumer.java:1294] Unparsed triple: _:genid120260 -> http://www.w3.org/2002/07/owl#annotatedProperty -> http://purl.obolibrary.org/obo/ECO_0000218
[OWLRDFConsumer.java:1294] Unparsed triple: _:genid120260 -> http://www.w3.org/2002/07/owl#annotatedSource -> _:genid120261

The size of the output from the OWL API is causing a buffer overflow / deadlock in BioPortal's source code.  I am currently working toward fixing the issue, and hope to have something available soon.

Apologies that this one is taking some time for us to track down / fix.

Best,
Jennifer

[1] http://owlcs.github.io/owlapi/



On Jun 27, 2016, at 3:18 AM, Marc Hanauer <marc.hanauer at inserm.fr<mailto:marc.hanauer at inserm.fr>> wrote:


Hello Jennifer,

I've made an update of our ontology (ORDO) last week as usual by putting the file on our own server. This time, the bioportal website seems to try the upload but with error.

The version in our own server is now "2.2" but it appears with this line on bioportal:

2.1 (Uploaded, Error Rdf)       06/22/2016      06/22/2016      OWL<http://data.bioontology.org/ontologies/ORDO/submissions/9/download?apikey=8b5b7825-538d-40e0-9e9e-5ab9274a9aeb> | Diff<http://data.bioontology.org/ontologies/ORDO/submissions/9/download_diff?apikey=8b5b7825-538d-40e0-9e9e-5ab9274a9aeb>

Any idea ?

Best regards,

Marc HANAUER
Directeur Adjoint / Deputy Director
Directeur technique / Chief technology officer
Stratégie et Innovation
marc.hanauer at inserm.fr<mailto:marc.hanauer at inserm.fr>

ORPHANET - INSERM US14
Plateforme Maladies Rares / Rare Disease Platform
96 rue Didot
75014 PARIS
FRANCE
Le 25/02/2016 à 00:41, Jennifer Leigh Vendetti a écrit :
Hello Marc,

On Feb 24, 2016, at 6:45 AM, Marc Hanauer <marc.hanauer at inserm.fr<mailto:marc.hanauer at inserm.fr>> wrote:

Dear support,
We manage the ORDO (Orphanet Rare Disease Ontology) and we have made an update recently (2 days ago).
Usually (for the previous version), we just update the file on our own server (keeping the same URL : http://www.orphadata.org/data/ORDO/ordo_orphanet.owl ) and then the bioportal website automatically upload it.

It seems this is not the case anymore (the version in bioportal remain v2.0 instead of the new 2.1)

Do we need to perform any manual action in order to update it on bioportal ?


You don't need to do anything manual.  We were having some performance issues with our site and temporarily disabled the cron jobs in our production environment that pull new versions of ontologies.  The cron job has been reenabled today, so the new version of your ontology should be pulled and processed overnight.



By the way, with our account, we still have trouble accessing the bioportal interface (when we sign in and try to reach the ORDO page, we obtain an error message which is not the case without login...) (see my previous msg below about this)


Sorry you're having difficulties.  I tried reproducing this behavior here and so far haven't had any luck.  I've viewed your ontology in BioPortal, both when logged in and logged out (screen shot below).  I also verified that your user account of "ORDO_Orphanet" is valid, and I don't see any issues there.  Are you still seeing this behavior?

Jennifer



<Mail Attachment.png>


_______________________________________________
bioontology-support mailing list
bioontology-support at lists.stanford.edu<mailto:bioontology-support at lists.stanford.edu>
https://mailman.stanford.edu/mailman/listinfo/bioontology-support

_______________________________________________
bioontology-support mailing list
bioontology-support at lists.stanford.edu<mailto:bioontology-support at lists.stanford.edu>
https://mailman.stanford.edu/mailman/listinfo/bioontology-support



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.stanford.edu/pipermail/bioontology-support/attachments/20170823/db5ba87b/attachment-0001.html>


More information about the bioontology-support mailing list