Search Mailing List Archives


Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort
Limit to: All This Week Last Week This Month Last Month
Select Date Range     through    

[bioontology-support] Annotator issues

Michael Dorf mdorf at stanford.edu
Tue Nov 12 11:34:16 PST 2019


Hi Vipina,

Thank you for this report and for giving me the opportunity to investigate it. Indeed, what you are describing appears contrary to my expectations of this functionality. BioPortal uses http://www.w3.org/2004/02/skos/core#altLabel as its default synonym property, so I expected this to work correctly. Yet, after looking at a number of terms defined across different ontologies, it is clear that BioPortal does not handle this property properly. Here are some additional examples from several prominent ontologies that demonstrate the issue:

http://data.bioontology.org/ontologies/SNOMEDCT/classes/http%3A%2F%2Fpurl.bioontology.org%2Fontology%2FSNOMEDCT%2F272379006?display=all<http://data.bioontology.org/ontologies/SNOMEDCT/classes/http://purl.bioontology.org/ontology/SNOMEDCT/272379006?display=all>
http://data.bioontology.org/ontologies/MESH/classes/http%3A%2F%2Fpurl.bioontology.org%2Fontology%2FMESH%2FD009422?display=all<http://data.bioontology.org/ontologies/MESH/classes/http://purl.bioontology.org/ontology/MESH/D009422?display=all>
http://data.bioontology.org/ontologies/NCBITAXON/classes/http%3A%2F%2Fpurl.bioontology.org%2Fontology%2FNCBITAXON%2F2157?display=all<http://data.bioontology.org/ontologies/NCBITAXON/classes/http://purl.bioontology.org/ontology/NCBITAXON/2157?display=all>

In all these cases, BioPortal has chosen the very first defined value as the synonym. Sadly, as with some of your other excellent reports, there is no immediate fix. I’ve logged the issue below and hope to attend to it in the near future:

https://github.com/ncbo/ontologies_linked_data/issues/97

There IS a workaround, if you are OK with making additional modifications to your ontology. It appears that when defining synonyms using either of the properties below, BioPortal correctly handles the multiple values:

http://www.geneontology.org/formats/oboInOwl#hasExactSynonym
http://www.geneontology.org/formats/oboInOwl#hasRelatedSynonym

Examples:

http://data.bioontology.org/ontologies/CHEBI/classes/http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FCHEBI_33250?display=all<http://data.bioontology.org/ontologies/CHEBI/classes/http://purl.obolibrary.org/obo/CHEBI_33250?display=all>
http://data.bioontology.org/ontologies/ENVO/classes/http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FPO_0007033?display=all<http://data.bioontology.org/ontologies/ENVO/classes/http://purl.obolibrary.org/obo/PO_0007033?display=all>

If you are able to revise your synonym definitions using one of the above properties, the multiple values should be be recognized property. Hope this helps!

Thanks again!

Michael



----------------------------------------------------
Michael Dorf
Chief Software Architect
The National Center for Biomedical Ontology
Stanford Biomedical Informatics Research
mdorf at stanford.edu<mailto:mdorf at stanford.edu>
O: 650-723-0357
M: 650-995-4374
----------------------------------------------------

On Nov 11, 2019, at 9:57 AM, Kuttichi Keloth, Vipina <vk396 at njit.edu<mailto:vk396 at njit.edu>> wrote:

Hi Michael,

Thank you for the detailed response. I tried to resolve the first issue that you mentioned - not properly defining the synonyms.
Now for the concept with id 56265001 (Heart disease) we discussed previously the attached figure shows the modified details.

I have used http://www.w3.org/2004/02/skos/core#altLabel to define synonyms and below is the term definition now.

 <!-- http://snomed.info/id/56265001 -->

<owl:Class rdf:about="http://snomed.info/id/56265001">
<rdfs:subClassOf rdf:resource="http://snomed.info/id/301095005"/>
<rdfs:label rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Heart disease</rdfs:label>
<skos:altLabel xml:lang="en">Morbus cordis</skos:altLabel>
<skos:altLabel xml:lang="en">Cardiopathy</skos:altLabel>
<skos:altLabel xml:lang="en">Disorder of heart</skos:altLabel>
<skos:altLabel xml:lang="en">Cardiac disorder</skos:altLabel>
<skos:altLabel xml:lang="en">Heart disease (disorder)</skos:altLabel>
</owl:Class>

Is it not possible to give multiple values to altLabel? I need to have more than one synonyms. What should I do to achieve this?

In the attached figure Synonyms property shows only Cardiopathy and not the other values like Morbus Cordis, Disorder of heart, etc. So I guess Synonyms is taking only the first value of altLabel.

Thank you.
Vipina

On Wed, Oct 23, 2019 at 3:05 PM Michael Dorf <mdorf at stanford.edu<mailto:mdorf at stanford.edu>> wrote:
Hi Vipina,

There are two issues at play here. The first is that the ontology does not properly define synonyms. The second is the earlier mentioned issue that prevents regular Annotator refreshes (https://github.com/ncbo/ncbo_annotator/issues/8).

The term “Cardiac disorder” contains multiple entries of "http://www.w3.org/2000/01/rdf-schema#label”, which does not translate into synonyms. By default, BioPortal uses these properties to define its own reserved property called “synonym”. A custom synonym property is allowed, but the ontology submitter must specify it on the metadata form when uploading the ontology.

http://www.w3.org/2004/02/skos/core#altLabel
http://www.geneontology.org/formats/oboInOwl#hasBroadSynonym
http://purl.obolibrary.org/obo/synonym
http://www.geneontology.org/formats/oboInOwl#hasExactSynonym
http://www.geneontology.org/formats/oboInOwl#hasNarrowSynonym
http://www.geneontology.org/formats/oboInOwl#hasRelatedSynonym

The default “prefLabel" property is set using one these properties (unless the user specified a custom one):

http://www.w3.org/2004/02/skos/core#prefLabel
http://www.w3.org/2000/01/rdf-schema#label
http://data.bioontology.org/metadata/def/prefLabel

As you can see, the "http://www.w3.org/2000/01/rdf-schema#label” property is used to determine a prefLabel for the term. When there are multiple entries, BioPortal either picks the English one (if multiple languages are used), or the very first one as the de-facto prefLabel.

In your older version of the ontology (v6), the term “Cardiac disorder” is defined as:

    <!-- http://snomed.info/id/56265001 -->

    <owl:Class rdf:about="http://snomed.info/id/56265001">
        <rdfs:label rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Heart disease</rdfs:label>
    </owl:Class>

In the latest version, the same term is defined as:

    <!-- http://snomed.info/id/56265001 -->

    <owl:Class rdf:about="http://snomed.info/id/56265001">
        <rdfs:label rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Cardiac disorder</rdfs:label>
        <rdfs:label rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Cardiopathy</rdfs:label>
        <rdfs:label rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Cardiopathy, NOS</rdfs:label>
        <rdfs:label rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Disorder of heart</rdfs:label>
        <rdfs:label rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Heart disease</rdfs:label>
        <rdfs:label rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Heart disease (disorder)</rdfs:label>
        <rdfs:label rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Heart disease, NOS</rdfs:label>
        <rdfs:label rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Morbus cordis</rdfs:label>
        <rdfs:label rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Morbus cordis, NOS</rdfs:label>
    </owl:Class>

Because of the Annotator refresh issue, both, the original prefLabel, "Heart disease” and the new prefLabel, “Cardiac disorder” are being stored in the Annotator, that’s why you're seeing two annotations in your results. However, there are NO matches on synonyms, since none are defined. Note the result of this API call:

http://data.bioontology.org/ontologies/CC_SNOMED/classes/http%3A%2F%2Fsnomed.info%2Fid%2F56265001?display=all<http://data.bioontology.org/ontologies/CC_SNOMED/classes/http://snomed.info/id/56265001?display=all>

As you can see, the “synonym” property contains an empty array.

I am going to try to refresh our Annotator repository in the upcoming days, so that at least for the time being it is synchronized with the latest version of CC_SNOMED.

Thanks again for your inquiry.

Michael


----------------------------------------------------
Michael Dorf
Chief Software Architect
The National Center for Biomedical Ontology
Stanford Biomedical Informatics Research
mdorf at stanford.edu<mailto:mdorf at stanford.edu>
O: 650-723-0357
M: 650-995-4374
----------------------------------------------------


On Oct 23, 2019, at 9:40 AM, Kuttichikeloth, Vipina <vk396 at njit.edu<mailto:vk396 at njit.edu>> wrote:

I am sorry about not explaining it properly.
I am talking about "Cardiac disorder" and sorry that "24 hour ....." was highlighted.
From the synonyms of Cardiac disorder in the attached figure, you can see that "Morbus cordis" and "Cardiopathy" and "Heart disease" are synonyms.
Now when I give the text "Cardiac disorder, Heart disease, Morbus cordis, Cardiopathy" to the Annotator,  Cardiac disorder and Heart disease are annotated to  "Cardiac disorder" (preferred label in this case) as you can see in the second screenshot I attached while  "Morbus cordis and Cardiopathy" are not annotated.

On Wed, Oct 23, 2019 at 12:28 PM Michael Dorf <mdorf at stanford.edu<mailto:mdorf at stanford.edu>> wrote:
Hi Vipina,

Thank you for reaching out. I’ve looked at the term you are referring to, but I am seeing a different picture on my end. None of the synonyms from your screenshot are listed under the term “24 hour diastolic blood pressure (observable entity)”:

<Screen Shot 2019-10-23 at 9.17.50 AM.png>

These synonyms ARE, however, listed under the term “Cardiac disorder” (see below), so your Annotator results appear to be consistent. I’ve flushed our caches, but it hasn’t changed the structure of the “24 hour….” term.

<Screen Shot 2019-10-23 at 9.20.10 AM.png>


The REST endpoint results are also consistent with my screenshot above:

http://data.bioontology.org/ontologies/CC_SNOMED/classes/http%3A%2F%2Fsnomed.info%2Fid%2F314465004?display=all<http://data.bioontology.org/ontologies/CC_SNOMED/classes/http://snomed.info/id/314465004?display=all>

Is it possible that that page is somehow browser-cached on your end?

Thanks,

Michael




On Oct 23, 2019, at 8:55 AM, Kuttichikeloth, Vipina <vk396 at njit.edu<mailto:vk396 at njit.edu>> wrote:

Hi,

I am having an issue trying to annotate synonyms. Some synonyms are getting annotated while some others are not.
For example, for the concept "Cardiac disorder" the first image attached shows all the alternate labels for the concept in CC_SNOMED.
The second image is the result of the Annotator.
I read about sort of a similar issue here https://github.com/ncbo/ncbo_annotator/issues/6 and it was mentioned that it has something to do with the way the ontology is designed.

Is this also an issue with the design of the ontology? But then why are some synonyms getting annotated.

Thank you,
Vipina

On Thu, Oct 17, 2019 at 5:21 PM Kuttichikeloth, Vipina <vk396 at njit.edu<mailto:vk396 at njit.edu>> wrote:
Thank you so much for your help and explanation.

~Vipina



On Thu, Oct 17, 2019 at 5:14 PM Michael Dorf <mdorf at stanford.edu<mailto:mdorf at stanford.edu>> wrote:
Hi Vipina,

I was finally able to identify the issue that was causing the behavior you reported. The culprit is an earlier version of CC_SNOMED ontology, in which the terms in question contained different labels:

Earlier version (cc_v6.owl):

    <!-- http://snomed.info/id/312523009 -->

    <owl:Class rdf:about="http://snomed.info/id/312523009">
        <rdfs:subClassOf rdf:resource="http://snomed.info/id/279316009"/>
        <rdfs:label rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Left</rdfs:label>
    </owl:Class>

Latest version:

    <!-- http://snomed.info/id/312523009 -->

    <owl:Class rdf:about="http://snomed.info/id/312523009">
        <rdfs:subClassOf rdf:resource="http://snomed.info/id/279316009"/>
        <rdfs:label rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Left (non-mitral) atrioventricular valve structure</rdfs:label>
        <rdfs:label rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Left (non-mitral) atrioventricular valve structure (body structure)</rdfs:label>
    </owl:Class>

The same applies to the other false match, the term with the id "http://snomed.info/id/312524003”:<http://snomed.info/id/312524003%E2%80%9D:>

Earlier version (cc_v6.owl):

    <!-- http://snomed.info/id/312524003 -->

    <owl:Class rdf:about="http://snomed.info/id/312524003">
        <rdfs:subClassOf rdf:resource="http://snomed.info/id/279316009"/>
        <rdfs:label rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Right</rdfs:label>
    </owl:Class>

Latest version:

    <!-- http://snomed.info/id/312524003 -->

    <owl:Class rdf:about="http://snomed.info/id/312524003">
        <rdfs:subClassOf rdf:resource="http://snomed.info/id/279316009"/>
        <rdfs:label rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Right (non-tricuspid) atrioventricular valve structure</rdfs:label>
        <rdfs:label rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Right (non-tricuspid) atrioventricular valve structure (body structure)</rdfs:label>
    </owl:Class>

We currently have a limitation in the Annotator that prevents us from removing older records on ontology re-processing. So when the pref labels got changed, both the old and the new values ended up being stored in the Annotator. As a workaround for this specific case, I have removed the old values from the Annotator datastore. Your text "left right left” no longer returns annotations. I’ve logged this issue in our Github repository:

https://github.com/ncbo/ncbo_annotator/issues/8

Thank you again for your patience.

Michael


----------------------------------------------------
Michael Dorf
Chief Software Architect
The National Center for Biomedical Ontology
Stanford Biomedical Informatics Research
mdorf at stanford.edu<mailto:mdorf at stanford.edu>
O: 650-723-0357
M: 650-995-4374
----------------------------------------------------

On Oct 15, 2019, at 2:01 PM, Michael Dorf <mdorf at stanford.edu<mailto:mdorf at stanford.edu>> wrote:

Hi Vipina,

Thanks for your report. I need a bit of time to investigate this. Indeed, this does not appear to be an expected behavior. I’ll keep you posted as soon as I get to the bottom of it.

Thank you for your patience.

Michael



----------------------------------------------------
Michael Dorf
Chief Software Architect
The National Center for Biomedical Ontology
Stanford Biomedical Informatics Research
mdorf at stanford.edu<mailto:mdorf at stanford.edu>
O: 650-723-0357
M: 650-995-4374
----------------------------------------------------

On Oct 15, 2019, at 9:12 AM, Kuttichikeloth, Vipina <vk396 at njit.edu<mailto:vk396 at njit.edu>> wrote:

Thank you for the response. I played a little bit more with the ontology to add multiple labels to each concept (synonyms) and then annotating some text.
I noticed that every mention of "left" or "right" in the text is getting annotated with a concept as shown in the attached image.
I gave the text to be annotated as "left right left".
I thought I understood what you said in your response, but in that case (and if I understood it correctly), it shouldn't be matching "left" which is a part of the entire label
"Left (non-mitral) atrioventricular valve structure". Am I missing something else here?

Thank you.
Vipina

On Sat, Oct 5, 2019 at 9:37 PM John Graybeal <jgraybeal at stanford.edu<mailto:jgraybeal at stanford.edu>> wrote:
Hi Vipina,

Our team discussed your experience Friday and I am offering our latest thoughts for the list, and some of my own observations. (I have the specific example you sent Jennifer but have not copied it here.)

When I remove '(finding)' from Jennifer's example, I still see an annotation for CC-SNOMED. It does not matter what boxes are or are not checked, this still works.

Now, I see that your class (https://bioportal.bioontology.org/ontologies/CC_SNOMED?p=classes&conceptid=http%3A%2F%2Fsnomed.info%2Fid%2F248714006<https://bioportal.bioontology.org/ontologies/CC_SNOMED?p=classes&conceptid=http://snomed.info/id/248714006>) is currently called simply 'abdominal aortic bruit', and so it seems likely that you have removed the parenthetical expressions from your latest submission. Based on some further examples (using 'abdominal' and 'abdominal aortic'), here is the likely explanation.

When Annotator is looking for matches, it is only looking for class names that exactly match a string in the source text. (If Annotator matched any class that *contained* a string in your text, there would be an impossibly large number of matches for any significant text, and the vast majority of them would be incorrect.)

So for example, Annotator will not match 'abdominal' to the class 'abdominal aortic bruit', because text describing 'abdominal pain' should not result in a match to that detailed class.

So if we reconsider your class name that contains '(finding)', if you remove that from the text that you are trying to annotate, the Annotator will no longer consider it a match (because Annotator will not want to assume the more general phrase 'abdominal aortic bruit' matches the more specific phrase 'abdominal aortic bruit (finding)'). This seems appropriately conservative, considering the above example of matching 'abdominal' to the longer phrase.

Checking 'Match partial words' does not have any impact on this behavior; it is simply used to enable a short word to match a longer word, for example, to match suffixed words.

On another point, if you compare your results with the matches found in SNOMED-CT, there you will find matches to the shorter strings—but only because SNOMED-CT has those exact shorter strings as classes.

This matching behavior is different than searching, which will match any appearance of the string in a class name, but will prioritize more complete matches. This is appropriate also, to let people find classes that may have only some of the entered search string.

I hope this explains your experience so far. Pleae feel free to follow up with further questions or concerns.

John


On Wed, Oct 2, 2019 at 2:31 PM Jennifer Leigh Vendetti <vendetti at stanford.edu<mailto:vendetti at stanford.edu>> wrote:
Hello Vipina,

Could you give me a specific example of the text you are entering where you don’t see results that you think should be present? I looked at your ontology with the Annotator just now. I entered the text of one of your class names in the Annotator and selected CC_SNOMED under advanced options. The Annotator returned results as I would have expected:


<Screenshot 2019-10-02 11.27.49.png>

Kind regards,
Jennifer



On Oct 2, 2019, at 8:57 AM, Kuttichikeloth, Vipina <vk396 at njit.edu<mailto:vk396 at njit.edu>> wrote:

Hi,

Recently I uploaded an ontology named Cardiology Component of SNOMED CT (CC_SNOMED). The visibility is set to private. I can see and browse the classes on BioPortal.

I want to annotate some medical text using this ontology that I created. In the advanced options under "select ontologies", I am able to select CC_SNOMED but when I click "Get annotations" no results appear. I have tried putting the concepts that exist in CC_SNOMED in the medical text still no results. Also, if I select SNOMED CT then the text gets annotated and all these concepts are shown. Is there anything I am missing and please let me know why this is not working?

Thank you.
Vipina





_______________________________________________
bioontology-support mailing list
bioontology-support at lists.stanford.edu<mailto:bioontology-support at lists.stanford.edu>
https://mailman.stanford.edu/mailman/listinfo/bioontology-support

On Oct 2, 2019, at 11:31 AM, Jennifer Leigh Vendetti <vendetti at stanford.edu<mailto:vendetti at stanford.edu>> wrote:

Hello Vipina,

Could you give me a specific example of the text you are entering where you don’t see results that you think should be present? I looked at your ontology with the Annotator just now. I entered the text of one of your class names in the Annotator and selected CC_SNOMED under advanced options. The Annotator returned results as I would have expected:


<Screenshot 2019-10-02 11.27.49.png>

Kind regards,
Jennifer



On Oct 2, 2019, at 8:57 AM, Kuttichikeloth, Vipina <vk396 at njit.edu<mailto:vk396 at njit.edu>> wrote:

Hi,

Recently I uploaded an ontology named Cardiology Component of SNOMED CT (CC_SNOMED). The visibility is set to private. I can see and browse the classes on BioPortal.

I want to annotate some medical text using this ontology that I created. In the advanced options under "select ontologies", I am able to select CC_SNOMED but when I click "Get annotations" no results appear. I have tried putting the concepts that exist in CC_SNOMED in the medical text still no results. Also, if I select SNOMED CT then the text gets annotated and all these concepts are shown. Is there anything I am missing and please let me know why this is not working?

Thank you.
Vipina





_______________________________________________
bioontology-support mailing list
bioontology-support at lists.stanford.edu<mailto:bioontology-support at lists.stanford.edu>
https://mailman.stanford.edu/mailman/listinfo/bioontology-support

_______________________________________________
bioontology-support mailing list
bioontology-support at lists.stanford.edu<mailto:bioontology-support at lists.stanford.edu>
https://mailman.stanford.edu/mailman/listinfo/bioontology-support

========================
John Graybeal
Technical Program Manager
Center for Expanded Data Annotation and Retrieval /+/ NCBO BioPortal
Stanford Center for Biomedical Informatics Research
650-736-1632


<annotator.png>

_______________________________________________
bioontology-support mailing list
bioontology-support at lists.stanford.edu<mailto:bioontology-support at lists.stanford.edu>
https://mailman.stanford.edu/mailman/listinfo/bioontology-support

<cardiac disorder.png><cardiac_annotated.png>


<heart_disease.png>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.stanford.edu/pipermail/bioontology-support/attachments/20191112/8b04dba0/attachment-0001.html>


More information about the bioontology-support mailing list