Search Mailing List Archives


Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort
Limit to: All This Week Last Week This Month Last Month
Select Date Range     through    

[bioontology-support] NCBO Annotator

Clement Jonquet jonquet at lirmm.fr
Sun Nov 22 23:07:07 PST 2020


Dear Pranhav, 

>> The NCBO annotator sometimes annotates the same word multiple times, and I was wondering if there would be any setting to have every word only be annotated once?

I can provide you an explanation but you will need to precise which situation you’re referring to: 

1- Several same words are identified as label of different concepts in different ontologies. So several annotations are being produced. This is the expected behavior as the Annotator is a service to annotate text data with all the ontologies in AgroPortal. To reduce the number of duplicates, reduce the number of ontology targets. The only consequences it that you are going to maybe miss unique annotation that would be done with an ontology. 

2- Several occurrences in your text of the same world will produce several match with different position in text. So several annotations are being produced. This is also the expected behavior. The idea of the Annotator is to produce an annotation for every fragment of text possible in your text (modulo the longestOnly and partialWorlds parameters). They maybe look the same, but they are not. They differ by the character offsets. If you do not care about the full list of annotations but like to have a global view on your text (i.e., what are the concepts annotating your text overall) then you can use the scoring parameter included in the AnnotatorPlus and then ignore the duplicates. The role of the score parameter is to score each annotations independently and then sum up all the score of the annotations done with the same concept into one unique score. You can then ignore the duplicates when dealing with the results. 

3- If you see other type of duplicate results, it could happen, this will be a bug on the service side, often because of a dictionary not refreshed. 

Regards
Clement


-------------------------------------------------------------------------------------------
Dr. Clement JONQUET  –  PhD in Informatics
Associate Research Scientist – INRAE (MISTEA)
Associate Professor – University of Montpellier (LIRMM)
-------------------------------------------------------------------------------------------

> Le 21 nov. 2020 à 06:03, John Graybeal <jgraybeal at stanford.edu> a écrit :
> 
> Hello,
> 
> No, with apologies there is no option in the UI like this. In the API you could choose to only 'look at' the first annotation for each term or phrase, and I believe choosing 'match longest' will eliminate matches of terms within other terms. But it the UI all annotations are returned. 
> 
> I created ticket https://github.com/ncbo/bioportal-project/issues/192 <https://github.com/ncbo/bioportal-project/issues/192> to capture this issue. Thank you for your request!
> 
> John
> 
>> On Nov 19, 2020, at 2:37 PM, Pranhav Sundararajan <pranhav16 at gmail.com <mailto:pranhav16 at gmail.com>> wrote:
>> 
>> Hello,
>> 
>> The NCBO annotator sometimes annotates the same word multiple times, and I was wondering if there would be any setting to have every word only be annotated once?
>> 
>> Thank you
>> 
>> On Tue, Nov 17, 2020 at 6:14 PM Michael Dorf <mdorf at stanford.edu <mailto:mdorf at stanford.edu>> wrote:
>> Hi Pranhav,
>> 
>> Thank you for contacting us. Please check out or REST sample code repository here:
>> 
>> https://github.com/ncbo/ncbo_rest_sample_code <https://github.com/ncbo/ncbo_rest_sample_code>
>> 
>> The Annotator java example can be referenced from here:
>> 
>> https://github.com/ncbo/ncbo_rest_sample_code/blob/master/java/src/AnnotateText.java <https://github.com/ncbo/ncbo_rest_sample_code/blob/master/java/src/AnnotateText.java>
>> 
>> Doing a quick Github search for “ncbo annotator” also yields a number of code sources that reference our Annotator REST API that you could adopt in your application:
>> 
>> https://github.com/search?q=ncbo+annotator <https://github.com/search?q=ncbo+annotator>
>> 
>> Hope this helps and you find the code you are looking for!
>> 
>> Michael
>> 
>> 
>>> On Nov 17, 2020, at 9:25 AM, Pranhav Sundararajan <pranhav16 at gmail.com <mailto:pranhav16 at gmail.com>> wrote:
>>> 
>>> Hello,
>>> 
>>> I'm trying to use the NCBO Annotator in a program that will allow for multiple datasets and files to be annotated. Where can I find the code that will allow me to use the Annotator programmatically?
>>> 
>>> Thank you
>>> _______________________________________________
>>> bioontology-support mailing list
>>> bioontology-support at lists.stanford.edu <mailto:bioontology-support at lists.stanford.edu>
>>> https://mailman.stanford.edu/mailman/listinfo/bioontology-support <https://mailman.stanford.edu/mailman/listinfo/bioontology-support>
>> 
>> _______________________________________________
>> bioontology-support mailing list
>> bioontology-support at lists.stanford.edu <mailto:bioontology-support at lists.stanford.edu>
>> https://mailman.stanford.edu/mailman/listinfo/bioontology-support
> 
> ========================
> John Graybeal
> Technical Program Manager
> Center for Expanded Data Annotation and Retrieval /+/ NCBO BioPortal
> Stanford Center for Biomedical Informatics Research
> 650-736-1632  | ORCID  0000-0001-6875-5360
> 
> 
> 
> _______________________________________________
> bioontology-support mailing list
> bioontology-support at lists.stanford.edu
> https://mailman.stanford.edu/mailman/listinfo/bioontology-support

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.stanford.edu/pipermail/bioontology-support/attachments/20201123/57a2bcb9/attachment.html>


More information about the bioontology-support mailing list