Search Mailing List Archives


Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort
Limit to: All This Week Last Week This Month Last Month
Select Date Range     through    

[bioontology-support] About the issue of annotating

John Graybeal jgraybeal at stanford.edu
Wed Mar 17 19:13:11 PDT 2021


Hello,

I confess to not having a complete understanding of all the nuances of the annotator, so I'm afraid I can't fully satisfy your request for a detailed explanation.  If you have not done so already, I suggest you search the previous questions about the annotator in our Nabble repo of support emails (http://ncbo-support.2288202.n4.nabble.com/).

It seems to me that '6-10', '11-15', and '1-5' all pass as tokens >= 3 characters; '6' and '> 15' do not ('> 15' is 2 tokens separated by a space, and each token is 1 or 2 characters).

I can't speak to the details of the second set of strings. we'll see if anyone else in the team or on this list can speak to them.

You may wish to read the top publications on the Annotator, so that you can understand its detailed operations more thoroughly. I found a useful list at https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=BioPortal+annotator<https://scholar.google.com/scholar?hl=en&as_sdt=0,5&q=BioPortal+annotator>, most of these are openly available. In particular, the Annotator+ documented in https://academic.oup.com/bioinformatics/article/34/11/1962/4802221?login=true may have additional features that are useful to you.

I'm most sorry that I can't give you an in-depth and precise response right away, the tools were developed before my time. But I have contacted a primary author of the annotation software to see if he can add to this response.

John

On Mar 17, 2021, at 6:42 PM, Zhou, Shuxin <sz23 at njit.edu<mailto:sz23 at njit.edu>> wrote:

Hi Mr.Graybeal,

Thank you for the quick response! I do appreciate it very much!
In addition to the issue, we tried some other concepts which are included in the RCIT_A1 ontology as well.
The problem is:
(1). For the classes ending with "L/min", 3 out of 6 are annotated out. Why half of them can be annotated, but the rest cannot?
(2). For concepts like: "3 days ago", "3 days later", "three days ago" ( which is the synonym of "3 days ago") are not successfully recognized, but wrongly annotated by another class "3 days".
      Since they are 3-character concepts, they should meet with the indexing rule you mentioned.

I apologize for disturbing you again, but our team needs to know the reason for such annotation results, then we can continue our research on the basis of that.
Again, thank you very much for the help!
<Screen Shot 2021-03-17 at 9.14.18 PM.png>
<Screen Shot 2021-03-17 at 9.18.30 PM.png>


On Wed, Mar 17, 2021 at 8:25 PM John Graybeal <jgraybeal at stanford.edu<mailto:jgraybeal at stanford.edu>> wrote:
hello Erica,

I am not 100% sure of the source of the problem, but I have written up the ticket https://github.com/ncbo/bioportal-project/issues/206 about it.

As the ticket suggests, I suspect this is a function of BioPortal not indexing words shorter than 3 characters, including numbers. (If this is true, then Segment 6 works only by accident, it's the first term beginning with 'segment' *and* it matches.) BioPortal won't index those shorter words because they are so likely to be matched elsewhere. Though certainly the full label should be indexed, that's something we could verify if we had resources.

Unfortunately we are unlikely to be able to fix it right now, as we have extraordinarily limited staff time. I am sorry I can't provide a more helpful answer at this point.

John

On Mar 16, 2021, at 5:00 PM, Zhou, Shuxin <sz23 at njit.edu<mailto:sz23 at njit.edu>> wrote:

Dear BioPortal Technical Team,
This is Erica Zhou, and I met some problems with using the BioPortal annotator page.
I uploaded a private ontology RCIT_A1 a few days ago, and some concepts(classes) were not successfully referred to when I tried to annotate certain text.

For example, I have a text containing "segment 7", "segment 6", and "segment 4A", and they are all included in RCIT_A1.
<Screen Shot 2021-03-16 at 7.52.28 PM.png>

But only "segment 6" is recognized as below, but "segment 7" and "segment 4A" are not correctly annotated out, instead, they are annotated partially to "Segment".
<Screen Shot 2021-03-16 at 7.55.42 PM.png>

So, could you help me fix this issue? I would appreciate it very much!

Best regards
Shuxin(Erica) Zhou

--
[NJIT logo]<https://www.njit.edu/>      Shuxin Zhou
Phd Student of SABOC Lab

sz23 at njit.edu<mailto:sz23 at njit.edu>
_______________________________________________
bioontology-support mailing list
bioontology-support at lists.stanford.edu<mailto:bioontology-support at lists.stanford.edu>
https://mailman.stanford.edu/mailman/listinfo/bioontology-support

========================
John Graybeal
Technical Program Manager
Center for Expanded Data Annotation and Retrieval /+/ NCBO BioPortal
Stanford Center for Biomedical Informatics Research
650-736-1632  | ORCID  0000-0001-6875-5360




========================
John Graybeal
Technical Program Manager
Center for Expanded Data Annotation and Retrieval /+/ NCBO BioPortal
Stanford Center for Biomedical Informatics Research
650-736-1632  | ORCID  0000-0001-6875-5360



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.stanford.edu/pipermail/bioontology-support/attachments/20210318/ef8eccdf/attachment-0001.html>


More information about the bioontology-support mailing list