Search Mailing List Archives


Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort
Limit to: All This Week Last Week This Month Last Month
Select Date Range     through    

[bioontology-support] [BioPortal] Feedback from Jason Alan Fries

Jennifer Leigh Vendetti vendetti at stanford.edu
Mon Oct 17 15:42:50 PDT 2016


Hi Jason,

On Oct 7, 2016, at 1:53 PM, support at bioontology.org<mailto:support at bioontology.org> wrote:

Name: Jason Alan Fries

Email: jason-fries at cs.stanford.edu<mailto:jason-fries at cs.stanford.edu>

Location: http://bioportal.bioontology.org/


Feedback:

Quick question about that statistics box on the front page -- what exactly do some of the rows correspond to? How do I find out the difference between "Direct Plus Expanded Annotations" and "Direct Annotations"? I would love to cite BioPortal for some current work, but I can't figure out the correct interpretation and meaning for some for statistics.



The first two rows are the number of ontologies in BioPortal, and total number of classes across all ontologies.

The last 4 rows in the statistics box refer to BioPortal’s Resource Index [1], which essentially indexes biomedical data by ontology concepts.  Here’s some brief explanation of their meanings:

Resources Indexed: this is the total number of publicly available biomedical resources from which we’ve processed textual metadata, e.g., PubMed, ArrayExpress, etc.  A full listing of the available resource is on the BioPortal Resource Index page [1].  The code used to access and process data from these resources is open source and available in our GitHub repository [2].

Indexed Records: this is the total number of elements from publicly available biomedical resources that were indexed.  You can see the number of elements each individual resource contains in the “Total Records” column on the BioPortal Resource Index page [1].

Direct Annotations: this is the total number of annotations generated by using the concept recognition tool mgrep (developed by Univ. of Michigan) to tag resource elements with terms from a dictionary.  The dictionary is constructed from all of the concept names and synonyms in ontologies across BioPortal.  This is essentially the same functionality provided by BioPortal’s Annotator [3].  You might be interested in navigating to the Annotator page and playing around with the sample text to get a sense for what the annotator does.

Direct Plus Expanded Annotations: to explain this one, I’m pasting some text from a paper by some of the folks originally involved in the Resource Index design:

"The system also uses relations provided at the ontology level to expand the annotations. This is the first step of the semantic expansion. For example, using the is_a ontology relation, for each annotation, we create additional transitive closure annotations according to the parent–child relationships subsumed by the original concept. For instance, if a resource element such as a GEO protein expression study is annotated with a concept from the ontology National Cancer Institute Thesaurus (NCIT), e.g., pheochromocytoma, then a researcher can query for retroperitoneal neoplasms and find data sets related to pheochromocytoma. The NCIT provides the knowledge that pheochromocytoma is_a retroperitoneal neoplasms. This first step is done offline because, processing the transitive closure is very time consuming – even if we use a pre-computed hierarchy – and will result in prolonged response times for the users. This use case is similar, in principle, to query expansion done by search engine like Entrez; however, Entrez does not use ontologies, therefore, there exists pheochromocytoma related GEO data sets, but none show up on searching for retroperitoneal neoplasms in Entrez. In our system, however, a researcher could search for retroperitoneal neoplasms and find the relevant samples…"

The paper is available online if you’re interested in reading more [4].



Is there a summary page explaining how these are computed ? Could you provide such information to me? Many thanks!




I’m not aware of a summary page, either on the BioPortal website, or on the NCBO wiki [5].  If you need to look at how these are computed at the source code level, let us know and we could point you in the relevant direction.


Kind regards,
Jennifer

[1] http://bioportal.bioontology.org/resource_index
[2] https://github.com/ncbo/resource_access_tools
[3] http://bioportal.bioontology.org/annotator
[4] http://www.lirmm.fr/~jonquet/publications/documents/Article-DILS08_Jonquet_Musen_Shah_published.pdf
[5] https://www.bioontology.org/wiki/index.php/Main_Page

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.stanford.edu/pipermail/bioontology-support/attachments/20161017/6c1c7ea3/attachment.html>


More information about the bioontology-support mailing list