Search Mailing List Archives
[bioontology-support] Ontology not in the ranked list?
john.zobolas at ntnu.no
Wed Mar 6 16:19:45 PST 2019
Hi Michael, and thanks for the answer, that clarifies it.
Something else: based on the answers you gave me last time, there was an example (and many more I am sure) of requesting a specific @id, representing a class/entry which appears in multiple ontologies. Is there a way to find the 'best ID' for these entries based on a smart way to reverse engineer the ontology of origin (`OntoOrigin`) from the URI id and asking with the extra: `ontologies=OntoOrigin` in the search query? I don't know if this can be done always, but anyway, the aim is to get back one result only for this id if possible. Or if you may have another way of doing something similar to this, please share it with me :)
From: Michael Dorf <mdorf at stanford.edu>
Sent: Thursday, March 7, 2019 12:58 AM
To: John Zobolas
Cc: support at bioontology.org
Subject: Re: [bioontology-support] Ontology not in the ranked list?
The ontology rank list isn't static. The Gist below simply contains a snapshot of the list at the time of the original writing. The ontology rankings get re-calculated internally on a weekly basis. I've just updated the Gist with the current snapshot of the list. It now includes the CHEAR ontology.
In terms of the results ordering, it only makes sense in the context of a search query. For example:
In this case, when there are two results with an identical prefLabel (protein), whose search ranking scores would obviously be identical, the ontology ranking ordering would take effect, making CHEAR "protein" results appear before the MCCL results.
What you are executing is a "queryless" search, where all results are ranked equally by the search engine. The initial ordering of the documents is random (or at least at the mercy of the search engine - Solr in our case). Since we don't return all 55,201 documents that matched your search but only the top 50, the ontology ranking ordering happens only for those top 50 (random) documents, which, as in the case with the search engine rankings, makes little sense. The bottom line is if you are using the search API to retrieve ALL documents, don't expect any meaningful ordering to be in place.
Hope this clarifies it.
On Mar 6, 2019, at 5:38 AM, John Zobolas <john.zobolas at ntnu.no<mailto:john.zobolas at ntnu.no>> wrote:
I remember Michael mentioned to me that there is this list: https://gist.github.com/mdorf/cea96433cf4bf7dd94d109c8e06e29c0 of BioPortal's ranked ontologies.
So, when I execute this kind of query (to get the classes from all 3 specified ontologies, iterating through the pages): http://data.bioontology.org/search?ontologies=RH-MESH,MCCL,CHEAR&ontology_types=ONTOLOGY&display_context=false?, I expect the returned results to be in the order of the ontologies as in the file, if I understood correctly.
* ?The CHEAR ontology is not in the file (why?) and its classes appear in the end
* RH-MESH is ranked higher than MCCL score (0.561 vs 0.409), but you get first results from the MCCL ontology (why?)
bioontology-support mailing list
bioontology-support at lists.stanford.edu<mailto:bioontology-support at lists.stanford.edu>
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the bioontology-support