Search Mailing List Archives


Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort
Limit to: All This Week Last Week This Month Last Month
Select Date Range     through    

[bioontology-support] [BioPortal] Feedback from Andrey Fedorov

David Clunie dclunie at dclunie.com
Tue Feb 13 05:08:28 PST 2018


Hi Jennifer

Thanks for explaining why the DCM "ontology" loads more
slowly initially. In this day and age, loading a flat
list of just under 4,000 entries shouldn't take noticeable
time, IMHO, but I understand that your implementation may not
be optimized for this pattern, and 7 seconds is not that
bad for the first response if it is cached afterwards.

However, through the web browser, if I "jump to" a term
after having loaded the top level, say "Segmentation", it
takes quite a few seconds to actually load the class and
quite a few seconds more to "jump to" it in the list.
Again I understand that this may be lack of optimization
for the "large flat list" case (choice of data structure,
size of nodes, indexing/insertion mechanism, re-sending the
list for the user interface frame rather than moving around
in what was already sent if content unchanged, etc.).

On the other hand, if I search for a term across all ontologies,
e.g., "Segmentation" it is pretty fast and finds the DCM class(es).
I assume this is because it a different index is used.

One of these days we may make the DCM ontology into
a proper ontology, but mostly it is used as a grab bag
of independent concepts that are used when we can't find
concepts in a "real" ontology like SNOMED, LOINC, FMA or
NCIt, so it doesn't have a meaningful class "hierarchy" (yet),
and building one hasn't been a priority for us (in DICOM). I
have considered various mechanical ways to do this (e.g., using
the context group (value set) labels as pseudo-classes to
produce at least a two level hierarchy), but I haven't got
very far with that yet.

I haven't used the REST API myself.

The only other data point for speed is the RSNA's RadLex
term browser (http://www.radlex.org/) which is sometimes glacial,
and they attribute that to poor performance of Bioportal,
but I have no direct evidence of that myself, and the alleged
slowness does not manifest when using the Bioportal web
browser directly.

E.g., search for "spiculation" from the top level in the RadLex
ontology in Bioportal and compare that with doing the same at
www.radlex.org ... I just tried it again and the latter just
says "Loading" forever, which is not uncommon.

David

On 2/12/18 6:33 PM, Jennifer Leigh Vendetti wrote:
> Hello Andrey,
> 
> It’s unclear to me from your initial post if you’re specifically referring just to the DCM ontology. I downloaded / examined this ontology and it’s worth noting that it’s an unusual one due to the almost complete lack of any hierarchical structure in the class tree. There are 3,866 classes with only one SubClassOf axiom. In other words, this ontology appears to be a flat list of classes, with only one class appearing as a subclass of another. 
> 
> It’s not particularly surprising that the initial page load time for this ontology is somewhat slow. In order to construct a class tree, BioPortal has to retrieve class data for 3,865 root level classes. However, after the initial page load, the class data is cached and subsequent navigation to other classes in the tree is fast. I clicked on several other classes in the tree and the Chrome developer console shows load times less than a second.
> 
> I compared the load times of the underlying REST call for constructing the initial class tree between DCM and a much larger ontology - SNOMEDCT at 327K classes. Chrome developer console shows a load time for SNOMEDCT of 456ms compared to 7.5s for DCM. This is again due to the lack of any hierarchy / very large number of root level classes in DCM.
> 
> If on the other hand, you’re referring to programmatic access to class data, the more expedient method is via the REST API. Issuing a REST call for the class information you listed below:
> 
> http://data.bioontology.org/ontologies/DCM/classes/http%3A%2F%2Fdicom.nema.org%2Fresources%2Fontology%2FDCM%2F113206?apikey=32688b66-537e-45b7-badc-ce525bebca4d <http://data.bioontology.org/ontologies/DCM/classes/http://dicom.nema.org/resources/ontology/DCM/113206?apikey=32688b66-537e-45b7-badc-ce525bebca4d>
> 
> ...completes in roughly 743ms on the first uncached attempt, then roughly 75ms on subsequent cached retrievals. You can also retrieve all classes for DCM with the classes endpoint, which implements paging:
> 
> http://data.bioontology.org/ontologies/DCM/classes
> 
> Kind regards,
> Jennifer
> 
> 
> 
>> On Feb 12, 2018, at 1:32 PM, Andrey Fedorov <andrey.fedorov at gmail.com <mailto:andrey.fedorov at gmail.com>> wrote:
>>
>> John, thank you for the reply. I will see if I can do more
>> comprehensive profiling.
>>
>> I also cc David Clunie, who has also always been mentioning how slow
>> BioPortal is. Perhaps he can chime in.
>>
>> On Sun, Feb 11, 2018 at 5:53 PM, John Graybeal <jgraybeal at stanford.edu <mailto:jgraybeal at stanford.edu>> wrote:
>>> Andrey,
>>>
>>> To answer this question in a useful way (not just "yes"), we'd need to hear
>>> more details about your use case, including the exact queries you're trying
>>> to answer, and when and for how long it has seemed slow.
>>>
>>> Many of the queries can be expected to take seconds to tens of seconds,
>>> because they have to perform a lot of computations and/or transmit a lot of
>>> data. (When I look at the class list in DICOM, it seems to take a while, but
>>> it's sending information about more than 3000 terms over my slow DSL...)
>>>
>>> So if you can post some example queries (or indicate what web page you're
>>> trying to access), and tell us your expectations in terms of response time,
>>> we can tell whether we need to troubleshoot something, and if so, what.
>>> Thanks!
>>>
>>> John
>>>
>>> On Feb 10, 2018, at 9:43 AM, support at bioontology.org <mailto:support at bioontology.org> wrote:
>>>
>>> Name: Andrey Fedorov
>>>
>>> Email: andrey.fedorov at gmail.com <mailto:andrey.fedorov at gmail.com>
>>>
>>> Location:
>>> http%3A%2F%2Fbioportal.bioontology.org%2Fontologies%2FDCM%3Fp%3Dclasses%26conceptid%3Dhttp%253A%252F%252Fdicom.nema.org%252Fresources%252Fontology%252FDCM%252F113206
>>>
>>>
>>> Feedback:
>>>
>>> Can anything be done to make BioPortal queries faster? It literally takes
>>> seconds to get a response.
>>>
>>>
>>> _______________________________________________
>>> bioontology-support mailing list
>>> bioontology-support at lists.stanford.edu
>>> https://mailman.stanford.edu/mailman/listinfo/bioontology-support
>>>
>>>
>>> ========================
>>> John Graybeal
>>> Technical Program Manager
>>> Center for Expanded Data Annotation and Retrieval /+/ NCBO BioPortal
>>> Stanford Center for Biomedical Informatics Research
>>> 650-736-1632
>>>
>>>
>> _______________________________________________
>> bioontology-support mailing list
>> bioontology-support at lists.stanford.edu <mailto:bioontology-support at lists.stanford.edu>
>> https://mailman.stanford.edu/mailman/listinfo/bioontology-support
> 



More information about the bioontology-support mailing list