Search Mailing List Archives


Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort
Limit to: All This Week Last Week This Month Last Month
Select Date Range     through    

[bioontology-support] search by id in a subtree -> sparql service in VA similar to BioPortal

John Graybeal jgraybeal at stanford.edu
Thu Feb 15 18:29:33 PST 2018


Hi Sina,

I'm replying to give you a heads-up on this, namely that we will not be able to get to this problem very quickly I'm afraid.

At the moment, we have a number of pretty critical and time-consuming tasks we are trying to complete. UMLS is high on our list, but not high enough to happen in the next few weeks at least. And it is not clear how long it may take for us to succeed in tracking down that problem.

That said, here's a point. I notice that the first thing in your log says
I, [2018-01-28T11:19:17.338663 #3910]  INFO -- : ["Using UMLS turtle file found, skipping OWLAPI parse"]
E, [2018-01-28T11:19:17.338923 #3910] ERROR -- : ["ArgumentError: invalid byte sequence in UTF-8
Invalid byte sequence in UTF-8 strongly suggests that there is a mis-coded character or an unsupported character, or that binary data contaminated the file. I don't know the best way to check this, but I use BBEdit for text files and it gives pretty good warnings for bad characters. I expect there are web services that can test a file as well, though maybe not for really big files.

Perhaps if anyone else on the list has successfully parsed the latest(?) SNOMED using umls2rdf they can let us know.

Sorry we can't be more help sooner.

John



On Feb 2, 2018, at 5:25 AM, Madani, Sina <Sina.Madani at vumc.org<mailto:Sina.Madani at vumc.org>> wrote:

Thank you, John for the response.
I will contact Agroportal regarding the sparql endpoint. For my problem with loading SNOMED, I am using the same process (umls2rdf) that I use to generate other UMLS terminologies (like RxNomr, LOINC, ICDx, etc.), per Bioportal instruction. The default output of umls2rdf script is ttl (load on code). ICD9/10, RxNomr, and LOINC ttl files were successfully loaded and parsed in Bioportal though.

Thanks again for looking into this

Sina

From: John Graybeal <jgraybeal at stanford.edu<mailto:jgraybeal at stanford.edu>>
Date: Friday, February 2, 2018 at 2:28 AM
To: "Madani, Sina" <Sina.Madani at vumc.org<mailto:Sina.Madani at vumc.org>>
Cc: "support at bioontology.org<mailto:support at bioontology.org>" <support at bioontology.org<mailto:support at bioontology.org>>
Subject: Re: [bioontology-support] search by id in a subtree -> sparql service in VA similar to BioPortal

Hi Sina,

Yes, the 4store backend essentially is a sparql endpoint, used by the rest of the Virtual Appliance. But it could be configured to be accessed also by other query originators. I suggest you contact the Agroportal folks to learn about their process and experience.

I think we haven't managed to answer your question, though I may have missed it. (We are a little time-constrained this week, sorry!)

But at a quick guess (only partially informed, apologies in advance if I get something wrong here), I would not expect loading the TTL file to work that way, and I would not expect the size to be an issue, since we've loaded big ontologies a lot without an issue related to their size. Naively perhaps, I would consider converting your ontologies of interest to OWL and loading them through the normal process, unless you want to execute the whole UMLS load software (not recommended, very complex!).

I think we'll be able to come back to you with more thoughts soon, hopefully by the weekend.

John


On Feb 1, 2018, at 11:51 AM, Madani, Sina <Sina.Madani at vumc.org<mailto:Sina.Madani at vumc.org>> wrote:

Hi John,

Thank you for getting back to me. Yes, I meant similar functionality like the sparql end point at the Stanford instance.
I understand that many (if not all) of the queries perhaps can be done via APIs but I was just curious to see if the appliance can be configured as an sparql end point for incoming sparql queries, like for extracting and validating the mappings that is done automatically by the appliance or regular queries.
We are evaluating your appliance for search/browse and visualization purposes for our order sets catalogue, with mappings to standard terminologies, at this point.

Thanks!
Sina

From: John Graybeal [mailto:jgraybeal at stanford.edu]
Sent: Wednesday, January 31, 2018 6:50 PM
To: Michael Dorf
Cc: support at bioontology.org<mailto:support at bioontology.org>; Madani, Sina
Subject: Re: [bioontology-support] search by id in a subtree -> sparql service in VA similar to BioPortal

Sina,

(Changing the topic line to focus on the SPARQL question that I'm addressing here.)

The exact answer to your last question may depend on what you mean by "similar to the BioPortal web site".

On BioPortal we have the 'front-end beta SPARQL query UI' set up at http://sparql.bioontology.org<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fsparql.bioontology.org&data=02%7C01%7CSina.Madani%40vumc.org%7C3434890b2cca46bf04eb08d5690dbdf8%7Cef57503014244ed8b83c12c533d879ab%7C0%7C1%7C636530430119439315&sdata=t2pxaHHIGDvX7Bvy7%2FCud0e0xT0h%2F7BRD6B4ctOgX%2Fs%3D&reserved=0>. There are a few complexities before the queries get to the backend service, which is a separate 4store service from the one used to serve BioPortal itself. (Which is why the backend data is not the same.)  We could tell you about all those details and provide code to make it all work, but you may not need things to be configured that way on your system.

It is certainly possible to configure the primary backend store (4store) that your Virtual Appliance uses, so that it can accept queries from anywhere. If you are running a public service, you might not want to do that, because it is difficult to protect your back end from queries that can take the system down. (That's why we don't do it that way on BioPortal.)

I think that Clement Jonquet may have found a way to do it that doesn't have many problems, for his AgroPortal installation. He reads this list and will likely weigh in, but if not we can make sure he gets this question too.

I know the technical folks are thinking about your other questions, I'm not going to try to guess at those answers!

John




On Jan 31, 2018, at 3:05 PM, Michael Dorf <mdorf at stanford.edu<mailto:mdorf at stanford.edu>> wrote:

Hi,

I am using a local version of Appliance 2.5 RC3 as well as UMLS2RDF script against UMLS2017AB database to generate and load SNOMED.ttl file (1.29Gb).
However, upon submitting a new ontology and after few seconds I get this error message in the web UI: “something went wrong”. Also, http://ontotoportal.admin<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fontotoportal.admin%2F&data=02%7C01%7CSina.Madani%40vumc.org%7C3434890b2cca46bf04eb08d5690dbdf8%7Cef57503014244ed8b83c12c533d879ab%7C0%7C1%7C636530430119439315&sdata=UOXjV5Z7bctq%2Bh4r%2BhcKOS%2FIOZUDBJzynLRSFYUM7OU%3D&reserved=0> report under issues section shows “ontology has no submission”.  Under /srv/ncbo/repository path no directory was created for SNOMED. Seems manually creating “1” submission directory and copying SNOMED ttl into that directory doesn’t have any effect even with manual parsing per instruction. Is it possible to manually load large ttl files and create submissions?
Scheduler.log or appliance.log doesn’t show any error either.

I re-tried the submission with a compressed tar file (72 Mb) too. This time,  http://ontotoportal.admin<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fontotoportal.admin%2F&data=02%7C01%7CSina.Madani%40vumc.org%7C3434890b2cca46bf04eb08d5690dbdf8%7Cef57503014244ed8b83c12c533d879ab%7C0%7C1%7C636530430119439315&sdata=UOXjV5Z7bctq%2Bh4r%2BhcKOS%2FIOZUDBJzynLRSFYUM7OU%3D&reserved=0> showed ERROR_RDF under Error Status field. Also, on admin page, the log link under URL field shows below messages.  It seems manually unzipping the file and/or reprocessing it doesn’t have any effect either
Is there a workaround for loading/parsing SNOMED (or similar large ontologies) into OntoPortal? Also, is it possible to access 4store directly in our local instance from a sparql endpoint similar to Bioportal website?

Thanks!

Sina

I, [2018-01-28T11:19:17.181176 #3910]  INFO -- : ["Starting to process http://data.bioontology.org/ontologies/SNOMED/submissions/1"<https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdata.bioontology.org%2Fontologies%2FSNOMED%2Fsubmissions%2F1%2522&data=02%7C01%7CSina.Madani%40vumc.org%7C3434890b2cca46bf04eb08d5690dbdf8%7Cef57503014244ed8b83c12c533d879ab%7C0%7C1%7C636530430119439315&sdata=C7B0qSamo%2BQz%2F1zW3G41kA3FzLCGJlKPYhMdXu7G8NU%3D&reserved=0>]
I, [2018-01-28T11:19:17.219473 #3910]  INFO -- : ["Starting to process SNOMED/submissions/1"]
I, [2018-01-28T11:19:17.338663 #3910]  INFO -- : ["Using UMLS turtle file found, skipping OWLAPI parse"]
E, [2018-01-28T11:19:17.338923 #3910] ERROR -- : ["ArgumentError: invalid byte sequence in UTF-8\n/srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.3.0/bundler/gems/ontologies_linked_data-ff5920539091/lib/ontologies_linked_data/models/ontology_submission.rb:398:in `block in generate_umls_metrics_file'\n\t/srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.3.0/bundler/gems/ontologies_linked_data-ff5920539091/lib/ontologies_linked_data/models/ontology_submission.rb:397:in `foreach'\n\t/srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.3.0/bundler/gems/ontologies_linked_data-ff5920539091/lib/ontologies_linked_data/models/ontology_submission.rb:397:in `generate_umls_metrics_file'\n\t/srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.3.0/bundler/gems/ontologies_linked_data-ff5920539091/lib/ontologies_linked_data/models/ontology_submission.rb:414:in `generate_rdf'\n\t/srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.3.0/bundler/gems/ontologies_linked_data-ff5920539091/lib/ontologies_linked_data/models/ontology_submission.rb:903:in `process_submission'\n\t/srv/ncbo/ncbo_cron/lib/ncbo_cron/ontology_submission_parser.rb:177:in `process_submission'\n\t/srv/ncbo/ncbo_cron/lib/ncbo_cron/ontology_submission_parser.rb:47:in `block in process_queue_submissions'\n\t/srv/ncbo/ncbo_cron/lib/ncbo_cron/ontology_submission_parser.rb:41:in `each'\n\t/srv/ncbo/ncbo_cron/lib/ncbo_cron/ontology_submission_parser.rb:41:in `process_queue_submissions'\n\t/srv/ncbo/ncbo_cron/bin/ncbo_cron:228:in `block (3 levels) in <main>'\n\t/srv/ncbo/ncbo_cron/lib/ncbo_cron/scheduler.rb:65:in `block (3 levels) in scheduled_locking_job'\n\t/srv/ncbo/ncbo_cron/lib/ncbo_cron/scheduler.rb:51:in `fork'\n\t/srv/ncbo/ncbo_cron/lib/ncbo_cron/scheduler.rb:51:in `block (2 levels) in scheduled_locking_job'\n\t/srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.3.0/gems/mlanett-redis-lock-0.2.7/lib/redis-lock.rb:43:in `lock'\n\t/srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.3.0/gems/mlanett-redis-lock-0.2.7/lib/redis-lock.rb:234:in `lock'\n\t/srv/ncbo/ncbo_cron/lib/ncbo_cron/scheduler.rb:50:in `block in scheduled_locking_job'\n\t/srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.3.0/gems/rufus-scheduler-2.0.24/lib/rufus/sc/jobs.rb:230:in `trigger_block'\n\t/srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.3.0/gems/rufus-scheduler-2.0.24/lib/rufus/sc/jobs.rb:204:in `block in trigger'\n\t/srv/ncbo/ncbo_cron/vendor/bundle/ruby/2.3.0/gems/rufus-scheduler-2.0.24/lib/rufus/sc/scheduler.rb:430:in `block in trigger_job'"]


_______________________________________________
bioontology-support mailing list
bioontology-support at lists.stanford.edu<mailto:bioontology-support at lists.stanford.edu>
https://mailman.stanford.edu/mailman/listinfo/bioontology-support<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmailman.stanford.edu%2Fmailman%2Flistinfo%2Fbioontology-support&data=02%7C01%7CSina.Madani%40vumc.org%7Cff6db93c5608494ba5e908d56a0e8238%7Cef57503014244ed8b83c12c533d879ab%7C0%7C0%7C636531532933146046&sdata=JFjyDOXVhifeBqHTHXeb0UdQLlcFXmPU9It5aHW%2B%2BNU%3D&reserved=0>

========================
John Graybeal
Technical Program Manager
Center for Expanded Data Annotation and Retrieval /+/ NCBO BioPortal
Stanford Center for Biomedical Informatics Research
650-736-1632

========================
John Graybeal
Technical Program Manager
Center for Expanded Data Annotation and Retrieval /+/ NCBO BioPortal
Stanford Center for Biomedical Informatics Research
650-736-1632

========================
John Graybeal
Technical Program Manager
Center for Expanded Data Annotation and Retrieval /+/ NCBO BioPortal
Stanford Center for Biomedical Informatics Research
650-736-1632


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.stanford.edu/pipermail/bioontology-support/attachments/20180216/c225f1e0/attachment-0001.html>


More information about the bioontology-support mailing list