Search Mailing List Archives


Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort
Limit to: All This Week Last Week This Month Last Month
Select Date Range     through    

[bioontology-support] problem doing large queries

Ray Fergerson ray.fergerson at stanford.edu
Thu Oct 24 14:54:43 PDT 2013


Xin,

 

I am a unclear about what you are doing. Are you (#1) making 55,000
separate calls to the annotator in a big loop with a short piece of text
in each. If so then the system should work. This is not one long query but
is instead 55,000 short ones. If instead you are (#2) concatenating 55,000
rows from a spreadsheet and sending it in one call to the annotator then
you will probably have trouble and run into timeouts.

 

What might be happening (in scenario #1 above)  is that occasionally the
text you pass to annotator is long and you hit a timeout. The easy way for
you to test this would be to examine the calls that fail and see if there
is a consistent pattern (length or words) in the text that fails. We have
occasionally run across text containing particular character sequences
that causes a failure. We are not aware of any text at the moment that
does so but that doesn't mean that there isn't any.

 

Ray

 

From: bioontology-support-bounces at lists.stanford.edu
[mailto:bioontology-support-bounces at lists.stanford.edu] On Behalf Of Xin.H
Sent: Thursday, October 24, 2013 9:14 AM
To: support at bioontology.org
Subject: [bioontology-support] problem doing large queries

 

Hi,

 

I was using the Annotator to map some text into Human Disease Ontology.
There are about 55000 rows' text and I need to do it for every row so it
takes a long time. During the query, I noticed some errors that appear
occasionally which are hard to make an example to repeat. The script is
still running so I assume that it just ignore this problem rows?

 

Here are the different errors I got:

 

<h1>Internal Server Error</h1> at Thu Oct 24 00:04:27 2013

 

Status read failed: Connection reset by peer at
/usr/share/perl5/Net/HTTP/Methods.pm line 269.

 at Thu Oct 24 01:36:17 2013

 

Server closed connection without sending any data back at
/usr/share/perl5/Net/HTTP/Methods.pm line 345.

 at Thu Oct 24 13:14:25 2013

 

Can't connect to data.bioontology.org:80 (No route to host)

LWP::Protocol::http::Socket: connect: No route to host at
/usr/share/perl5/LWP/Protocol/http.pm line 51.

 at Thu Oct 24 16:17:54 2013

 

These seem to be connection problem so I was wondering if this is caused
losing connection during the long time query? I once used ENSEMBL API to
do large queries and there provide a method to allow reconnecting if time
out without breaking the script. I was wondering if the Annotator provide
similar API? Or is that another way to do big queries?

 

Many thanks,

Xin

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.stanford.edu/pipermail/bioontology-support/attachments/20131024/691e627d/attachment.html>


More information about the bioontology-support mailing list