Search Mailing List Archives


Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort
Limit to: All This Week Last Week This Month Last Month
Select Date Range     through    

[java-nlp-user] processing stops after 30-50 sentences

John Bauer horatio at gmail.com
Sun May 12 14:41:24 PDT 2013


I honestly don't know anything about the Python wrapper, so it's
difficult to suggest how to fix it.  However, I do know that reparsing
the corenlp xml would be a rather unpleasant chore.  What might be a
lot easier for you is to use the Java serialized objects.  Corenlp can
output the files directly into serialized format if you use the option
-outputFormat serialized

The code is included in the sources jar, part of the zip file we distribute.

John

On Sun, May 12, 2013 at 1:18 PM, Johannes Castner <jac2130 at gmail.com> wrote:
> Dear John,
>
> Actually, now I know more about what the problem is and it turns out to be
> that the python wrapper interacts with the interactive shell of the Stanford
> tools and there is a limit on how much text can be typed into the
> interactive shell (which is what the wrapper does).  Is there a way to get
> rid of this limit?  As I really need to use the output in form of the python
> objects that are created by the wrapper, it would help me a great deal if I
> didn't have to parse the xml files and if I could just use the wrapper.
> Also, are the uncompiled java files available somewhere so that I could look
> directly at the code?
>
>
> Johannes
>
>
> On Sat, May 11, 2013 at 1:02 AM, John Bauer <horatio at gmail.com> wrote:
>>
>> Heh, oops.
>>
>> I was just about to suggest trying it without the python wrapper,
>> since it seemed to work fine run directly from java.
>>
>> John
>>
>> On Fri, May 10, 2013 at 5:39 PM, Johannes Castner <jac2130 at gmail.com>
>> wrote:
>> > I am really sorry for having bothered you at all about this problem; it
>> > turns out to be entirely on the python side of things (the cleanup
>> > function
>> > in Dustin Smith's python wrapper exits silently if it times out)!
>> >
>> > Johannes
>> >
>> >
>> > On Fri, May 10, 2013 at 7:06 PM, John Bauer <horatio at gmail.com> wrote:
>> >>
>> >> It is very strange that you are having problems.  I am confused by
>> >> that manner of specifying the classpath, which I wouldn't expect to
>> >> work, but I guess you would have noticed if it was not doing anything
>> >> at all.  However, when I ran corenlp on this file, it ran to
>> >> completion on the whole file and output the entire thing.
>> >>
>> >> I would note that our tokenizer doesn't know how to handle words
>> >> wrapping from one line to the next, which is causing quite a bit of
>> >> trouble for some of the tools.  I suppose that would be a useful
>> >> feature to add sometime in the future, but in the meantime you
>> >> probably want to remove that.
>> >>
>> >> If you're still having trouble, please let me know anything you can
>> >> about what it's printing out, where the output ends, etc.
>> >>
>> >> John
>> >>
>> >> On Fri, May 10, 2013 at 1:10 PM, Johannes Castner <jac2130 at gmail.com>
>> >> wrote:
>> >> > Thank you very much! The input file is attached and the command is as
>> >> > follows:
>> >> >
>> >> > java -Xmx3g stanford-corenlp-1.3.5.jar
>> >> > stanford-corenlp-1.3.5-models.jar
>> >> > xom.jar joda-time.jar jollyday.jar
>> >> > edu.stanford.nlp.pipeline.StanfordCoreNLP
>> >> > -annotators tokenize,ssplit,pos,lemma,ner,parse,dcoref -file
>> >> >
>> >> >
>> >> > ~/Documents/causal-belief-catcher/Semaphore-master/semafor-semantic-parser/CR/new_sample.txt
>> >> >
>> >> > Johannes
>> >> >
>> >> >
>> >> > On Fri, May 10, 2013 at 3:13 PM, John Bauer <horatio at gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> You could send me the input file and the command line you are
>> >> >> running
>> >> >> if you want.
>> >> >>
>> >> >> What version are you using and what OS?
>> >> >>
>> >> >> On Fri, May 10, 2013 at 12:04 PM, Johannes Castner
>> >> >> <jac2130 at gmail.com>
>> >> >> wrote:
>> >> >> > I wish there was an explicit error message, but the biggest
>> >> >> > problem
>> >> >> > is
>> >> >> > that
>> >> >> > it returns as if it had completely processed everything.
>> >> >> >
>> >> >> > Johannes
>> >> >> >
>> >> >> >
>> >> >> > On Fri, May 10, 2013 at 2:58 PM, John Bauer <horatio at gmail.com>
>> >> >> > wrote:
>> >> >> >>
>> >> >> >> We certainly would have debugged such an error if we had
>> >> >> >> witnessed
>> >> >> >> it
>> >> >> >> happening.  There isn't enough information here to try to debug
>> >> >> >> it
>> >> >> >> for
>> >> >> >> you, though,
>> >> >> >>
>> >> >> >> John
>> >> >> >>
>> >> >> >> On Fri, May 10, 2013 at 11:36 AM, Johannes Castner
>> >> >> >> <jac2130 at gmail.com>
>> >> >> >> wrote:
>> >> >> >> > To whom it may concer,
>> >> >> >> >
>> >> >> >> > I am trying to use the stanford core tools, but they seem to
>> >> >> >> > stop
>> >> >> >> > proccessing my text after about 30-50 sentences, without any
>> >> >> >> > errors,
>> >> >> >> > as
>> >> >> >> > if
>> >> >> >> > the whole text, which has thousands of sentences had been
>> >> >> >> > processed.
>> >> >> >> > What
>> >> >> >> > does that mean and is there something I can do? It is
>> >> >> >> > particularly
>> >> >> >> > important
>> >> >> >> > for noun phrase resolution that the whole text is parsed at
>> >> >> >> > once,
>> >> >> >> > to
>> >> >> >> > preserve global coreference relations.
>> >> >> >> >
>> >> >> >> > Johannes
>> >> >> >> >
>> >> >> >> > --
>> >> >> >> > Johannes
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > --------------------------------------------------------------------------------------------------------
>> >> >> >> > "I can calculate the motions of the heavenly bodies, but not
>> >> >> >> > the
>> >> >> >> > madness
>> >> >> >> > of
>> >> >> >> > people."
>> >> >> >> > - Isaac Newton
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > _______________________________________________
>> >> >> >> > java-nlp-user mailing list
>> >> >> >> > java-nlp-user at lists.stanford.edu
>> >> >> >> > https://mailman.stanford.edu/mailman/listinfo/java-nlp-user
>> >> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > --
>> >> >> > Johannes
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > --------------------------------------------------------------------------------------------------------
>> >> >> > "I can calculate the motions of the heavenly bodies, but not the
>> >> >> > madness
>> >> >> > of
>> >> >> > people."
>> >> >> > - Isaac Newton
>> >> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> > Johannes
>> >> >
>> >> >
>> >> >
>> >> > --------------------------------------------------------------------------------------------------------
>> >> > "I can calculate the motions of the heavenly bodies, but not the
>> >> > madness
>> >> > of
>> >> > people."
>> >> > - Isaac Newton
>> >> >
>> >
>> >
>> >
>> >
>> > --
>> > Johannes
>> >
>> >
>> > --------------------------------------------------------------------------------------------------------
>> > "I can calculate the motions of the heavenly bodies, but not the madness
>> > of
>> > people."
>> > - Isaac Newton
>> >
>
>
>
>
> --
> Johannes
>
> --------------------------------------------------------------------------------------------------------
> "I can calculate the motions of the heavenly bodies, but not the madness of
> people."
> - Isaac Newton
>


More information about the java-nlp-user mailing list