Search Mailing List Archives


Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort
Limit to: All This Week Last Week This Month Last Month
Select Date Range     through    

[java-nlp-user] Phrase Extract fails due to OutOfMemoryError (and ant fails)

Hwidong Na leona at postech.ac.kr
Thu May 26 18:46:27 PDT 2011


Hi,

I ran extact-phrase with gaps on 1M sentence pairs (Chinese-English) of
the training corpus. It aborts at certain point. In the phrases.log, it
gives the following stack trace.

....
        at java.util.LinkedList.listIterator(LinkedList.java:667)
        at java.util.AbstractList.listIterator(AbstractList.java:284)
        at
java.util.AbstractSequentialList.iterator(AbstractSequentialList.java:222)
        at
edu.stanford.nlp.mt.train.DTUPhraseExtractor.growDTUs(DTUPhraseExtractor.java:591)
        at
edu.stanford.nlp.mt.train.DTUPhraseExtractor.extractPhrases(DTUPhraseExtractor.java:734)
        at
edu.stanford.nlp.mt.train.PhraseExtract.featurizePhrases(PhraseExtract.java:636)
        at
edu.stanford.nlp.mt.train.PhraseExtract.processLine(PhraseExtract.java:628)
        at
edu.stanford.nlp.mt.train.PhraseExtract.extractFromAlignedData(PhraseExtract.java:583)
        at
edu.stanford.nlp.mt.train.PhraseExtract.extractAll(PhraseExtract.java:748)
        at
edu.stanford.nlp.mt.train.PhraseExtract.main(PhraseExtract.java:814)
...

Can I skip the specific sentences that arises OutOfMemoryError? I tried
to modify the source codes and compile them, but it seemed to require
more libraries such as TER.

(I already set $SRILM, $PHRASAL, and $CORENLP properly.)
$ cd $PHRASAL
$ ant
Buildfile: build.xml

init:
    [mkdir] Created dir: /home/leona/phrasal.beta1c/classes

compile:
     [echo] /home/leona/stanford-corenlp-2011-05-21//xom.jar:/home/leona/stanford-corenlp-2011-05-21//xom-src-1.2.6.jar:/home/leona/stanford-corenlp-2011-05-21//stanford-corenlp-models-2011-04-11.jar:/home/leona/stanford-corenlp-2011-05-21//stanford-corenlp-2011-05-21.jar:/home/leona/stanford-corenlp-2011-05-21//stanford-corenlp-2011-05-21-sources.jar:/home/leona/stanford-corenlp-2011-05-21//jgrapht.jar:/home/leona/stanford-corenlp-2011-05-21//jgrapht-src-0.7.3.jar:/home/leona/stanford-corenlp-2011-05-21//jgraph.jar:/home/leona/stanford-corenlp-2011-05-21//fastutil.jar:/home/leona/phrasal.beta1c/phrasal.beta1c.jar:
    [javac] Compiling 312 source files to /home/leona/phrasal.beta1c/classes
    [javac] /home/leona/phrasal.beta1c/src/edu/stanford/nlp/mt/metrics/TERMetric.java:19: package com.bbn.mt.ter does not exist
    [javac] import com.bbn.mt.ter.TERcalc;
    [javac]                      ^
    [javac] /home/leona/phrasal.beta1c/src/edu/stanford/nlp/mt/metrics/TERMetric.java:20: package com.bbn.mt.ter does not exist
    [javac] import com.bbn.mt.ter.TERalignment;
...
-- 
Hwidong Na <leona at postech.ac.kr>
KLE lab, POSTECH, KOREA








More information about the java-nlp-user mailing list