Search Mailing List Archives


Limit search to: Subject & Body Subject Author
Sort by: Reverse Sort
Limit to: All This Week Last Week This Month Last Month
Select Date Range     through    

[java-nlp-user] Training stanford NER on CoNLL dutch corpus causes excprion

John Bauer horatio at gmail.com
Fri Apr 8 18:23:53 PDT 2011


The original conll submissions were for a cmm or memm model, not a
crf.  The flags don't work for crf classifiers...

Here is an example .prop file to try instead,

John

On Fri, Nov 12, 2010 at 3:27 AM, Andrei Vishneuski <vish at gravitysoft.org> wrote:
>
> Hi,
>
> I am trying to train stanford NER (version Jan-2009) on dutch CoNLL corpus. There is magic meta flag "goodCoNLL" I have set in properties file to true.
> Unfortunately stanford NER throws exception (shown below).
>
> Do you have any thoughts what can trigger the exception and how can I suppress it ?
>
> Regards
> Andrei
>
>
> goodCoNLL=true
> serializeTo=conll.nl.ser.gz
> trainFile=/Users/avishneu/projects/tae/lib/ilps/tae/corpus/conll.nl.train
> prop=./mac-test.properties
> numClasses: 5 [0=O,1=I-ORG,2=I-MISC,3=I-PER,4=I-LOC]
> numDocuments: 22
> numDatums: 218736
> numFeatures: 968883
> numWeights: 8225607
> QNMinimizer called on double function of 8225607 variables, using M = 15.
>          An explanation of the output:
> Iter           The number of iterations
> evals          The number of function evaluations
> SCALING        <D> Diagonal scaling was used
>               <I> Scaled Identity
> LINESEARCH     [## M steplength]  Minpack linesearch
>                   1-Function value was too high
>                   2-Value ok, gradient positive, positive curvature
>                   3-Value ok, gradient negative, positive curvature
>                   4-Value ok, gradient negative, negative curvature
>               [.. B]  Backtracking
> VALUE          The current function value
> TIME           Total elapsed time
> |GNORM|        The current norm of the gradient
> {RELNORM}      The ratio of the current to initial gradient norms
> AVEIMPROVE     The average improvement / current value
>
> Iter ## evals ## <SCALING> [LINESEARCH] VALUE TIME |GNORM| {RELNORM} AVEIMPROVE
>
> Iter 1 evals 1 <D> [1113M 3.919E-6]
> Exception in thread "main" java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
>        at java.util.ArrayList.RangeCheck(ArrayList.java:547)
>        at java.util.ArrayList.remove(ArrayList.java:387)
>        at edu.stanford.nlp.optimization.QNMinimizer.minimize(QNMinimizer.java:988)
>        at edu.stanford.nlp.optimization.QNMinimizer.minimize(QNMinimizer.java:797)
>        at edu.stanford.nlp.optimization.QNMinimizer.minimize(QNMinimizer.java:792)
>        at edu.stanford.nlp.optimization.QNMinimizer.minimize(QNMinimizer.java:84)
>        at edu.stanford.nlp.ie.crf.CRFClassifier.train(CRFClassifier.java:1088)
>        at edu.stanford.nlp.ie.AbstractSequenceClassifier.train(AbstractSequenceClassifier.java:648)
>        at edu.stanford.nlp.ie.AbstractSequenceClassifier.train(AbstractSequenceClassifier.java:641)
>        at edu.stanford.nlp.ie.crf.CRFClassifier.main(CRFClassifier.java:1725)
>
> _______________________________________________
> java-nlp-user mailing list
> java-nlp-user at lists.stanford.edu
> https://mailman.stanford.edu/mailman/listinfo/java-nlp-user
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: conll.2006.prop
Type: application/octet-stream
Size: 1304 bytes
Desc: not available
URL: <http://mailman.stanford.edu/pipermail/java-nlp-user/attachments/20110408/22f3e245/attachment.prop>


More information about the java-nlp-user mailing list