Closed
Description
OS: Linux Mint 20.1 Ulyssa (base: Ubuntu 20.04 focal)
Java: openjdk version "11.0.9.1"
CoreNLP: 4.2.0 (also 4.1.0)
Command line:
java -Xmx10g edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos,lemma,ner,depparse,coref,quote -file input.txt -outputFormat text
Contents of input.txt:
John Doe:"foo"
Note that the problem occurs when the first line of any document is of the above form. A name (first [middle] last), punctuation (tested with colon and comma), and then a quote.
Exception thrown:
Exception in thread "main" java.lang.IndexOutOfBoundsException: Index -1 out of bounds for length 6
at java.base/jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:64)
at java.base/jdk.internal.util.Preconditions.outOfBoundsCheckIndex(Preconditions.java:70)
at java.base/jdk.internal.util.Preconditions.checkIndex(Preconditions.java:248)
at java.base/java.util.Objects.checkIndex(Objects.java:372)
at java.base/java.util.ArrayList.get(ArrayList.java:459)
at edu.stanford.nlp.quoteattribution.Sieves.QMSieves.TrigramSieve.trigramPatterns(TrigramSieve.java:71)
at edu.stanford.nlp.quoteattribution.Sieves.QMSieves.TrigramSieve.doQuoteToMention(TrigramSieve.java:27)
at edu.stanford.nlp.pipeline.QuoteAttributionAnnotator.annotate(QuoteAttributionAnnotator.java:251)
at edu.stanford.nlp.pipeline.QuoteAnnotator.annotate(QuoteAnnotator.java:292)
at edu.stanford.nlp.pipeline.AnnotationPipeline.annotate(AnnotationPipeline.java:76)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.annotate(StanfordCoreNLP.java:653)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.annotate(StanfordCoreNLP.java:663)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.processFiles(StanfordCoreNLP.java:1261)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.processFiles(StanfordCoreNLP.java:1095)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.run(StanfordCoreNLP.java:1361)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.main(StanfordCoreNLP.java:1430)
Metadata
Metadata
Assignees
Labels
No labels