Class MultiWordChunker
java.lang.Object
org.languagetool.tagging.disambiguation.AbstractDisambiguator
org.languagetool.tagging.disambiguation.MultiWordChunker
- All Implemented Interfaces:
Disambiguator
Multiword tagger-chunker.
-
Field Summary
Fields -
Constructor Summary
ConstructorsConstructorDescriptionMultiWordChunker
(String filename) MultiWordChunker
(String filename, boolean allowFirstCapitalized) -
Method Summary
Modifier and TypeMethodDescriptionfinal AnalyzedSentence
disambiguate
(AnalyzedSentence input) Implements multiword POS tags, e.g., <ELLIPSIS> for ellipsis (...) start, and </ELLIPSIS> for ellipsis end.private void
lazyInit()
loadWords
(InputStream stream) private AnalyzedTokenReadings
prepareNewReading
(String tokens, String tok, AnalyzedTokenReadings token, boolean isLast) private AnalyzedTokenReadings
setAndAnnotate
(AnalyzedTokenReadings oldReading, AnalyzedToken newReading) Methods inherited from class org.languagetool.tagging.disambiguation.AbstractDisambiguator
preDisambiguate
-
Field Details
-
filename
-
allowFirstCapitalized
private final boolean allowFirstCapitalized -
mStartSpace
-
mStartNoSpace
-
mFull
-
-
Constructor Details
-
MultiWordChunker
- Parameters:
filename
- file text with multiwords and tags
-
MultiWordChunker
- Parameters:
filename
- file text with multiwords and tagsallowFirstCapitalized
- if set totrue
, first word of the multiword can be capitalized
-
-
Method Details
-
lazyInit
private void lazyInit() -
disambiguate
Implements multiword POS tags, e.g., <ELLIPSIS> for ellipsis (...) start, and </ELLIPSIS> for ellipsis end.- Parameters:
input
- The tokens to be chunked.- Returns:
- AnalyzedSentence with additional markers.
-
prepareNewReading
private AnalyzedTokenReadings prepareNewReading(String tokens, String tok, AnalyzedTokenReadings token, boolean isLast) -
setAndAnnotate
private AnalyzedTokenReadings setAndAnnotate(AnalyzedTokenReadings oldReading, AnalyzedToken newReading) -
loadWords
-