Package edu.berkeley.nlp.lm.io
package edu.berkeley.nlp.lm.io
-
ClassDescriptionArpaLmReader<W>A parser for ARPA LM files.Callback that is called for each n-gram in the collectionComputes the log probability of a list of files.FirstPassCallback<V extends LongRepresentable<V>>Reader callback which adds n-grams to an NgramMapReads in n-gram count collections in the format that the Google n-grams Web1T corpus comes in.Some IO utility functions.Class for producing a Kneser-Ney language model in ARPA format from raw text.Class for producing a Kneser-Ney language model in ARPA format from raw text.LmReader<V,
C extends LmReaderCallback<V>> Callback that is called for each n-gram in the collectionThis class contains a number of static methods for reading/writing/estimating n-gram language models.Estimates a Kneser-Ney language model from raw text, and writes the language model out in ARPA-format.Given a language model in ARPA format, builds a binary representation of the language model and writes it to disk.Given a directory in Google n-grams format, builds a binary representation of a stupid-backoff language model language model and writes it to disk.LikeMakeLmBinaryFromGoogle
, except it only writes the NgramMap portion of the LM, meaning the binary does not contain the vocabulary.Reader callback which adds n-grams to an NgramMapCallback that is called for each n-gram in the collectionTextReader<W>Class for reading raw text files.