|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectopennlp.tools.ngram.NGramModel
public class NGramModel
The NGramModel can be used to crate ngrams and character ngrams.
| Field Summary | |
|---|---|
protected static java.lang.String |
COUNT
|
| Constructor Summary | |
|---|---|
NGramModel()
Initializes an empty instance. |
|
NGramModel(java.io.InputStream in)
Initializes the current instance. |
|
| Method Summary | |
|---|---|
void |
add(java.lang.String chars,
int minLength,
int maxLength)
Adds character NGrams to the current instance. |
void |
add(TokenList ngram)
Adds one NGram, if it already exists the count increase by one. |
void |
add(TokenList ngram,
int minLength,
int maxLength)
Adds NGrams up to the specified length to the current instance. |
boolean |
contains(TokenList tokens)
Checks fit he given tokens are contained by the current instance. |
void |
cutoff(int cutoffUnder,
int cutoffOver)
Deletes all ngram which do appear less than the cutoffUnder value and more often than the cutoffOver value. |
boolean |
equals(java.lang.Object obj)
|
int |
getCount(TokenList ngram)
Retrives the count of the given ngram. |
int |
hashCode()
|
java.util.Iterator |
iterator()
Retrives an Iterator over all TokenList entires. |
int |
numberOfGrams()
Retrives the total count of all Ngrams. |
void |
remove(TokenList tokens)
Removes the specified tokens form the NGram model, they are just dropped. |
void |
serialize(java.io.OutputStream out)
Writes the ngram instance to the given OutputStream. |
void |
setCount(TokenList ngram,
int count)
Sets the count of an existing ngram. |
int |
size()
Retrives the number of TokenList entries in the current instance. |
Dictionary |
toDictionary()
|
Dictionary |
toDictionary(boolean caseSensitive)
Creates a dictionary which contains all TokenLists which
are in the current NGramModel. |
java.lang.String |
toString()
|
| Methods inherited from class java.lang.Object |
|---|
clone, finalize, getClass, notify, notifyAll, wait, wait, wait |
| Field Detail |
|---|
protected static final java.lang.String COUNT
| Constructor Detail |
|---|
public NGramModel()
public NGramModel(java.io.InputStream in)
throws java.io.IOException,
InvalidFormatException
in -
java.io.IOException
InvalidFormatException| Method Detail |
|---|
public int getCount(TokenList ngram)
ngram -
public void setCount(TokenList ngram,
int count)
ngram - count - public void add(TokenList ngram)
ngram -
public void add(TokenList ngram,
int minLength,
int maxLength)
ngram - the tokens to build the uni-grams, bi-grams, tri-grams, ..
from.minLength - - minimal lengthmaxLength - - maximal length
public void add(java.lang.String chars,
int minLength,
int maxLength)
chars - minLength - maxLength - public void remove(TokenList tokens)
tokens - public boolean contains(TokenList tokens)
tokens -
public int size()
TokenList entries in the current instance.
public java.util.Iterator iterator()
Iterator over all TokenList entires.
public int numberOfGrams()
public void cutoff(int cutoffUnder,
int cutoffOver)
cutoffUnder - cutoffOver - public Dictionary toDictionary()
public Dictionary toDictionary(boolean caseSensitive)
TokenLists which
are in the current NGramModel.
caseSensitive - Specifies whether case distinctions should be kept in the creation of the dictionary.
public void serialize(java.io.OutputStream out)
throws java.io.IOException
OutputStream.
out -
java.io.IOException - if an I/O Error during writing occurespublic boolean equals(java.lang.Object obj)
equals in class java.lang.Objectpublic java.lang.String toString()
toString in class java.lang.Objectpublic int hashCode()
hashCode in class java.lang.Object
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||