opennlp.tools.util
Class BeamSearch

java.lang.Object
  extended by opennlp.tools.util.BeamSearch

public class BeamSearch
extends java.lang.Object

Performs k-best search over sequence. This is based on the description in Ratnaparkhi (1998), PhD diss, Univ. of Pennsylvania.


Field Summary
protected  BeamSearchContextGenerator cg
           
protected  opennlp.maxent.MaxentModel model
           
protected  int size
           
 
Constructor Summary
BeamSearch(int size, BeamSearchContextGenerator cg, opennlp.maxent.MaxentModel model)
          Creates new search object.
BeamSearch(int size, BeamSearchContextGenerator cg, opennlp.maxent.MaxentModel model, int cacheSize)
           
 
Method Summary
 Sequence bestSequence(java.util.List sequence, java.lang.Object[] additionalContext)
          Returns the best sequence of outcomes based on model for this object.
 Sequence bestSequence(java.lang.Object[] sequence, java.lang.Object[] additionalContext)
          Returns the best sequence of outcomes based on model for this object.
 Sequence[] bestSequences(int numSequences, java.lang.Object[] sequence, java.lang.Object[] additionalContext)
           
 Sequence[] bestSequences(int numSequences, java.lang.Object[] sequence, java.lang.Object[] additionalContext, double minSequenceScore)
          Returns the best sequence of outcomes based on model for this object.
protected  boolean validSequence(int i, java.lang.Object[] inputSequence, java.lang.String[] outcomesSequence, java.lang.String outcome)
          Determines whether a particular continuation of a sequence is valid.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

model

protected opennlp.maxent.MaxentModel model

cg

protected BeamSearchContextGenerator cg

size

protected int size
Constructor Detail

BeamSearch

public BeamSearch(int size,
                  BeamSearchContextGenerator cg,
                  opennlp.maxent.MaxentModel model)
Creates new search object.

Parameters:
size - The size of the beam (k).
cg - the context generator for the model.
model - the model for assigning probabilities to the sequence outcomes.

BeamSearch

public BeamSearch(int size,
                  BeamSearchContextGenerator cg,
                  opennlp.maxent.MaxentModel model,
                  int cacheSize)
Method Detail

bestSequences

public Sequence[] bestSequences(int numSequences,
                                java.lang.Object[] sequence,
                                java.lang.Object[] additionalContext)

bestSequences

public Sequence[] bestSequences(int numSequences,
                                java.lang.Object[] sequence,
                                java.lang.Object[] additionalContext,
                                double minSequenceScore)
Returns the best sequence of outcomes based on model for this object.

Parameters:
numSequences - The maximum number of sequences to be returned.
sequence - The input sequence.
additionalContext - An Object[] of additional context. This is passed to the context generator blindly with the assumption that the context are appropiate.
minSequenceScore - A lower bound on the score of a returned sequence.
Returns:
An array of the top ranked sequences of outcomes.

bestSequence

public Sequence bestSequence(java.util.List sequence,
                             java.lang.Object[] additionalContext)
Returns the best sequence of outcomes based on model for this object.

Parameters:
sequence - The input sequence.
additionalContext - An Object[] of additional context. This is passed to the context generator blindly with the assumption that the context are appropiate.
Returns:
The top ranked sequence of outcomes.

bestSequence

public Sequence bestSequence(java.lang.Object[] sequence,
                             java.lang.Object[] additionalContext)
Returns the best sequence of outcomes based on model for this object.

Parameters:
sequence - The input sequence.
additionalContext - An Object[] of additional context. This is passed to the context generator blindly with the assumption that the context are appropiate.
Returns:
The top ranked sequence of outcomes.

validSequence

protected boolean validSequence(int i,
                                java.lang.Object[] inputSequence,
                                java.lang.String[] outcomesSequence,
                                java.lang.String outcome)
Determines whether a particular continuation of a sequence is valid. This is used to restrict invalid sequences such as thoses used in start/continue tag-based chunking or could be used to implement tag dictionary restrictions.

Parameters:
i - The index in the input sequence for which the new outcome is being proposed.
inputSequence - The input sequence.
outcomesSequence - The outcomes so far in this sequence.
outcome - The next proposed outcome for the outcomes sequence.
Returns:
true is the sequence would still be valid with the new outcome, false otherwise.


Copyright 2008 Jason Baldridge, Gann Bierner, and Thomas Morton. All Rights Reserved.