opennlp.tools.coref.resolver
Class MaxentResolver

java.lang.Object
  extended by opennlp.tools.coref.resolver.AbstractResolver
      extended by opennlp.tools.coref.resolver.MaxentResolver
All Implemented Interfaces:
Resolver
Direct Known Subclasses:
CommonNounResolver, DefiniteNounResolver, IsAResolver, PluralNounResolver, PluralPronounResolver, ProperNounResolver, SingularPronounResolver, SpeechPronounResolver

public abstract class MaxentResolver
extends AbstractResolver

Provides common functionality used by classes which implement the Resolver class and use maximum entropy models to make resolution decisions.


Field Summary
static java.lang.String DEFAULT
          Default feature value.
static java.lang.String DIFF
          Outcome when two mentions are not corefernt.
protected  NonReferentialResolver nonReferentialResolver
          The model for computing non-referential probabilities.
protected  boolean pairedSampleSelection
          When true, this designates that training should consist of a single positive and a single negitive example (when possible) for each mention.
protected  boolean preferFirstReferent
          When true, this designates that the resolver should use the first referent encountered which it more preferable than non-reference.
static java.lang.String SAME
          Outcomes when two mentions are coreferent.
protected  boolean useSameModelForNonRef
          When true, this designates that the same maximum entropy model should be used non-reference events (the pairing of a mention and the "null" reference) as is used for potentially referential pairs.
 
Fields inherited from class opennlp.tools.coref.resolver.AbstractResolver
distances, numEntitiesBack, numSentencesBack, showExclusions
 
Constructor Summary
protected MaxentResolver(int numberOfEntitiesBack, boolean preferFirstReferent)
          Creates a maximum-entropy-based resolver which will look the specified number of entities back for a referent.
  MaxentResolver(java.lang.String modelDirectory, java.lang.String modelName, ResolverMode mode, int numberEntitiesBack)
          Creates a maximum-entropy-based resolver with the specified model name, using the specified mode, which will look the specified number of entities back for a referent.
  MaxentResolver(java.lang.String modelDirectory, java.lang.String modelName, ResolverMode mode, int numberEntitiesBack, boolean preferFirstReferent)
           
  MaxentResolver(java.lang.String modelDirectory, java.lang.String modelName, ResolverMode mode, int numberEntitiesBack, boolean preferFirstReferent, double nonReferentialProbability)
           
  MaxentResolver(java.lang.String modelDirectory, java.lang.String name, ResolverMode mode, int numberOfEntitiesBack, boolean preferFirstReferent, NonReferentialResolver nonReferentialResolver)
          Creates a maximum-entropy-based resolver with the specified model name, using the specified mode, which will look the specified number of entities back for a referent and prefer the first referent if specified.
  MaxentResolver(java.lang.String modelDirectory, java.lang.String modelName, ResolverMode mode, int numberEntitiesBack, NonReferentialResolver nonReferentialResolver)
           
 
Method Summary
protected  boolean defaultReferent(DiscourseEntity de)
          Returns whether the specified entity satisfies the criteria for being a default referent.
protected  boolean definiteArticle(java.lang.String tok, java.lang.String tag)
          Returns whether the specified token is a definite article.
protected  boolean excluded(MentionContext ec, DiscourseEntity de)
          Excludes entities which you are not compatible with the entity under consideration.
static java.util.List getContextFeatures(MentionContext mention)
          Returns a list of features based on the surrounding context of the specified mention.
protected  java.util.List getDistanceFeatures(MentionContext mention, DiscourseEntity entity)
          Returns distance features for the specified mention and entity.
protected  java.util.List getFeatures(MentionContext mention, DiscourseEntity entity)
          Returns a list of features for deciding whether the specificed mention refers to the specified discourse entity.
protected  java.lang.String getMentionCountFeature(DiscourseEntity de)
           
protected  java.util.List getPronounMatchFeatures(MentionContext mention, DiscourseEntity entity)
          Returns features indicating whether the specified mention is compatible with the pronouns of the specified entity.
protected  java.util.List getStringMatchFeatures(MentionContext mention, DiscourseEntity entity)
          Returns string-match features for the the specified mention and entity.
static java.util.List getWordFeatures(Parse token)
          Returns a list of word features for the specified tokens.
static boolean loadAsResource()
          Returns whether the models should be loaded from a file or from a resource.
static void loadAsResource(boolean lar)
          Specifies whether the models should be loaded from a resource.
 DiscourseEntity resolve(MentionContext ec, DiscourseModel dm)
          Resolve this refering extression to a discourse entity in the discourse model.
 DiscourseEntity retain(MentionContext mention, DiscourseModel dm)
          Uses the specified mention and discourse model to train this resolver.
static void setSimilarityModel(TestSimilarityModel sm)
           
 void train()
          Retrains model on examples for which retain was called.
 
Methods inherited from class opennlp.tools.coref.resolver.AbstractResolver
featureString, getHead, getHeadIndex, getHeadString, getNumEntities, getNumEntities, getPronounGender, outOfRange, setNumberSentencesBack, stripNp
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface opennlp.tools.coref.resolver.Resolver
canResolve
 

Field Detail

SAME

public static final java.lang.String SAME
Outcomes when two mentions are coreferent.

See Also:
Constant Field Values

DIFF

public static final java.lang.String DIFF
Outcome when two mentions are not corefernt.

See Also:
Constant Field Values

DEFAULT

public static final java.lang.String DEFAULT
Default feature value.

See Also:
Constant Field Values

preferFirstReferent

protected boolean preferFirstReferent
When true, this designates that the resolver should use the first referent encountered which it more preferable than non-reference. When false all non-excluded referents within this resolvers range are considered.


pairedSampleSelection

protected boolean pairedSampleSelection
When true, this designates that training should consist of a single positive and a single negitive example (when possible) for each mention.


useSameModelForNonRef

protected boolean useSameModelForNonRef
When true, this designates that the same maximum entropy model should be used non-reference events (the pairing of a mention and the "null" reference) as is used for potentially referential pairs. When false a seperate model is created for these events.


nonReferentialResolver

protected NonReferentialResolver nonReferentialResolver
The model for computing non-referential probabilities.

Constructor Detail

MaxentResolver

protected MaxentResolver(int numberOfEntitiesBack,
                         boolean preferFirstReferent)
Creates a maximum-entropy-based resolver which will look the specified number of entities back for a referent. This constructor is only used for unit testing.

Parameters:
numberOfEntitiesBack -
preferFirstReferent -

MaxentResolver

public MaxentResolver(java.lang.String modelDirectory,
                      java.lang.String name,
                      ResolverMode mode,
                      int numberOfEntitiesBack,
                      boolean preferFirstReferent,
                      NonReferentialResolver nonReferentialResolver)
               throws java.io.IOException
Creates a maximum-entropy-based resolver with the specified model name, using the specified mode, which will look the specified number of entities back for a referent and prefer the first referent if specified.

Parameters:
modelDirectory - The name of the directory where the resover models are stored.
name - The name of the file where this model will be read or written.
mode - The mode this resolver is being using in (training, testing).
numberOfEntitiesBack - The number of entities back in the text that this resolver will look for a referent.
preferFirstReferent - Set to true if the resolver should prefer the first referent which is more likly than non-reference. This only affects testing.
nonReferentialResolver - Determines how likly it is that this entity is non-referential.
Throws:
java.io.IOException - If the model file is not found or can not be written to.

MaxentResolver

public MaxentResolver(java.lang.String modelDirectory,
                      java.lang.String modelName,
                      ResolverMode mode,
                      int numberEntitiesBack)
               throws java.io.IOException
Creates a maximum-entropy-based resolver with the specified model name, using the specified mode, which will look the specified number of entities back for a referent.

Parameters:
modelDirectory - The name of the directory where the resover models are stored.
modelName - The name of the file where this model will be read or written.
mode - The mode this resolver is being using in (training, testing).
numberEntitiesBack - The number of entities back in the text that this resolver will look for a referent.
Throws:
java.io.IOException - If the model file is not found or can not be written to.

MaxentResolver

public MaxentResolver(java.lang.String modelDirectory,
                      java.lang.String modelName,
                      ResolverMode mode,
                      int numberEntitiesBack,
                      NonReferentialResolver nonReferentialResolver)
               throws java.io.IOException
Throws:
java.io.IOException

MaxentResolver

public MaxentResolver(java.lang.String modelDirectory,
                      java.lang.String modelName,
                      ResolverMode mode,
                      int numberEntitiesBack,
                      boolean preferFirstReferent)
               throws java.io.IOException
Throws:
java.io.IOException

MaxentResolver

public MaxentResolver(java.lang.String modelDirectory,
                      java.lang.String modelName,
                      ResolverMode mode,
                      int numberEntitiesBack,
                      boolean preferFirstReferent,
                      double nonReferentialProbability)
               throws java.io.IOException
Throws:
java.io.IOException
Method Detail

loadAsResource

public static void loadAsResource(boolean lar)
Specifies whether the models should be loaded from a resource.

Parameters:
lar - boolean which if true indicates that the model should be loaded as a resource.

loadAsResource

public static boolean loadAsResource()
Returns whether the models should be loaded from a file or from a resource.

Returns:
whether the models should be loaded from a file or from a resource.

resolve

public DiscourseEntity resolve(MentionContext ec,
                               DiscourseModel dm)
Description copied from interface: Resolver
Resolve this refering extression to a discourse entity in the discourse model.

Parameters:
ec - the refering expression.
dm - the discourse model.
Returns:
the discourse entity which the resolver beleives this refering expression refers to or null if no discourse entity is coreferent with the refering expression.

defaultReferent

protected boolean defaultReferent(DiscourseEntity de)
Returns whether the specified entity satisfies the criteria for being a default referent. This criteria is used to perform sample selection on the training data and to select a single non-referent entity. Typcically the criteria is a hueristic for a likly referent.

Parameters:
de - The discourse entity being considered for non-reference.
Returns:
True if the entity should be used as a default referent, false otherwise.

retain

public DiscourseEntity retain(MentionContext mention,
                              DiscourseModel dm)
Description copied from interface: Resolver
Uses the specified mention and discourse model to train this resolver. All mentions sent to this method need to have their id fields set to indicate coreference relationships.

Specified by:
retain in interface Resolver
Overrides:
retain in class AbstractResolver
Parameters:
mention - The mention which is being used for training.
dm - the discourse model.
Returns:
the discourse entity which is refered to by the refering expression or null if no discourse entity is referenced.

getMentionCountFeature

protected java.lang.String getMentionCountFeature(DiscourseEntity de)

getFeatures

protected java.util.List getFeatures(MentionContext mention,
                                     DiscourseEntity entity)
Returns a list of features for deciding whether the specificed mention refers to the specified discourse entity.

Parameters:
mention - the mention being considers as possibly referential.
entity - The disource entity with which the mention is being considered referential.
Returns:
a list of features used to predict reference between the specified mention and entity.

train

public void train()
           throws java.io.IOException
Description copied from interface: Resolver
Retrains model on examples for which retain was called.

Specified by:
train in interface Resolver
Overrides:
train in class AbstractResolver
Throws:
java.io.IOException

setSimilarityModel

public static void setSimilarityModel(TestSimilarityModel sm)

getContextFeatures

public static java.util.List getContextFeatures(MentionContext mention)
Returns a list of features based on the surrounding context of the specified mention.

Parameters:
mention - he mention whose surround context the features model.
Returns:
a list of features based on the surrounding context of the specified mention

definiteArticle

protected boolean definiteArticle(java.lang.String tok,
                                  java.lang.String tag)
Returns whether the specified token is a definite article.

Parameters:
tok - The token.
tag - The pos-tag for the specified token.
Returns:
whether the specified token is a definite article.

excluded

protected boolean excluded(MentionContext ec,
                           DiscourseEntity de)
Description copied from class: AbstractResolver
Excludes entities which you are not compatible with the entity under consideration. The default implementation excludes entties whose last extent contatins the extent under consideration. This prevents posessive pronouns from refering to the noun phrases they modify and other undesireable things.

Overrides:
excluded in class AbstractResolver
Parameters:
ec - The mention which is being considered as referential.
de - The entity to which the mention is to be resolved.
Returns:
true if the entity should be excluded, false otherwise.

getDistanceFeatures

protected java.util.List getDistanceFeatures(MentionContext mention,
                                             DiscourseEntity entity)
Returns distance features for the specified mention and entity.

Parameters:
mention - The mention.
entity - The entity.
Returns:
list of distance features for the specified mention and entity.

getPronounMatchFeatures

protected java.util.List getPronounMatchFeatures(MentionContext mention,
                                                 DiscourseEntity entity)
Returns features indicating whether the specified mention is compatible with the pronouns of the specified entity.

Parameters:
mention - The mention.
entity - The entity.
Returns:
list of features indicating whether the specified mention is compatible with the pronouns of the specified entity.

getStringMatchFeatures

protected java.util.List getStringMatchFeatures(MentionContext mention,
                                                DiscourseEntity entity)
Returns string-match features for the the specified mention and entity.

Parameters:
mention - The mention.
entity - The entity.
Returns:
list of string-match features for the the specified mention and entity.

getWordFeatures

public static java.util.List getWordFeatures(Parse token)
Returns a list of word features for the specified tokens.

Parameters:
token - The token for which fetures are to be computed.
Returns:
a list of word features for the specified tokens.


Copyright 2008 Jason Baldridge, Gann Bierner, and Thomas Morton. All Rights Reserved.