edu.iastate.jrelm.rl.rotherev
Class RELearner<I,A extends Action<I>>

java.lang.Object
  extended by edu.iastate.jrelm.rl.AbstractStatlessLearner<REParameters,I,A,Feedback<java.lang.Double>,REPolicy<I,A>>
      extended by edu.iastate.jrelm.rl.rotherev.RELearner<I,A>
All Implemented Interfaces:
ReinforcementLearner<REParameters,I,A,Feedback<java.lang.Double>,REPolicy<I,A>>
Direct Known Subclasses:
VRELearner

public class RELearner<I,A extends Action<I>>
extends AbstractStatlessLearner<REParameters,I,A,Feedback<java.lang.Double>,REPolicy<I,A>>

Roth-Erev Learner The original Roth-Erev reinforcement learning algorithm was presented by A. Roth and I. Erev in
"Learning in Extensive-Form Games: Experimental Data and Simple Dynamic
Models in the Intermediate Term," Games and Economic Behavior,
Special Issue: Nobel Symposium, vol. 8, January 1995, 164-212.
and
"Predicting How People Play Games with Unique Mixed-Strategy Equilibria,"
American Economics Review, Volume 88, 1998, 848-881.

This ReinforcementLearner implements a this later version of the algorithm Implementation adapted, in part, from the RothErevLearner in the Java Auction Simulator API (JASA) by Steve Phelps, Department of Computer Science, University of Liverpool.

Author:
Charles Gieseler\

Field Summary
protected  java.util.ArrayList<I> actionIDList
           
protected  int domainSize
           
 
Constructor Summary
RELearner(REParameters learningParams, ActionDomain<I,A> aDomain)
          Construct a RothErev learning component with parameters specified in a AREParameters object.
RELearner(REParameters learningParams, REPolicy<I,A> aPolicy)
          Construct a RothErev learning component with parameters specified in a AREParameters object and the given policy.
 
Method Summary
protected  double experience(int actionIndex, double reward)
           
protected  void generateBoltzmanProbs()
           
 double getInitialPropensity()
          Retreive the initial propensity value.
 java.lang.String getName()
          Retrieves the name of the learning algorithm this learner implements.
 REParameters getParameters()
          Returns the parameters currently being used by this RELearner.
 REPolicy<I,A> getPolicy()
          Retrieve the StatelessPolicy being used to represent learned knowledge.
protected  void init()
           
 REParameters makeParameters()
          Create a default set of parameters that can be used with this learner.
 void reset()
          Clear all learned knowledge.
 void setInitialPropensityValue(double initProp)
          Set the initial propensity value.
 void setPolicy(REPolicy<I,A> newPolicy)
          Set the StatelessPolicy to be used to represent learned knowledge.
 void update(double feedback)
          Convenience version of the update(FeedbackDoubleValue) method
 void update(Feedback<java.lang.Double> feedback)
          This activates the learning process according to the modified Roth-Erev learning algorithm.
protected  void updateProbabilities()
          Updates the probability for each action to be chosen in the policy.
protected  void updatePropensities(double reward)
           
 
Methods inherited from class edu.iastate.jrelm.rl.AbstractStatlessLearner
chooseAction, getLastRandSeed, getLastSelectedAction, getUpdateCount, incrementUpdateCount, resetUpdateCount, setLastRandSeed, setLastSelectedAction, setParameters, setUpdateCount
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

domainSize

protected int domainSize

actionIDList

protected java.util.ArrayList<I> actionIDList
Constructor Detail

RELearner

public RELearner(REParameters learningParams,
                 ActionDomain<I,A> aDomain)
Construct a RothErev learning component with parameters specified in a AREParameters object.

Parameters:
learningParams - - the collection of parameter settings for the RELearner
aDomain - - the ActionDomain to learn over

RELearner

public RELearner(REParameters learningParams,
                 REPolicy<I,A> aPolicy)
Construct a RothErev learning component with parameters specified in a AREParameters object and the given policy. Note: any random seed that is already set in the given policy will be overwritten with the seed in the given parmaters.

Parameters:
learningParams - - the collection of parameter settings for the RELearner
aPolicy - - a Roth-Erev specific policy type (REPolicy).
Method Detail

init

protected void init()
Overrides:
init in class AbstractStatlessLearner<REParameters,I,A extends Action<I>,Feedback<java.lang.Double>,REPolicy<I,A extends Action<I>>>

updatePropensities

protected void updatePropensities(double reward)

experience

protected double experience(int actionIndex,
                            double reward)

updateProbabilities

protected void updateProbabilities()
Updates the probability for each action to be chosen in the policy. Uses a proportional probability unless the given parameters say to use a Gibbs-Bolztmann distribution.


generateBoltzmanProbs

protected void generateBoltzmanProbs()

update

public void update(Feedback<java.lang.Double> feedback)
This activates the learning process according to the modified Roth-Erev learning algorithm. Feedback is interpreted as reward for the last Action chosen by this engine. Entries in the policy associated with this Action are updated accordingly.

This algorithm expects feedback to be given as a Double.

Parameters:
feedback - - reward for the specified action
See Also:
ReinforcementLearner

update

public void update(double feedback)
Convenience version of the update(FeedbackDoubleValue) method

Parameters:
feedback - - reward for the specified action, given as a primitive double.
See Also:
ReinforcementLearner

reset

public void reset()
Clear all learned knowledge. The Action propensities are set to the current initial value and the probability values in the policy.


getInitialPropensity

public double getInitialPropensity()
Retreive the initial propensity value.

Returns:
initial propesisty

setInitialPropensityValue

public void setInitialPropensityValue(double initProp)
Set the initial propensity value.

Parameters:
initProp -

getName

public java.lang.String getName()
Description copied from interface: ReinforcementLearner
Retrieves the name of the learning algorithm this learner implements.

Returns:
- the algorithm name

getParameters

public REParameters getParameters()
Returns the parameters currently being used by this RELearner. Returned parameters are of type REParameters

Specified by:
getParameters in interface ReinforcementLearner<REParameters,I,A extends Action<I>,Feedback<java.lang.Double>,REPolicy<I,A extends Action<I>>>
Overrides:
getParameters in class AbstractStatlessLearner<REParameters,I,A extends Action<I>,Feedback<java.lang.Double>,REPolicy<I,A extends Action<I>>>
Returns:
the REParameters for this learner
See Also:
ReinforcementLearner.getParameters();

makeParameters

public REParameters makeParameters()
Description copied from interface: ReinforcementLearner
Create a default set of parameters that can be used with this learner.

Returns:
learning parameters compatible with this learner, initialized to default settings.
See Also:
ReinforcementLearner.makeParameters()

getPolicy

public REPolicy<I,A> getPolicy()
Description copied from interface: ReinforcementLearner
Retrieve the StatelessPolicy being used to represent learned knowledge.

Specified by:
getPolicy in interface ReinforcementLearner<REParameters,I,A extends Action<I>,Feedback<java.lang.Double>,REPolicy<I,A extends Action<I>>>
Overrides:
getPolicy in class AbstractStatlessLearner<REParameters,I,A extends Action<I>,Feedback<java.lang.Double>,REPolicy<I,A extends Action<I>>>
Returns:
the Policy being used by this ReinforcementLearner. The policy can be any object implementing the StatelessPolicy interface.
See Also:
ReinforcementLearner.getPolicy()

setPolicy

public void setPolicy(REPolicy<I,A> newPolicy)
Description copied from interface: ReinforcementLearner
Set the StatelessPolicy to be used to represent learned knowledge.

Specified by:
setPolicy in interface ReinforcementLearner<REParameters,I,A extends Action<I>,Feedback<java.lang.Double>,REPolicy<I,A extends Action<I>>>
Overrides:
setPolicy in class AbstractStatlessLearner<REParameters,I,A extends Action<I>,Feedback<java.lang.Double>,REPolicy<I,A extends Action<I>>>
See Also:
edu.iastate.jrelm.rl.ReinforcementLearner#setPolicy(edu.iastate.jrelm.rl.StatelessPolicy)