edu.iastate.jrelm.rl.rotherev.variant
Class VRELearner<I,A extends Action<I>>

java.lang.Object
  extended by edu.iastate.jrelm.rl.AbstractStatlessLearner<REParameters,I,A,Feedback<java.lang.Double>,REPolicy<I,A>>
      extended by edu.iastate.jrelm.rl.rotherev.RELearner<I,A>
          extended by edu.iastate.jrelm.rl.rotherev.variant.VRELearner<I,A>
All Implemented Interfaces:
ReinforcementLearner<REParameters,I,A,Feedback<java.lang.Double>,REPolicy<I,A>>

public class VRELearner<I,A extends Action<I>>
extends RELearner<I,A>

Variant Roth-Erev Learner

This ReinforcementLearner implements a variation of the Roth-Ere'ev algorithm as presented in
James Nicolaisen, Valentin Petrov, and Leigh Tesfatsian, "Market Power and
Efficiency in a Computational Electricity Market with Discriminatory
Double-Auction Pricing," IEEE Transactions on Evolutionary Computation,
Volume 5, Number 5, 2001, 504-523.

See VRELearner.experience(int, double) for more details.

In addition, this variation allows updating with given reward-action pairs. See VRELearner.update(FeedbackDoubleValue, Action).

See RELearner for details on the original Roth-Erev algorithm

Author:
Charles Gieseler

Field Summary
 
Fields inherited from class edu.iastate.jrelm.rl.rotherev.RELearner
actionIDList, domainSize
 
Constructor Summary
VRELearner(VREParameters learningParams, ActionDomain<I,A> aDomain)
          Construct a learner using Variant Roth-Erev with parameters specified in a VREParameters object.
VRELearner(VREParameters learningParams, REPolicy<I,A> aPolicy)
          Construct a RothErev learning component with parameters specified in a AMREParameters object and the given policy.
 
Method Summary
protected  double experience(int actionIndex, double reward)
           
protected  void generateBoltzmanProbs()
           
 java.lang.String getName()
          Retrieves the name of the learning algorithm this learner implements.
 VREParameters getParameters()
          Returns the parameters currently being used by this RELearner.
 VREParameters makeParameters()
          Create a default set of parameters that can be used with this learner.
 void update(double feedback, A actionToReinforce)
          Convenience version of the update(Feedback, Action) method.
 void update(Feedback<java.lang.Double> feedback, A actionToReinforce)
          Update the Policy according to the Variant Roth-Erev algorithm, but associate the given feedback with the given Action.
protected  void updateProbabilities()
          Updates the probability for each action to be chosen in the policy.
 
Methods inherited from class edu.iastate.jrelm.rl.rotherev.RELearner
getInitialPropensity, getPolicy, init, reset, setInitialPropensityValue, setPolicy, update, update, updatePropensities
 
Methods inherited from class edu.iastate.jrelm.rl.AbstractStatlessLearner
chooseAction, getLastRandSeed, getLastSelectedAction, getUpdateCount, incrementUpdateCount, resetUpdateCount, setLastRandSeed, setLastSelectedAction, setParameters, setUpdateCount
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

VRELearner

public VRELearner(VREParameters learningParams,
                  ActionDomain<I,A> aDomain)
Construct a learner using Variant Roth-Erev with parameters specified in a VREParameters object.

Parameters:
learningParams - - the collection of parameter settings for the VRELearner
aDomain - - the ActionDomain to learn over

VRELearner

public VRELearner(VREParameters learningParams,
                  REPolicy<I,A> aPolicy)
Construct a RothErev learning component with parameters specified in a AMREParameters object and the given policy. Note: any random seed that is already set in the given policy will be overwritten with the seed in the given parmaters.

Parameters:
learningParams - - the collection of parameter settings for the VRELearner
aPolicy - - a SimpleStatelessPolicy
Method Detail

update

public void update(Feedback<java.lang.Double> feedback,
                   A actionToReinforce)
            throws java.lang.Exception
Update the Policy according to the Variant Roth-Erev algorithm, but associate the given feedback with the given Action. Learning will proceed in accordance with the Variant Roth-Erev algorithm, however, the given Action will be treated as if it were the last one selected by the last call to chooseAction().

Note, the actual record of the last Action chosen will not be changed. Thus a call to getLastSelectedAction() will yield the same Action before and after a call to this update method.

This update method will throw an Exception if this learner does not know about the given Action. That is, an Exception will be thrown if the given Action is not in the ActionDomain that this learner is using.

Parameters:
feedback - - reward for the specified action, given as a primitive double.
actionToReinforce - - update will proceed associating the given feedback with this Action
Throws:
java.lang.Exception

update

public void update(double feedback,
                   A actionToReinforce)
            throws java.lang.Exception
Convenience version of the update(Feedback, Action) method.

Parameters:
feedback - - reward for the specified action, given as a primitive double.
actionToReinforce - - update will proceed associating the given feedback with this Action
Throws:
java.lang.Exception
See Also:
update(Feedback, Action)

updateProbabilities

protected void updateProbabilities()
Description copied from class: RELearner
Updates the probability for each action to be chosen in the policy. Uses a proportional probability unless the given parameters say to use a Gibbs-Bolztmann distribution.

Overrides:
updateProbabilities in class RELearner<I,A extends Action<I>>

generateBoltzmanProbs

protected void generateBoltzmanProbs()
Overrides:
generateBoltzmanProbs in class RELearner<I,A extends Action<I>>

experience

protected double experience(int actionIndex,
                            double reward)
Overrides:
experience in class RELearner<I,A extends Action<I>>

getName

public java.lang.String getName()
Description copied from interface: ReinforcementLearner
Retrieves the name of the learning algorithm this learner implements.

Specified by:
getName in interface ReinforcementLearner<REParameters,I,A extends Action<I>,Feedback<java.lang.Double>,REPolicy<I,A extends Action<I>>>
Overrides:
getName in class RELearner<I,A extends Action<I>>
Returns:
- the algorithm name

getParameters

public VREParameters getParameters()
Description copied from class: RELearner
Returns the parameters currently being used by this RELearner. Returned parameters are of type REParameters

Specified by:
getParameters in interface ReinforcementLearner<REParameters,I,A extends Action<I>,Feedback<java.lang.Double>,REPolicy<I,A extends Action<I>>>
Overrides:
getParameters in class RELearner<I,A extends Action<I>>
Returns:
the REParameters for this learner
See Also:
ReinforcementLearner.getParameters();

makeParameters

public VREParameters makeParameters()
Description copied from interface: ReinforcementLearner
Create a default set of parameters that can be used with this learner.

Specified by:
makeParameters in interface ReinforcementLearner<REParameters,I,A extends Action<I>,Feedback<java.lang.Double>,REPolicy<I,A extends Action<I>>>
Overrides:
makeParameters in class RELearner<I,A extends Action<I>>
Returns:
learning parameters compatible with this learner, initialized to default settings.
See Also:
ReinforcementLearner.makeParameters()