edu.iastate.jrelm.rl
Class SimpleStatelessLearner<O>

java.lang.Object
  extended by edu.iastate.jrelm.rl.SimpleStatelessLearner<O>
Type Parameters:
O - - The type of object being used to specify actions. This will default to the most general Oject type.
All Implemented Interfaces:
ReinforcementLearner<RLParameters,java.lang.Integer,SimpleAction<O>,Feedback<java.lang.Double>,StatelessPolicy<java.lang.Integer,SimpleAction<O>>>

public class SimpleStatelessLearner<O>
extends java.lang.Object
implements ReinforcementLearner<RLParameters,java.lang.Integer,SimpleAction<O>,Feedback<java.lang.Double>,StatelessPolicy<java.lang.Integer,SimpleAction<O>>>

The SimpleStatelessLearner packages together all the core learning components and a few pre-implemented reinforcement learning algorithms. It is meant to provide an easy way to drop reinforcement learning capabilities into an agent.

SimpleStatelessLearner is initialized with a collection of objects that represent or specify actions that an agent can take. It makes no difference what specific types are in the collection. The SimpleStatelessLearner just chooses objects from the collection and learns which ones are best to pick based on given feedback.

The desired reinforcement learning algorithm is specified by the type of parameters SimpleStatelessLearner is constructed with. For example, passing VREParameters to the constructor will build a SimpleStatelessLearner that uses a modified version of the Roth-Erev reinforcement learning algorithm. Once the learning algorithm is set through the constructor, it may not be changed for the life of the SimpleStatelessLearner object, though new parameters or policies may be given.

The nextAction() method is used to elicit a choice of action from SimpleStatelessLearner. This will return an object from the original collection given to the learner. Also, nextIndex() may be used to elicit a choice in the form of an int. This int is the index of the chosen action in the original collection.

Author:
Charles Gieseler

Constructor Summary
SimpleStatelessLearner(REParameters params, java.util.Collection<O> actionList)
          Build a SimpleStatelessLearner that uses a the Roth-Erev algorithm.
SimpleStatelessLearner(REParameters params, REPolicy<java.lang.Integer,SimpleAction<O>> aPolicy)
          Build a SimpleStatelessLearner that uses the Roth-Erev algorithm with the given REPolicy.
SimpleStatelessLearner(REParameters params, SimpleActionDomain<O> aDomain)
          Build a SimpleStatelessLearner that uses the Roth-Erev algorithm with the given SimpleActionDomain.
SimpleStatelessLearner(VREParameters params, java.util.Collection<O> actionList)
          Build a SimpleStatelessLearner that uses a modified version of the Roth-Erev algorithm.
SimpleStatelessLearner(VREParameters params, REPolicy<java.lang.Integer,SimpleAction<O>> aPolicy)
          Build a SimpleStatelessLearner that uses a modified version of the Roth-Erev algorithm with the given REPolicy.
SimpleStatelessLearner(VREParameters params, SimpleActionDomain<O> aDomain)
          Build a SimpleStatelessLearner that uses a modified version of the Roth-Erev algorithm with the given SimpleActionDomain.
 
Method Summary
 SimpleAction<O> chooseAction()
          Elicits a choice of action.
 int chooseActionIndex()
          Elicits a choice of action.
 O chooseActionRaw()
          Elicits a choice of action.
 ReinforcementLearner getEngine()
           
 java.lang.String getName()
          Retrieves the name of the learning algorithm this learner implements.
 RLParameters getParameters()
          Returns the learning parameters currently in use.
 StatelessPolicy<java.lang.Integer,SimpleAction<O>> getPolicy()
          Retrieve the StatelessPolicy being used to represent learned knowledge.
 RLParameters makeParameters()
          Creates an RLParameters object of the same type as given to the SimpleStatelessLearner constructor.
 void setParameters(RLParameters learnParams)
          Set the learning parameters to use.
 void setPolicy(StatelessPolicy<java.lang.Integer,SimpleAction<O>> newPolicy)
          Set the policy for this SimpleStatelessLearner.
 void update(Feedback<java.lang.Double> reward)
          Give the learner feedback resulting from its last choice of action.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

SimpleStatelessLearner

public SimpleStatelessLearner(REParameters params,
                              java.util.Collection<O> actionList)
Build a SimpleStatelessLearner that uses a the Roth-Erev algorithm.

Parameters:
actionList - - list of action choices
parameters - - parameters for the Roth-Erev learnining algorithm
See Also:
RELearner, REParameters

SimpleStatelessLearner

public SimpleStatelessLearner(REParameters params,
                              SimpleActionDomain<O> aDomain)
Build a SimpleStatelessLearner that uses the Roth-Erev algorithm with the given SimpleActionDomain.

Parameters:
actionList - - list of action choices
parameters - - parameters for the Roth-Erev learnining algorithm

SimpleStatelessLearner

public SimpleStatelessLearner(REParameters params,
                              REPolicy<java.lang.Integer,SimpleAction<O>> aPolicy)
Build a SimpleStatelessLearner that uses the Roth-Erev algorithm with the given REPolicy. Learning will proceed over the domain included in the REPolicy. This can be used to start a learner with known or previously learned knowledge.

Parameters:
params - - parameters for the modified Roth-Erev learnining algorithm
aPolicy - - an existing policy containing and ActionDomain and possibly previously learned knowledge

SimpleStatelessLearner

public SimpleStatelessLearner(VREParameters params,
                              java.util.Collection<O> actionList)
Build a SimpleStatelessLearner that uses a modified version of the Roth-Erev algorithm.

Parameters:
actionList - - list of action choices
parameters - - parameters for the modified Roth-Erev learnining algorithm
See Also:
VRELearner, VREParameters

SimpleStatelessLearner

public SimpleStatelessLearner(VREParameters params,
                              SimpleActionDomain<O> aDomain)
Build a SimpleStatelessLearner that uses a modified version of the Roth-Erev algorithm with the given SimpleActionDomain.

Parameters:
actionList - - list of action choices
parameters - - parameters for the modified Roth-Erev learnining algorithm
See Also:
VRELearner, VREParameters

SimpleStatelessLearner

public SimpleStatelessLearner(VREParameters params,
                              REPolicy<java.lang.Integer,SimpleAction<O>> aPolicy)
Build a SimpleStatelessLearner that uses a modified version of the Roth-Erev algorithm with the given REPolicy. Learning will proceed over the domain included in the REPolicy. This can be used to start a learner with known or previously learned knowledge.

Parameters:
params - - parameters for the modified Roth-Erev learnining algorithm
aPolicy - - an existing policy containing and ActionDomain and possibly previously learned knowledge
See Also:
VRELearner, VREParameters
Method Detail

chooseAction

public SimpleAction<O> chooseAction()
Elicits a choice of action. The returned choice is a SimpleAction object which is wrapped around an original Object from the Collection given to the learner during its construction. This choice is made according to the reinforcement learning algorithm specified in the constructor.

Specified by:
chooseAction in interface ReinforcementLearner<RLParameters,java.lang.Integer,SimpleAction<O>,Feedback<java.lang.Double>,StatelessPolicy<java.lang.Integer,SimpleAction<O>>>
Returns:
an object from the Original action choice Collection
See Also:
Action

chooseActionRaw

public O chooseActionRaw()
Elicits a choice of action. The returned choice is an Object from the original Collection given to the learner during its construction. The action choice is referred to as "Raw" since it is not wrapped in a SimpleAction. This choice is made according to the reinforcement learning algorithm specified in the constructor.

Returns:
an object from the original action collection

chooseActionIndex

public int chooseActionIndex()
Elicits a choice of action. The returned choice is the index an Object from the original collection given to the learner during its construction. This choice is made according to the reinforcement learning algorithm specified in the constructor.

Returns:
the index of an object from the original action collection

update

public void update(Feedback<java.lang.Double> reward)
Give the learner feedback resulting from its last choice of action. This is the "reward" value for a reinforcement learning algorithm. SimpleStatelessLearner requires feedback in the form of a Double value.

Specified by:
update in interface ReinforcementLearner<RLParameters,java.lang.Integer,SimpleAction<O>,Feedback<java.lang.Double>,StatelessPolicy<java.lang.Integer,SimpleAction<O>>>
Parameters:
reward - - "reward" resulting from the learners last choice of action.

getEngine

public ReinforcementLearner getEngine()

getPolicy

public StatelessPolicy<java.lang.Integer,SimpleAction<O>> getPolicy()
Description copied from interface: ReinforcementLearner
Retrieve the StatelessPolicy being used to represent learned knowledge.

Specified by:
getPolicy in interface ReinforcementLearner<RLParameters,java.lang.Integer,SimpleAction<O>,Feedback<java.lang.Double>,StatelessPolicy<java.lang.Integer,SimpleAction<O>>>
Returns:
the Policy being used by this ReinforcementLearner. The policy can be any object implementing the StatelessPolicy interface.
See Also:
StatelessPolicy

setPolicy

public void setPolicy(StatelessPolicy<java.lang.Integer,SimpleAction<O>> newPolicy)
Set the policy for this SimpleStatelessLearner. Note: the given policy must be compatible with ReinforcementLearner set through the constructor. For example, if SimpleStatelessLearner were given VREParameters through its constructor, then the internal leaner is set to VRELearner and the given policy must be of type MREPolicy.

Specified by:
setPolicy in interface ReinforcementLearner<RLParameters,java.lang.Integer,SimpleAction<O>,Feedback<java.lang.Double>,StatelessPolicy<java.lang.Integer,SimpleAction<O>>>
Parameters:
newPolicy - - new policy to use
See Also:
StatelessPolicy

getParameters

public RLParameters getParameters()
Returns the learning parameters currently in use. Note: These will be of the same type given to SimpleStatelessLearner durring construction.

Specified by:
getParameters in interface ReinforcementLearner<RLParameters,java.lang.Integer,SimpleAction<O>,Feedback<java.lang.Double>,StatelessPolicy<java.lang.Integer,SimpleAction<O>>>
Returns:
learning algorithm settings as RLParameters
See Also:
ReinforcementLearner.getParameters()

makeParameters

public RLParameters makeParameters()
Creates an RLParameters object of the same type as given to the SimpleStatelessLearner constructor. This is intialized with default values.

Specified by:
makeParameters in interface ReinforcementLearner<RLParameters,java.lang.Integer,SimpleAction<O>,Feedback<java.lang.Double>,StatelessPolicy<java.lang.Integer,SimpleAction<O>>>
Returns:
learning parameters compatible with this learner, initialized to default settings.
See Also:
ReinforcementLearner.makeParameters()

setParameters

public void setParameters(RLParameters learnParams)
Set the learning parameters to use. Note: These must be compatible with the type of ReinforcementLeaner being used internally as chosen when the SimpleStatelessLearner was created. In other words, the RLParameters given here must be of the same type as those given to the SimpleStatelessLearner constructor.

Specified by:
setParameters in interface ReinforcementLearner<RLParameters,java.lang.Integer,SimpleAction<O>,Feedback<java.lang.Double>,StatelessPolicy<java.lang.Integer,SimpleAction<O>>>
See Also:
ReinforcementLearner.setParameters(RLParameters)

getName

public java.lang.String getName()
Description copied from interface: ReinforcementLearner
Retrieves the name of the learning algorithm this learner implements.

Specified by:
getName in interface ReinforcementLearner<RLParameters,java.lang.Integer,SimpleAction<O>,Feedback<java.lang.Double>,StatelessPolicy<java.lang.Integer,SimpleAction<O>>>
Returns:
- the algorithm name
See Also:
ReinforcementLearner.getName()