edu.iastate.jrelm.rl
Interface ReinforcementLearner<PA extends RLParameters,I,A extends Action,F extends Feedback,PO extends Policy>

Type Parameters:
PA - - the type of ReinforcementLearner parameters (RLParameters) this learner accepts. This will usually be a specific set of paramters needed for a particular learning algorithm.
I - - the type of identifier being used to distiguish Actions
A - - the type of Actions this learner is working with
F - - the type of reinforcement object (Feedback) that this learner accepts
PO - - the type of Policy that this learner updates and uses make new Action selections
All Known Implementing Classes:
AbstractStatlessLearner, GBMLearner, RELearner, SimpleStatelessLearner, VRELearner

public interface ReinforcementLearner<PA extends RLParameters,I,A extends Action,F extends Feedback,PO extends Policy>

For classes that implement reinforcement learning algorithms. Classes implementing this interface are responsible for driving the learning process of specific algorithms. Reinforcement learning algorithms make use of a policy to represent learned knowledge. Policies themselves require access to the space of possible actions, represented by ActionDomains. As such an ReinforcementLearner will make use of with a StatelessPolicy and an ActionDomain.

Feedback is parameterized since required input will vary depending on the specific reinforcement learning algorithm and the particular simulation environment.

Author:
Charles Giseler

Method Summary
 A chooseAction()
          Elicits a new choice of action.
 java.lang.String getName()
          Retrieves the name of the learning algorithm this learner implements.
 PA getParameters()
          Retrieve the RLParameters that contain settings for this learning algorithm.
 PO getPolicy()
          Retrieve the StatelessPolicy being used to represent learned knowledge.
 PA makeParameters()
          Create a default set of parameters that can be used with this learner.
 void setParameters(PA newParams)
          Sets the current settings for this learning algorithm.
 void setPolicy(PO newPolicy)
          Set the StatelessPolicy to be used to represent learned knowledge.
 void update(F reward)
          Initiate the learning process using given feedback.
 

Method Detail

update

void update(F reward)
Initiate the learning process using given feedback. Feedback is associated with a the last Action chosen by this leaner and is treated as a reinforcement value for that Action. Feedback used to update the probability of choosing Actions in the learner's Policy according to the specific learning algorithm implemented.

Note: Most often feedback is for the last Action chosen, so given ActionID will usually point to this Action. As such, many RLEnigine implementations may also provide update() methods that simply accept feedback and associate it with the last Action chosen.

Parameters:
reward - - feedback for the specified action

chooseAction

A chooseAction()
Elicits a new choice of action. The action will be chosen according to selection rule of the SimpleStatelessPolicy. Actions are chosen from a DiscreteFiniteDomain.

Returns:
the next action chosen. Selected actions can be any object implementing the Action interface.
See Also:
Action

getParameters

PA getParameters()
Retrieve the RLParameters that contain settings for this learning algorithm.

Returns:
learning algorithm settings as RLParameters

makeParameters

PA makeParameters()
Create a default set of parameters that can be used with this learner.

Returns:
learning parameters compatible with this learner, initialized to default settings.

setParameters

void setParameters(PA newParams)
Sets the current settings for this learning algorithm.

Parameters:
parameters - - new settings for this algorithm as RLParameters

getPolicy

PO getPolicy()
Retrieve the StatelessPolicy being used to represent learned knowledge.

Returns:
the Policy being used by this ReinforcementLearner. The policy can be any object implementing the StatelessPolicy interface.
See Also:
StatelessPolicy

setPolicy

void setPolicy(PO newPolicy)
Set the StatelessPolicy to be used to represent learned knowledge.

Parameters:
p - - The policy can be any object implementing the StatelessPolicy interface.
See Also:
StatelessPolicy

getName

java.lang.String getName()
Retrieves the name of the learning algorithm this learner implements.

Returns:
- the algorithm name