edu.iastate.jrelm.rl
Class SimplePolicy<AI,A extends Action,SI,S extends State>

java.lang.Object
  extended by edu.iastate.jrelm.rl.SimplePolicy<AI,A,SI,S>
Type Parameters:
S - - the type of State managed by the given StateDomain.
AI - - the type of Action identifier used.
A - - the type of Action contained in the given ActionDomain
All Implemented Interfaces:
Policy<AI,A,SI,S>

public class SimplePolicy<AI,A extends Action,SI,S extends State>
extends java.lang.Object
implements Policy<AI,A,SI,S>

A simple implementation of the Policy interface. This is essentially manages discrete probability distributions governing the choice of action based on the current state of the world. An action choice is represented as one Action object from a given ActionDomain. Similarly, the state of the world is represented as one State object from a given StateDomain. This is to be used with a ReinforcementLearner, which is responsible for updating this policy according to the reinforcement learning algorithm implemented.

Author:
Charles Gieseler

Field Summary
protected  ActionDomain<AI,A> actionDomain
           
protected  java.util.ArrayList<AI> actionIDList
           
protected  ModifiedEmpiricalWalker eventGenerator
           
protected  A lastAction
           
protected  double[][] pdfs
           
protected  cern.jet.random.engine.RandomEngine randomEngine
           
protected  int randSeed
           
protected  StateDomain<SI,S> stateDomain
           
protected  java.util.ArrayList<SI> stateIDList
           
 
Constructor Summary
SimplePolicy(ActionDomain<AI,A> aDomain, StateDomain<SI,S> sDomain)
          Construct a SimplePolicy using a given ActionDomain and StateDomain.
SimplePolicy(ActionDomain<AI,A> aDomain, StateDomain<SI,S> sDomain, double[][] initPDFs)
          Construct a SimplePolicy using the given ActionDomain, StateDomain, and initial probability distribution functions.
SimplePolicy(ActionDomain<AI,A> aDomain, StateDomain<SI,S> sDomain, double[][] initPDFs, int randSeed)
           
SimplePolicy(ActionDomain<AI,A> aDomain, StateDomain<SI,S> sDomain, double[][] initPDFs, cern.jet.random.engine.RandomEngine randomGen)
           
SimplePolicy(ActionDomain<AI,A> aDomain, StateDomain<SI,S> sDomain, int randSeed)
          Construct a SimplePolicy using a given ActionDomain, StateDomain and psuedo-random generator seed.
SimplePolicy(ActionDomain<AI,A> aDomain, StateDomain<SI,S> sDomain, cern.jet.random.engine.RandomEngine randomGen)
          Construct a SimplePolicy using a given ActionDomain and RandomEngine.
 
Method Summary
 A generateAction(SI stateID)
          Given the indentifier of the current State, choose an Action according to the current probability distribution function.
 A generateAction(State<SI> currentState)
          Given the current State, choose an Action according to the current probability distribution function.
 ActionDomain<AI,A> getActionDomain()
          Get the domain of action being used by this policy.
 double[][] getDistribution()
          Retrieve the collection of probability distribution functions used in selecting Actions from the ActionDomain for all States in the StateDomain.
 double[] getDistribution(State<SI> aState)
          Retrieve the probability distribution function used in selecting Actions from the ActionDomain in the given State.
 A getLastAction()
          Get the last action chosen by this policy.
 int getNumActions()
          Retrieve the number of Actions in this policy's ActionDomain.
 int getNumStates()
          Retrieve the number of States in this policy's StateDomain.
 double getProbability(SI stateID, AI actionID)
          Look up the probability for a State-Action pair.
 int getRandomSeed()
           
 StateDomain<SI,S> getStateDomain()
          Get the domain of world state being used by this policy.
protected  void init()
           
 void reset()
          Reset this policy.
 void setDistribution(State<SI> aState, double[] pdf)
          Set the probability distribution function used in selecting Actions from the ActionDomain for the given State.
 void setProbability(SI stateID, AI actionID, double newValue)
          Set a State-Action pair probability value.
 void setRandomEngine(cern.jet.random.engine.RandomEngine engine)
          Sets the RandomEngine to be used by this policy.
 void setRandomSeed(int seed)
          Resets the RandomEngine, initializing it with the given seed.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

pdfs

protected double[][] pdfs

randomEngine

protected cern.jet.random.engine.RandomEngine randomEngine

eventGenerator

protected ModifiedEmpiricalWalker eventGenerator

actionDomain

protected ActionDomain<AI,A extends Action> actionDomain

stateDomain

protected StateDomain<SI,S extends State> stateDomain

actionIDList

protected java.util.ArrayList<AI> actionIDList

stateIDList

protected java.util.ArrayList<SI> stateIDList

lastAction

protected A extends Action lastAction

randSeed

protected int randSeed
Constructor Detail

SimplePolicy

public SimplePolicy(ActionDomain<AI,A> aDomain,
                    StateDomain<SI,S> sDomain)
Construct a SimplePolicy using a given ActionDomain and StateDomain. Note, this policy requires a the finite, disctrete action and state domains. A new MersenneTwister seeded with the current time ((int)System.currentTimeMillis()) is created as the RandomEngine for this policy. Note: If creating multiple SimplePolicies and MersenneTwister is the desired RandomEngine, it will be more efficient to create a single MersenneTwister and pass it to each new policy as it is constructed.

Parameters:
aDomain - - the collection of possible Actions
sDomain - - the collection of possible States
See Also:
MersenneTwister

SimplePolicy

public SimplePolicy(ActionDomain<AI,A> aDomain,
                    StateDomain<SI,S> sDomain,
                    int randSeed)
Construct a SimplePolicy using a given ActionDomain, StateDomain and psuedo-random generator seed. A new MersenneTwister seeded with the given seed value is created as the RandomEngine for this policy. Note: If creating multiple SimplePolicies and MersenneTwister is the desired RandomEngine, it will be more efficient to create a single MersenneTwister and pass it to each new policy as it is constructed.

Parameters:
aDomain - - the collection of possible Actions
sDomain - - the collection of possible States
randSeed - - seed value for the random generator used in this policy
See Also:
MersenneTwister

SimplePolicy

public SimplePolicy(ActionDomain<AI,A> aDomain,
                    StateDomain<SI,S> sDomain,
                    cern.jet.random.engine.RandomEngine randomGen)
Construct a SimplePolicy using a given ActionDomain and RandomEngine. Note, this policy requires a finite domain of discrete actions. A new MersenneTwister seeded with the current time ((int)System.currentTimeMillis()) is created as the RandomEngine for this policy.

Parameters:
aDomain - - the collection of possible Actions
sDomain - - the collection of possible States
randomGen - - the RandomEngine to use. This policy employs a ModifiedEmpiricalWalker in selecting Actions, which in turn uses a RandomEngine.
See Also:
RandomEngine, ModifiedEmpiricalWalker

SimplePolicy

public SimplePolicy(ActionDomain<AI,A> aDomain,
                    StateDomain<SI,S> sDomain,
                    double[][] initPDFs)
Construct a SimplePolicy using the given ActionDomain, StateDomain, and initial probability distribution functions. This policy requires a discrete, finite ActionDomain and ActionDomain. The pdfs should be organized as State-Action pairs. The first dimension should be the index of a State's ID in the ID list maintained by the StateDomian. The second dimension should be the index of an Action's ID in the ID list maintained by the ActionDomain. For example, probValue = initPDFs[4][2]; should retrieve a State-Action probability value for the fourth State and second Action in the respective State ID and Action ID lists.

Parameters:
aDomain - - the collection of possible Actions
sDomain - - the collection of possible States
initPDFs - - an initial collection of pdfs used in selecting Actions for each State in the StateDomain.

SimplePolicy

public SimplePolicy(ActionDomain<AI,A> aDomain,
                    StateDomain<SI,S> sDomain,
                    double[][] initPDFs,
                    int randSeed)

SimplePolicy

public SimplePolicy(ActionDomain<AI,A> aDomain,
                    StateDomain<SI,S> sDomain,
                    double[][] initPDFs,
                    cern.jet.random.engine.RandomEngine randomGen)
             throws java.lang.IllegalArgumentException
Throws:
java.lang.IllegalArgumentException
Method Detail

init

protected void init()

generateAction

public A generateAction(State<SI> currentState)
Given the current State, choose an Action according to the current probability distribution function.

Returns:
a new Action

generateAction

public A generateAction(SI stateID)
Given the indentifier of the current State, choose an Action according to the current probability distribution function.

Specified by:
generateAction in interface Policy<AI,A extends Action,SI,S extends State>
Returns:
a new Action

reset

public void reset()
Reset this policy. Reverts to a uniform probability distribution over the domain of actions. This only modifies the probability distribution. It does not reset the RandomEngine. WARNING: This will effectivlely erase all learned Action probabilities in this policy.


getDistribution

public double[] getDistribution(State<SI> aState)
Retrieve the probability distribution function used in selecting Actions from the ActionDomain in the given State.

Parameters:
aState - - the State for which to retrieve a an action choice pdf.
Returns:
the current pdf used in choosing Actions for the given world State.

getDistribution

public double[][] getDistribution()
Retrieve the collection of probability distribution functions used in selecting Actions from the ActionDomain for all States in the StateDomain.

Returns:
collection of all the pdf used in selecting Actions for each State.

setDistribution

public void setDistribution(State<SI> aState,
                            double[] pdf)
                     throws java.lang.IllegalArgumentException
Set the probability distribution function used in selecting Actions from the ActionDomain for the given State. The distribution is given as an array of doubles. Note: there should be a value for each Action in this policy's ActionDomain. Each value is associated with the Action.

Parameters:
distrib - - the new collection of Action choice probabilities
Throws:
java.lang.IllegalArgumentException

getActionDomain

public ActionDomain<AI,A> getActionDomain()
Get the domain of action being used by this policy.

Specified by:
getActionDomain in interface Policy<AI,A extends Action,SI,S extends State>
Returns:
the associated ActionDomain
See Also:
Policy.getActionDomain()

getStateDomain

public StateDomain<SI,S> getStateDomain()
Get the domain of world state being used by this policy.

Specified by:
getStateDomain in interface Policy<AI,A extends Action,SI,S extends State>
Returns:
See Also:
Policy.getStateDomain()

getNumActions

public int getNumActions()
Retrieve the number of Actions in this policy's ActionDomain.

Returns:
size of the ActionDomain

getNumStates

public int getNumStates()
Retrieve the number of States in this policy's StateDomain.

Returns:
size of the StateDomain

getLastAction

public A getLastAction()
Get the last action chosen by this policy. Note: this will be null if called before any Actions have been chosen.

Specified by:
getLastAction in interface Policy<AI,A extends Action,SI,S extends State>
See Also:
Policy.getLastAction()

getProbability

public double getProbability(SI stateID,
                             AI actionID)
Look up the probability for a State-Action pair. Gets the current probability of choosing an Action given a world State. The desired State and Action are indicated by Action and State identifiers respectively.

Specified by:
getProbability in interface Policy<AI,A extends Action,SI,S extends State>
Parameters:
stateID - - the identifier indicating the desired state to look up an Action probability for
actionID - - the identifier indicating the Action to look up a probability for.
Returns:
the probability of choosing the specified Action from the indicated State of the wolrd. Double.NaN if either the Action ID or State ID do not belong to this policy's ActionDomain or StateDomain.

setProbability

public void setProbability(SI stateID,
                           AI actionID,
                           double newValue)
Set a State-Action pair probability value. Updates the probability of choosing the indicated Action given the indicated State of the wold.

Specified by:
setProbability in interface Policy<AI,A extends Action,SI,S extends State>
Parameters:
stateID - - indicator of the desired State in this policy's StateDomain.
actionID - - indicator of the desired Action in this policy's ActionDomain.
newValue - - new choice probability value to associate with this State-Action pair.

setRandomEngine

public void setRandomEngine(cern.jet.random.engine.RandomEngine engine)
Sets the RandomEngine to be used by this policy.

Parameters:
engine -

getRandomSeed

public int getRandomSeed()

setRandomSeed

public void setRandomSeed(int seed)
Resets the RandomEngine, initializing it with the given seed. The RandomEngine will be set to a MersenneTwister. If you wish to use a different RandomEngine with this seed, use setRandomEngine(RandomEngine). Note: Calling this method will create a new MersenneTwister. Repeated calls can lead to perfomance issues.

Specified by:
setRandomSeed in interface Policy<AI,A extends Action,SI,S extends State>
Parameters:
seed - - seed value
See Also:
cern.jet.random.engine.MersenneTwister