MachineIntelligenceCore:ReinforcementLearning
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator
mic::application::GridworldValueIteration Class Reference

Class responsible for solving the gridworld problem by applying the reinforcement learning value iteration method. More...

#include <GridworldValueIteration.hpp>

Inheritance diagram for mic::application::GridworldValueIteration:
Collaboration diagram for mic::application::GridworldValueIteration:

Public Member Functions

 GridworldValueIteration (std::string node_name_="application")
 
virtual ~GridworldValueIteration ()
 

Protected Member Functions

virtual void initializePropertyDependentVariables ()
 
virtual void initialize (int argc, char *argv[])
 
virtual bool performSingleStep ()
 

Private Member Functions

std::string streamStateActionTable ()
 
float computeQValueFromValues (mic::types::Position2D pos_, mic::types::NESWAction ac_)
 
float computeBestValue (mic::types::Position2D pos_)
 

Private Attributes

mic::environments::Gridworld grid_env
 The gridworld object. More...
 
mic::types::MatrixXf state_value_table
 Matrix storing values for all states (gridworld w * h). ROW MAJOR(!). More...
 
mic::configuration::Property
< float > 
step_reward
 
mic::configuration::Property
< float > 
discount_rate
 
mic::configuration::Property
< float > 
move_noise
 
mic::configuration::Property
< std::string > 
statistics_filename
 Property: name of the file to which the statistics will be exported. More...
 
float running_delta
 

Detailed Description

Class responsible for solving the gridworld problem by applying the reinforcement learning value iteration method.

Author
tkornuta

Definition at line 45 of file GridworldValueIteration.hpp.

Constructor & Destructor Documentation

mic::application::GridworldValueIteration::GridworldValueIteration ( std::string  node_name_ = "application")

Default Constructor. Sets the application/node name, default values of variables, initializes classifier etc.

Parameters
node_name_Name of the application/node (in configuration file).

Definition at line 40 of file GridworldValueIteration.cpp.

References discount_rate, move_noise, statistics_filename, and step_reward.

mic::application::GridworldValueIteration::~GridworldValueIteration ( )
virtual

Destructor.

Definition at line 57 of file GridworldValueIteration.cpp.

Member Function Documentation

float mic::application::GridworldValueIteration::computeBestValue ( mic::types::Position2D  pos_)
private

Calculates the best value for given state - by finding the action having the maximal expected value.

Parameters
pos_Starting state (position).
Returns
Value for given state.

Definition at line 147 of file GridworldValueIteration.cpp.

References computeQValueFromValues(), grid_env, mic::environments::Environment::isActionAllowed(), and mic::environments::Gridworld::isStateAllowed().

Referenced by performSingleStep().

float mic::application::GridworldValueIteration::computeQValueFromValues ( mic::types::Position2D  pos_,
mic::types::NESWAction  ac_ 
)
private

Calculates the Q-value, taking into consideration probabilistic transition between states (i.e. that north action can end up going east or west)

Parameters
pos_Starting state (position).
ac_Action to be performed.
Returns
Value ofr the function

Definition at line 99 of file GridworldValueIteration.cpp.

References discount_rate, grid_env, mic::environments::Environment::isActionAllowed(), move_noise, state_value_table, and step_reward.

Referenced by computeBestValue().

void mic::application::GridworldValueIteration::initialize ( int  argc,
char *  argv[] 
)
protectedvirtual

Method empty (not used).

Parameters
argcNumber of application parameters.
argvArray of application parameters.

Definition at line 62 of file GridworldValueIteration.cpp.

void mic::application::GridworldValueIteration::initializePropertyDependentVariables ( )
protectedvirtual
std::string mic::application::GridworldValueIteration::streamStateActionTable ( )
private

Steams the current state of the state-action values.

Returns
Ostream with description of the state-action table.

Definition at line 81 of file GridworldValueIteration.cpp.

References mic::environments::Environment::getEnvironmentHeight(), mic::environments::Environment::getEnvironmentWidth(), grid_env, and state_value_table.

Referenced by initializePropertyDependentVariables(), and performSingleStep().

Member Data Documentation

mic::configuration::Property<float> mic::application::GridworldValueIteration::discount_rate
private

Property: future discount factor (should be in range 0.0-1.0).

Definition at line 92 of file GridworldValueIteration.hpp.

Referenced by computeQValueFromValues(), and GridworldValueIteration().

mic::environments::Gridworld mic::application::GridworldValueIteration::grid_env
private
mic::configuration::Property<float> mic::application::GridworldValueIteration::move_noise
private

Property: move noise, determining gow often action results in unintended direction.

Definition at line 97 of file GridworldValueIteration.hpp.

Referenced by computeQValueFromValues(), and GridworldValueIteration().

float mic::application::GridworldValueIteration::running_delta
private

Running delta being the sum of increments of the value table.

Definition at line 105 of file GridworldValueIteration.hpp.

Referenced by initializePropertyDependentVariables(), and performSingleStep().

mic::types::MatrixXf mic::application::GridworldValueIteration::state_value_table
private

Matrix storing values for all states (gridworld w * h). ROW MAJOR(!).

Definition at line 82 of file GridworldValueIteration.hpp.

Referenced by computeQValueFromValues(), initializePropertyDependentVariables(), performSingleStep(), and streamStateActionTable().

mic::configuration::Property<std::string> mic::application::GridworldValueIteration::statistics_filename
private

Property: name of the file to which the statistics will be exported.

Definition at line 100 of file GridworldValueIteration.hpp.

Referenced by GridworldValueIteration().

mic::configuration::Property<float> mic::application::GridworldValueIteration::step_reward
private

Property: the "expected intermediate reward", i.e. reward received by performing each step (typically negative, but can be positive as all).

Definition at line 87 of file GridworldValueIteration.hpp.

Referenced by computeQValueFromValues(), and GridworldValueIteration().


The documentation for this class was generated from the following files: