MachineIntelligenceCore:ReinforcementLearning
|
Class responsible for solving the gridworld problem by applying the reinforcement learning value iteration method. More...
#include <GridworldValueIteration.hpp>
Public Member Functions | |
GridworldValueIteration (std::string node_name_="application") | |
virtual | ~GridworldValueIteration () |
Protected Member Functions | |
virtual void | initializePropertyDependentVariables () |
virtual void | initialize (int argc, char *argv[]) |
virtual bool | performSingleStep () |
Private Member Functions | |
std::string | streamStateActionTable () |
float | computeQValueFromValues (mic::types::Position2D pos_, mic::types::NESWAction ac_) |
float | computeBestValue (mic::types::Position2D pos_) |
Private Attributes | |
mic::environments::Gridworld | grid_env |
The gridworld object. More... | |
mic::types::MatrixXf | state_value_table |
Matrix storing values for all states (gridworld w * h). ROW MAJOR(!). More... | |
mic::configuration::Property < float > | step_reward |
mic::configuration::Property < float > | discount_rate |
mic::configuration::Property < float > | move_noise |
mic::configuration::Property < std::string > | statistics_filename |
Property: name of the file to which the statistics will be exported. More... | |
float | running_delta |
Class responsible for solving the gridworld problem by applying the reinforcement learning value iteration method.
Definition at line 45 of file GridworldValueIteration.hpp.
mic::application::GridworldValueIteration::GridworldValueIteration | ( | std::string | node_name_ = "application" | ) |
Default Constructor. Sets the application/node name, default values of variables, initializes classifier etc.
node_name_ | Name of the application/node (in configuration file). |
Definition at line 40 of file GridworldValueIteration.cpp.
References discount_rate, move_noise, statistics_filename, and step_reward.
|
virtual |
Destructor.
Definition at line 57 of file GridworldValueIteration.cpp.
|
private |
Calculates the best value for given state - by finding the action having the maximal expected value.
pos_ | Starting state (position). |
Definition at line 147 of file GridworldValueIteration.cpp.
References computeQValueFromValues(), grid_env, mic::environments::Environment::isActionAllowed(), and mic::environments::Gridworld::isStateAllowed().
Referenced by performSingleStep().
|
private |
Calculates the Q-value, taking into consideration probabilistic transition between states (i.e. that north action can end up going east or west)
pos_ | Starting state (position). |
ac_ | Action to be performed. |
Definition at line 99 of file GridworldValueIteration.cpp.
References discount_rate, grid_env, mic::environments::Environment::isActionAllowed(), move_noise, state_value_table, and step_reward.
Referenced by computeBestValue().
|
protectedvirtual |
Method empty (not used).
argc | Number of application parameters. |
argv | Array of application parameters. |
Definition at line 62 of file GridworldValueIteration.cpp.
|
protectedvirtual |
Initializes all variables that are property-dependent.
Definition at line 66 of file GridworldValueIteration.cpp.
References mic::environments::Environment::getEnvironmentHeight(), mic::environments::Environment::getEnvironmentWidth(), grid_env, mic::environments::Gridworld::initializeEnvironment(), running_delta, state_value_table, and streamStateActionTable().
|
protectedvirtual |
Performs single step of computations.
Definition at line 173 of file GridworldValueIteration.cpp.
References computeBestValue(), mic::environments::Gridworld::environmentToString(), mic::environments::Environment::getEnvironmentHeight(), mic::environments::Environment::getEnvironmentWidth(), mic::environments::Gridworld::getStateReward(), grid_env, mic::environments::Gridworld::isStateAllowed(), mic::environments::Gridworld::isStateTerminal(), running_delta, state_value_table, and streamStateActionTable().
|
private |
Steams the current state of the state-action values.
Definition at line 81 of file GridworldValueIteration.cpp.
References mic::environments::Environment::getEnvironmentHeight(), mic::environments::Environment::getEnvironmentWidth(), grid_env, and state_value_table.
Referenced by initializePropertyDependentVariables(), and performSingleStep().
|
private |
Property: future discount factor (should be in range 0.0-1.0).
Definition at line 92 of file GridworldValueIteration.hpp.
Referenced by computeQValueFromValues(), and GridworldValueIteration().
|
private |
The gridworld object.
Definition at line 79 of file GridworldValueIteration.hpp.
Referenced by computeBestValue(), computeQValueFromValues(), initializePropertyDependentVariables(), performSingleStep(), and streamStateActionTable().
|
private |
Property: move noise, determining gow often action results in unintended direction.
Definition at line 97 of file GridworldValueIteration.hpp.
Referenced by computeQValueFromValues(), and GridworldValueIteration().
|
private |
Running delta being the sum of increments of the value table.
Definition at line 105 of file GridworldValueIteration.hpp.
Referenced by initializePropertyDependentVariables(), and performSingleStep().
|
private |
Matrix storing values for all states (gridworld w * h). ROW MAJOR(!).
Definition at line 82 of file GridworldValueIteration.hpp.
Referenced by computeQValueFromValues(), initializePropertyDependentVariables(), performSingleStep(), and streamStateActionTable().
|
private |
Property: name of the file to which the statistics will be exported.
Definition at line 100 of file GridworldValueIteration.hpp.
Referenced by GridworldValueIteration().
|
private |
Property: the "expected intermediate reward", i.e. reward received by performing each step (typically negative, but can be positive as all).
Definition at line 87 of file GridworldValueIteration.hpp.
Referenced by computeQValueFromValues(), and GridworldValueIteration().