MachineIntelligenceCore:ReinforcementLearning
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator
mic::application::GridworldQLearning Class Reference

Class responsible for solving the gridworld problem with Q-learning. More...

#include <GridworldQLearning.hpp>

Inheritance diagram for mic::application::GridworldQLearning:
Collaboration diagram for mic::application::GridworldQLearning:

Public Member Functions

 GridworldQLearning (std::string node_name_="application")
 
virtual ~GridworldQLearning ()
 

Protected Member Functions

virtual void initialize (int argc, char *argv[])
 
virtual void initializePropertyDependentVariables ()
 
virtual bool performSingleStep ()
 
virtual void startNewEpisode ()
 
virtual void finishCurrentEpisode ()
 

Private Member Functions

std::string streamQStateTable ()
 
float computeBestValue (mic::types::Position2D pos_)
 
mic::types::NESWAction selectBestAction (mic::types::Position2D pos_)
 

Private Attributes

WindowCollectorChart< float > * w_chart
 Window for displaying ???. More...
 
mic::utils::DataCollectorPtr
< std::string, float > 
collector_ptr
 Data collector. More...
 
mic::environments::Gridworld grid_env
 The gridworld object. More...
 
mic::types::TensorXf qstate_table
 Tensor storing values for all states (gridworld w * h * 4 (number of actions)). COL MAJOR(!). More...
 
mic::configuration::Property
< float > 
step_reward
 
mic::configuration::Property
< float > 
discount_rate
 
mic::configuration::Property
< float > 
learning_rate
 
mic::configuration::Property
< float > 
move_noise
 
mic::configuration::Property
< double > 
epsilon
 
mic::configuration::Property
< std::string > 
statistics_filename
 Property: name of the file to which the statistics will be exported. More...
 
long long sum_of_iterations
 
long long sum_of_rewards
 

Detailed Description

Class responsible for solving the gridworld problem with Q-learning.

Author
tkornuta

Definition at line 44 of file GridworldQLearning.hpp.

Constructor & Destructor Documentation

mic::application::GridworldQLearning::GridworldQLearning ( std::string  node_name_ = "application")

Default Constructor. Sets the application/node name, default values of variables, initializes classifier etc.

Parameters
node_name_Name of the application/node (in configuration file).

Definition at line 41 of file GridworldQLearning.cpp.

References discount_rate, epsilon, learning_rate, move_noise, statistics_filename, and step_reward.

mic::application::GridworldQLearning::~GridworldQLearning ( )
virtual

Destructor.

Definition at line 62 of file GridworldQLearning.cpp.

References w_chart.

Member Function Documentation

float mic::application::GridworldQLearning::computeBestValue ( mic::types::Position2D  pos_)
private

Calculates the best value for given state - by finding the action having the maximal expected value.

Parameters
pos_Starting state (position).
Returns
Value for given state.

Definition at line 181 of file GridworldQLearning.cpp.

References grid_env, mic::environments::Environment::isActionAllowed(), mic::environments::Gridworld::isStateAllowed(), and qstate_table.

Referenced by performSingleStep().

void mic::application::GridworldQLearning::finishCurrentEpisode ( )
protectedvirtual

Method called when given episode ends (goal: export collected statistics to file etc.) - abstract, to be overridden.

Definition at line 112 of file GridworldQLearning.cpp.

References collector_ptr, mic::environments::Gridworld::getAgentPosition(), mic::environments::Gridworld::getStateReward(), grid_env, statistics_filename, sum_of_iterations, and sum_of_rewards.

void mic::application::GridworldQLearning::initialize ( int  argc,
char *  argv[] 
)
protectedvirtual

Method initializes GLUT and OpenGL windows.

Parameters
argcNumber of application parameters.
argvArray of application parameters.

Definition at line 67 of file GridworldQLearning.cpp.

References collector_ptr, sum_of_iterations, sum_of_rewards, and w_chart.

void mic::application::GridworldQLearning::initializePropertyDependentVariables ( )
protectedvirtual
mic::types::NESWAction mic::application::GridworldQLearning::selectBestAction ( mic::types::Position2D  pos_)
private

Finds the best action.

Returns
The best action found.

Definition at line 206 of file GridworldQLearning.cpp.

References grid_env, mic::environments::Environment::isActionAllowed(), and qstate_table.

Referenced by performSingleStep().

void mic::application::GridworldQLearning::startNewEpisode ( )
protectedvirtual

Method called at the beginning of new episode (goal: to reset the statistics etc.) - abstract, to be overridden.

Definition at line 100 of file GridworldQLearning.cpp.

References mic::environments::Gridworld::environmentToString(), grid_env, mic::environments::Gridworld::initializeEnvironment(), and streamQStateTable().

std::string mic::application::GridworldQLearning::streamQStateTable ( )
private

Member Data Documentation

mic::utils::DataCollectorPtr<std::string, float> mic::application::GridworldQLearning::collector_ptr
private

Data collector.

Definition at line 93 of file GridworldQLearning.hpp.

Referenced by finishCurrentEpisode(), and initialize().

mic::configuration::Property<float> mic::application::GridworldQLearning::discount_rate
private

Property: future discount (should be in range 0.0-1.0).

Definition at line 109 of file GridworldQLearning.hpp.

Referenced by GridworldQLearning(), and performSingleStep().

mic::configuration::Property<double> mic::application::GridworldQLearning::epsilon
private

Property: variable denoting epsilon in action selection (the probability "below" which a random action will be selected). if epsilon < 0 then if will be set to 1/episode, hence change dynamically depending on the episode number.

Definition at line 125 of file GridworldQLearning.hpp.

Referenced by GridworldQLearning(), and performSingleStep().

mic::environments::Gridworld mic::application::GridworldQLearning::grid_env
private
mic::configuration::Property<float> mic::application::GridworldQLearning::learning_rate
private

Property: learning rate (should be in range 0.0-1.0).

Definition at line 114 of file GridworldQLearning.hpp.

Referenced by GridworldQLearning(), and performSingleStep().

mic::configuration::Property<float> mic::application::GridworldQLearning::move_noise
private

Property: move noise, determining gow often action results in unintended direction.

Definition at line 119 of file GridworldQLearning.hpp.

Referenced by GridworldQLearning().

mic::types::TensorXf mic::application::GridworldQLearning::qstate_table
private

Tensor storing values for all states (gridworld w * h * 4 (number of actions)). COL MAJOR(!).

Definition at line 99 of file GridworldQLearning.hpp.

Referenced by computeBestValue(), initializePropertyDependentVariables(), performSingleStep(), selectBestAction(), and streamQStateTable().

mic::configuration::Property<std::string> mic::application::GridworldQLearning::statistics_filename
private

Property: name of the file to which the statistics will be exported.

Definition at line 128 of file GridworldQLearning.hpp.

Referenced by finishCurrentEpisode(), and GridworldQLearning().

mic::configuration::Property<float> mic::application::GridworldQLearning::step_reward
private

Property: the "expected intermediate reward", i.e. reward received by performing each step (typically negative, but can be positive as all).

Definition at line 104 of file GridworldQLearning.hpp.

Referenced by GridworldQLearning(), and performSingleStep().

long long mic::application::GridworldQLearning::sum_of_iterations
private

Sum of all iterations made till now - used in statistics.

Definition at line 155 of file GridworldQLearning.hpp.

Referenced by finishCurrentEpisode(), and initialize().

long long mic::application::GridworldQLearning::sum_of_rewards
private

Sum of all rewards collected till now - used in statistics.

Definition at line 160 of file GridworldQLearning.hpp.

Referenced by finishCurrentEpisode(), and initialize().

WindowCollectorChart<float>* mic::application::GridworldQLearning::w_chart
private

Window for displaying ???.

Definition at line 90 of file GridworldQLearning.hpp.

Referenced by initialize(), and ~GridworldQLearning().


The documentation for this class was generated from the following files: