Class responsible for solving the gridworld problem with Q-learning. More...

#include <GridworldQLearning.hpp>

Inheritance diagram for mic::application::GridworldQLearning:

Collaboration diagram for mic::application::GridworldQLearning:

Public Member Functions
	GridworldQLearning (std::string node_name_="application")

virtual	~GridworldQLearning ()

Protected Member Functions
virtual void	initialize (int argc, char *argv[])

virtual void	initializePropertyDependentVariables ()

virtual bool	performSingleStep ()

virtual void	startNewEpisode ()

virtual void	finishCurrentEpisode ()

Private Member Functions
std::string	streamQStateTable ()

float	computeBestValue (mic::types::Position2D pos_)

mic::types::NESWAction	selectBestAction (mic::types::Position2D pos_)

Private Attributes
WindowCollectorChart< float > *	w_chart
	Window for displaying ???. More...

mic::utils::DataCollectorPtr < std::string, float >	collector_ptr
	Data collector. More...

mic::environments::Gridworld	grid_env
	The gridworld object. More...

mic::types::TensorXf	qstate_table
	Tensor storing values for all states (gridworld w * h * 4 (number of actions)). COL MAJOR(!). More...

mic::configuration::Property < float >	step_reward

mic::configuration::Property < float >	discount_rate

mic::configuration::Property < float >	learning_rate

mic::configuration::Property < float >	move_noise

mic::configuration::Property < double >	epsilon

mic::configuration::Property < std::string >	statistics_filename
	Property: name of the file to which the statistics will be exported. More...

long long	sum_of_iterations

long long	sum_of_rewards

Detailed Description

Class responsible for solving the gridworld problem with Q-learning.

Author: tkornuta

Definition at line 44 of file GridworldQLearning.hpp.

Constructor & Destructor Documentation

mic::application::GridworldQLearning::GridworldQLearning ( std::string node_name_ = "application" )

Default Constructor. Sets the application/node name, default values of variables, initializes classifier etc.

Parameters

node_name_ Name of the application/node (in configuration file).

Definition at line 41 of file GridworldQLearning.cpp.

References discount_rate, epsilon, learning_rate, move_noise, statistics_filename, and step_reward.

mic::application::GridworldQLearning::~GridworldQLearning ( )

virtual

Destructor.

Definition at line 62 of file GridworldQLearning.cpp.

References w_chart.

Member Function Documentation

float mic::application::GridworldQLearning::computeBestValue ( mic::types::Position2D pos_ )

private

Calculates the best value for given state - by finding the action having the maximal expected value.

Parameters

pos_	Starting state (position).

Returns: Value for given state.

Definition at line 181 of file GridworldQLearning.cpp.

References grid_env, mic::environments::Environment::isActionAllowed(), mic::environments::Gridworld::isStateAllowed(), and qstate_table.

Referenced by performSingleStep().

void mic::application::GridworldQLearning::finishCurrentEpisode ( )

protectedvirtual

Method called when given episode ends (goal: export collected statistics to file etc.) - abstract, to be overridden.

Definition at line 112 of file GridworldQLearning.cpp.

References collector_ptr, mic::environments::Gridworld::getAgentPosition(), mic::environments::Gridworld::getStateReward(), grid_env, statistics_filename, sum_of_iterations, and sum_of_rewards.

void mic::application::GridworldQLearning::initialize	(	int	argc,
		char *	argv[]
	)

protectedvirtual

Method initializes GLUT and OpenGL windows.

Parameters

argc	Number of application parameters.
argv	Array of application parameters.

Definition at line 67 of file GridworldQLearning.cpp.

References collector_ptr, sum_of_iterations, sum_of_rewards, and w_chart.

void mic::application::GridworldQLearning::initializePropertyDependentVariables ( )

protectedvirtual

Initializes all variables that are property-dependent.

Definition at line 87 of file GridworldQLearning.cpp.

References mic::environments::Environment::getEnvironmentHeight(), mic::environments::Environment::getEnvironmentWidth(), grid_env, mic::environments::Gridworld::initializeEnvironment(), qstate_table, and streamQStateTable().

bool mic::application::GridworldQLearning::performSingleStep ( )

protectedvirtual

Performs single step of computations.

Definition at line 236 of file GridworldQLearning.cpp.

References computeBestValue(), discount_rate, mic::environments::Gridworld::environmentToString(), epsilon, mic::environments::Gridworld::getAgentPosition(), mic::environments::Gridworld::getStateReward(), grid_env, mic::environments::Gridworld::isStateTerminal(), learning_rate, mic::environments::Environment::moveAgent(), qstate_table, selectBestAction(), step_reward, and streamQStateTable().

mic::types::NESWAction mic::application::GridworldQLearning::selectBestAction ( mic::types::Position2D pos_ )

private

Finds the best action.

Returns: The best action found.

Definition at line 206 of file GridworldQLearning.cpp.

References grid_env, mic::environments::Environment::isActionAllowed(), and qstate_table.

Referenced by performSingleStep().

void mic::application::GridworldQLearning::startNewEpisode ( )

protectedvirtual

Method called at the beginning of new episode (goal: to reset the statistics etc.) - abstract, to be overridden.

Definition at line 100 of file GridworldQLearning.cpp.

References mic::environments::Gridworld::environmentToString(), grid_env, mic::environments::Gridworld::initializeEnvironment(), and streamQStateTable().

std::string mic::application::GridworldQLearning::streamQStateTable ( )

private

Steams the current state of the state-action values.

Returns: Ostream with description of the state-action table.

Definition at line 132 of file GridworldQLearning.cpp.

References mic::environments::Environment::getEnvironmentHeight(), mic::environments::Environment::getEnvironmentWidth(), grid_env, mic::environments::Environment::isActionAllowed(), mic::environments::Gridworld::isStateAllowed(), mic::environments::Gridworld::isStateTerminal(), and qstate_table.

Referenced by initializePropertyDependentVariables(), performSingleStep(), and startNewEpisode().

Member Data Documentation

mic::utils::DataCollectorPtr<std::string, float> mic::application::GridworldQLearning::collector_ptr

private

Data collector.

Definition at line 93 of file GridworldQLearning.hpp.

Referenced by finishCurrentEpisode(), and initialize().

mic::configuration::Property<float> mic::application::GridworldQLearning::discount_rate

private

Property: future discount (should be in range 0.0-1.0).

Definition at line 109 of file GridworldQLearning.hpp.

Referenced by GridworldQLearning(), and performSingleStep().

mic::configuration::Property<double> mic::application::GridworldQLearning::epsilon

private

Property: variable denoting epsilon in action selection (the probability "below" which a random action will be selected). if epsilon < 0 then if will be set to 1/episode, hence change dynamically depending on the episode number.

Definition at line 125 of file GridworldQLearning.hpp.

Referenced by GridworldQLearning(), and performSingleStep().

mic::environments::Gridworld mic::application::GridworldQLearning::grid_env

private

The gridworld object.

Definition at line 96 of file GridworldQLearning.hpp.

Referenced by computeBestValue(), finishCurrentEpisode(), initializePropertyDependentVariables(), performSingleStep(), selectBestAction(), startNewEpisode(), and streamQStateTable().