MachineIntelligenceCore:ReinforcementLearning
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator
mic::application::nArmedBanditsUnlimitedHistory Class Reference

Class implementing a n-Armed Bandits problem solving the n armed bandits problem based on unlimited history action selection (storing all action-value pairs). More...

#include <nArmedBanditsUnlimitedHistory.hpp>

Inheritance diagram for mic::application::nArmedBanditsUnlimitedHistory:
Collaboration diagram for mic::application::nArmedBanditsUnlimitedHistory:

Public Member Functions

 nArmedBanditsUnlimitedHistory (std::string node_name_="application")
 
virtual ~nArmedBanditsUnlimitedHistory ()
 

Protected Member Functions

virtual void initializePropertyDependentVariables ()
 
virtual void initialize (int argc, char *argv[])
 
virtual bool performSingleStep ()
 

Private Member Functions

short calculateReward (float prob_)
 
size_t selectBestArm ()
 

Private Attributes

WindowCollectorChart< float > * w_reward
 Window for displaying average reward. More...
 
mic::utils::DataCollectorPtr
< std::string, float > 
reward_collector_ptr
 Reward collector. More...
 
mic::types::VectorXf arms
 n Bandit arms. More...
 
std::vector< std::pair< size_t,
size_t > > 
action_values
 Action values - pairs of <arm_number, reward>. More...
 
mic::configuration::Property
< size_t > 
number_of_bandits
 Property: number of bandits. More...
 
mic::configuration::Property
< double > 
epsilon
 Property: variable denoting epsilon in action selection (the probability "below" which a random action will be selected). More...
 
mic::configuration::Property
< std::string > 
statistics_filename
 Property: name of the file to which the statistics will be exported. More...
 
size_t best_arm
 
float best_arm_prob
 

Detailed Description

Class implementing a n-Armed Bandits problem solving the n armed bandits problem based on unlimited history action selection (storing all action-value pairs).

Author
tkornuta

Definition at line 41 of file nArmedBanditsUnlimitedHistory.hpp.

Constructor & Destructor Documentation

mic::application::nArmedBanditsUnlimitedHistory::nArmedBanditsUnlimitedHistory ( std::string  node_name_ = "application")

Default Constructor. Sets the application/node name, default values of variables, initializes classifier etc.

Parameters
node_name_Name of the application/node (in configuration file).

Definition at line 38 of file nArmedBanditsUnlimitedHistory.cpp.

References epsilon, number_of_bandits, and statistics_filename.

mic::application::nArmedBanditsUnlimitedHistory::~nArmedBanditsUnlimitedHistory ( )
virtual

Destructor.

Definition at line 53 of file nArmedBanditsUnlimitedHistory.cpp.

References w_reward.

Member Function Documentation

short mic::application::nArmedBanditsUnlimitedHistory::calculateReward ( float  prob_)
private

Calculates the reward.

Parameters
prob_Probability.

Definition at line 96 of file nArmedBanditsUnlimitedHistory.cpp.

References number_of_bandits.

Referenced by performSingleStep().

void mic::application::nArmedBanditsUnlimitedHistory::initialize ( int  argc,
char *  argv[] 
)
protectedvirtual

Method initializes GLUT and OpenGL windows.

Parameters
argcNumber of application parameters.
argvArray of application parameters.

Definition at line 58 of file nArmedBanditsUnlimitedHistory.cpp.

References reward_collector_ptr, and w_reward.

void mic::application::nArmedBanditsUnlimitedHistory::initializePropertyDependentVariables ( )
protectedvirtual

Initializes all variables that are property-dependent.

Definition at line 74 of file nArmedBanditsUnlimitedHistory.cpp.

References action_values, arms, best_arm, best_arm_prob, and number_of_bandits.

bool mic::application::nArmedBanditsUnlimitedHistory::performSingleStep ( )
protectedvirtual
size_t mic::application::nArmedBanditsUnlimitedHistory::selectBestArm ( )
private

Greedy method that selects best arm based on historical action-value pairs.

Definition at line 106 of file nArmedBanditsUnlimitedHistory.cpp.

References action_values, and number_of_bandits.

Referenced by performSingleStep().

Member Data Documentation

std::vector< std::pair<size_t, size_t> > mic::application::nArmedBanditsUnlimitedHistory::action_values
private

Action values - pairs of <arm_number, reward>.

Definition at line 84 of file nArmedBanditsUnlimitedHistory.hpp.

Referenced by initializePropertyDependentVariables(), performSingleStep(), and selectBestArm().

mic::types::VectorXf mic::application::nArmedBanditsUnlimitedHistory::arms
private

n Bandit arms.

Definition at line 81 of file nArmedBanditsUnlimitedHistory.hpp.

Referenced by initializePropertyDependentVariables(), and performSingleStep().

size_t mic::application::nArmedBanditsUnlimitedHistory::best_arm
private

The best arm (hidden state).

Definition at line 98 of file nArmedBanditsUnlimitedHistory.hpp.

Referenced by initializePropertyDependentVariables(), and performSingleStep().

float mic::application::nArmedBanditsUnlimitedHistory::best_arm_prob
private

The best arm probability/"reward" (hidden state).

Definition at line 103 of file nArmedBanditsUnlimitedHistory.hpp.

Referenced by initializePropertyDependentVariables(), and performSingleStep().

mic::configuration::Property<double> mic::application::nArmedBanditsUnlimitedHistory::epsilon
private

Property: variable denoting epsilon in action selection (the probability "below" which a random action will be selected).

Definition at line 90 of file nArmedBanditsUnlimitedHistory.hpp.

Referenced by nArmedBanditsUnlimitedHistory(), and performSingleStep().

mic::configuration::Property<size_t> mic::application::nArmedBanditsUnlimitedHistory::number_of_bandits
private
mic::utils::DataCollectorPtr<std::string, float> mic::application::nArmedBanditsUnlimitedHistory::reward_collector_ptr
private

Reward collector.

Definition at line 78 of file nArmedBanditsUnlimitedHistory.hpp.

Referenced by initialize(), and performSingleStep().

mic::configuration::Property<std::string> mic::application::nArmedBanditsUnlimitedHistory::statistics_filename
private

Property: name of the file to which the statistics will be exported.

Definition at line 93 of file nArmedBanditsUnlimitedHistory.hpp.

Referenced by nArmedBanditsUnlimitedHistory(), and performSingleStep().

WindowCollectorChart<float>* mic::application::nArmedBanditsUnlimitedHistory::w_reward
private

Window for displaying average reward.

Definition at line 75 of file nArmedBanditsUnlimitedHistory.hpp.

Referenced by initialize(), and ~nArmedBanditsUnlimitedHistory().


The documentation for this class was generated from the following files: