 mic | |
  algorithms | |
   MazeHistogramFilter | Class implementing a histogram filter based solution of the maze-of-digits localization problem |
  application | |
   EpisodicHistogramFilterMazeLocalization | Application for episodic testing of convergence of histogram filter based maze-of-digits localization |
   GridworldDeepQLearning | Class responsible for solving the gridworld problem with Q-learning and (not that) deep neural networks |
   GridworldDRLExperienceReplay | Class responsible for solving the gridworld problem with Q-learning, neural network used for approximation of the rewards and experience replay using for (batch) training of the neural network |
   GridworldDRLExperienceReplayPOMDP | Class responsible for solving the gridworld problem with Q-learning, neural network used for approximation of the rewards and experience replay using for (batch) training of the neural network. In this case there is an assumption that the agent observes only part of the environment (POMPD) |
   GridworldQLearning | Class responsible for solving the gridworld problem with Q-learning |
   GridworldValueIteration | Class responsible for solving the gridworld problem by applying the reinforcement learning value iteration method |
   HistogramFilterMazeLocalization | Class implementing a histogram filter based solution of the maze-of-digits problem |
   MazeOfDigitsDLRERPOMPD | Application of Partially Observable Deep Q-learning with Experience Reply to the maze of digits problem. There is an assumption that the agent observes only part of the environment (POMPD) |
   MNISTDigitDLRERPOMDP | Application of Partially Observable Deep Q-learning with Experience Reply to the MNIST digits problem. There is an assumption that the agent observes only part of the environment - a patch of the whole image (POMPD) |
   TestApp | Class implementing a n-Armed Bandits problem solving the n armed bandits problem using simple Q-learning rule |
   nArmedBanditsSofmax | Class implementing a n-Armed Bandits problem solving the n armed bandits problem using Softmax Action Selection |
   nArmedBanditsUnlimitedHistory | Class implementing a n-Armed Bandits problem solving the n armed bandits problem based on unlimited history action selection (storing all action-value pairs) |
  environments | |
   Environment | Abstract class representing an environment |
   Gridworld | Class emulating the gridworld environment |
   MazeOfDigits | Class emulating the maze of digits environment |
   MNISTDigit | Class emulating the MNISTDigit digit environment |
  importers | |
   MazeMatrixImporter | |
  types | |
   SpatialExperience | Structure storing a spatial experience - a triplet of position in time t, executed action and position in time t+1 |
   SpatialExperienceMemory | Class representing the spatial experience memory - used in memory replay. Derived from the Batch class |