# Neural Network Heuristics Functions for Classical Planning: Reinforcement Learning and Comparison to Other Methods


## /Code

This project builds upon the code published for with the paper
"Neural Network Heuristics for Classical Planning: A Study of 
Hyperparameter Space" (Ferber et al. 2020). Thus, it is also cluttered 
with the scripts from the previous project. Some of these scripts are 
adapted for our reinforcement learning approach, some scripts have been
used again, because we used their approach as a baseline and trained
some models. The most important locations are:

- **requirements.txt:** Use this file to create a virtual python3
  environment to execute the training scripts.
- **fast-deepcube.py**: For a given PDDL instance file, this scripts 
  starts the reinforcement learning process.
- **src/training/learners/keras_networks/:** This directory contains the
  code for constructing and training our networks. Everything else in
  **src/training/** is code of Ferber et al 2020 that we used to train
  our baselines.
- **src/search/:** This directory contains our modifications of the
  Fast Downward search component (Helmert 2006). Notable files are:
  - **src/search/search_engines/sampling_:** Those files contain the
  search engines which we used for generating the training data (states
  and heuristic estimates). *sampling_search.{cc,h}* contains the code
  to generate data for our two bootstrapping approaches. 
  *sampling_q.{cc,h}* should be named *sampling_v* and performs the
  Bellman estimations for our approximate value iteration approach.
  - **src/search/task_utils/sampling_techinque.{cc,h}:** These files 
  contain our sampling techniques. A sampling technique receives a 
  planning task and samples a new task (in our case either via random
  walks from the initial state or goal).
  - **src/search/neural_networks/:** This directory contains the code
  to evaluate Protobuf Networks (trained by Keras/Tensorflow 1.X).
  
  **If you want to exactly reproduce our experiments, use this code. 
  Otherwise, I suggest to use:
  http://github.com/PatrickFerber/neuralfastdownward.
  This repository contains a refactored version of the sampling code, as well 
  as, support for PyTorch networks. Furthermore, unused options are removed.**

- **misc/reinforcement_learning/experiments/:** contains the code (in **srce**)
  and scripts for the reinforcement learning experiments.
  

## /Benchmarks

There are 2 benchmarks directories:
1. **training_states/:** This directory contains the IPC task for our 10 domains.
  We use these files to train our NN.
2. **test_states/:** This directory contains the test tasks for our domains. For
   every domain we have 50 test states. If available, we used those published
   by Ferber et al. (2020), otherwise, we generated our own test states using
   their approach. The scripts to generate new test states are also provided.
   
## /Results

This directory contains the output files of Lab (Seipp et al. 2017) and the
scripts to extract data for Table 1 and to generate the plots.

- **properties/**: This directory contains the result files produced by Lab
  for our three approaches, supervised learning (Ferber et al. 2020), LAMA 
  (Richter and Westphal 2010), and STRIPS-HGN (Shen et al. 2020).
- **plots/:** Directory to store the generated plots (all plots are already
  generated)
- **plot_coverage_x_time.sh:** Generates the plots for Figure 1.
- **plot_boxplot_expansions.sh:** Generates the boxplot for Figure 2
- **extract_tables.sh:** Extracts the values for Table 1.

## /strips_hgn

This directory contains the code to run our STRIPS-HGN comparison. It
contains its own README files.

## References
- Ferber et al. 2020: Patrick Ferber, Malte Helmert, Jörg Hoffmann. 2020. 
Neural Network Heuristics for Classical Planning: A Study of Hyperparameter Space. 24th European Conference on Artificial Intelligence (ECAI 2020): 2346-2353. https://zenodo.org/record/3671553
- Helmert 2006: Malte Helmert. 2006. The Fast Downward Planning System. 
  Journal of Artificial Intelligence Research 26: 191-246. http://fast-downward.org
- Richter and Westphal 2010: S. Richter and M. Westphal. 2010. The LAMA Planner: Guiding Cost-Based Anytime Planning with Landmarks. Journal of Artificial Intelligence Research 39: 127-177.
- Seipp et al. 2017: Jendrik Seipp; Florian Pommerening; Silvan Sievers; Malte Helmert. 2017. Downward Lab. Zenodo. https://doi.org/10.5281/zenodo.790461.
- Shen et al. 2020: W. Shen, F. Trevizan, S. Thiébaux. 2020. Learning Domain-Independent Planning Heuristics with Hypergraph Networks. In Proc. ICAPS 2020: 574–584.
