Configuration

The configuration of this project differentiates training and evaluation. Both are described by the training_config and evaluation_config in configs/base_config.

The training configuration contains the hyperparameters and the learning environment. Every converged agent is contained in the experiments.pkl which must be specified in the evaluation configuration.

Training Configuration

NAME:: Name of the experiment.
HYPERPARAMETERS:: Dictionary with the algorithm configuration.
ALGORITHM_NAME:: Name of the algorithm.
ENV_CONFIG:: Dictionary of the environment parameters.
TRAINING_ITERATIONS:: Number of training iterations.
LOCAL_DIR:: Dictionary to save the models and result files.
NUM_GPUS:: Number of GPU’s to use for training.
NUM_WORKERS:: Number of Rllib workers.
NUM_SAMPLES:: Number of configuration trials. Only relevant for hyperparameter optimization.
SEEDS:: List of seeds to train.
VERBOSE:: Log information level.
NUM_DISCRETIZATION_BINS:: Number of discretized actions or bins. Only relevant for PAM and the discretization wrapper.
HPO_RESULTS_PATH:: Path to store a dataframe with the results for each trial during the hyperparameter optimization.
TRAINING_RESULTS_PATH:: Path to store the final models.
CONFIGURATION_PATH:: Path to store a dictionary of the training configuration for reproducability.
EXPERIMENTS_PATH:: Path to store a summary of all trained models describing the location of the corresponding config and model.

Evaluation Configuration

ALGORITHM_NAME:: Name of the algorithm to evaluate. The agent will be looked up in a experiments file and the corresponding config and model loaded.
ENV_CONFIG:: Dictionary of the environment parameters.
SEEDS:: List of seeds to evaluate.
OBSTACLES:: List with the numbers of obstacles to evaluate.
EXPERIMENTS_PATH:: Path to an existing experiments file which contains the algorithm to evaluate.
EPISODE_RESULTS_PATH:: Path to store the results summarizing entire episodes.
STEP_DATA_PATH:: Path to store the results for describing every time step of each episode.
RECORD:: Path to store videos of the rollouts. None if videos should not be saved.