Configuration
The configuration of this project differentiates training and evaluation. Both are described by the training_config and evaluation_config in configs/base_config.
The training configuration contains the hyperparameters and the learning environment. Every converged agent is contained in the experiments.pkl which must be specified in the evaluation configuration.
Training Configuration
- NAME:
Name of the experiment.
- HYPERPARAMETERS:
Dictionary with the algorithm configuration.
- ALGORITHM_NAME:
Name of the algorithm.
- ENV_CONFIG:
Dictionary of the environment parameters.
- TRAINING_ITERATIONS:
Number of training iterations.
- LOCAL_DIR:
Dictionary to save the models and result files.
- NUM_GPUS:
Number of GPU’s to use for training.
- NUM_WORKERS:
Number of Rllib workers.
- NUM_SAMPLES:
Number of configuration trials. Only relevant for hyperparameter optimization.
- SEEDS:
List of seeds to train.
- VERBOSE:
Log information level.
- NUM_DISCRETIZATION_BINS:
Number of discretized actions or bins. Only relevant for PAM and the discretization wrapper.
- HPO_RESULTS_PATH:
Path to store a dataframe with the results for each trial during the hyperparameter optimization.
- TRAINING_RESULTS_PATH:
Path to store the final models.
- CONFIGURATION_PATH:
Path to store a dictionary of the training configuration for reproducability.
- EXPERIMENTS_PATH:
Path to store a summary of all trained models describing the location of the corresponding config and model.
Evaluation Configuration
- ALGORITHM_NAME:
Name of the algorithm to evaluate. The agent will be looked up in a experiments file and the corresponding config and model loaded.
- ENV_CONFIG:
Dictionary of the environment parameters.
- SEEDS:
List of seeds to evaluate.
- OBSTACLES:
List with the numbers of obstacles to evaluate.
- EXPERIMENTS_PATH:
Path to an existing experiments file which contains the algorithm to evaluate.
- EPISODE_RESULTS_PATH:
Path to store the results summarizing entire episodes.
- STEP_DATA_PATH:
Path to store the results for describing every time step of each episode.
- RECORD:
Path to store videos of the rollouts. None if videos should not be saved.