Adaptive Estimation of the Number of Algorithm Runs in Stochastic Optimization
Contributors
Researchers:
Description
This repository contains the code and data accompanying the paper "Adaptive Estimation of the Number of Algorithm Runs in Stochastic Optimization," accepted at GECCO 2025. To facilitate reproducibility, the repository is organized into multiple folders, each containing the necessary materials for a specific part of the study. This README connects these folders in the same order they appear in the paper.
# Part 1: Performance data
The data for this paper was collected by running a set of 104 DE configurations (hyperparameters provided in the DE_configurations.csv file) on the COCO benchmark suite, consisting of 24 problems × 15 instances × 3 problem dimensions (10, 20, and 40), and running 11 Nevergrad algorithms with their default hyperparameters on COCO, consisting of 25 problems × 10 instances × one dimension (20). For each combination of problem, instance, dimension, and algorithm, 50 repeated runs were performed, resulting in a total of 5,748,000 runs.
The results for DE configurations on the COCO benchmark suite are stored in the COCO folder, while the results for the Nevergrad algorithms on COCO are available in the Nevergrad folder. Performance data for each algorithm is stored separately.
# Part 2: Estimating the number of runs
The run_parser.R script calculates the percentage of triplets for each algorithm/configuration separately, indicating how accurately the proposed methodology estimates performance, along with results from post-hoc evaluations. The script must be executed individually for each algorithm/configuration (specified via input command parameter) in combination with each empirical threshold for checking the symmetry, producing three output files named {i_}{name_of_the_input_file}.csv, where i represents the outlier detection method used (1 – IQR, 2 – Percentile, 3 – MAD).
Each output CSV file contains the percentage of correctly estimated triplets and the corresponding post-hoc evaluation results across 10 different sampling seeds (resulting in 10 rows). The final row in each file provides the average across these 10 seeds, representing the overall percentage of correctly estimated triplets across all problem instances and problem dimensionalities for the selected benchmark suite and algorithm/configuration.
Example:
In general, each R script can be executed using the following format:
bash
Rscript [path_to_script]{script_name}.R [script_parameters]
Running the command for a single input file (algorithm/configuration) would look like this:
bash
Rscript runs_parser.R -p FILENAME -c 1 -s SKEW -o OUTPUT_PATH
The above example generates the following output files in the out/ folder:
-
out/1_FILENAME.csv
-
out/2_FILENAME.csv
-
out/3_FILENAME.csv
Executing the code on performance data stored in different folders produces corresponding results:
-
For DE configurations (COCO benchmark suite): stored in the COCO_results/ folder.
-
For Nevergrad algorithms (COCO benchmark suite): stored in the Nevergrad_COCO/ folder.
# Part 3: Wrong analysis per dimension and problem
The COCO_parser.R script generates three output folders:
-
COCO_results_dim:
Contains CSV files named {i_}{name_of_the_input_file}.csv, where i identifies the outlier detection method (1 – IQR, 2 – Percentile, 3 – MAD). Each file lists the number of problem instances per dimension (10, 20, and 40) that were inaccurately estimated for each algorithm/configuration, as evaluated using bootstrapped confidence intervals (CIs). Each CSV file has 10 rows corresponding to different sampling seeds, plus a final row with the average across these seeds, summarizing the overall number of inaccurate estimations. -
COCO_results_problems:
Contains CSV files named {i_}{j_}{name_of_the_input_file}.csv, where i identifies the outlier detection method (1 – IQR, 2 – Percentile, 3 – MAD), and j denotes the sampling seed. Each file specifies the individual problems, their instances, and dimensions for which the proposed methodology produced inaccurate estimations, using a single sampling seed. -
COCO_results_problems_dim:
Contains CSV files named {i_}{j_}{name_of_the_input_file}.csv, where i identifies the outlier detection method (1 – IQR, 2 – Percentile, 3 – MAD), and j denotes the sampling seed. Each file summarizes, per dimension (10, 20, and 40), the number of inaccurately estimated problem instances for each algorithm/configuration, based on a post-hoc evaluation. Each CSV file has six rows corresponding to different percentages used in the post-hoc analysis.
These outputs enable summarization and visualization of results, as presented in the paper. Pre-processing scripts for generating the visualizations are available upon request from the authors.
—--------------------------------------------------------------------------------------------------------------------------------
The COCO_results folder contains the outputs obtained by running the two scripts using the complete performance data described in the first part. Results are structured according to different empirical thresholds (0.05, 0.1, 0.15, 0.2) used to evaluate distribution symmetry. Each threshold has three subfolders:
-
run: Contains results generated by executing the script run_parser_2024.R.
-
COCO: Contains results generated by executing the script COCO_parser_2024.R.
-
green: Contains results used for calculating metrics related to green benchmarking—specifically, the estimated number of algorithm runs required per problem instance.
The Nevergrad_results folder contains the outputs generated by running the script run_parser_2024.R for the Nevergrad algorithms.
Files
COCO.zip
Files
(1.5 GB)
Name | Size | Download all |
---|---|---|
md5:e1cf7c3d793498c946369443bdfca225
|
1.4 GB | Preview Download |
md5:9703abd936ff52538728554a48e5ff2c
|
8.5 kB | Download |
md5:f40aa3d041411238c13c89319b1bf113
|
43.0 MB | Preview Download |
md5:59dbf7952e25897f8bf842c2b2dcd8dd
|
2.5 kB | Preview Download |
md5:7629a7519198b039bd1193a9e7b75d23
|
1.5 MB | Preview Download |
md5:8cc5143bdf97ef3e4ee05bd1b2a13ee7
|
71.0 kB | Preview Download |
md5:8ef5fdc17ebf93c35a3b013d4f01b047
|
7.3 kB | Download |
md5:e481cbfb4c3c828a6af1791ed28abff7
|
4.6 kB | Download |
Additional details
Funding
- European Commission
- AutoLearn-SI - Leveraging Benchmarking Data for Automated Machine Learning and Optimization 101187010
- The Slovenian Research and Innovation Agency
- Auto-OPT: Automated selection and configuration of single-objective continuous optimization algorithms J2-4460
- The Slovenian Research and Innovation Agency
- Artificial Intelligence for Science (AI4sci) GC-0001
- The Slovenian Research and Innovation Agency
- Computer structures and systems P2-0098