Published March 28, 2025 | Version 1.0.0
Conference paper Open

Adaptive Estimation of the Number of Algorithm Runs in Stochastic Optimization

  • 1. ROR icon Jožef Stefan Institute
  • 2. University of Primorska Faculty of Mathematics, Natural Sciences and Information Technologies

Contributors

  • 1. ROR icon Jožef Stefan Institute
  • 2. University of Primorska Faculty of Mathematics, Natural Sciences and Information Technologies

Description

This repository contains the code and data accompanying the paper "Adaptive Estimation of the Number of Algorithm Runs in Stochastic Optimization," accepted at GECCO 2025. To facilitate reproducibility, the repository is organized into multiple folders, each containing the necessary materials for a specific part of the study. This README connects these folders in the same order they appear in the paper.

# Part 1: Performance data

The data for this paper was collected by running a set of 104 DE configurations (hyperparameters provided in the DE_configurations.csv file) on the COCO benchmark suite, consisting of 24 problems × 15 instances × 3 problem dimensions (10, 20, and 40), and running 11 Nevergrad algorithms with their default hyperparameters on COCO, consisting of 25 problems × 10 instances × one dimension (20). For each combination of problem, instance, dimension, and algorithm, 50 repeated runs were performed, resulting in a total of 5,748,000 runs.

The results for DE configurations on the COCO benchmark suite are stored in the COCO folder, while the results for the Nevergrad algorithms on COCO are available in the Nevergrad folder. Performance data for each algorithm is stored separately.

# Part 2: Estimating the number of runs 

The run_parser.R script calculates the percentage of triplets for each algorithm/configuration separately, indicating how accurately the proposed methodology estimates performance, along with results from post-hoc evaluations. The script must be executed individually for each algorithm/configuration (specified via input command parameter) in combination with each empirical threshold for checking the symmetry, producing three output files named {i_}{name_of_the_input_file}.csv, where i represents the outlier detection method used (1 – IQR, 2 – Percentile, 3 – MAD).

Each output CSV file contains the percentage of correctly estimated triplets and the corresponding post-hoc evaluation results across 10 different sampling seeds (resulting in 10 rows). The final row in each file provides the average across these 10 seeds, representing the overall percentage of correctly estimated triplets across all problem instances and problem dimensionalities for the selected benchmark suite and algorithm/configuration.

Example:

In general, each R script can be executed using the following format:

bash

Rscript [path_to_script]{script_name}.R [script_parameters]

 

Running the command for a single input file (algorithm/configuration) would look like this:

bash

Rscript runs_parser.R -p FILENAME -c 1 -s SKEW -o OUTPUT_PATH

 

The above example generates the following output files in the out/ folder:

  • out/1_FILENAME.csv

  • out/2_FILENAME.csv

  • out/3_FILENAME.csv

Executing the code on performance data stored in different folders produces corresponding results:

  • For DE configurations (COCO benchmark suite): stored in the COCO_results/ folder.

  • For Nevergrad algorithms (COCO benchmark suite): stored in the Nevergrad_COCO/ folder.

 

# Part 3: Wrong analysis per dimension and problem

The COCO_parser.R script generates three output folders:

  1. COCO_results_dim:
    Contains CSV files named {i_}{name_of_the_input_file}.csv, where i identifies the outlier detection method (1 – IQR, 2 – Percentile, 3 – MAD). Each file lists the number of problem instances per dimension (10, 20, and 40) that were inaccurately estimated for each algorithm/configuration, as evaluated using bootstrapped confidence intervals (CIs). Each CSV file has 10 rows corresponding to different sampling seeds, plus a final row with the average across these seeds, summarizing the overall number of inaccurate estimations.

  2. COCO_results_problems:
    Contains CSV files named {i_}{j_}{name_of_the_input_file}.csv, where i identifies the outlier detection method (1 – IQR, 2 – Percentile, 3 – MAD), and j denotes the sampling seed. Each file specifies the individual problems, their instances, and dimensions for which the proposed methodology produced inaccurate estimations, using a single sampling seed.

  3. COCO_results_problems_dim:
    Contains CSV files named {i_}{j_}{name_of_the_input_file}.csv, where i identifies the outlier detection method (1 – IQR, 2 – Percentile, 3 – MAD), and j denotes the sampling seed. Each file summarizes, per dimension (10, 20, and 40), the number of inaccurately estimated problem instances for each algorithm/configuration, based on a post-hoc evaluation. Each CSV file has six rows corresponding to different percentages used in the post-hoc analysis.

These outputs enable summarization and visualization of results, as presented in the paper. Pre-processing scripts for generating the visualizations are available upon request from the authors.

—--------------------------------------------------------------------------------------------------------------------------------

The COCO_results folder contains the outputs obtained by running the two scripts using the complete performance data described in the first part. Results are structured according to different empirical thresholds (0.05, 0.1, 0.15, 0.2) used to evaluate distribution symmetry. Each threshold has three subfolders:

  • run: Contains results generated by executing the script run_parser_2024.R.

  • COCO: Contains results generated by executing the script COCO_parser_2024.R.

  • green: Contains results used for calculating metrics related to green benchmarking—specifically, the estimated number of algorithm runs required per problem instance.

The Nevergrad_results folder contains the outputs generated by running the script run_parser_2024.R for the Nevergrad algorithms.

Files

COCO.zip

Files (1.5 GB)

Name Size Download all
md5:e1cf7c3d793498c946369443bdfca225
1.4 GB Preview Download
md5:9703abd936ff52538728554a48e5ff2c
8.5 kB Download
md5:f40aa3d041411238c13c89319b1bf113
43.0 MB Preview Download
md5:59dbf7952e25897f8bf842c2b2dcd8dd
2.5 kB Preview Download
md5:7629a7519198b039bd1193a9e7b75d23
1.5 MB Preview Download
md5:8cc5143bdf97ef3e4ee05bd1b2a13ee7
71.0 kB Preview Download
md5:8ef5fdc17ebf93c35a3b013d4f01b047
7.3 kB Download
md5:e481cbfb4c3c828a6af1791ed28abff7
4.6 kB Download

Additional details

Funding

European Commission
AutoLearn-SI - Leveraging Benchmarking Data for Automated Machine Learning and Optimization 101187010
The Slovenian Research and Innovation Agency
Auto-OPT: Automated selection and configuration of single-objective continuous optimization algorithms J2-4460
The Slovenian Research and Innovation Agency
Artificial Intelligence for Science (AI4sci) GC-0001
The Slovenian Research and Innovation Agency
Computer structures and systems P2-0098