Convergence of distributed symbolic regression using metaheuristics

Cardoen, Ben

doi:10.5281/zenodo.11549519

Published June 15, 2017 | Version v1

Thesis Open

Convergence of distributed symbolic regression using metaheuristics

Cardoen, Ben

Contributors

Contact person:

Cardoen, Ben¹

1. University of Antwerp

Symbolic regression (SR) fits a symbolic expression to a set of expected values. Amongst its advantages over other techniques is the ability for a practitioner to interpret the resulting expression, determine important features by their usage in the expression, and insights into the behavior of the resulting model such as continuity, derivatives and extrema. SR combines a discrete combinatoric problem, combining base functions, with the continuous optimization problem of selecting and mutating real valued constants. One of the main algorithms used in SR is Genetic Programming (GP). The convergence characteristics of SR using GP are still an open issue. The continuous aspect of the problem has traditionally been an issue in GP based symbolic regression. This paper will study convergence of a GP-SR implementation on selected benchmarks known for poor convergence characteristics. We introduce a cooling schedule on the mutation operator and observe the computational savings. The constant optimization problem is studied using a two phase approach. We apply a variation on constant folding and evaluate its effects. The hybridization of GP with 3 metaheuristics (Differential Evolution, Artificial Bee Colony, Particle Swarm Optimization) are evaluated. We use a distributed GP-SR implementation to evaluate the effect of topologies on the convergence characteristics of the algorithm and the difference in communication overhead
and speedup. We introduce and evaluate a topology with the aim of finding a new balance between diffusion and communication and synchronization overhead. We intro-
duce a variation of k-fold cross validation to estimate how accurate a generated solution is in predicting unknown datapoints. This validation technique is implemented in parallel in the algorithm combining both the advantages of cross validation with the increase in coverage of the search space. Our tool offers a wide array of statistics describing the convergence characteristics of the algorithm over time, offering practitioners nuanced insights into the algorithm as it approximates the symbolic regression problem. We combine our incremental support with a design of experiment technique applied on a simulator and evaluate the impact on the convergence characteristics in combination with our constant optimization approach on the one hand and the distributed algorithm on the other hand.

Other

Master Thesis, University of Antwerp, Computer Science.

Files

MsCThesisBenCardoen.pdf

Files (2.7 MB)

Name	Size	Download all
MsCThesisBenCardoen.pdf md5:c33d28866d24c86966b881106e028150	2.7 MB	Preview Download

Additional details

Repository URL: https://github.com/bencardoen/CSRM.git
Programming language: Python
Development Status: Active

	All versions	This version
Views	53	53
Downloads	31	31
Data volume	101.5 MB	101.5 MB

Convergence of distributed symbolic regression using metaheuristics

Creators

Contributors

Contact person:

Description

Other

Files

MsCThesisBenCardoen.pdf

Files (2.7 MB)

Additional details

Software