Published November 26, 2022 | Version v1
Software Open

Symbolic expression generation via Variational Auto-Encoder

  • 1. Department of Computer Science, LAMBDA, National Research University Higher School of Economics (HSE), Russia

Description

There are many problems in physics, biology, and other natural sciences in which symbolic regression
can provide valuable insights and discover new laws of nature. A widespread Deep Neural Networks do
not provide interpretable solutions. Meanwhile, symbolic expressions give us a clear relation between
observations and the target variable. However, at the moment, there is no dominant solution for the
symbolic regression task, and we aim to reduce this gap with our algorithm. In this work, we propose a
novel deep learning framework for symbolic expression generation via variational autoencoder (VAE).
In a nutshell, we suggest using a VAE to generate mathematical expressions, and our training strategy
forces generated formulas to fit a given dataset. Our framework allows encoding apriori knowledge of the
formulas into fast-check predicates that speed up the optimization process. We compare our method to
modern symbolic regression benchmarks and show that our method outperforms the competitors under
noisy conditions. The recovery rate of SEGVAE is 65% on the Ngyuen dataset with a noise level of 10%,
which is better than the previously reported SOTA by 20%. We demonstrate that this value depends on
the dataset and can be even higher.

Files

segvae-main.zip

Files (32.6 kB)

Name Size Download all
md5:87aaa556a81235b51b393334c3b4a02b
32.6 kB Preview Download