# Simsala

This should grow into a small collection of tools useful for working with
SLURM.

## submit.pl

A submission script for parameterizing an executable into a SLURM task with
many different instances over some list of problems. 

## generate\_sqlite.pl

Generate a SQLite database from some benchmarks using Perl's Regex Support and
SQLite3 DBI Module. Use it like this:

    ./generate_sqlite.pl <path or globbing expression folder with .log files from submit.pl> [more paths] <output database>
    
Example:

    ./path/to/generate_sqlite.pl "./benchmark_name/*/" benchmark_name.db
    
The above example generates a benchmark_name.db SQLite3 file in the local
directory. To look through this database, https://sqlitebrowser.org/ is very
useful. The database can be queried directly to have a better overview over the
benchmarks or using some other automated tooling.

The dependency should be in the system packages. Install a package similar as these:

  - Arch Linux: `perl-dbd-sqlite`
  - Ubuntu and Debian: `libdbd-sqlite3-perl`
  - OpenSUSE: `perl-DBD-SQLite`

## generate\_statistics\_from\_sqlite.py

This tool generates summaries and plots of the sqlite databases generated using
`generate_sqlite.pl`. Use it for plotting like this:

    ./path/to/generate_statistics_from_sqlite.py <db> [output-name] \
        [-f <family to plot, matched using string.contains on the problem>] \
        [-s <suit to plot, also matched using string.contains on the problem>] \
        [--plot] \ # Plots the overall summary by printing to stdout
        [--plot-families] \ # Plot previously defined families
        [--plot-suites] \ # Plot the previously fefined suites
        | gnuplot # To plot the stuff printed into the local directory.

Hereby, families and suites are useful to have two other "views" into subsets of
the data. For example, one benchmark may contain multiple families, then these
can be put into the filenames of the benchmarks and they may be differentiated
using `-f`. Families and suites are basically the same, they are just named
differently and stem from the naming scheme in QBFEval.

Another useful feature is `--summary`, which prints a summary of results with
some overall statistics for every benchmark. It groups by `__` in the filename
of benchmarks, but it generally always works by just listing results
alphabetically. This is printed to `stderr`, so you can use it together with
plotting and piping into `gnuplot`.

Furthermore, one can ignore and filter benchmarks to drill down on results. This
is done using `-i` for _ignore_ and `-o` for _only_. Generally, you would first
ignore stuff and then say only some of the now filtered sets. This way, results
can be analyzed easier.

One example:

    ./path/to/generate_statistics_from_sqlite.py benchmark_name.db name \
        -f b1 -f b2 \ # Two families in this benchmarks, files containing either b1 or b2
        -i solver3 \ # solver3 results are bad, ignore it
        -o solver1 -o solver2 \ # Only give us solver1 and solver2 results
        --plot-families \ # Plot the b1 and b2 families on their own
        --plot \ # Summary OverAll plots
        --summary \ # Summary of results
        | gnuplot # Of course, also pipe the stuff into gnuplot!
