Benchmarking Octave, R and Python platforms for code prototyping in Data Analytics and Machine Learning applications programming

doi:10.5281/zenodo.1432789

Published September 22, 2018 | Version v1

Conference paper Open

Benchmarking Octave, R and Python platforms for code prototyping in Data Analytics and Machine Learning applications programming

Harris Georgiou¹

1. University of Piraeus

Abstract

Octave, R and Python identical codes are tested in terms of in terms of end-user execution speed, using a very low-end "embedded" hardware system and a standard office workstation. The codes include algorithmic primitives common in Data Analytics and Machine Learning, i.e., matrix manipulation (inversion, product), linear Algebra, linear regression, Singular Value Decomposition (SVD), fast Fourier transformation (FFT) and a baseline Bubblesort implementation for testing flow control structures.

Description

In Data Analytics and Machine Learning, code prototyping is an integral part of the Research & Development (R&D) process, especially in data exploration and algorithm design. The programming tools and platforms used for these tasks are selected for rich API/library base, high-level expression syntax, very compact code, interactive on-the-fly code input, abstract data management and best-possible execution speed. Thus, traditional programming languages are usually inappropriate for such heavily iterative and exploratory coding evolutions.

Today, by far the three most popular and appropriate choices are Octave, R and Python. In this work, these three programming environments are assessed in terms of end-user execution speed. More specifically, some common algorithmic primitives are implemented and tested in each language separately, including matrix manipulation (inversion, product), linear Algebra, linear regression, Singular Value Decomposition (SVD), as well as fast Fourier transformation (FFT) as a standard procedure in a signal processing pipeline. Additionally, a baseline implementation of the Bubblesort algorithm is employed for testing the efficiency of flow control structures and execution performance in code branching.

The results present the performance of the three identical source codes in terms of end-user execution speed (elapsed time) in three different hardware platforms, namely: (1) simulating very low-end processing and resources machine similar to embedded systems (Linux, 2GB RAM, N20 Atom single-core CPU), (2) a standard/enhanced office workstation (Win10, 16GB RAM, dual-core i7 CPU) and (3) a high-end workstation or small office server (Win10, 32GB RAM, quad-core i7 CPU).

Notes

Conference: FossComm 2018 @ 13-14 October, Heraklion, Greece.

Files

HG-fosscomm2018-pres.pdf

Files (10.3 MB)

Name	Size	Download all
benchmark.m md5:75379a716cd3c263a779b2631361a8f9	1.8 kB	Download
benchmark.py md5:1e779f5abf5d3a3db8b87dc34db17042	2.3 kB	Download
benchmark.R md5:1d0933f743fad402839f39118840d093	2.4 kB	Download
HG-fosscomm2018-pres.pdf md5:1570688a4212ab62b7bc43e38c45ac63	2.6 MB	Preview Download
HG-fosscomm2018-pres.pptx md5:670f32285f6a58b12edec17e591e2bfa	7.6 MB	Download
submission summary.pdf md5:67259a3f53613b431a55f83e44f028ad	78.7 kB	Preview Download

	All versions	This version
Views	185	184
Downloads	223	222
Data volume	478.3 MB	475.7 MB

Benchmarking Octave, R and Python platforms for code prototyping in Data Analytics and Machine Learning applications programming

Creators

Description

Notes

Files

HG-fosscomm2018-pres.pdf

Files (10.3 MB)