Info: Zenodo’s user support line is staffed on regular business days between Dec 23 and Jan 5. Response times may be slightly longer than normal.

Published August 17, 2018 | Version v2
Dataset Open

Safe Automated Refactoring for Intelligent Parallelization of Java 8 Streams

  • 1. City University of New York (CUNY) Hunter College
  • 2. City University of New York (CUNY) Graduate Center
  • 3. Oakland University

Description

Streaming APIs are becoming more pervasive in mainstream Object-Oriented programming languages. For example, the Stream API introduced in Java 8 allows for functional-like, MapReduce-style operations in processing both finite and infinite data structures. However, using this API efficiently involves subtle considerations like determining when it is best for stream operations to run in parallel, when running operations in parallel can be less efficient, and when it is safe to run in parallel due to possible lambda expression side-effects. In this paper, we present an automated refactoring approach that assists developers in writing efficient stream code in a semantics-preserving fashion. The approach, based on a novel data ordering and typestate analysis, consists of preconditions for automatically determining when it is safe and possibly advantageous to convert sequential streams to parallel and unorder or de-parallelize already parallel streams. The approach was implemented as a plug-in to the Eclipse IDE, uses the WALA and SAFE analysis frameworks, and was evaluated on 11 Java projects consisting of ~642 thousand lines of code. We found that 36.31% of candidate streams were refactorable, and an average speedup of 3.49 on performance tests was observed. The results indicate that the approach is useful in optimizing stream code to their full potential.

Notes

htm_java.patch and java-design-patterns.patch are the changes made to the htm.java and java-design-patterns subjects, respectively, to increase the data size of the performance test as discussed with the developers. refactoring.patch is the transformations applied to the subjects as a result of the program analysis.

Files

candidate_streams.csv

Files (345.2 kB)

Name Size Download all
md5:7f241632e7732f6cc1b9dbca898d344c
19.9 kB Preview Download
md5:9b4ca68e5cd6b8d6e090b522d3e0fdf5
90.9 kB Preview Download
md5:a4de3b9f494c600a6dc7fb4160e0deb9
23.4 kB Preview Download
md5:e7d8398abc58d02a1920dbaef3393efb
5.4 kB Download
md5:994530c1f91d81b52a39279d19d49a94
56.6 kB Download
md5:97f0e83255a17827db2b1521e6cbcbe3
7.5 kB Preview Download
md5:8999e0b8dd00ace5c2dffe634ecae3cb
2.2 kB Preview Download
md5:689ab40ff03c1e963294a3ee892f8ba4
2.2 kB Preview Download
md5:4430621a798f80493f987d1ffd02274c
28.4 kB Download
md5:7dc7a9ba5ca4ff9eaf78b0c7aa5675d2
14.7 kB Preview Download
md5:3c1a76fd227cfd052de8069282b359bc
8.7 kB Preview Download
md5:8232d38af633b401bfa23bbce19d0a27
72.9 kB Preview Download
md5:28b3b486c8d2187866cbdd5a90141f76
12.5 kB Preview Download

Additional details