Published February 25, 2021 | Version 3.0
Software Open

Scalable FSM Parallelization via Path Fusion and Higher-Order Speculation

  • 1. Michigan Technological University
  • 2. University of California, Riverside

Description

Finite-state machine (FSM) is a fundamental computation model used by many applications. However, FSM execution is known to be “embarrassingly sequential” due to the state dependences among transitions. Existing solutions leverage enumerative or speculative parallelization to break the state dependences. Despite their promising results, the efficiency of both parallelization schemes highly depends on the properties of the FSM and its inputs. For those exhibiting unfavorable properties, the former suffers from the overhead of maintaining multiple execution paths, while the latter is bottlenecked by the serial reprocessing among the misspeculation cases. Either way, the FSM parallelization scalability is seriously compromised.

This work addresses the above scalability limitations with two novel techniques. First, for enumerative parallelization, it proposes path fusion, a technique inspired by the classic NFA to DFA conversion—both NFA and state enumeration need to maintain multiple current states. By mapping a vector of states in the original FSM to a new (fused) state, path fusion can reduce multiple FSM execution paths into a single path, thus minimizing the overhead of path maintenance. Second, for speculative parallelization, to overcome the bottleneck of serial reprocessing during validations, this work introduces higher-order speculation, a generalized speculation model that allows a speculated state to be validated against the result from another speculative execution, thus enabling parallel reprocessing. Finally, this work integrates different schemes of FSM parallelization into a framework—BoostFSM, which automatically selects the best based on the relevant properties of the FSM. Evaluation using real-world FSMs with diverse characteristics confirms that BoostFSM raises the average speedup from 3.1× and 15.4× of the existing speculative and enumerative parallelization schemes, respectively, to 25.8× on a 64-core machine.

Files

ASPLOS21_AE.zip

Files (7.6 GB)

Name Size Download all
md5:556792d985cfe2525ae4d3f7f97eddcb
7.6 GB Preview Download
md5:794c56e4fdc17f04d01009a23fbdc2f6
123.0 kB Preview Download