A spark-based parallel distributed posterior decoding algorithm for big data hidden Markov models decoding problem
Description
Hidden Markov models (HMMs) are one of machine learning algorithms which have been widely used and demonstrated their efficiency in many conventional applications. This paper proposes a modified posterior decoding algorithm to solve hidden Markov models decoding problem based on MapReduce paradigm and spark’s resilient distributed dataset (RDDs) concept, for large-scale data processing. The objective of this work is to improve the performances of HMM to deal with big data challenges. The proposed algorithm shows a great improvement in reducing time complexity and provides good results in terms of running time, speedup, and parallelization efficiency for a large amount of data, i.e., large states number and large sequences number.
Files
30 20576.pdf
Files
(583.2 kB)
Name | Size | Download all |
---|---|---|
md5:2ca5da76fca04fea988dea0e6ff68f49
|
583.2 kB | Preview Download |