Comparative Study of Spark MLlib vs. TensorFlow on Distributed Big Data Sets
Creators
Description
|
ABSTRACT |
|
|
Big data analytics typically involves analyses where the volume of data exceeds the computational resources of a single machine. Machine learning (ML) often plays a big role at the core of the analytical pipeline, but existing ML techniques do not scale well with dataset sizes, and many ML implementations are not compatible with analytics systems like Hadoop or Spark. AOps and C-Systems offer big data analytics solutions built using Apache Spark to compute task relatedness and perform classification; nevertheless, Spark does not come with ML algorithms for analytics. Several big data analytics and management solutions also rely on rule-based or domain knowledge-based heuristics rather than statistical or ML methods.
Keywords: Spark Mllib, TensorFlow, big data, Machine learning (ML)
|
|
Files
13.pdf
Files
(301.0 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:9f82bae7f80b654068e54cf2594b3fb7
|
301.0 kB | Preview Download |