puzzlef/pagerank-cuda-optimization-skip-chains: Performance benefit of skipping chain vertices for CUDA based PageRank

Subhajit Sahu

doi:10.5281/zenodo.5777892

Published December 13, 2021 | Version 1.0.0

Software Open

puzzlef/pagerank-cuda-optimization-skip-chains: Performance benefit of skipping chain vertices for CUDA based PageRank

Subhajit Sahu

Part of the report STIC-D based Algorithmic Optimizations for Monolithic PageRank.

--

Performance benefit of skipping chain vertices for CUDA based PageRank (pull, CSR).

This experiment was for comparing performance between:

Find PageRank without optimization.
Find PageRank skipping rank calculation of chain vertices.

Each approach was attempted on a number of graphs, running each approach 5 times to get a good time measure. The minimum size of chains is varied from 2-256, with 2 being the default (skip2), and the best among them is also picked (skipbest). A chain here means a set of uni-directional links connecting one vertex to the next, without any additional edges. Bi-directional links are not considered as chains. Note that most graphs don't have enough chains to provide an advantage. Road networks do have chains, but they are bi-directional, and thus not considered here. It seems skipping all chains (skip2) decreases execution time by -11 to 6%, while skipping chains with best group size or more (skipbest) decreases execution time by -9 to 6%, when compared to no optimization. Please note that some other these small speedups can be attributed to randomness (execution time has small variations). With respect to GM-RATIO, skipping chains (skip2) and skipping best chains (skipbest) completes in 4% more time (0.96x) than using default approach. With respect to AM-RATIO, skipping chains (skip2) completes in 4% more time (0.96x), and skipping best chains (skipbest) completes in 3% more time (0.97x), than using default approach. It therefore makes sense to apply this optimization only when the graph has a large number of chains.

All outputs are saved in out and a small part of the output is listed here. Some charts are to be included below, generated from sheets. The input data used for this experiment is available at "graphs" (for small ones), and the SuiteSparse Matrix Collection. This experiment was done with guidance from Prof. Dip Sankar Banerjee and Prof. Kishore Kothapalli.

$ nvcc -std=c++17 -Xcompiler -lnvgraph -O3 main.cu
$ ./a.out ~/data/min-1DeadEnd.mtx
$ ./a.out ~/data/min-2SCC.mtx
$ ...

# ...
#
# Loading graph /home/subhajit/data/web-Stanford.mtx ...
# order: 281903 size: 2312497 {}
# order: 281903 size: 2312497 {} (transposeWithDegree)
# [00011.371 ms; 063 iters.] [0.0000e+00 err.] pagerankCuda
# [00011.726 ms; 063 iters.] [2.5968e-09 err.] pagerankCuda [skip-chains=002; chain-vertices=00001265; chains=00000575]
# [00011.673 ms; 063 iters.] [9.5497e-12 err.] pagerankCuda [skip-chains=004; chain-vertices=00000051; chains=00000002]
# [00011.697 ms; 063 iters.] [9.5497e-12 err.] pagerankCuda [skip-chains=008; chain-vertices=00000051; chains=00000002]
# [00011.992 ms; 063 iters.] [9.5497e-12 err.] pagerankCuda [skip-chains=016; chain-vertices=00000051; chains=00000002]
# [00011.948 ms; 063 iters.] [7.5033e-12 err.] pagerankCuda [skip-chains=032; chain-vertices=00000034; chains=00000001]
# [00011.873 ms; 063 iters.] [0.0000e+00 err.] pagerankCuda [skip-chains=064; chain-vertices=00000000; chains=00000000]
# [00011.860 ms; 063 iters.] [0.0000e+00 err.] pagerankCuda [skip-chains=128; chain-vertices=00000000; chains=00000000]
# [00011.853 ms; 063 iters.] [0.0000e+00 err.] pagerankCuda [skip-chains=256; chain-vertices=00000000; chains=00000000]
#
# ...
#
# Loading graph /home/subhajit/data/soc-LiveJournal1.mtx ...
# order: 4847571 size: 68993773 {}
# order: 4847571 size: 68993773 {} (transposeWithDegree)
# [00157.950 ms; 051 iters.] [0.0000e+00 err.] pagerankCuda
# [00160.251 ms; 051 iters.] [3.9981e-09 err.] pagerankCuda [skip-chains=002; chain-vertices=00015182; chains=00007435]
# [00158.980 ms; 051 iters.] [0.0000e+00 err.] pagerankCuda [skip-chains=004; chain-vertices=00000008; chains=00000002]
# [00159.280 ms; 051 iters.] [0.0000e+00 err.] pagerankCuda [skip-chains=008; chain-vertices=00000000; chains=00000000]
# [00158.992 ms; 051 iters.] [0.0000e+00 err.] pagerankCuda [skip-chains=016; chain-vertices=00000000; chains=00000000]
# [00158.958 ms; 051 iters.] [0.0000e+00 err.] pagerankCuda [skip-chains=032; chain-vertices=00000000; chains=00000000]
# [00159.059 ms; 051 iters.] [0.0000e+00 err.] pagerankCuda [skip-chains=064; chain-vertices=00000000; chains=00000000]
# [00159.376 ms; 051 iters.] [0.0000e+00 err.] pagerankCuda [skip-chains=128; chain-vertices=00000000; chains=00000000]
# [00158.963 ms; 051 iters.] [0.0000e+00 err.] pagerankCuda [skip-chains=256; chain-vertices=00000000; chains=00000000]
#
# ...

References

Files

puzzlef/pagerank-cuda-optimization-skip-chains-1.0.0.zip

Files (49.9 kB)

Name	Size	Download all
puzzlef/pagerank-cuda-optimization-skip-chains-1.0.0.zip md5:248d2abd4df57791f533dfd80bd6994e	49.9 kB	Preview Download

Additional details

Is supplement to: https://github.com/puzzlef/pagerank-cuda-optimization-skip-chains/tree/1.0.0 (URL)

	All versions	This version
Views	43	43
Downloads	9	9
Data volume	449.4 kB	449.4 kB

puzzlef/pagerank-cuda-optimization-skip-chains: Performance benefit of skipping chain vertices for CUDA based PageRank

Authors/Creators

Description

Files

puzzlef/pagerank-cuda-optimization-skip-chains-1.0.0.zip

Files (49.9 kB)

Additional details

Related works