Publication
Title
Cost model for Pregel on GraphX
Author
Abstract
The graph partitioning strategy plays a vital role in the overall execution of an algorithm in a distributed graph processing system. Choosing the best strategy is very challenging, as no one strategy is always the best fit for all kinds of graphs or algorithms. In this paper, we help users choosing a suitable partitioning strategy for algorithms based on the Pregel model by providing a cost model for the Pregel implementation in Spark-GraphX. The cost model shows the relationship between four major parameters: (1) input graph (2) cluster configuration (3) algorithm properties and (4) partitioning strategy. We validate the accuracy of the cost model on 17 different combinations of input graph, algorithm, and partition strategy. As such, the cost model can serve as a basis for yet to be developed optimizers for Pregel.
Language
English
Source (journal)
Lecture notes in computer science. - Berlin, 1973, currens
Source (book)
Advances in Databases and Information Systems : 21st European Conference, ADBIS 2017, September 24-27, 2017, Nicosia, Cyprus / Kirikova, Mārīte [edit.]; et al.
Source (series)
Information systems and applications, incl. internet/web, and HC (LNISA) ; 10509
Publication
Cham : Springer , 2017
ISBN
978-3-319-66917-5 [online]
978-3-319-66916-8
Volume/pages
10509 (2018) , p. 153-166
Full text (Publisher's DOI)
Full text (open access)
UAntwerpen
Faculty/Department
Research group
Publication type
Subject
Affiliation
Publications with a UAntwerp address
External links
Record
Identification
Creation 03.10.2018
Last edited 15.07.2021
To cite this reference