Document Type

Article

Language

eng

Publication Date

7-14-2020

Publisher

Institute of Electrical and Electronics Engineers

Source Publication

2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS)

Source ISSN

9781728168760

Abstract

Due to the developments of topographic techniques, clear satellite imagery, and various means for collecting information, geospatial datasets are growing in volume, complexity, and heterogeneity. For efficient execution of spatial computations and analytics on large spatial data sets, parallel processing is required. To exploit fine-grained parallel processing in large scale compute clusters, partitioning in a load-balanced way is necessary for skewed datasets. In this work, we focus on spatial join operation where the inputs are two layers of geospatial data. Our partitioning method for spatial join uses Adaptive Partitioning (ADP) technique, which is based on Quadtree partitioning. Unlike existing partitioning techniques, ADP partitions the spatial join workload instead of partitioning the individual datasets separately to provide better load-balancing. Based on our experimental evaluation, ADP partitions spatial data in a more balanced way than Quadtree partitioning and Uniform grid partitioning. ADP uses an output-sensitive duplication avoidance technique which minimizes duplication of geometries that are not part of spatial join output. In a distributed memory environment, this technique can reduce data communication and storage requirements compared to traditional methods.To improve the performance of ADP, an MPI+Threads based parallelization is presented. With ParADP, a pair of real world datasets, one with 717 million polylines and another with 10 million polygons, is partitioned into 65,536 grid cells within 7 seconds. ParADP performs well with both good weak scaling up to 4,032 CPU cores and good strong scaling up to 4,032 CPU cores.

Comments

Accepted version. 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS) (July 14, 2020): 810-820. DOI. © 2020 Institute of Electrical and Electronics Engineers. Used with permission.

Puri_13954acc.docx (333 kB)
ADA Accessible Version

Share

COinS