Cheng, Long, Malik, Avinash, Kotoulas, Spyros, Ward, Tomas E. and Theodoropoulos, Georgios (2014) Scalable RDF Data Compression using X10. Working Paper. arXiv.
Preview
TW-Scalable-RDF.pdf
Download (290kB) | Preview
Abstract
The Semantic Web comprises enormous volumes
of semi-structured data elements. For interoperability, these
elements are represented by long strings. Such representations
are not efficient for the purposes of Semantic Web applications
that perform computations over large volumes of information.
A typical method for alleviating the impact of this problem is
through the use of compression methods that produce more
compact representations of the data. The use of dictionary
encoding for this purpose is particularly prevalent in Semantic
Web database systems. However, centralized implementations
present performance bottlenecks, giving rise to the need for
scalable, efficient distributed encoding schemes. In this paper,
we describe an encoding implementation based on the asynchronous
partitioned global address space (APGAS) parallel
programming model. We evaluate performance on a cluster of
up to 384 cores and datasets of up to 11 billion triples (1.9
TB). Compared to the state-of-art MapReduce algorithm, we
demonstrate a speedup of 2:6 - 7:4X and excellent scalability.
These results illustrate the strong potential of the APGAS
model for efficient implementation of dictionary encoding and
contributes to the engineering of larger scale Semantic Web
applications.
Item Type: | Monograph (Working Paper) |
---|---|
Keywords: | RDF; Parallel compression; dictionary encoding; X10; HPC; |
Academic Unit: | Faculty of Science and Engineering > Electronic Engineering |
Item ID: | 6278 |
Identification Number: | arXiv:1403.2404 |
Depositing User: | Dr Tomas Ward |
Date Deposited: | 21 Jul 2015 14:53 |
Publisher: | arXiv |
URI: | https://mu.eprints-hosting.org/id/eprint/6278 |
Use Licence: | This item is available under a Creative Commons Attribution Non Commercial Share Alike Licence (CC BY-NC-SA). Details of this licence are available here |
Repository Staff Only (login required)
Downloads
Downloads per month over past year