MURAL - Maynooth University Research Archive Library



    Scalable RDF Data Compression using X10


    Cheng, Long, Malik, Avinash, Kotoulas, Spyros, Ward, Tomas E. and Theodoropoulos, Georgios (2014) Scalable RDF Data Compression using X10. Working Paper. arXiv.

    [thumbnail of TW-Scalable-RDF.pdf]
    Preview
    Text
    TW-Scalable-RDF.pdf

    Download (290kB) | Preview

    Abstract

    The Semantic Web comprises enormous volumes of semi-structured data elements. For interoperability, these elements are represented by long strings. Such representations are not efficient for the purposes of Semantic Web applications that perform computations over large volumes of information. A typical method for alleviating the impact of this problem is through the use of compression methods that produce more compact representations of the data. The use of dictionary encoding for this purpose is particularly prevalent in Semantic Web database systems. However, centralized implementations present performance bottlenecks, giving rise to the need for scalable, efficient distributed encoding schemes. In this paper, we describe an encoding implementation based on the asynchronous partitioned global address space (APGAS) parallel programming model. We evaluate performance on a cluster of up to 384 cores and datasets of up to 11 billion triples (1.9 TB). Compared to the state-of-art MapReduce algorithm, we demonstrate a speedup of 2:6 - 7:4X and excellent scalability. These results illustrate the strong potential of the APGAS model for efficient implementation of dictionary encoding and contributes to the engineering of larger scale Semantic Web applications.
    Item Type: Monograph (Working Paper)
    Keywords: RDF; Parallel compression; dictionary encoding; X10; HPC;
    Academic Unit: Faculty of Science and Engineering > Electronic Engineering
    Item ID: 6278
    Identification Number: arXiv:1403.2404
    Depositing User: Dr Tomas Ward
    Date Deposited: 21 Jul 2015 14:53
    Publisher: arXiv
    URI: https://mu.eprints-hosting.org/id/eprint/6278
    Use Licence: This item is available under a Creative Commons Attribution Non Commercial Share Alike Licence (CC BY-NC-SA). Details of this licence are available here

    Repository Staff Only (login required)

    Item control page
    Item control page

    Downloads

    Downloads per month over past year

    Origin of downloads