MURAL - Maynooth University Research Archive Library



    Greener Data Exchange in the Cloud: A Coding-Based Optimization for Big Data Processing


    Asad, Zakia, Asad Rehman Chaudhry, Mohammad and Malone, David (2016) Greener Data Exchange in the Cloud: A Coding-Based Optimization for Big Data Processing. IEEE Journal on Selected Areas in Communications, 34 (5). pp. 1360-1377. ISSN 0733-8716

    [thumbnail of DM-Greener-data-2016.pdf]
    Preview
    Text
    DM-Greener-data-2016.pdf

    Download (2MB) | Preview

    Abstract

    The rise of the cloud and distributed data-intensive (big data) applications puts pressure on data center networks due to the movement of massive volumes of data. Reducing the volume of communication is pivotal for embracing greener data exchange by efficient utilization of network resources. This paper proposes the use of mixing technique, spate coding, working in tandem with software-defined network control as a means of dynamically-controlled reduction in volume of communication. We introduce motivating real-world use-cases, and present a novel spate coding algorithm for the data center networks. We also analyze the computational complexity of the general problem of minimizing the volume of communication in a distributed data center application without degrading the rate of information exchange, and provide theoretical limits of such schemes. Moreover, we proceed to bridge the gap between theory and practice by performing a proof-of-concept implementation of the proposed system in a real world data center. We use Hadoop MapReduce, the most widely used big data processing framework, as our target. The experimental results employing two of industry standard benchmarks show the advantage of our proposed system compared to a vanilla Hadoop implementation, an in-network combiner, and Combine-N-Code. The proposed coding-based scheme shows performance improvement in terms of volume of communication (up to 62%), goodput (up to 76%), disk utilization (up to 38%), and the number of bits that can be transmitted per Joule of energy (up to 200%).
    Item Type: Article
    Keywords: Big data; optimization; hadoop; green computing; green communication; cloud computing; spate coding; data center networks; middlebox;
    Academic Unit: Faculty of Science and Engineering > Mathematics and Statistics
    Faculty of Science and Engineering > Research Institutes > Hamilton Institute
    Item ID: 10046
    Identification Number: 10.1109/JSAC.2016.2520245
    Depositing User: Dr. David Malone
    Date Deposited: 03 Oct 2018 14:16
    Journal or Publication Title: IEEE Journal on Selected Areas in Communications
    Publisher: Institute of Electrical and Electronics Engineers (IEEE)
    Refereed: Yes
    Related URLs:
    URI: https://mu.eprints-hosting.org/id/eprint/10046
    Use Licence: This item is available under a Creative Commons Attribution Non Commercial Share Alike Licence (CC BY-NC-SA). Details of this licence are available here

    Repository Staff Only (login required)

    Item control page
    Item control page

    Downloads

    Downloads per month over past year

    Origin of downloads