Citation

Berlinska, J and Drozdowsk, M Mitigating Partitioning Skew in MapReduce Computations. In proceedings of the 6th Multidisciplinary International Conference on Scheduling : Theory and Applications (MISTA 2013), 27 - 30 Aug 2013, Ghent, Belgium, pages 80-90, 2013.

Paper


Abstract

In this paper we analyze handling partitioning skew in MapReduce computations. The basic MapReduce implementations strongly depend on the assumption that the data is partitioned evenly for reducing. However, in practical applications the data distribution is often skewed, what leads to decreasing MapReduce system performance. Using divisible load theory we analyze two methods of handling data skew in MapReduce computations. The proposed algorithms are evaluated in a series of computational experiments. To our best knowledge this is the ?rst analytical study comparing mitigation of partitioning skew in two di?erent stages of MapReduce applications.


pdf

You can download the pdf of this publication from here


doi

This publication does not have a doi, so we cannot provide a link to the original source

What is a doi?: A doi (Document Object Identifier) is a unique identifier for sicientific papers (and occasionally other material). This provides direct access to the location where the original article is published using the URL http://dx.doi/org/xxxx (replacing xxx with the doi). See http://dx.doi.org/ for more information



URL

This pubication does not have a URL associated with it.

The URL is only provided if there is additional information that might be useful. For example, where the entry is a book chapter, the URL might link to the book itself.


Bibtex

@INPROCEEDINGS{2013-080-090-P, author = {J. Berlinska and M. Drozdowsk},
title = {Mitigating Partitioning Skew in MapReduce Computations},
booktitle = {In proceedings of the 6th Multidisciplinary International Conference on Scheduling : Theory and Applications (MISTA 2013), 27 - 30 Aug 2013, Ghent, Belgium},
year = {2013},
editor = {G. Kendall and B. McCollum and G. {Venden Berghe}},
pages = {80--90},
note = {Paper},
abstract = { In this paper we analyze handling partitioning skew in MapReduce computations. The basic MapReduce implementations strongly depend on the assumption that the data is partitioned evenly for reducing. However, in practical applications the data distribution is often skewed, what leads to decreasing MapReduce system performance. Using divisible load theory we analyze two methods of handling data skew in MapReduce computations. The proposed algorithms are evaluated in a series of computational experiments. To our best knowledge this is the ?rst analytical study comparing mitigation of partitioning skew in two di?erent stages of MapReduce applications.},
owner = {Graham},
timestamp = {2017.01.16},
webpdf = {2013-080-090-P.pdf} }