Skip to main content

Table 2 Empirical workloads

From: A Pareto-based scheduler for exploring cost-performance trade-offs for MapReduce workloads

Workload

Jobs

No. of map tasks

No. of reduce tasks

Deadline (s)

Input size

Twitter friendship

3

20

10

1000

1.2 GB tweetsa

Sort

3

80

10

2500

5 GB random data

Wordcount

3

47

10

1000

3.2 GB movie reviews

Twitter-Wordcount

1 Twitter and

20 Twitter and

10 Twitter and

1000

3.2 GB movie reviews

 

2 Wordcount

47 Wordcount

10 Wordcount

 

and 1.2 GB tweets

  1. aExtracted during the period of January 1, 2013 to April 30, 2013 using the Streaming API 2 of Twitter