Early survey conclusions...

How to write a MapReduce-related paper:

1. Find a feature of MapReduce which is terrible for a *specific* setup or class of applications

Tips:

  • Read the motivation of the MapReduce system
  • Specify set of goals, set of target applications and set of appropriate physical setups for MapReduce
  • Choose *anything* outside those sets

 

2. Change this feature using the simplest solution

Tips:

  • Download Hadoop 
  • Locate the class with the corresponding functionality 
  • Write your own implementation 
  • Build

 

3. Perform evaluation against vanilla MapReduce, using *only* the setup and applications your version was built for.

Tips: This is probably the hardest step...

  • Find the optimal configuration for your system, i.e. number of nodes, data sizes, data formats, etc.
  • Use default configuration for Hadoop
  • Keep tuning the configuration until your system outperforms Hadoop

 

4. Publish

Tips:

  • Find a cool name for your system 
  • Write a long, descriptive paper 
  • Focus on the limitations of MapReduce that your system overcomes 
  • Include a colourful architecture diagram, showing your system as an additional layer to the Hadoop stack (even though it's only a couple of modified classes)
  • Publish

 

Do research they said.. it will be fun they said.. it will be novel they said...

 

V.