Early survey conclusions...
How to write a MapReduce-related paper:
1. Find a feature of MapReduce which is terrible for a *specific* setup or class of applications
Tips:
- Read the motivation of the MapReduce system
- Specify set of goals, set of target applications and set of appropriate physical setups for MapReduce
- Choose *anything* outside those sets
2. Change this feature using the simplest solution
Tips:
- Download Hadoop
- Locate the class with the corresponding functionality
- Write your own implementation
- Build
3. Perform evaluation against vanilla MapReduce, using *only* the setup and applications your version was built for.
Tips: This is probably the hardest step...
- Find the optimal configuration for your system, i.e. number of nodes, data sizes, data formats, etc.
- Use default configuration for Hadoop
- Keep tuning the configuration until your system outperforms Hadoop
4. Publish
Tips:
- Find a cool name for your system
- Write a long, descriptive paper
- Focus on the limitations of MapReduce that your system overcomes
- Include a colourful architecture diagram, showing your system as an additional layer to the Hadoop stack (even though it's only a couple of modified classes)
- Publish
Do research they said.. it will be fun they said.. it will be novel they said...
V.

