Apache Hadoop YARN: Moving beyond MapReduce and Batch Processing with Apache Hadoop 2 (2014)

E. Resources and Additional Information

Apache Hadoop is an open-source project and is part of the Apache Foundation (http://www.apache.org/). Community involvement is encouraged and information about the Apache Hadoop project can be found at the project website: http://hadoop.apache.org.

In addition, an active discussion can be found in Apache’s JIRA issue tracker system. Issues, ideas, and many important discussions take place on this site. You can see all the Hadoop JIRAs by consulting the following:


In addition to the project website and the JIRA issue tracker, you may wish to consult the following resources for further information.

1. Vinod Kumar Vavilapalli, Arun C. Murthy, Chris Douglas, Sharad Agarwal, Mahadev Konar, Robert Evans, Thomas Graves, Jason Lowe, Hitesh Shah, Siddharth Seth, Bikas Saha, Carlo Curino, Owen O’Malley, Sanjay Radia, Benjamin Reed, and Eric Baldeschwieler. Apache Hadoop YARN: Yet Another Resource Negotiator. ACM Symposium on Cloud Computing 2013. http://www.socc2013.org/home/program/a5-vavilapalli.pdf.

2. J. Dean and S. Ghemawat. MapReduce: Simplified data processing on large clusters. Communications of the ACM, 51(1), January 2008.

3. K. Shvachko, H. Kuang, S. Radia, and R. Chansler. The Hadoop Distributed File System. In Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), MSST ’10. Washington, DC: IEEE Computer Society, 2010.

4. O. O’Malley. Hadoop. In Hadoop: The Definitive Guide. O’Reilly Media, 2012.

5. Apache Storm: http://storm-project.net/documentation.html.

6. T. Graves. GraySort and MinuteSort at Yahoo! on Hadoop 0.23. 2013. http://sortbenchmark.org/Yahoo2013Sort.pdf.

7. Apache TEZ. http://incubator.apache.org/projects/tez.html.

8. M. Isard, M. Budiu, Y. Yu, A. Birrell, and D. Fetterly. Dryad: Distributed data-parallel programs from sequential building blocks. In Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007, EuroSys ’07. New York, NY: ACM, 2007.

9. S. Loughran, D. Das, and E. Baldeschwieler. Introducing Hoya: HBase on YARN. 2013. http://hortonworks.com/blog/introducing-hoya-hbase-on-yarn.

10. C. Olston, B. Reed, U. Srivastava, R. Kumar, and A. Tomkins. Pig Latin: A not-so-foreign language for data processing. In Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, SIGMOD ’08. New York, NY: ACM, 2008.

11. A. Thusoo, J. S. Sarma, N. Jain, Z. Shao, P. Chakka, N. Zhang, S. Anthony, H. Liu, and R. Murthy. Hive: A petabyte scale data warehouse using Hadoop. In F. Li, M. M. Moro, S. Ghandeharizadeh, J. R. Haritsa, G. Weikum, M. J. Carey, F. Casati, E. Y. Chang, I. Manolescu, S. Mehrotra, U. Dayal, and V. J. Tsotras, eds., Proceedings of the 26th International Conference on Data Engineering, ICDE 2010, March 1–6, 2010, Long Beach, California, USA. IEEE, 2010.

12. B. Hindman, A. Konwinski, M. Zaharia, A. Ghodsi, A. D. Joseph, R. Katz, S. Shenker, and I. Stoica. Mesos: A platform for fine-grained resource sharing in the data center. In Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation, NSDI ’11. Berkeley, CA: USENIX Association, 2011.