The Hadoop configuration can be fine-tuned to optimize cluster performance. To manage a cluster effectively, you need to monitor the hardware and system performance to catch issues before they become problems.

You can monitor Java applications such as Hadoop and Datameer with JMX and monitor the underlying OS with SNMP to watch the machines CPU activity levels, memory usage, network traffic levels, disk IO, and so on. There are a variety of applications available that provide real-time monitoring, and alerts.

Some tools commonly used for monitoring include:

In a cluster, the most vital machines are the NameNode, SecondaryNameNode, and the Job Tracker. If the slave servers go down, there is built-in redundancy so their configuration isn't as vital.

To learn about best monitoring practices, see http://www.cloudera.com/blog/2009/11/hadoop-world-monitoring-best-practices-from-ed-capriolo/

Learn more: