Hadoop and HBase use GangliaContext class to send the metrics collected by each daemon (such as datanode, tasktracker, jobtracker, HMaster etc) to gmond s.
Once you have setup Ganglia successfully, you may want to edit /etc/hadoop/conf/hadoop-metrics.properties and /etc/hbase/conf/hadoop-metrics.properties to announce Hadoop and HBase-related metric to Ganglia. Since we use CDH 4.0.1 which is compatible with Ganglia releases 3.1.x, we use newly introduced GangliaContext31 (instead olderGangliaContext class ) in properties files.
Metrics configuration for slaves
Metrics configuration for master
Should be the same as for slaves – just use hadoop-master.IP.address:8649 (instead of hadoop-slave1.IP.address:8649) for example:
Remember to edit both properties files (/etc/hadoop/conf/hadoop-metrics.properties for Hadoop and /etc/hbase/conf/hadoop-metrics.properties for HBase) on all nodes and then restart Hadoop and HBase clusters. No further configuration is necessary.
Some more details
Actually, I was surprised that Hadoop’s deamons really send data somewhere, instead of just being polled for this data. What does it mean? It means, for example, that every single slave node runs several processes (e.g. gmond , datanode, tasktracker and regionserver) that collect the metrics and send them to gmond running on slave1 node. If we stop gmond s on slave2, slave3 and slave4, but still run Hadoop’s daemons, we will still get metrics related to Hadoop (but do not get metrics about memory, cpu usage as they were to be send by stopped gmond s). Please look at slave2 node in the picture bellow to see (more or less) how it works (tt, dd and rs denotes tasktracker, datanode and regionserver respectively, while slave4 was removed in order to increase readability).
Single points of failure
This configuration works well until nodes starts to fail. And we know that they will! And we know that, unfortunately, our configuration has at least two single points of failure (SPoF):
gmond on slave1 (if this node fails, all monitoring statistics about all slave nodes will be unavailable)gmetad and the web frontend on master (if this node fails, the full monitoring system will be unavailable. It means that we not only loose the most important Hadoop node (actually, it should be called SUPER-master since it has so many master daemons installed ;), but we also loose the valuable source of monitoring information that may help us detect the cause of failure by looking at graphs and metrics for this node that were generated just a moment before the failure)
Avoiding Ganglia’s SPoF on slave1 node
Fortunately, you may specify as many udp_send_channels as you like to send metrics redundantly to other gmond s (assuming that these gmond s specify udp_recv_channels to listen to incoming metrics).
In our case, we may select slave2 to be also additional lead node (together with slave1) to collect metrics redundantly (and announce to them to gmetad
- update
gmond.conf on all slave nodes and define additionaludp_send_channel section to send metrics to slave2 (port 8649) - update
gmond.conf s on slave2 to defineudp_recv_channel (port 8649) to listen to incoming metrics andtcp_accept_channel (port 8649) to announce them (the same settings should be already set ingmond.conf s on slave1) - update
hadoop-metrics.properties file for Hadoop and HBase daemons running on slave nodes to send their metrics to both slave1 and slave2 e.g.:
- finally update
data_source “hadoop-slaves” ingmetad.conf to poll data from two redundantgmond s (ifgmetad cannot pull the data from slave1.node.IP.address, it will continue trying slave2.node.IP.address):
Perhaps the picture bellow is not fortunate (so many arrows), but it intends to say that if slave1 fails, then gmetad will be able to take metrics from gmond on slave2 node (since all slave nodes send metrics redundantly to gmond s running on slave1 and slave2).
Avoiding Ganglia’s SPoF on master node
The main idea here is not to collocate gmetad (and the web frontend) with Hadoop master daemons, so that we will not loose monitoring statistics if the master node fails (or simply become unavailable). One idea is to, for example, move gmetad (and the web frontend) from slave1 to slave3 (or slave4) or simply introduce a redundant gmetad running on slave3 (or slave4). The former idea seems to be quite OK, while the later makes things quite complicated for such a small cluster.
I guess that even better idea is to introduce an additional node (called “edge” node, if possible) that runs gmetad and the web frontend (it may also have base Hadoop and HBase packages installed, but it does not run any Hadoop’s daemons – it acts as a client machine only to launch MapReduce jobs and access HBase). Actually, the “edge” node is commonly used practice to provide the interface between users and the cluster (e.g. it runs Pig and Hive, Oozie).
Troubleshooting and tips that may help
Since debugging various aspects of the configuration was the longest part of setting up Ganglia, I share some tips here. Note that is does not cover all possible troubleshooting, but it is rather based on problems that we have encountered and finally managed to solve.
Start small
Although the process configuration of Ganglia is not so complex, it is good to start with only two nodes and if it works, grew that to a larger cluster. But before, you install any Ganglia’s daemon…
Try to send “Hello” from node1 to node2
Make sure that you can talk to port 8649 on the given target host using UDP protocol. netcat is a simple tool, that helps you to verify it. Open port 8649 on node1 (called the “lead node” later) for inbound UDP connections, and then send some text to it from node2.
If it does not work, please double check whether your iptables rules (iptables, or ip6tables if you use IPv6) opens port 8649 for both UDP and TCP connections.
Let gmond send some data to another gmond
Install gmond on two nodes and verify if one can send its metrics to another using UDP connection on port 8649. You may use following settings in gmond.conf file for both nodes:
After running gmonds (sudo /etc/init.d/ganglia-monitor start ), you can use lsof to check if the connection was established:
To see if any data is actually sent to the lead node, use tcpdump to dump network traffic on port 8649:
Use debug option
Hopefully, running gmond or gmetad in debugging mode gives us more information (since it does not run as a daemon in the debugging mode, so stop it simply using Ctrl+C)
We see that various metrics are collected and sent to host=hadoop-slave1.IP.address port=8649. Unfortunately, it only does not tell whether thy are delivered successfully since they were send over UDP…
Do not mix IPv4 and IPv6
Let’s have a look at a real situation, that we have encountered on our cluster (and which was the root cause of mysterious and annoying Ganglia misconfiguration). First, look at lsof results.
Here hadoop-slave2 sends metrics to hadoop-slave1 on right port and hadoop-slave1 listens to on right port as well. Everything is almost the same as at the snippets in the previous section, except one important detail – hadoop-slave2 sends over IPv6, but hadoop-slave1 reads over IPv4!
The initial guess was to update ip6tables (apart from iptables) rules to open port 8649 for both UDP and TCP connections over IPv6. But it did not work.
It worked when we changed hostname “hadoop-slave1.vls” to its IP addess in gmond.conf files (yes, earlier I used hostnames instead of IP addresses in every file).
Make sure, that your IP address is correctly resolved to a hostname, or vice versa.
Get cluster summary with gstat
If you managed to send send metrics from slave2 to slave1, it means your cluster is working. In Ganglia’s nomenclature, cluster is a set of hosts that share the same cluster name attribute ingmond.conf file e.g. “hadoop-slaves”. There is a useful provided by Ganglia called gstat that prints the list of hosts that are represented by a gmond running on a given node.
Check where gmetad polls metrics from
Run following command on the host that runs gmetad to check what clusters and host is it polling metrics from (you grep it somehow to display only useful lines):
Other issues
Other issues that I saw using Ganglia are as follow:
There was an error collecting ganglia data (127.0.0.1:8652): fsockopen error: Connection refused fixed thanks tohttp://viewsby.wordpress.com/2013/03/12/ganglia-error-collecting-data-127-0-0-18652-fsockopen-error-connection-refused/PHP Fatal error: Allowed memory size of N bytes exhausted fixed thanks to http://www.mail-archive.com/ganglia-general@lists.sourceforge.net/msg02451.html
No comments:
Post a Comment