Monitoring Hadoop Clusters: a one hour project!

May 11, 2017

Instana - Enterprise Observability and APM for Cloud-Native Applications

What if you could monitor your Hadoop Cluster out of the box? APM companies have attempted this in varying degrees for some time now, but none have ever crossed the Rubicon. I’ve personally tried to setup Hadoop monitoring with past solutions many times, but every time, it came up short. It was not for lack of trying, but here are many different reasons it never worked:

  • Hadoop Implementations are all different;
  • There are many processes to monitor, each requiring configuration;
  • It’s difficult to choose which Extensions to install and configure.

Where do you begin?

It was with trepidation, and a little pride, that I set out over the weekend to see if I could set up viable Hadoop monitoring with Instana.

In the Hortonworks Sandbox there are 21 relevant processes. Yarn, Spark, Tomcat, 15 other Java processes, MySQL, Postgres, and ZooKeeper. I have not even mentioned the dynamic processes that start and stop during data processing.

Installation of the Instana Agent was a snap using the One Liner installation from the documentation:

curl -o setup_agent.sh https://setup.instana.io/agent && chmod 700 ./setup_agent.sh && ./setup_agent.sh -a $yourAgentKey -l $location

As I had hoped, the technology stack and services just showed up—and the appropriate technology sensors started doing their job automatically: continuously monitoring changes and relationships, collecting data, and determining health.  Don’t forget there is an actual infrastructure underneath Hadoop.  For this project the following sensors were relevant: Hadoop Yarn, ZooKeeper, MySQL, PostgreSQL, Tomcat, Java, Process, Host.

The following are interesting screen shots from the deployment:
Instana - Enterprise Observability and APM for Cloud-Native Applications
Yarn Tracing and Mapping (with zero configuration!).

Instana - Enterprise Observability and APM for Cloud-Native Applications
All the Hortonworks processes are discovered.

Instana - Enterprise Observability and APM for Cloud-Native Applications
The Yarn Dashboard.

Having successfully installed the Agent into the Hortonworks Sandbox, and feeling emboldened, I decided to try the Cloudera Sandbox. I had similar success. More Java processes were discovered in Cloudera, but essentially the same level of discovery and monitoring occurred.
The best part (other than that I finally saw something monitor Hadoop) was that I managed to complete all of this in under two hours on a rainy Saturday afternoon. That includes writing this article and taking all these fancy screenshots.
What can Instana do in your environment? Why not take a little time and find out.

Below are some Cloudera Screenshots that hint at Instana’s capabilities:
Instana - Enterprise Observability and APM for Cloud-Native Applications
 
Instana - Enterprise Observability and APM for Cloud-Native Applications
 

Play with Instana’s APM Observability Sandbox

Start your FREE TRIAL today!

Instana, an IBM company, provides an Enterprise Observability Platform with automated application monitoring capabilities to businesses operating complex, modern, cloud-native applications no matter where they reside – on-premises or in public and private clouds, including mobile devices or IBM Z.

Control hybrid modern applications with Instana’s AI-powered discovery of deep contextual dependencies inside hybrid applications. Instana also gives visibility into development pipelines to help enable closed-loop DevOps automation.

This provides actionable feedback needed for clients as they to optimize application performance, enable innovation and mitigate risk, helping Dev+Ops add value and efficiency to software delivery pipelines while meeting their service and business level objectives.

For further information, please visit instana.com.