How ElasticSearch and Kibana make the c|g BI IT solution “just work”

Lets start this off with an intro of what the c|g BI IT products are designed to do.  We looked for a way to index several hundred thousand data points and provide easy access to them in real time to OUR end users that needed them.  We needed real-time, agent-less collection from MANY different types of sources (Servers, Routers, Switches, Tape Robots, Arduino boards, etc).  Some data points are simple to track and trend, ie. “How many tickets related to backup jobs were created?”  Others are much more complex, “How many backup jobs are failing due to traffic timeout in Dallas, TX  during inclement weather on Saturdays?”

This led us to to investigate several leading commercially available solutions, SAP BusinessObjects, IBM Cognos, Oracle OBIE/Hyperion, and Microsoft BI to name a few.  For us, it came down to a combination of products that needed to be integrated and accessible on a global scale.  The result of this effort is three fold.

  1. We have purpose designed hardware tuned for the unique work loads of BI/EDW clusters
  2. We have a custom designed BI platform that actually gives us faster and more specific information than splunk (an industry standard).
  3. We have installe

How BI tools like Splunk are crucial to any organization.

Tools like Splunk are a fantastic example of what real time results can do for you. It literally allows you to drill down to the packet level of what is happening when, where and by whom in your organization. In the age of big data, log management is becoming an absolute necessity, as developers, operations, and, yes, even senior management have to deal with and process huge amounts of machine-generated data. Many organizations have turned to Splunk, a pioneer in the space, to help manage the rising tide of log data – but Splunk can get really, really expensive, FAST.

While there still is not a single, all purpose alternative to Splunk that is as robust and stable, there are several tools that can be used to replicate much of its functionality. In fact SysAdmin Brad Lhotsky documented his quest to build his own central log management system using only open source software.

Of course, his blog entry contains much deeper technical insight, but at the high level, he broke his solution down into three components: Log centralization (rsyslog), log management (logstash/Kibana) and log visualization (Graphite).

Rsyslog was tapped for log centralization over similarly popular alternative syslog-ng because the former offers guaranteed delivery and encrypted transfer in the open source edition – two features that Lhotsky says are becoming of increased importance to regulatory compliance auditors. With rsyslog, Lhotsky was able to build a reliable way to transport event logs from Unix hosts to a central repository.

This is where Lhotsky starts entering Splunk’s territory, calling the company “the 1,000 lb Gorilla in the room.” But in lieu of Splunk, Lhotsky writes that he took the MongoDB-powered Graylog2 for a test drive before settling on logstash. Graylog2 is great, he says, but suggests that its ElasticSearch indexing scheme is “broken,” and if you have to keep a large amount of logs around for compliance reasons, you’re going to take a performance hit. Lhotsky goes so far as to speculate that it’s because Graylog2 only implemented ElasticSearch for, well, search fairly late in the game.

On the other side of the coin, logstash also uses ElasticSearch, but with far more of a focus on scalability, inputs, filter and outputs. The cost, Lhotsky writes, is a polished front-end. Enter Kibana, a PHP front-end for logstash that takes the ElasticSearch indexes and adds a front-end for search and analysis, making the whole platform a lot more usable.

“Kibana fills the gap with the Logstash interface so perfectly. It doesn’t give me everything I’d get with Splunk, but I’ve just touched the functionality I can extract with Logstash,” as Lhotsky puts it.

Finally, he suggests the popular Graphite for data visualization and graphing all the log data you’ve now collected.

As Lhotsky says, this just how he tried to match Splunk-like functionality with open source tools, and it’s still a work in progress.

Below is a video from the CEO of Splunk explaining why his product is unchallenged in the space, and Just What exactly Splunk is.

“Why Splunk?”

Godfrey Sullivan, Chairman and CEO of Splunk, gives you the essential overview of Splunk. Your machine data contains a definitive record of all user transactions, customer behavior, machine behavior, security threats, system health, fraudulent activity and more. Splunk can help you take this machine data and make business sense of it. We call this operational intelligence. Learn how Splunk can help turn silos of machine data into actionable insights for IT and the business.