Tools like Splunk are a fantastic example of what real time results can do for you. It literally allows you to drill down to the packet level of what is happening when, where and by whom in your organization. In the age of big data, log management is becoming an absolute necessity, as developers, operations, and, yes, even senior management have to deal with and process huge amounts of machine-generated data. Many organizations have turned to Splunk, a pioneer in the space, to help manage the rising tide of log data – but Splunk can get really, really expensive, FAST.
While there still is not a single, all purpose alternative to Splunk that is as robust and stable, there are several tools that can be used to replicate much of its functionality. In fact Booking.com SysAdmin Brad Lhotsky documented his quest to build his own central log management system using only open source software.
Of course, his blog entry contains much deeper technical insight, but at the high level, he broke his solution down into three components: Log centralization (rsyslog), log management (logstash/Kibana) and log visualization (Graphite).
Rsyslog was tapped for log centralization over similarly popular alternative syslog-ng because the former offers guaranteed delivery and encrypted transfer in the open source edition – two features that Lhotsky says are becoming of increased importance to regulatory compliance auditors. With rsyslog, Lhotsky was able to build a reliable way to transport event logs from Unix hosts to a central repository.
This is where Lhotsky starts entering Splunk’s territory, calling the company “the 1,000 lb Gorilla in the room.” But in lieu of Splunk, Lhotsky writes that he took the MongoDB-powered Graylog2 for a test drive before settling on logstash. Graylog2 is great, he says, but suggests that its ElasticSearch indexing scheme is “broken,” and if you have to keep a large amount of logs around for compliance reasons, you’re going to take a performance hit. Lhotsky goes so far as to speculate that it’s because Graylog2 only implemented ElasticSearch for, well, search fairly late in the game.
On the other side of the coin, logstash also uses ElasticSearch, but with far more of a focus on scalability, inputs, filter and outputs. The cost, Lhotsky writes, is a polished front-end. Enter Kibana, a PHP front-end for logstash that takes the ElasticSearch indexes and adds a front-end for search and analysis, making the whole platform a lot more usable.
“Kibana fills the gap with the Logstash interface so perfectly. It doesn’t give me everything I’d get with Splunk, but I’ve just touched the functionality I can extract with Logstash,” as Lhotsky puts it.
Finally, he suggests the popular Graphite for data visualization and graphing all the log data you’ve now collected.
As Lhotsky says, this just how he tried to match Splunk-like functionality with open source tools, and it’s still a work in progress.
Below is a video from the CEO of Splunk explaining why his product is unchallenged in the space, and Just What exactly Splunk is.
Godfrey Sullivan, Chairman and CEO of Splunk, gives you the essential overview of Splunk. Your machine data contains a definitive record of all user transactions, customer behavior, machine behavior, security threats, system health, fraudulent activity and more. Splunk can help you take this machine data and make business sense of it. We call this operational intelligence. Learn how Splunk can help turn silos of machine data into actionable insights for IT and the business.