StackState Blog

Smart IT monitoring and root cause analysis needs big data

Posted by Mark Bakker on Jun 26, 2015 12:30:00 PM
Find me on:

 

Our big data engine is ready. We are on track to change the way IT departments work and manage their IT operations.

 

To understand what happens in an IT stack, we need good in-depth data. And as we all know, we gather this data through all kind of systems. We monitor, measure and analyze how software applications perform, which new deployments we did, the changes we made in our architecture or the issues we have and are trying to solve. All these pieces together are part of a big jigsaw puzzle. And when you add business processes, services and infrastructure components including their dependencies and states, you get what we call Full Stack Chain Monitoring. Having a single real-time unified overview is an interesting approach since it gives insights to devop teams, architects or IT services managers how healthy their (part of) the stack is. And it is a great tool for root cause analysis since it immediately shows where actual failures or services interruptions originate from.

 

The next step is storing all this knowledge in a big database for IT operation analytics.

 

big_data

 

Big data for a pro-active approach

But what if we could store all this real-time information as big data and use it to make monitoring and root cause analysis super smart? We could use the live IT stack overview as a time machine to go back in time and see how your infrastructure looked like a month ago. Or discover where, deeply hidden in the stack, 9 hours ago a small change or failure “infected” another component causing a domino failure effect path through the stack finally hitting one of our core services. Or discover abnormalities and be more predictive and repair before critical If services failures occur. A pro-active root cause analysis or self healing mechanisms would be the result. 

A huge step forward needed new technology

Making the next step for the StackState concept from just real-time to a combination of real-time and full history, asked for specific big data technology storing and retrieving capabilities which we couldn’t find in the market. Sometimes great ideas need newer or better technologies. The last 6 months we worked hard to create a new big data engine called StackGraph to fulfill our needs. We just embedded it in StackState and we believe that this will change how IT departments will do root cause analysis and manage and control their whole IT stack to improve their service levels.

Open source graph database

Now before you start asking me all kind of technical graph database questions (I am not an engineer), I have good news for you. We are planning to release StackGraph as an open source project to share these great capabilities with the world. So be patient and keep posted.

 

You can also signup for our StackState early access program and be the first to download StackState

Download

 

Topics: Dev/Ops

Our Mission

Creating an Error Free IT Environment

To accomplish this, we created the world’s first Algorithmic IT Operations platform that analyzes large volumes of IT topology, telemetry and time from disparate sources and applies various forms of algorithmics to the data in real-time. StackState is able to capture your entire IT stack. In one data model.


Join over 5,000 people from companies like eBay, American Express, Cisco, Tesco, ING and more who get our best new posts delivered via email. Subscribe below if you'd like to get it too:

Subscribe to Email Updates

Most Popular Posts