It probably sounds familiar: You detect an issue with the end user experience (perhaps a performance degradation or outage). Once you’ve discovered the root-cause of the problem, it’s not difficult to fix. However, the real pain point is the time between those two events. Often times it is left to the user to find the root-cause by assembling teams, processes, tools and piecing together siloed data fragments. Every minute spent looking for the root-cause is money lost for your company, and even worse, reputation-damaging. This post covers which evidence you need to collect to accelerate your root cause analysis process.
Mark Arts
Recent Posts
Collecting Critical Evidence for Faster Root Cause Analysis
Topics: DevOps, Root Cause Analysis, AIOps
In my job, I meet a lot of customers who have invested heavily in a data lake. And for good reasons, of course. With the increasing complexity and growing number of tools used, there’s a tremendous need to have all the data in one place.
Topics: Monitoring, Algorithmic IT Operations, DevOps
Changes in applications or IT infrastructure can lead to application downtime. This not only hits your revenue, it also has a negative impact on your reputation.
Everybody in IT understands the importance of having the right monitoring solutions in place. From an infrastructure – to a business perspective, we rely on monitoring tools to get us the right information.
But I wonder, why don’t these tools monitor application updates and infrastructure changes? Especially in today’s world, where DevOps and agile working are pushing changes to the apps and infrastructure all the time? Manual configuration errors can cost companies up to $72,000 per hour in Web application downtime. While application maintenance costs are increasing at a rate of 20% annually, 35% of those polled said at least one-quarter of their downtime was caused by configuration errors.
Topics: Dev/Ops, ITSM, Monitoring