Think about the following questions before reading this article: (1) did you ever encounter an outage in a business chain where someone depending on the chain could not perform their job for hours? (2) Is there anyone who has insight in all dependencies between systems? (3) Is the status of each system available to everyone? (4) Do departments report a scheduled release to the depending departments?
After years of experience in the IT Ops space, we have encountered many real-life examples of how IT Ops teams and related technology do not cooperate efficiently when it is needed most. They are working and focussing on their own department and have less visibility into what is happening beyond their area of responsibility. When data or information is not shared across teams and tools, a department becomes a black box. These so-called silos obstruct and limit IT Ops teams to quickly find the root cause of issues, with a higher MTTR as a negative effect. StackState's AIOps platform can help breaking down these silos, by consolidating all information across teams and tools. Let us discuss some real-life examples.
1. The mortgage application
A large bank has a mortgage department that specializes in mortgage loans. New mortgage applications are received by mail, fax or other formats. The mortgage applications pass through several systems until the final check is made by agents deciding whether a mortgage is granted or rejected. Mortgage applications flow through mailboxes, message queues or file transfers in each step of the chain. In this particular example, the agents were expecting to process mortgage applications but did not receive them in a timely manner. After several hours, the agents started to ask questions. Based on these questions, several teams were asked what could be wrong. These siloed teams were trying to debug the same issue. In this case a message queue was having issues causing the mortgage application pipeline to stagnate. A single team responsible for that message queue had seen the alert but underestimated its severity. StackState could have reduced the number of teams asked significantly by providing insight in the entire chain and perform fast root cause analysis to pinpoint the issue quickly.
2. Unannounced software deployment
The second example is that of an unannounced software deployment that impacted business operations. One department scheduled an upgrade to their systems as usual. After a day a system in another department started reporting problems. Deploying software upgrades is common within a business chain. Software is tested and a process is in place to ensure smooth deployments. The business chain in this example had all the necessary steps in place. A software deployment was scheduled in a department as their process dictates. The release was a success but was not announced across teams. Another department started reporting problems. The team having problems started to investigate but could not immediately find the root cause of the problem and started to ask around. Based on hearsay the team found out that a deployment had happened the day before in a system that the team is dependent on. After asking the team that deployed their software, they recognized the issue and started working on a solution.
The release contained changes to a message format and the team was unaware of other teams using that format. In an ideal scenario, StackState would have been able to provide insight into the dependencies between components such that teams could analyze the impact of their deployment. Besides, with StackState the teams could have made a correlation between the issue and the release as StackState is able to show version changes.
3. Dependency hell
Finally, a company has adopted microservices for one of their business chains. The microservices depend heavily on each other to perform business critical operations. A team needed to upgrade a couple of microservices and were dealing with dependency hell, it was not entirely clear which microservices would be impacted and it was unclear whether the upgrade would be compatible with the depending microservices. Version information was only partly available to the team and it was unknown whether the information was current. The team was required to ask other departments which versions their microservices were actual running, possible compatibility issues that might arise, and what the health status was of their microservices before the team was able to deploy. StackState could have provide fast insight into the dependencies between microservices as well as show their current version information. Besides, it is possible to integrate StackState into a CI/CD pipeline and restrict deployments to only healthy environments.
Accelerate your IT Operations by breaking down siloes across teams and tools
As seen in the examples above, siloed teams and tools do not contribute to an agile, fast and impact free IT environment. Especially with the increasing dynamic nature of modern IT environments the need for next-gen AIOps platforms, that deliver consolidated insights across teams and tools, become more and more important. StackState can help you reduce the time to detect and analyze an issue up to 80%. AIOps is the future and the way to go. Want to know the maturity of your current IT Monitoring landscape? And how to improve it? Please download our whitepaper here.