What will the future be for monitoring machine data in 2019?

Why the business relationship with data will go from "more, more, more" to "it's complicated"?

This is a contributed article by Colin Fernandes, EMEA Product Marketing Director, Sumo Logic


Data has always been important to business applications and underlying infrastructure workload
components. Today, we have more data than ever from more applications, more infrastructure and more services that we can leverage to improve and grow our business — if we use this data correctly. And in order to use this data, we first need to be able to visualize it, analyze it and understand it. The ability to see what is taking place across IT is essential and the demand for this is growing. According to a recent study from 451 Research, more enterprises are looking to gain real-time insight into their data with 48 percent of companies over 10,000 employees in Europe using real-time dashboards.

So what will this mean for us all in 2019?

There's a shift taking place to move from traditional architectures to agile, microservices-based technologies, where applications are made up of smaller units that interact via APIs rather than being monolithic applications in one place. At the same time, more of our IT is hosted in the cloud, running in containers or moving straight through to serverless applications like AWS Lambda. The old quote attributed to Peter Drucker - "you can't manage what you can't measure" - should therefore ring truer than ever.


Monitoring new infrastructures

One of the main trends in monitoring will therefore be when IT teams meet the limits of what they can currently see. Traditional IT infrastructure monitoring tools are - of course - built to look at traditional IT networks, applications and servers. There are increasingly more modern options available for developers to build applications today - from simply hosting on public cloud services through to leveraging emerging technologies like serverless and containers. Monitoring will have to evolve to keep up.

The challenge here with these new technologies is that they don't work in the same way as those traditional environments. They can be set up to run either as permanent infrastructure or on-demand resources. They can live for months, days or even just hours, based on demand levels. This means that it can be hard to gain an accurate level of visibility into all of the data being created from these digital services over time.

All these options can come under the banner of "cloud native" - that is, applications that are built with cloud in mind. Monitoring has to take the same approach and run in a cloud native fashion.


More data needs more forethought

All these individual services create their own data on how they are running over time. With so many smaller components all generating data, monitoring can become a harder and more time consuming job - the sheer volume of log and metric data can make it difficult to see the wood for the trees. Therefore, thinking smarter about how we can efficiently and effectively monitor, analyze and use logs and metric data. At the same time, the number of ways that we can use this data is going up across development, security, operations and the business.

There are two elements to this - automation and correlation. Automation refers to how your system can automate processing all this data so that your tools can surface the most important data. Rather than looking for the proverbial needle in a haystack yourself, you should get a metal detector - if and when something goes wrong, automation should help you see where the real problem is. By helping you focus on the real issues, automation can help IT teams prioritize where they put their efforts to have the most impact for overall application performance. At the same time, this should free up time for the teams to focus on more innovative and strategic projects. 

Correlation also relies on automation - this involves linking different streams of data from application elements together to provide a better picture of what is going on across the full application stack. For developers running container-based applications, this can show when more containers are added in response to demand levels and the effects on wider application performance. This can also indicate where additional bottlenecks are being found that have not been anticipated in the original application design.

The ability to correlate machine data sets together can also be important for alerting technical teams when applications stop working. Traditionally, the split between different teams covering applications, hardware, networking and IT operations could lead to finger pointing and blame when an application failed. Today, the growth of cloud means that there are more points for potential conflict both inside and outside the business, and even less time to troubleshoot and remediate those application issues.

This is a major challenge because, in today's on-demand digital world, any application failure is immediately more apparent to customers, who want a seamless and uninterrupted experience. Gaining accurate insight into any outages to quickly identify where the issue occurred — and who is responsible for resolving it —  is therefore going to be even more important in the future. With this valuable data gathered in one place, it is easier to shift the culture towards one that is supportive and promotes cross-functional collaboration. This puts the emphasis on keeping services running as intended and without impacting the overall customer experience.


What's next?

For the next 12 months, it's important that we recognize that we have more machine data available to us than we can ever hope to understand manually. The phrase "analysis paralysis" is as relevant as ever. We can automate our approaches to understand what is really critical to performance over time. However, we also have to work on how our company cultures respond to that availability of data.

Without the proper tools in place to monitor data, it will be difficult to continue to innovate and move the business forward. This requires a culture of collaboration - as seen in DevOps or DevSecOps - where teams understand the value of machine data as well as how to collaborate and use it collectively to benefit the customer, regardless of the fact that their individual team goals are more specific. 

The ability to accurately and effectively use this monitoring data across the business should help teams build, run and secure their modern applications and cloud infrastructure, and more importantly, deliver value to the customer. It can also hold up a mirror to organizations and show us how we are performing as a whole and allow us to implement the right goals, culture and support structures in 2019 as we continue on this digital transformation journey.