AI In Code Series: Dynatrace - From observability to understandability to explainability

We users use Artificial Intelligence (AI) almost every day, often without even realising it i.e. a large amount of the apps and online services we all connect with have a degree of Machine Learning (ML) and AI in them in order to provide predictive intelligence, autonomous internal controls and smart data analytics designed to make the end user User Interface (UI) experience a more fluid and intuitive experience. 

That’s great. We’re glad the users are happy and getting some AI-goodness. But what about the developers?

What has AI ever done for the programming toolsets and coding environments that developers use every day? How can we expect developers to develop AI-enriched applications if they don’t have the AI advantage at hand at the command line, inside their Integrated Development Environments (IDEs) and across the Software Development Kits (SDKs) that they use on a daily basis?

What can AI do for code logic, function direction, query structure and even for basic read/write functions… what tools are in development? In this age of components, microservices and API connectivity, how should AI work inside coding tools to direct programmers to more efficient streams of development so that they don’t have to ‘reinvent the wheel’ every time?

This Computer Weekly Developer Network series features a set of guest authors who will examine this subject — this post comes from Alois Reitbauer, VP and chief technology strategist at Dynatrace.

Dynatrace is a software intelligence company with roots are in Application Performance Monitoring (APM)… its application monitoring and testing tools are available as cloud-based SaaS services, or as on-premises software.

Reitbauer writes as follows…

When we look at the applications of AI for developers, we are more specifically looking at algorithms and methodologies that support development tasks. Modern developers are more and more confronted with the challenge of understanding the dynamics of when and how their code is used, and solving complex issues when it is not working. To understand why this topic has come to the fore, let’s step back and take a look at what is happening at the software engineering coalface. 

There are two trends that have led to increased pressure on developers, which validate (and indeed promote) the need for AI in programming tools.

First, DevOps is pushing for more operational responsibilities with an extreme ‘you build it, you run it’ approach. Second, microservices and dynamic cloud environments that adjust to load or infrastructure issues require developers to understand systems that are 1,000 times more complicated than they used to be. Traditionally, applications usually consisted of three layers – front end, back end and database – and were deployed four times a year. Modern applications, however, consist of up to hundreds of services, which are deployed up to several times a day. 

The cognitive load needed to understand what’s going on in a system is therefore constantly increasing. The industry shift towards observability aims to address this need by providing even more data to help developers understand these complex systems. While this is important and necessary, it leads to developers needing to work with even more data.

From observability to understandability

This leads to a new trend towards understandability, meaning that we not only need to understand the state a system is or was in, but also what led to this state… and potentially how we can get back to a desired state in case of problems. 

Dynatrace’s Reitbauer: Data analysis at its core can profit from AI and automation.

The difference between observability and understandability is a thorough understanding of data, which requires a lot of data analysis and interpretation. This is usually done by developers. They look at data, create a hypothesis, validate it against data and repeat this process until they’ve found a likely solution. However, this process takes time, and time is also what people don’t have when a production application is not working. 

Thankfully, this domain lends itself very well to applying AI techniques. First of all, it is a closed domain that can be well understood, and the environment – an IT system – can be modelled very well. Secondly, the environment and its state are presented in a format that can be well understood by machine learning algorithms – namely numbers as time series data and structured data models. 

Benefits of AI in development

Combining research in semantic (graph) data models with data analysis and machine learning approaches provides a very powerful system that can take away a large part of the analysis that humans would have to do manually. Having the analysis results available immediately saves a massive amount of time and also lets developers focus on solving problems, rather than trying to understand what they are. 

Another benefit of machine-based analysis and data interpretation is the often better and more consistent quality of the data.

Human-led analysis is very often flawed and there are common forms of bias that influence the result. Confirmation bias tends to lead us to use data to confirm our existing assumptions, rather than trying to invalidate them or find new options.

Selection bias sees us select data that is either easier to access or for other reasons preferred by us. There is also often the problem of confounding variables, which leads us towards relating two symptoms while ignoring the actual cause. So, data analysis at its core can profit from AI and automation – especially as we move beyond mechanical data crunching. 

These approaches, however, are not without challenges.

Onward to explainability

Machine learning in the software development and operations space has some very specific characteristics. Usually, it needs to work on a small data set, as we don’t want to learn about servers being down days later and models need to be constantly updated, since systems are also constantly changing. Last but not at all least, the results need to be ‘explainable’ meaning the AI needs to surface not only its findings, but also the rationale for how the system arrived at those conclusions.

This is currently not a given for many AI-based applications and is therefore an area worthy of increased focus to ensure AI benefits developers more effectively.

 

Approved image use – source: Dynatrace