Auto-tech series: Coralogix - Automating data discovery for modern observability

This is a guest post for the Computer Weekly Developer Network written by Chris Cooney in his capacity as a developer evangelist for Coralogix – the company is known for its technology that maps software flows to automatically detect production problems and delivers ‘pinpoint insights’ for log analytics.

Thinking about the role of ‘data discovery ‘ and how automation controls can now aid us in the modern quest for observability, Cooney analyses some of the key intelligence factors playing out in this space.

Cooney writes as follows…

Once upon a time, a half-baked dashboard and a tail command running on a log file were the cutting edge of observability. 

It wasn’t pleasant, but it did the job. Now, observability has grown into a behemoth of complex tools, techniques and architectures. All of this has spawned in an attempt to keep up with one of the biggest problems in modern software engineering. 

Namely, what the hell are we going to do with all of this telemetry data?

The scale problem

But why is data scale such a problem?

Because scale impacts everything. From query performance to cost. As soon as a system is working with a large dataset, all of those simple engineering problems change and multiply. 

As an engineer, if you are running your own ELK stack (Elasticsearch, Logstash and Kibana), this means your single-node cluster is not going to cut it. If you’re a customer of a full-stack observability platform provider, this typicall means high costs and overages. 

A data scientist lesson 

We need to listen to the data scientists

Any good data scientist will tell you that the gold standard of data is the kind they can query quickly, is consistent in its structures and can be transformed without waiting for hours. Telemetry data requires more than this gold standard. Information needs to be available instantly because the circumstances might be dire. 

A two-minute query is acceptable if run once, but anyone who has written a Lucene query will know that a successful investigation involves around 20 or 30 queries. That’s 60 minutes of waiting time. 

Smooth (automated) data discovery

Coralogix’s Cooney: Listen to data science, you know you want to.

So then, what is needed for smooth – and essentially automated – data discovery.

Data discovery is an interactive process, in which an engineer is able to interrogate the data without long waiting periods. They’re able to transform and inspect the data, aggregating on fields to detect trends, and so on. All of these things require some key capabilities that are primarily driven by automation.

  • Blazing  Fast Queries – With the volume of data increasing exponentially, users need queries that complete in seconds at the most. The compounding cost of slow queries is no longer economical. 
  • An IDE Style Experience – Software engineers explore data structures all the time in their IDE, which will index every field, function, method and variable. Observability providers need this capability, because it’s impossible for operators to remember every field in the complex hierarchy of data.
  • No Reliance on Expensive Storage – Easy access  and expensive storage MUST be decoupled. It’s simply not viable for companies to afford to hold petabytes of data in high-cost storage and this cost barrier is a significant prohibitive factor in adoption.

The company that offers users the best experience for data discovery will give themselves a significant edge in the market. This involves finding a cost-effective way of maintaining access to vast swathes of data, making a slick user experience for data discovery and finding the performance optimizations that no one else can.

As data volumes increase, the pressure on vendors to deliver on these key points will amplify. 

The race is on!