We need to give AI more time (series)
This is a guest blogpost by, Evan Kaplan, CEO, InfluxData.
Two contrasting forces are driving the AI revolution: generative AI (GenAI), known for its creative prowess, and real-world applications AI, built for practical applications. Generative AI captures the imagination, while “real-world AI” holds the key to solving real-world problems and encompasses everything from factory automation to smart thermostats.
Just like GenAI, real-world AI requires significant volumes of data to deliver insights into system behaviour that paves the way for accurate predictions or more efficient automation. Unlike GenAI, however, real-world AI applications rely on vast amounts of real-world — often real-time — data to deliver the most benefits.
Put another way, they rely on time series data.
Tesla’s self-driving cars are an example of how real-world AI can be put to work. The company uses its vast network of cars for visual data that is fed into a training model. The model continuously learns how to react to and solve new problems, and it learns to understand driving and traffic flow patterns. As the model learns updates over time, it is distributed to cars via updates to the company’s Full Self Driving (FSD) Beta software.
Tesla claims that the fleet of FSD Beta cars has driven more than 500 million combined miles, generating an incredible amount of data that’s proactively used to influence AI in physical real-world situations.
The Sensorfication of AI
The backbone of real-world AI is the “sensorification” of the world around us.
By collecting real-time information about everything from air quality in our green spaces to traffic on the road or the energy usage of a supermarket refrigerator, sensors deliver large volumes of time series data – essentially, a collection of observations obtained through repeated measurements over time – which play a critical role in creating the latest AI and ML models.
And “large volumes” of data may be an understatement. The volume of data points generated by IoT sensors and devices can quickly escalate into the billions, with time series measurements at every minute, second, millisecond – even nanosecond. Time series data serves as the common language of connected devices, presenting a sequence of data points indexed in chronological order, which allows for tracking changes over time from the same source.
Critically, time series data is unique in its ability to display “serial dependence,” which means the value of a data point that is statistically dependent on another data point from a separate point in time. By using this sort of data, we can create machines that anticipate our actions by analysing historical data to anticipate future outcomes.
This delivers a predictive analytics model with essentially limitless potential. Traditional statistical models were previously the norm for predictive analytics, but AI/ML-based models have gained significant traction thanks to their accuracy and ability to be deployed by professionals who may not be highly trained statisticians. Expanding access to a broader user base and empowering them to build applications capable of predicting precise outcomes through time series data can drive the democratisation of real-world AI at a level comparable to that of GenAI.
Setting the tempo: creating a time series strategy from the start
Time series data offers a foundation for real-world AI applications, but implementing a time series data strategy alongside AI isn’t as simple as traditional data collection and analysis.
First, organisations need to lose the mindset of limiting the volume of data they collect to improve querying times. Collecting vast quantities of rich data is the key to making AI smarter. Instead of measuring only the temperature and time of day, a sensor might monitor other contributing factors such as the amount of UV light, the position of the sun in the sky, and the temperature ten feet above the surface to help build a deeper, wider-reaching picture to generate better insights.
Using a time series-optimised database helps reduce querying times in these scenarios, so it’s important to build time series data considerations into an AI strategy early to avoid the need to replace a traditional database that’s unable to handle the sheer volume of data required for these applications.
Once a business overcomes these challenges, attention then turns to collecting the time series data points. This means investing in the instrumentation necessary to create and collect this valuable data which requires a data management strategy that accommodates high data cardinality, or a high number of unique datasets. Again, the more sensors an organisation has, the better their end-AI product, but only if they can effectively manage that data.
AI building blocks
If a business gets time series data right, the potential for organisational AI transformation is huge. By analysing vast amounts of data, AI can identify system behaviours and forecast future scenarios with increasing accuracy. This capability, combined with time series databases to manage huge amounts of data, is instrumental to developing automated systems such as self-driving cars and space-bound rockets.
Real-world AI will also transform the way we work. As organisations continue to unlock the potential of time series data, we can expect to see the end of manual monitoring and physical dashboards. This data allows AI and ML to monitor data-driven trends and react automatically based on predefined rules. Increasingly, this is already happening in more digitally advanced industries, and it’s fast becoming standard. Ultimately, the end of manual monitoring frees up teams from mundane work and allows them to innovate even further.
Time series data is already ubiquitous, as it lies in every part of today’s digital businesses, but it has yet to reach its full potential within most organisations. As the amount of data we create continues to grow, it’s clear that the businesses that give themselves the best competitive advantage are those that become experts in harnessing, analysing, and ultimately using time series data.
Businesses need to recognize that time is the new frontier in data; those who embrace this now will be the winners of the next wave of innovation in AI and connected technologies.