Why the IoT is pushing analytics to the limit
IoT Back to Basics, chapter 2: In the era of the Internet of Things (IoT) it is becoming increasingly important to be able to process, filter and analyse data close to where it is created, so it can be acted on remotely, rather than having to bring it back to a data-centre or the cloud for filtering and analysis.
The other reason to implement analytics at the edge of the network is because use cases for IoT continue to grow, and in many situations, the volume of data generated at the edge requires bandwidth levels – as well as computing power – that overwhelm the available resources. So it’s possible that streams of data from smart devices, sensors and the like could swamp datacentres designed for more traditional enterprise scale needs.
For example, a temperature reading from a wind turbine motor’s sensor, that falls within the normal range, shouldn’t necessarily be stored every second, as the data volume can soon add up. Rather, it is the readings that fall outside of a normal range or signify a trend – perhaps pointing towards an imminent failure of a component – that should create an alert, and possibly be stored centrally only after that first anomaly, for subsequent analysis.
There are too many vendors in this space to produce an exhaustive list here. But it’s perhaps notable that last year, a company formerly known as JustOne Database performed a root and branch rebranding exercise. It renamed not only its products, but also its company name, which is now Edge Intelligence. It told me it was seeing such good traction for its database – that can run on relatively compact servers at the edge of the network, a data-centre or the cloud – that it changed its name after over six years in the business.
So what are some of the characteristics of edge analytics that you might want to consider if you are trying to push at least some analytics to the edge?
Standards and protocol translation
Although there is likely to be a shakeout of some of the standards in this space, opting for technologies that support standards is likely to make future integrations easier. Again there is a vast array of standards and API’s in this area. Standards and protocols include POSIX and HDFS API’s for file access, SQL for querying, a Kafka API for event streams, and HBase and perhaps an OJAI (Open JSON Application Interface) API to help with compatibility with NoSQL databases. There’s also the need to be able to support older, proprietary telemetry protocols so that legacy equipment (that often have lifetimes measured in decades) can been connected to more modern IoT frameworks. This is especially true in the industrial space, where IoT is of particular value for the likes of predictive maintenance.
Distributed data aggregation
This is to some extent the bread and butter of edge analytics, providing high-speed local processing, which is especially useful for location-restricted or sensitive data such as personally identifiable information (PII), and can be used also to consolidate IoT data from edge sites.
Bandwidth-awareness
This refers to technologies that adjust throughput from the edge to the cloud and/or data centre, even with occasionally-connected sensors or devices.
Converged analytics
Combines operational decision-making with real-time analysis of data at the edge.
Security and identity management
End-to-end IoT security provides authentication, authorization, and access control from the edge to the central clusters. In certain circumstances it will be desirable to offer secure encryption on the wire for data communicated between the edge and the main data centre. Identity management is also a thorny issue: it’s necessary to be able to manage the ’things’ in terms of their authentication, authorization and privileges within or across system and enterprise boundaries.
Enterprise-grade reliability
Delivers a reliable computing environment to handle multiple hardware failures that can occur in remote, isolated deployments.
Integration with the cloud
Even if not now, there may be a requirement in the future to have good integration between an edge analytics node and the cloud. This is so that alert data and even ‘baseline’ data points can be stored in the cloud rather than in one’s own data centre. In this regard integration with your cloud provider of choice – if you have one – would be a wise idea. If you don’t already do much in the way of data processing and storage in the cloud, some of the likely execution venues in your future could include Amazon Web Services, Google Cloud Platform or Microsoft Azure, but it wouldn’t do any harm to know there is support for the open source OpenStack infrastructure as a service (IaaS).
Edge analytics has come on leaps and bounds in the past several years as IoT use cases have shaken out. At the very least it might be worth asking if edge computing has a role to play in any IoT projects that you may be thinking of embarking on.