Confluent - don’t blink, stream in sync with Apache Flink

Data streaming company Confluent has billowed new functionalities upwards into its Confluent Cloud platform.

The new capabilities are designed to provide confidence that data is trustworthy and can be easily processed and securely shared.

Data Quality Rules, is an expansion of the company’s own Stream Governance suite.

It is designed to enable organisations to resolve data quality issues so data can be relied on for making business-critical decisions.

Also shiny and new are Confluent’s new Custom Connectors, Stream Sharing, the Kora Engine and an early access program for managed Apache Flink, all of which are designed to make it easier for companies to gain insights from their data on one platform.

Apache Flink

Did we blink, or did you say Apache Flink?

Yes indeed, Apache Flink, the project (and technology) designed to make stateful computations (where software code is tasked with executing a function that takes the state and value of a given stateful data entity that has the ability to access memory storage and then subsequently returns a value along with a new state) possible over data streams i.e. data that is in motion, streaming, in a data streaming platform, application, software engine or other.

Open source at its core, Apache Flink can be explained as a unified stream processing framework (with batch processing functionality where it is needed) powered by a distributed streaming dataflow engine. Some say data flow, some say data-flow, some say dataflow – either way, Flink is written in Java & Scala and built to execute arbitrary dataflow programs in a data-parallel (across more than one microprocessor) and pipelined manner. Flink’s runtime supports the execution of iterative algorithms natively and the technology ultimately provides a high-throughput, low-latency streaming engine.

Stream processing plays a critical role in data streaming infrastructure by filtering, joining and aggregating data in real-time, enabling downstream applications and systems to deliver instant insights.

Confluent says Customers are turning to Flink to handle large-scale, high throughput and low latency data streams with its advanced stream processing capabilities and developer communities. Following Confluent’s Immerok acquisition, the early access program for managed Apache Flink has opened to select Confluent Cloud customers to try the service and help shape the roadmap by partnering with the company’s product and engineering teams.

Having high-quality data that can be quickly shared between teams, customers and partners helps businesses make decisions faster. However, this is a challenge many companies face when dealing with highly distributed open source infrastructure like Apache Kafka.

According to Confluent’s new 2023 Data Streaming Report, 72% of IT leaders cite the inconsistent use of integration methods and standards as a challenge or major hurdle to their data streaming infrastructure.

“Real-time data is the lifeblood of every organisation, but it’s extremely challenging to manage data coming from different sources in real-time and guarantee that it’s trustworthy,” said Shaun Clowes, chief product officer at Confluent. “As a result, many organisations build a patchwork of solutions plagued with silos and business inefficiencies. Confluent Cloud’s new capabilities fix these issues by providing an easy path to ensuring trusted data can be shared with the right people in the right formats.”

What are data contracts?

Data contracts are formal agreements between upstream and downstream components around the structure and semantics of data that is in motion. One critical component of enforcing data contracts is rules or policies that ensure data streams are high-quality, fit for consumption and resilient to schema evolution over time.

To address the need for more comprehensive data contracts, Confluent’s Data Quality Rules, a new feature in Stream Governance, enable organisations to deliver trusted, high-quality data streams across the organisation using customisable rules that ensure data integrity and compatibility.

With Data Quality Rules, schemas stored in Schema Registry can now be augmented with several types of rules so teams can:

Ensure high data integrity by validating and constraining the values of individual fields within a data stream.
Resolve data quality issues with customisable follow-up actions on incompatible messages.
Simplify schema evolution using migration rules to transform messages from one data format to another.

Many organisations have unique data architectures and need to build their own connectors to integrate their homegrown data systems and custom applications to Apache Kafka. However, these custom-built connectors then need to be self-managed, requiring manual provisioning, upgrading and monitoring, taking away valuable time and resources from other business-critical activities.

Confluent Custom Connector

By expanding Confluent’s Connector ecosystem, Custom Connectors allow teams to quickly connect to any data system using the team’s own Kafka Connect plugins without code changes; ensure high availability and performance using logs and metrics to monitor the health of team’s connectors and workers; and eliminate the operational burden of provisioning and perpetually managing low-level connector infrastructure.

For businesses doing activities such as inventory management, deliveries and financial trading, they need to constantly exchange real-time data internally and externally across their ecosystem to make informed decisions, build seamless customer experiences and improve operations. Today, many organisations still rely on flat file transmissions or polling APIs for data exchange, resulting in data delays, security risks and extra integration complexities. Confluent’s Stream Sharing provides the easiest and safest alternative to share streaming data across organisations.

Using Stream Sharing, teams can exchange real-time data without delays directly from Confluent to any Kafka client; safely share and protect data with robust authenticated sharing, access management and layered encryption controls; and trust the quality and compatibility of shared data by enforcing consistent schemas across users, teams and organisations.

Confluent’s new Custom Connectors are available on AWS in select regions – and support for additional regions and other cloud providers will be available in the future.