Teradata brings big data into the SQL family

Teradata addresses the issue of combining unstructured, semi-structured and traditional row-and-column data warehousing in a single architecture.

Teradata has fired the starting pistol in the race to make unstructured, semi-structured and traditional row-and-column data warehousing manageable and accessible in a single integrated architecture.

With the arrival of its “Unified Data Architecture”, the data warehouse specialist has started a trend that other suppliers – including IBM, EMC and Oracle – are likely to follow. 

The architecture will encompass a raft of technologies designed to make it quicker and cheaper for businesses to capture, store and analyse data in both traditional relational databases and big data Hadoop environments. 

Parallel programming framework MapReduce, developed at Google, and its open-source version Hadoop, simplify data processing across huge data sets distributed across commodity hardware. 

It requires expensive programmers to access and analyse data. Teradata says its architecture allows developers and data scientists to query unstructured data in the Hadoop Distributed File System with SQL, a more commonly used language.

"Now you can have the power of MapReduce and the ease of use of SQL," said Teradata chief technology officer Stephen Brobst. 

"Before, with Hadoop, the only people that could get the data out were the people that put it in."

The vendor has deployed the HCatalog, an open-source metadata framework developed by Hortonworks; and SQL-H, which allows analysis of the HDFS using industry standard SQL. 

Prior to its acquisition by Teradata, Aster invented and patented SQL-MapReduce, which extends SQL with MapReduce functionality and offers more than 50 pre-built analytics applications. 

"It allows you to have the best of both worlds," Brobst said.

For more on combining relational database systems with big data technologies

The Teradata-Aster Big Analytics Appliance combines these technologies to run, manage and analyse data in Hadoop and Teradata's relational database in a single machine, which can be configured to store 15 petabytes of data across both databases, Teradata says. The architecture will also operate across a multi-server environment.

Together with the appliance server, Teradata has announced new software. Web-based systems management tool Viewpoint has been extended to manage Aster from the same console and will also be able to manage and monitor Hadoop databases from early 2013. Connectors for Hadoop will allow movement of data in and out of Hadoop distributions from Cloudera and Hortonworks. Meanwhile, Teradata Vital Infrastructure will be able to monitor events and identify risks and system faults across both traditional, Aster and Hadoop databases.

Eric Rivard, chief operating officer with Cerulium, a consultancy firm, said there will be cost benefits from the use of SQL-H in the architecture. 

“Hadoop is good for unlimited data, but it is difficult to get the data out,” said Rivard.

Robin Bloor, chief analyst and founder of the Bloor Group, said Oracle, Microsoft and IBM are likely to follow with their own systems for combining Hadoop and relational databases in a more manageable and accessible framework, but for the moment Teradata has the head start.

Despite the considerable benefits of combining big data and traditional data in one architecture, businesses are still left with the problem of incorporating existing and historic data into the same management and analytics architecture, Bloor says. 

They are unlikely to move these data into a new environment because of a technology announcement, he says.

Read more on Data warehousing