Can one 'multi-model' database rule them all?

In open source, we trust community – and as such, we might reasonably trust benchmarking studies that have been driven by community groups, in theory at least.

ArangoDB open source NoSQL performance benchmark series is one such open study.

The native ‘multi-model’ NoSQL database company has even published the necessary scripts required to repeat the benchmark.

A multimodel (or  multi-model) database is a data processing platform that supports multiple data models, which define the parameters for how the information in a database is organised and arranged. Being able to incorporate multiple models into a single database lets users meet various application requirements without needing to deploy different database systems.

The goal of the benchmark is to measure the performance of each database system when there is no cache used.

For the 2018 benchmark, three single-model database systems were compared against ArangoDB: Neo4j for graph; MongoDB for document; and PostgreSQL for relational database.

Additionally, it tested ArangoDB against a multi-model database, OrientDB.

Benchmark parameters

The benchmark used NodeJS 8.9.4. The operating system for the servers was Ubuntu 16.04, including the OS-patch 4.4.0-1049-aws — this includes Meltdown and Spectre V1 patches. Each database had an individual warm-up.

What ArangoDB has been trying to suggest (would ‘spin’ be too cruel?) is how a multi-model database competes to single-model databases in their specialities.

In fundamental queries like Single Read, Single Write and Single Write Sync, ArangoDB says its technology outperformed PostgreSQL.

Claudius Weinberger, CEO of ArangoDB, said: “One of our main objectives, when conducting the benchmark, is to demonstrate that a native multi-model database can compete with single-model databases on their home turf. To get more developers to buy-in to the multi-model approach, ArangoDB needs to continually evolve and innovate.”

The company lists a series of similarly “positive” (its term, not ours) performance stats in areas including document aggregation, computing statistics about age distribution and benchmark results that profile data, shortest path and memory usage.

Need for debate

We’ve been talking about multi-model databases for perhaps half a decade now and the promise is an end to the ‘polyglot persistence’ scenario where an IT team has to use a variety of databases for different data model requirements and so end up with multiple storage and operational requirements — and then the additional task of integrating that stack and ensure fault tolerance controls are applied across the spectrum. Multi-model does indeed provide a means of alleviating those concerns… but we need to hear some balancing arguments put forward from the single model cognoscenti in order for us to judge more broadly for all use cases.