Maksim Kabakou - Fotolia

Security Think Tank: Understand data for risk-based protection

Why is it important to know where data flows, with whom it is shared and where it lives at rest, and what is the best way of achieving this?

Mapping what data lives where, and how it gets there, is now backed up with a legal requirement under the EU’s General Data Protection Regulation (GDPR).

But regardless of the judicial issue, developing a “data dictionary” is good business practice. It acknowledges that data is an asset with value, and allows organisations to understand what data they have, how it is accessed and why.

And from there, the level of rigour with which this data needs to be protected can be defined, thereby reducing the risk that it will be breached during a cyber attack – the occurrence of which is now widely considered to be a matter of when, not if.

Without knowing what data is stored and where it is when “at rest”, it is almost impossible to determine the different levels of control required at specific locations. For example, some information may need to be encrypted at all times and rigorously protected, even from most internal employees, while other material can be viewed internally, but is not appropriate for external workers.

Without an accurate understanding of what information is stored in which locations (physical and electronic), an organisation would logically have to assume that all data is highly sensitive in order to meet the highest common denominator. And although it is technically possible to ensure matching (strong) levels of encryption throughout the organisation, this is potentially expensive – and unnecessary. In other words, it doesn’t make good business sense.

By contrast, knowing that a specific server holds particularly sensitive data allows the appropriate level of protection, such as access-driven malware scanning, strong encryption algorithms or simply good user authorisation controls to be put in place on that part of the network.

Identifying breaches and minimising impact

Mapping data also means that defence-in-depth principles can be deployed to protect relevant data from network intruders. Any breaches are identified quickly – which is critical in view of GDPR rules about notification – while knowing what data flows and patterns should take place allows anomalies to be flagged and mitigating steps taken. Remedial action quickly reduces the impact of an attack.

For example, once an attacker has breached the network, they will need to extract the data they have obtained. They may do this by “hiding” it with data that is less critical or sensitive to increase the time it takes for the breach to be noticed.

However, knowledge of where the data should be flowing allows controls to be put in place to prevent abnormal data flows, while enabling legitimate ones to take place so that business processes and activities can continue unhindered.

Enabling network and endpoint data leak prevention (DLP) in this way provide another layer of network defence.

Technology for data mapping

Updating an existing data model or map is much easier than building one from scratch and is usually possible when a mature enterprise application or standard software is being used. The data model should already be at least partially available.

It may need to be customised for each use case, but adopting the standard functionality as a base will significantly accelerate the task of defining the data model.

There are various discovery and configuration management tools to help with data mapping.

It is also important to recognise that data structure may change for each business project, and a mechanism for identifying these variations needs to be incorporated into the early stages of the project delivery process.

For example, in-depth access to sensitive information on customer purchasing habits, such as size and payment preferences, may be pivotal for the outcome of a venture. Additional technology implemented for the project will also have access to this data, so will impact the data model and may require additional checks.

Although this could be incorporated as part of the mandatory GDPR requirement for privacy by design, it can increase the risk of additional assessments, leading to increased project costs or delays.

Efficiency as well as compliance

Accurately populated data maps also drive efficiency, and the time taken to review potential changes is reduced significantly if there is clarity on the data model from the outset. Identifying and enforcing a primary data source minimises data duplications, reduces the risk of it being questioned or amended and allows controls to be dedicated to one data point rather than several.

It is also worth noting that, when it comes to developing a data map, the challenge faced by many organisations is the sheer quantity of data they handle. Locating all of it, along with the associated data flows, would be an enormous and costly task. Prioritising is therefore a key element of mapping where data flows and lives.

A key tactic is to start with the data that is classified as highly sensitive because it would impact the organisation most severely should it be leaked. The location of the master data is identified, as well as any copies that exist. Communicating with the business team to understand what this data is and why it is used will help to determine its lifecycle flow.

Data with lower levels of sensitivity can then be tackled on a cost-benefit basis.

Read more on IT risk management