Auto-tech series - Cohesity: How diverse threads of automation tighten the data fabric

This is a guest post for the Computer Weekly Developer Network written in full by Mark Molyneux in his position as EMEA CTO for Cohesity – a firm known for its data management platform technologies.

Molyneux argues that automation is essential in enabling companies to secure and manage their exploding data volumes, regardless of where they originate.

He reminds us that the amount of data is growing rapidly, according to a report from the Enterprise Strategy Group (ESG) for every 1TB of production data, companies need an additional 4TB of storage for secondary data, which they store for privacy and other non-production reasons.

This often unclassified data is held for long periods, indefinitely in some cases and fragmented across technology platforms… so how can automation help us move forwards?

Molyneux writes as follows…

IT teams are also flooded with warning messages from their security architecture [about unclassified data issues] that they cannot process. There are too many informational messages and false alarms ensure that teams assess incidents poorly and, in an emergency, take no or limited measures to address – and in some cases they take the wrong steps.

According to Forrester Consulting’s The 2020 State of Security Operations study, IT teams have to put out a wildfire with a garden hose.

Automation will greatly reduce the workloads for these teams, as IT systems will be able to react autonomously to potential threats and take important precautions without a member of the IT team intervening. Modern data security and management platforms analyse the snapshots of all data AI and ML controlled and issue a warning to higher-level SIEM platforms as soon as they detect an anomaly.

This can, but does not have to be an indication of an attack. Nevertheless, actions and rules can be stored automatically for every anomaly and copies of the affected production systems can be triggered. This has huge advantages, both in the protection of data and in its recoverability in a clean manner.

Isolated external cyber vault

If there are clear indications of an attack, fresh copies of the systems/data defined by resiliency categories can be generated immediately without anyone having to intervene. This data is already held in an isolated external cyber vault, with multiple copies and encryption, so IT teams can restore the data from there in the event of a disaster. This would recover into a clean room and be thoroughly assessed and cleansed as required, again largely through automation, to ensure it’s free of threat.

Futhermore, if it later turns out that the anomalies were actually caused by an attack, security teams can search the historical snapshots of the past weeks and months for fingerprints without having to touch the production systems themselves. In the timed snapshots, teams can locate the various attack artifacts and reconstruct the path of the intrusion.

With this knowledge, the abused weaknesses and gaps in the production system can be closed so that it can be restored and hardened.

Screening, cleaning… no intervening

Automation of modern data security and management solutions also help to screen the growing mountains of data and to handle the files correctly from a compliance and security perspective. Depending on the data type, actions can be defined automatically. Personal data is encrypted and not allowed to leave certain storage regions, while strict access rights control who can even open it. These rules can then be enforced end to end, regardless of where the data is stored and without a user having to do anything manually.

Cohesity’s Molyneux: Automation can be applied at so many levels of the data management fabric – and it’s pretty much all good news for users (and systems too).

The rules can also be used to enforce expiration dates for data.

Today, it’s down to a user to make the final decision as to what data records can be deleted. In the future, this process could happen automatically for information clearly identified as scrap.

The same also applies to archiving tasks. Automatic classification recognises data that companies have to keep for several years and autonomously moves it to an archive. Cost-related rules can also control that less important data is pushed into a slow but cheap archive. Data that users access frequently can be automatically moved to fast but expensive storage resources.

Eww, that data is stale

The rules can also be used to enforce retention and expiration dates for data.

Today, it’s down to the user to make the final decision as to what data records can be deleted. In the future, this process could happen automatically for information clearly identified as scrap.

The same also applies to archiving tasks. Automatic classification recognises data that companies have to keep in accordance with Relevant Records strategy and autonomously moves it to an appropriate archive.

Cost-related rules can also control that less important data is pushed into a slow but cheap archive. Data that users access frequently can be automatically moved to fast but expensive storage resources. Automated control of data allows a company to remain compliant, secure, and cost-effective, without the need for physical intervention.