How to manage the risk of the human factor in database administration

Fail to manage the risk of the human factor in database administration and you court disaster. DBAs, like air traffic controllers: work in pairs?

RAID. Replicas. Mirroring. Clustering. Backups. Transactional replication. SAN replication. Dual comms links. Tier 4 datacentres. UPS. ITIL. Change management.

If you've worked anywhere near a database in the last 30 years, you'll be familiar with these terms, writes Jon Reade. Many of these technologies and working practices are also used to ensure reliability and business continuity for other critical corporate services, such as websites and email. They cost businesses a small fortune and without them Slough would not exist. But they are deemed worthy, because the alternative is simply too awful to think about.

What is that? Namely, without accurate, available data, most modern, information-based businesses would quickly go under. Data is their lifeblood. Data that resides in databases. Administered by DBAs.

And here’s the snag. Often, there’s only one DBA in an organisation. Only one air traffic controller in the tower, by analogy.

Few people would argue that regular long hours cause tiredness. Running a team so ragged that one person picks up all of the support, all of the time, causes tiredness too. Unrelenting tiredness in turn causes people to make mistakes. Fact. Mistakes take time and money to recover from. Sometimes, those mistakes cannot be recovered from. The losses are of such magnitude that the business ceases to exist.

So why, I wondered, do a large percentage of the clients I have worked for address the risk of technology failure, but not human failure?

For more on managing database administrators

Database administration: DBA staffing considerations

Understanding the DBA job description: Database administrator's roles and responsibilities

What is a database administrator (DBA)?

In the last twenty years I’ve worked for some great technical people. Many understand bathtub curves, MTBF and probability of failure. They understand risk and, from a process point of view, manage it well.

Yet they don’t always “get” people. In their risk calculations, they ignore them.

The human factor

It is for these reasons I like to work with managers who recognise the human factor. Some are fantastic with technology, some more hands off. But the good ones share two common factors which I now look for when I choose to work with someone.

First, they exhibit self-awareness. They are aware of what long, unrelenting hours, with little holiday, working weekends and no lunch breaks does to a human being. Including the macho types who kid themselves that 90-hour weeks can be brushed off with no ill effect. These managers know that everyone gets ground down in that situation.

Secondly, they exhibit empathy. Unfortunately, management attracts a small percentage of narcissists and sociopaths. These types are devoid of empathy and are best not hired in the first place. They present a risk to the organisation, because they simply do not understand people. Any problem that causes a dent in their persona gets transferred to their staff, instead of understanding the root human cause of the problem. They ask what their staff can do for them.

And yet the best managers I have worked with always think the opposite, namely: What can I do to support my staff?

This latter group tend to share other common characteristics. They are knowledgeable, more humble people who praise their staff when they do well, attempt to understand “why” when things go wrong, and don’t steal the limelight when it goes right. Their support is visible and tangible to the people they manage. They verbally defend their staff when they exercise their right to a holiday or weekend.

Oddly, these managers find their staff will do more for them when the chips are down. Possibly because they are treated respectfully as human beings and are not persistently on their last legs, they have something left in reserve.

When I have managed teams of DBAs, I keep a close eye on excessive hours, and knowledge concentrated in single team members. It presents a risk to the business. My team cross-trains where practical, which reduces risk by allowing the rotation of staff on critical projects. Where there is only one DBA, I share the technical load, bringing myself up to speed, fast.

If I didn’t, my actions would personally present a real risk to my employer by the way I manage my team. They will get demoralised. They will become tired and more error-prone. They will not have time to cope with their life outside of work. Eventually they will make a mistake, or leave.

The result: I will spend most of my working life on the phone to recruiters, ploughing through endless CVs trying to find someone of equal technical quality to the one I’ve lost. It won’t be an easy task. They won’t have the business knowledge the last one had. When I do find them, it will cost my employer a fortune in recruitment fees. Hopefully, they will be able to start before the current one leaves. No handover would compound the risk.

In summary, if you want to keep valuable people, don’t treat them like machines.

They don’t RAID, mirror or cluster well.

And when they do fail over, it’s not pretty.


About the author

Jon Reade is a DBA manager and SQL Server 2012 consultant. He has worked in financial services and has a master’s degree in business intelligence from Dundee University.

Read more on IT technical skills