Raedan AI

Reflections on Data Design: Part 2

Mastering data

Reflections on Data Design: Part 2

Is this a gloomy outlook on the state of affairs? No. My motivation is to improve the state of the data industry and provide people with the tools to improve their skills in the complex subject of data. My premise is that if you want to improve, you need the focus on what needs to be improved.

Instead of trying to understand the root cause of the issue, we replace the technology and hope this time around it will work out.

Technology is rarely the real obstacle, in most instances, it’s human nature is the real culprit. Here are some examples where human perspective can be the limiting factor:

  • In decision-making on the scope of a data initiative.
  • In having opinions about how the world works and therefore what the data will tell us.
  • In assessing your own skill to interpret information.
  • In recognising that voicing what information you need requires a continuous conversation and demands close collaboration.
  • In choosing between between the advice of like-minded people and data-savvy analysts.
  • In preferring certain technologies based upon personal experience (or slick marketing).
  • In underestimating the complexity involved in data aggregation at scale.

These obstacles can be overcome. Assignment after assignment, I have battled these perceptions and trained executives in asking the right questions.

The most fundamental fact is that data is not objective.

Data is the breadcrumb trail of human activity. Like a real breadcrumb trail, it never shows the full picture of what happened. What we see in data represents us, with all of the idiosyncrasies that we expose. And often, the data records the activity, but not the context in which the activity took place. There is a lot of information missing from the information at our disposal.

The essence of 22 years of experience

When I started out, fumbling with data was a niche market. We were regarded as the sorcerer’s apprentice. These days, data makes the headlines.

A number of impediments have remained constant.

We lack any form of formal education in data. The current boom is making a familiar mistake  – analytics is not data management, data scientists, do not do data management, and the most valuable information for decision making is the unstructured data that is not in your databases or data warehouses.

The data industry is currently a hobbyist playground.

The wheel is reinvented over and over again. If there is training, the focus is on technical skills. This is not without consequences. There are very few people who really understand what data is, what can be done with it, and the limitations.

Snake oil sells well in our industry. There is an infinite amount of thought leaders, consultants, and architects who will sell you silver bullet solutions for a wicked problem. These problems are very often hard wired to your data management culture, for which technology is not the answer.

Technology for data processing is still in its infancy. Though we have seen an explosion of alternative solutions, not many survive. Most were invented because the variety of data formats has exploded, not because they solve a strategic business issue.

We have just begun to find out what works, through experimentation.

Many home-grown open source solutions, even made by the brightest minds in this industry, have started out focused on a problem for which no solution was available. Others sought to fill the gap when market leaders could not adapt quickly enough.

The volatility of the last 20 years in the data industry has surfaced the same challenges at different places, at the same point in time. People have to work on answers, in parallel, independent of each other. Time weeds out which solutions work best. We are still in the weeding phase.

To be truthful, Boardroom and the C-Suite do not have a clue what to do with the staggering amounts of data amassed. And their advisers do not even know where the is value in it, if at all. All we know is that capturing and maintaining that huge pile of data is costly. The more the cost of storage and processing technology drops, the more we store. The bottom-line: I do not think we create that much more value from data than we did 22 years ago.

Towards a brighter future?

Is this a gloomy outlook on the state of affairs? No. My motivation is to improve the state of the data industry and provide people with the tools to improve their skills in the complex subject that is data. My premise is that if you want to improve, you should put the focus on what needs to be improved.

I could reminisce over all the hypes that have risen and fizzled out. But those hypes are all necessary to figure things out. It is an evolution in a pattern of two steps forward and one step back. And the steps are smaller than you might think. What people actually do with data, even in the sexiest data-driven companies on this planet, is not very different from what we did 22 years ago.

What would we need to progress faster and create more value with the technological capabilities that are at our disposal?

There are two important actions:

  • Educate: We need to have better curricula for teaching students at our schools and universities. Data is not a field of IT or mathematics. It belongs in the humanities and requires strong IT fundament. Much of the insight gained by professionals over the last decades is not reflected in what is currently taught.
  • Humanise: Recognize that humans consume information for the purpose of humans. Making decisions is an inherently human activity, even when humans have to program a computer to do so. It requires human knowledge to create information that is meaningful. I see a slow shift from a technology-focused practice to a people-focused practice, but that movement needs a louder voice to let the message that data is a people industry resonate.

Will the future be brighter?

Time will tell. I am neither a futurologist nor a fortune teller. What I do know is that we have a wealth of technical means at our disposal. It is certain that new technologies will be on the horizon, existing technologies will be improved, and that algorithms and the means to create those algorithms will become more sophisticated. We have to connect the human dimension to the data industrial complex and find more balance in organizing our collaboration and finding the fitting technologies to support that collaboration.

Article by Andrew Smailes