By Andrew Smailes

Instead of trying to understand the root cause, we replace the technology and hope this time around it will work out.

But technology is seldom the real obstacle, human nature is in most instances:

  • In decision-making on the clarity of the scope of a data initiative.
  • In having opinions about what the world looks like and thus what the data will tell us.
  • In overestimating one’s skill in interpreting information.
  • In not accepting that voicing what information you need is a continuous conversation and demands for close collaboration with its minded people and data-savvy analysts at all times.
  • In preference for certain technologies based upon limited experience or slick marketing stories.
  • In underestimating the complexity involved in the automation of data aggregation at scale.

These obstacles can be overcome. Assignment after assignment I have battled perceptions and train users in asking the right questions.

But the most fundamental fact is that data is not objective.

Data is the breadcrumb trail of human activity. Like a real breadcrumb trail, it never shows the full picture of what happened. What we see in data represents us, with all of the idiosyncrasies that we expose. And often, the data records the activity, but not the context in which the activity took place. There is a lot of information missing from the information at our disposal.

The essence of 22 years of experience

When I started out, fumbling with data was a niche market. We were regarded as the sorcerer’s apprentice. These days, data makes the headlines.

A number of impediments have remained constant.

We lack any form of formal education in data. The current boom is making a familiar mistake  Рanalytics is not data management, data scientists, do not do data management, and the most valuable information for decision making in the unstructured data that is not in your databases or data warehouses.

The data industry is a hobbyist playground.

The wheel is reinvented over and over again. If there is training, the focus is on technical skills. This is not without consequences: There a very few people around who really understand what data is, what can be done with it, and what the limitations are.

Snake-oil sells well in our industry. There is an infinite amount of thought leaders, consultants, and architects who will sell you silver bullet solutions for resilient problems. These problems are very often inherent to data management practices, for which technology is not the answer.

Technology for data processing is still in its infancy. Though we have seen an explosion of alternative solutions, not many survive. Most have been invented because the variety of available data formats has exploded, not because they solve a strategic business issue.

We have just begun to find out what works, through experimentation.

Many home-grown open source solutions, even made by the brightest minds in this industry, have started out as a solution for a problem for which no solution was available. Others sought to fill the gap when market leaders could not adapt to changing requirements quickly enough.

The volatility of the last 20 years in this industry has surfaced the same challenges, at different places, at the same point in time. People started to work out answers, in parallel, independent of each other. Time weeds out which solutions work best. We are still in the weeding phase.

We do not have a clue with what we have to do with the staggering amounts of data we have amassed. We do not even know where there is value in it, if at all. All we know is that capturing and maintaining that huge pile of data is costly. The more the cost of storage and processing technology drops, the more we amass. The bottom-line: I do not think we create that much more value than we did 22 years ago.

Towards a brighter future?

Is this a gloomy outlook on the state of affairs? No. My motivation is improving the state of affairs in the data industry and provide people with the tools to improve their skills in the complex subject that is data. My premise is that if you want to improve, you should put the focus on what needs to be improved.

I could reminisce over all the hypes that have risen and fizzled out. But those hypes are all necessary to figure things out. It is an evolution in a pattern of two steps forward and one step back. And the steps are smaller than you might think. What people actually do with data, even in the sexiest data-driven companies on this planet, is not very different from what we did 22 years ago.

What would we need to progress faster and create more value with the technological capabilities that are at our disposal?

There are two important actions:

We need to have better curricula for teaching students at our schools and universities. Data is not a field of IT or mathematics. It belongs in the humanities and requires strong IT fundament. Much of the insight gained by professionals over the last decades is not reflected in what is currently taught.

Recognize that humans who consume information for the purpose of humans. Making decisions is an inherently human activity, even when humans have program a computer to do so. It requires human knowledge to create information that is meaningful. I see a slow shift from a technology-focused practice to a people-focused practice, but that movement needs a louder voice to let the message that data is a people industry resonate.

Will the future be brighter?

Time will tell. I am neither a futurologist nor a fortune teller. What I do know is that we have a wealth of technical means at our disposal. It is certain that new technologies will be on the horizon, existing technologies will be improved, and that algorithms and the means to create those algorithms will become more sophisticated. We have to connect the human dimension to the data industrial complex and find more balance in organizing our collaboration and finding the fitting technologies to support that collaboration.