Dick Zeeman, CDMP Master: "Good predictive models stand or fall on data governance."
Business Intelligence Consultant at DataTalents
Summary
Dick Zeeman, Certified Data Management Professional (CDMP) Master, emphasises in this article the crucial importance of data governance for the success of predictive models. He explains that a solid data governance framework is essential to safeguard the quality, reliability and relevance of data, which in turn improves the accuracy and effectiveness of predictive models. Zeeman highlights the value of a holistic approach to data governance, with collaboration between different stakeholders and departments at the centre, to deliver a robust and sustainable data management process.
Data has long ceased to be a by-product of processes; for many companies it is now a critical asset with significant value. But to do anything with that data, it needs to be of good quality. And that is often the problem. Plenty of reason to invest in data governance. That is why Dick Zeeman of Data Talents took the Certified Data Management Professional (CDMP) training at Connected Data Academy, plus the Specialist follow-up trainings Data Quality and Data Governance. Thanks to his high exam scores and convincing CV, he is now a CDMP Master.
"The CDMP Master certificate is very unique," says Erik Fransen of Connected Data Academy, part of Connected Data Group, an Open Line company. He is also a CDMP Trainer and secretary on the DAMA Netherlands board. "No public list is released, but as far as I know there are only a handful of CDMP Masters in the Netherlands." It is no surprise that many clients are eager for his expertise. There are not that many data governance consultants out there, and certainly not consultants at Dick's level.
Dick's interest in data quality stems from the fact that he sees so many opportunities being missed in this area. He explains: "Data is more than a by-product of processes. You can extract more value than financially-oriented BI reports. Think of predictive models that help you set up processes more efficiently. In many organisations, however, there is one big spoiler: data quality. If you are going to use data to steer processes precisely, you need to be able to rely on that data being correct."
Trainings
For Dick this importance of data quality was reason to take, in addition to the CDMP Fundamentals training, two specialist trainings at Connected Data Academy: those focused on data quality and data governance. He says: "Central to our field is the DAMA DMBoK (Data Management Body of Knowledge). It is not the most exciting book to read. By taking the trainings at Connected Data Academy together with a few colleagues from Data Talents, you get a much better feel for how to apply the DMBoK in practice. The trainers continuously translate theory into practice. They use examples participants bring in. That makes the material come alive."
Besides picking up new knowledge, Dick also uses the trainings to refresh existing knowledge and put it in perspective. "I have been working in the data field for fifteen years. As with driving a car, you reach a point where you do things on autopilot. These trainings and the framework they use help you remind yourself why you are doing what you are doing. They start from a common foundation and a common language, which makes it easy to translate examples from one domain to another."
From data quality to data governance
If data is of mediocre quality, that is usually because governance is not properly in place, says Dick. "Governance means you create policy. That policy is needed, because data quality does not stand alone. It goes hand in hand with questions like: which data do we use in which processes? What is the quality of that data and what quality do we need to make predictions? Who owns which data? You need someone in the business with enough mandate to make improvements and resolve structural problems."
One important reason to make processes more data-driven is the labour shortage in many sectors. After all, the more efficiently you deploy your available staff, the fewer people you need. "You want to make processes predictable and you want them executed first time right," says Dick.
Predicting processes
Take a large logistics company as an example. To steer and predict their processes, they depend on a lot of data provided by external partners, such as suppliers and carriers. When one of those partners delivers incorrect data, that has consequences for the entire chain, with delays, extra costs and dissatisfied customers as a result. Dick: "If, for example, a large shipment of incoming goods arrives a day later than you expect, you have scheduled staff at the wrong time, you cannot deliver orders to customers, and so on. That is why it is important to have a clear view of data quality. You do that by measuring that quality. Then you can intervene on time and you can also make structural improvements by agreeing the right things with the partners."
This example shows clearly that data governance is not an IT exercise. "It is a business topic. It is about people, processes and behaviour."
Finding balance
In addition to people and behaviour, data governance is also about rules, procedures and other things people generally dislike. That is why it is always important to look for balance, says Dick. "You do not want to add extra rules or controls, you want to make sure data is captured well at the source and then carried over automatically into other processes. That usually means a different layout of processes and supporting IT systems."
This immediately raises the next issue: costs. Adapting processes or systems always brings costs. "But the returns are usually so much higher," says Dick. During the Data Governance training at Connected Data Academy, building a business case is therefore also covered. "Because ultimately you convince management with a business case," he knows.
Tips
Another topic the training covers is scoping a project. As a rule, companies generate huge volumes of data and the teams in charge of data governance and data quality are too small to tackle everything at once. Dick therefore advises: "Start with the processes where you experience the most bottlenecks."
In the example above, those are the dependencies between the various collaborating parties: supplier, transport company, distribution centre. To better predict processes and reduce the risk of disruption, it is important to know how the data is used in the processes, but also who is responsible for it within the organisation. Dick: "For all those data sources, you appoint an owner who becomes responsible for the data quality. If the quality is insufficient, the data governance specialist helps the data owner come up with measures to raise it."
A second piece of advice Dick gives: "Try to get the organisation to see that data and technology are two different things. Data does not belong to an IT department, it belongs to the business. Of course there is a relationship between data and IT, but that starts with the data and never with the technology. In practice I often see that organisations start with the technology. They appoint the product owner of an IT system as data owner. That is wrong. I also hear this regularly from colleagues at Data Talents, and the subject came up often during the trainings at Connected Data Academy. So be aware that it must be the business that owns the data. IT is supporting, but certainly not leading."
Also interested in CDMP training and certification?
View the DAMA CDMP Portfolio