Owning the Data
According to a widely quoted study by Gartner, 85% of artificial intelligence (AI) projects will deliver erroneous outcomes. The reason, Gartner asserted, is often down to issues with the source data. That sparks a couple of obvious questions. What exactly is good data and how should companies and organizations go about capturing and managing it?
George Von Zedwitz-Liebenstein, the final guest in the first series of our Future Says interviews with leading professionals in AI, is well qualified to provide answers. As information and analytics lead of the financial services arm at Scania, the global manufacturer of commercial vehicles, he is at the forefront of the mobility revolution. The emergence of the connected vehicle is generating a tsunami of potentially valuable data, and AI is central to Scania’s vision for a truly sustainable transport system. As such, the company provides another vivid illustration of the power of big data and AI to do good.
At first glance, it might seem a little illogical to conclude series one with a discussion focusing on building the right foundations for AI. In fact, the interview with Von Zedwitz-Liebenstein very much expands on what has gone before. All our guests have stressed the need for delivery over wild ambition. Amidst all the hype surrounding AI, they have also emphasized the importance of the human dimension.
Von Zedwitz-Liebenstein insists that the most important task for any emerging AI and data-driven strategy is to establish a solid footing. Reflecting the findings of the Gartner study, that means providing good data. Achieving it is as much a question of attitude as technology.
Rather neatly, he describes data quality as a team sport. The absolute priority is to create clear ownership of data and ensure that a data product state of mind permeates throughout the organization. In seeking to implement a data-driven company, one of the more common pitfalls is to use a vast, enterprise-wide data lake as little more than a dumping ground. This strategy typically fails to deliver on its expected benefits, primarily because of the unresolved issue of ownership.
A far more effective solution is to give data teams end-to-end responsibility for making data available. Only by creating clear and transparent ownership of data products will its value to the wider organization become evident and easy to exploit. By establishing these relationships between data and owners, it also becomes far simpler to address key questions such as those surrounding the privacy, sensitivity, and protection of that data.
Von Zedwitz-Liebenstein’s attitude to data ownership also informs his expectations for the future. The emergence of the data mesh will, he believes, be one of the defining trends in this field. Basically, this means applying a microservices approach; instead of slow, inflexible, and unresponsive data lakes and pipelines, organizations will be looking to create much more manageable data products that can be deployed quickly and independently by their lifetime owners. In other words, the speed and ease with which data can be leveraged is central to our original question: what makes good data?
Back in 2016, IBM estimated that bad data was costing the US economy $3 trillion a year. Whatever the veracity of that claim – and how much it may have changed over the last five years – there can be no doubt that recognition of the importance of data is often not matched by the ability of enterprises and organizations to put it to good use. As Von Zedwitz-Liebenstein stresses, it may well be that they have simply failed to build the foundations for success.
Watch the whole interview with George here and catch past and future episodes of Future Says here.