By Samantha Barnes, International Banker
By now, most of us will have encountered a comparison between data and oil, with the former being touted as “the oil of the 21st century”, or words to that effect. Indeed, the insight and utility that are now being gleaned from data and the fact that it increasingly represents the lifeblood of an already diverse range of industries make its comparisons to the black gold far from unwarranted. But while oil can be transmitted through pipelines and shared with the world in a relatively straightforward manner, the same can’t always be said for data.
The growth of the digital economy means that sharing data between various stakeholders—whether connected or not—has become crucial for driving value in terms of intra-organisational efficiency, inter-organisational standards and practices, and even solving problems and improving living standards for the wider public. But standing in the way of these goals are a number of roadblocks that prevent the free, unencumbered flow of data between parties. While implementing effective data-analytics techniques is frequently discussed as the key challenge when extracting useful information from raw, unstructured data, there now appears a more immediate but arguably less highlighted obstacle in actually attempting to gain access to the data in the first place.
Perhaps the biggest obstacle that entities face in this regard is the inaccessibility to data in the form of data “silos”—that is, the closed walls that exist around data that ultimately makes it a costly, resource-intensive process to obtain data. Such insular systems are typically unable to operate with other systems and thus prevent important data from being shared. This creates gross inefficiencies across organisations, as management is prevented from having access to the data of all the business divisions that it needs for comparative analysis. “Silos are nothing more than the barriers that exist between departments within an organization, causing people who are supposed to be on the same team to work against one another,” business-management expert Patrick Lencioni wrote in his book Silos, Politics and Turf Wars: A Leadership Fable About Destroying the Barriers That Turn Colleagues Into Competitors. “And whether we call this phenomenon departmental politics, divisional rivalry, or turf warfare, it is one of the most frustrating aspects of life in any sizable organization.”
Silos arise for multiple reasons. For one, the structure of an organisation can be wholly inhibitive for data sharing. With software applications often designed to support a very specific business outcome at one point in time, for example, the potential sharing of data as a long-term benefit of the design is rarely prioritised or, indeed, given any thought at all. Common organisational structures also create silos between business divisions, as each one is often set up to manage its own data under its own IT (information technology) infrastructure and with its own policies governing that data. Such a culture, therefore, is decisively unconducive to the free inter-divisional sharing of data.
The common problem of “vendor lock-in” is also a major hindrance to data sharing. This occurs due to the strategy of software vendors to keep the data contained within their applications proprietary and intentionally difficult to export or share. With software-as-a-service (SaaS) applications, this becomes even more problematic, as the vendor will invariably work to keep users operating within its cloud ecosystem. And with different organisational divisions using different solutions and varying technologies to manage data in their own ways, data is inevitably pushed into silos, thus becoming increasingly detached from other systems.
A whole host of other issues continue to prevent data sharing, both within organisations and across industries. In the world of research, for example, the sharing of research data is crucially important in the pursuit of reaching more robust conclusions and making research studies more productive. But according to a 2018 report from academic-publishing firm Springer Nature, respondents who were involved in frequent data sharing identified five key challenges that hamper their ability to derive adequate insights in a timely manner: organizing data in a presentable and useful way (46 percent), unsure about copyright and licensing (37 percent), not knowing which repository to use (33 percent), lack of time to deposit data (26 percent) and costs of sharing data (19 percent). “While we continue to see researchers increasingly sharing data, the majority of the research community are not yet managing or sharing data in ways that make it findable, accessible or reusable. The utopia of findable, accessible, interoperable and reusable (FAIR) data is still some way off,” according to Grace Baynes, Springer Nature’s vice president of research data and new product development.
The company has since published a white paper in which it emphasises five key factors to accelerate the process of data sharing:
- Clear policy: from funders, institutions, journals, publishers and research communities. Establishing clear standards for data management and sharing will lead to a shift in researcher behaviour.
- Better credit: formally recognising those researchers who do share data via citations, authorship, inclusion in research assessments and career advancement.
- Explicit funding: for the management, sharing and publishing of data, which, in turn, should incentivise more data sharing.
- Practical help: for organising data, finding appropriate repositories and provision of faster, easier routes to share data.
- Training and education: to boost knowledge around data sharing, encourage best data practices and address common areas of concern.
Privacy issues are also front and centre on stakeholders’ minds when sharing data. In the finance industry, for example, data can be hugely important to financial institutions in terms of providing their customers with higher-quality services and enhancing the robustness of their security for preventing fraud and money-laundering activities. But concerns over the privacy and security of data means financial institutions remain reticent to share data that they believe may end up in the wrong hands—perhaps a competitor or even cyber-criminals.
Along with the World Economic Forum (WEF), Deloitte has identified a series of privacy-enhancing techniques (PETs) that have the potential to eliminate the privacy risks of sharing data and “allow institutions, customers, and regulators to analyse and share insights from data without distributing the underlying data itself”. These techniques are:
- Differential privacy, which adds noise to analytical systems to prevent the inputs from being reverse-engineered;
- Federated analysis, whereby parties share the insights from their analyses without sharing the data itself;
- Homomorphic encryption, through which data is encrypted before it is shared, such that it can still be analysed but not decoded into the original information;
- Zero-knowledge proofs, whereby users can prove their knowledge of a value without revealing the value itself;
- Secure multiparty computation, in which data analysis is spread across multiple parties such that no individual party can see the complete set of inputs.
It’s not only at the corporate level that the need for better data-sharing capabilities has become important. Intergovernmental data sharing can be crucial in developing harmonised global solutions and generating value for the public, particularly during times of crisis, such as the ongoing COVID-19 pandemic. Ultimately, this could enable a wholly new method of public service, which would utilise both the power of data sharing at the back-end and the capability to simplify the transaction process at the front-end through digitalisation.
But intergovernmental data sharing is far from being adopted on a widespread basis, according to the recent report “Silo Busting: The Challenges and Success Factors for Sharing Intergovernmental Data”, which was published in December 2020 by IBM Center for the Business of Government and Jane Wiseman from the Ash Center for Democratic Governance and Innovation at the Harvard Kennedy School. “The COVID-19 pandemic has clearly demonstrated the importance and value of being able to share data quickly between levels of government,” IBM’s executive director, Daniel J. Chenok, and general manager, Timothy Paydos, stated in the report. “Even with the stumbles that have occurred in standing up a national system for sharing pandemic-related health data, it has been far more successful than previous efforts to share data between levels of government—or across government agencies at the same level.”
The report also highlights the work of the Commonwealth of Virginia, which was able to quickly provide a COVID-19 dashboard as a result of creating a data-sharing platform that integrated public safety, public health and other data in response to its opioid crisis, as well as Allegheny County in Pennsylvania, which built a data warehouse to prioritise service delivery where most needed. “Individual level data has led to development of risk modelling tools for child welfare and for homeless service delivery,” the report noted.
As such solutions demonstrate, data-sharing problems can’t be solved by a single stakeholder alone. It will require community-wide and industry-wide solutions and collaboration among governments to encourage widespread sharing across organisational and geographical boundaries. Ultimately, working together will make the biggest difference in addressing individual stakeholder needs and preferences and ensure that ethics, privacy and regulatory concerns are suitably addressed.