The evidence of data
I spend much of my working day talking about data, which is not surprising as I work in the cloud data services industry! However, with the pandemic and the need to support working from home, it seems that not only the industry, but the rest of the world has started talking about data too.
In the UK, we have had months of scientists referring to data about the coronavirus pandemic and making predictions on infection rates and the resulting impact on our health service. We have had senior government officials stating that decisions of national importance will be made on the “evidence of the data”. It has got to the point where almost every news bulletin has an endless stream of graphs and other data points – seemingly to reinforce this point.
The UK Government has also recently published a consultation paper in the form of the National Data Strategy. This talks about the goals of the government to foster an ambitious, pro-growth strategy that drives the UK to build a world-leading data economy. Similarly, the Department of Health and Social Care (DHSC) has stated that improving data collection within the NHS would help staff improve patient care and improve efficiency.
Many people are happy to accept that data is now shaping our lives in this way and willingly agree to the terms of service that appear fleetingly as they download a new app, accepting that their data may be used for commercial purposes because they value the service or capability that a particular app or on-line service provides. However there is also a general heightened awareness of the potential for personal data to be obtained via ransomware and broader cyber security threats. This requires us all to find an acceptable balance between risk and opportunity – a task which is made ever harder by the increasingly sophisticated nature of these threats and the growing value attributed to electronic information.
Data makes all the difference – but how?
The dialogue around data is now much more of a business-led rather than technology-led discussion. How can we use data to save on costs or improve our service to customers? Can we use artificial intelligence or machine learning to harness the hidden value of our data to gain competitive advantage? How can we protect our intellectual property and personally identifiable data against theft to avoid reputational damage and risk of financial penalties? All this is set against the context of organisations exploring how the cloud and data-intensive computing can improve service levels and agility whilst removing the burden of legacy technical debt.
In the public sector, there are also growing calls for data to be shared more effectively within and between government agencies to improve service for citizens and businesses. Indeed, the Data Saves Lives whitepaper published in June 2021, states that “data made all the difference” in combatting the coronavirus pandemic and one senses a desire across government to continue in this vein going forward. The paper sets out fundamental principles ls and a mandate for the Secretary of State for Health and Social Care to define how data should be collected and stored and how that data will flow through the system in a usable way and when and how it should be accessed.
However, one fundamental question keeps cropping up in my mind when I hear and read about such topics: Do we have a common understanding of what it meant by the term “data?”
IT professionals often talk about storage efficiency, data protection, uptime and availability – words which do not crop up in these more business-led strategic conversations. This has led me to conclude that these conversations are more generically talking about “information”. Whereas data is measured in bits and bytes that do not carry any specific meaning, information is presented in meaningful context and in way that is useful for decision-making. More pertinently, information depends on data, but data does not depend on information.
So where does this leave us?
Firstly, context is all important when referring to data. We need to be sensitive to the fact that different individuals and organisations have different terms of reference for their use of the word and so we need to check understanding when commencing a conversation about data. As IT professionals we should relish the fact that the world at large is more reliant on the services we architect and deliver, but they rightly care little about how these services are built and run. What they care about is having access to the information they require to make effective decisions – our job is to orchestrate the underlying infrastructure and data services to deliver the required data in such a way that it can be presented as information via the customer’s preferred application suite.
Secondly, accept that data sharing will be increasingly commonplace in the future and that new access control and security methods will be required to maintain control across increasingly complex hybrid cloud ecosystems. Consider how immutable copies of data can be used to achieve these goals without introducing additional risk and how storage-efficient technologies such as cloning can avoid the need to create multiple copies of data to avoid waste.
Thirdly and most importantly, we should think about building a data fabric. This presents the opportunity to “decouple” the dependences between applications, data centres and the underlying storage, freeing us to think more holistically about how we manage the entire lifecycle of data. Done well, it provides greater protection against cyber security threats and allows a “zero trust” model to be introduced to safeguard against wider data loss. It will enable individuals and organisations to maintain control of their data by capitalising on the power and flexibility of the public cloud, and so accelerate their data-driven future.