May 2021 Month Notes

Welcome to another set of (slightly delayed) month notes! Here’s my previous month notes from April if you want the “story so far”.

ODI

Strategic work

Writing month notes always makes me think both how much I do, and how little I achieve. I’m usually surprised by how much I’ve done – the number of things I’ve been involved with, the number of conversations I’ve had and people I have learned from and/or touched in my own way. But I also always have mixed feelings about how much I’ve achieved: things just seem to move more slowly than I expect them to, and by the time something is actually done or has an impact, it feels somehow smaller or harder to recognise, because it is the culmination of something that has taken much longer. It’s rare that I feel I’ve achieved something meaningful within the space of a single month.

During May I wrote a much longer reflection: a final report for Luminate on the period during which ODI received its previous grant, the three years 2018-2020, bracketed by the Facebook/Cambridge Analytica scandal and the Covid-19 pandemic. It was interesting reflecting on how the narrative and interests of the ODI (and the wider data community) have changed over that time. I brought out the growing tension and polarised thinking between openness and trustworthiness; the importance of and interest in data institutions; and, with Olivier’s help, the attraction and limitations of data ethics (something explored in the report we commissioned from Consequential on the next generation of data ethics tools).

Reflecting on how we work, I highlighted the programmatic approach we’ve started using. I also explored some of the things that we’ve found particularly impactful: working on problems that matter, oriented around the SDGs; developing practical advocacy tools and the training and consulting support offers that help organisations embed them in their practice; using stimulus funds to create peer networks, provide financial and non-financial support to projects of interest, and to learn from practical experience. I also highlighted our approach to working with commercial businesses, partnering with other organisations to achieve scale and complement our expertise, and our focus on communications that enables us to reach other audiences. Finally, I talked about how we are approaching improving our diversity, equity and inclusion as an organisation.

And guess what, even with a three year span, I still felt surprised by how much we’d done and have mixed feelings about what we’ve achieved! Perhaps it’s just in my nature to never be satisfied at work (in fact, someone said more or less that to me this month).

Diversity, equity and inclusion

I’ve written over the past couple of months about how I’ve been trying to push forward our work on a diversity census for the organisation. This month these got passed up to the Board and Ethics Committee to look at, and make the balancing decisions between the utility of collecting such data and the risks posed by doing so.

To help inform the discussion, I went through each of the characteristics that we’d identified as potential things we could measure, and classified them using a RAG (red/amber/green) rating against each of the following properties:

special category data – these require particular scrutiny; some characteristics are clearly so, others are proxies or related to special category data (eg data about neurodivergence could be seen as health data) and some unrelated
properties to do with the purpose to which the data might be put:
- protected characteristics – those where there are legal obligations and we therefore should keep an eye on
- hiring – those where we might conceivably take positive action in our hiring practice (eg in choosing where to advertise, or mandating diversity in our shortlisting) to improve our diversity
- policies – those where the diversity of the team might influence our internal policies or practices
- recognition – those where asking questions about these characteristics might make people feel seen, recognised and welcomed for who they are
properties to do with the harmful impacts collecting the data might have:
- sensitive – those where people might be concerned about their particular response becoming widely known
- trigger – those where asking the question might trigger traumatic memories or feelings (eg asking about someone’s parents, when perhaps they are ill or recently deceased)
- UK bias – those where the question might make people not from the UK feel excluded because they contain assumptions about cultural context or make those from outside the UK feel “other”
- participation – those where it’s likely that people are already commonly known to have particular characteristics, so not seeing them present in the reported answers might reveal they didn’t complete the diversity census

We also ran another survey of the team, to pick apart how they felt about questions being included vs whether they would personally answer them, the vast majority of which was positive (but of course, what the majority thinks is not necessarily the point in DE&I activities). I’m still hopeful that I’ll be able to share more about the entire process, including the aspects that are really challenging in collecting this kind of sensitive, personal data, at some point in the future. It isn’t straight-forward, or comfortable.

If you’re interested in DEI efforts within organisations, I found this thread and particularly this tweet thought-provoking.

Spending Review

If you are good at reading between the lines, you’ll have intuited from the government’s response to the response to the National Data Strategy consultation and the announcement of ODI’s programmes of work that ODI again secured government funding for its work, as part of the 2021/22 Spending Review. Since its initial £10m grant ended, the ODI has received about a third of its income over the last four years through a R&D programme grant from InnovateUK. Our new grant comes direct from DCMS, as part of its programme of work to implement the National Data Strategy.

One of my jobs is making a strong case for ODI to continue to receive government funding in the next Spending Review. In previous years, a lot of this has fallen to me, but this year I’m trying to empower the programme leads to take more ownership over their parts of ODI’s proposal, and Louise, our MD, on the bits about the general development of ODI. So, with the capable support of Kim, a new delivery manager at ODI, I’ve been spending time this month setting up an internal project to support our SR bid and engagement, with planning and milestones and everything.

Project work

After a quiet month last month, I’ve been involved in a number of projects this month:

I was one of the judges for the Microsoft Education Challenge
I’ve been supporting our work exploring challenges faced by data institutions in low- and middle-income countries for the World Bank
I was also involved in another piece of work with the World Bank on business data sharing in SE Asia
I’ve been helping Milly and Deb to shape some public policy work to support our data assurance programme, in particular to make the case for due diligence around the collection, use and sharing of data to be a part of regular corporate audits
I’ve also been supporting Milly and Gavin to scope a project to explore the UK data policy landscape
We managed to win a piece of work with the World Health Organisation with a very quick turnaround, to support a conference on health data governance that they’re planning at the end of June
I’ve had great fun working with Joe and Jack on the next stage of a data institutions register that they’ve been working on for a while

Interesting conversations

This month I’ve chatted with:

Elena Simperl about the state of open data and the new work that she’s doing on the European Open Data Portal
Catherine at Creative Commons
Tania at 360 Giving
Diane Coyle at the Bennett Institute
people at the Legal Services Board as they plan their next steps around influencing or building data infrastructure in the legal sector
Schmidt Futures about a new initiative they’re putting together
Lisa Quest from Oliver Wyman about the future of data and tech regulation
Various people at DCMS about their and our progress and plans
people in the Data Standards Authority about embedding data into the service manual
The Rockefeller Foundation about some projects we’re hoping they’ll fund
Our Future Health to work through some of the issues to do with potential cross-border data transfers
MHCLG, having been invited by the Leasehold Knowledge Partnership, about the availability of data to support the post-Grenfell remediation. This was a really interesting conversation because it highlighted how sensitive non-personal data can be (the locations of buildings with Grenfell-style cladding needs to be protected because of arson threats), and the challenges posed when central authorities want to have access to data but there’s no quid-pro-quo (only problems) for those they want to supply it
Swee Leng at Luminate
Simon McDougall for a final chat before he leaves ICO

Not included in the above, to protect identities, were four conversations I had with different people (outside ODI) who were all in the middle of difficult life choices about their careers and wanted some advice / a listening ear, which I was very glad to provide.

I was interviewed as part of research by:

W.I.R.E., a Swiss think-tank, about the future of health data governance, as they are planning an event on the topic later in the year
researchers carrying out an evaluation of the Next Generation Services Industrial Strategy Challenge Fund

Finally, Lisa Allen joined us at ODI this month! She’s already making her mark in a wonderfully proactive way, and has brought great insight into where things are at for those implementing data strategies inside government.

GPAI

At GPAI, within the Data Governance Working Group, we have defined our two projects:

Enabling data sharing for social benefit through data institutions which will look to do the fundamental work to support the creation of data institutions / data trusts (we’re being deliberately ambiguous at the moment) by GPAI.
Advancing research and practice on data justice: preliminary guidance for AI developers, policy makers and communities affected by the use of AI which aims to help practitioners and users to include considerations of equity and justice in terms of access to, and visibility and representation in, data used in the development of AI/ML systems.

I’m honestly bowled over by the amount of commitment and expertise within the group and really excited about being able to take these pieces of work forward. We only have a little budget, but raising the international profile for these pieces of work feels important.

We’re currently seeking consultants to help us deliver this work, so please, if you’re interested and it suits you, take a look at the terms of reference for the data justice and data trusts projects, and please share these with others you think might be interested. Tweet thread is here if the best way to do that is to retweet it.

We’ve also gathered volunteers in the group to support work being led by other Working Groups, specifically around:

Climate action, led by the Responsible AI Working Group
Drug discovery, co-led by the Pandemic Response and Responsible AI Working Groups
Pressing new IP issues, led by the Innovation and Commercialisation Working Group
Social media, led by the Responsible AI Working Group

Other work

The beginning of May saw a couple of data-relevant government announcements. I had fun working with Milly and the team to craft Twitter threads responding to the Queen’s Speech and to the National Data Strategy consultation response response. It was particularly gratifying to realise how many of the varied topics within the Queen’s Speech we had previous work on that we could point to.

I wrote half an op-ed on the future of data protection regulation in the UK post-Brexit and how we should position ourselves as a trustworthy data harbour rather than a dodgy Cayman Islands style data haven. Then one of the Allegory comms team who we work with chatted to me about it, rewrote it, and made it ten times punchier and more readable. I used to think I was good at writing. Anyway, it was still reflecting my original ideas and I did do a final pass to make a few corrections and add some nuance, so I guess I can claim some credit for the result. Hasn’t been published yet, but if you’re interested to read drop me a line and I’ll share it with you (we might publish it as a blog if it doesn’t get picked up anywhere else, I guess).

I chaired a roundtable for DCMS on data intermediaries and the kind of interventions that government needs to make to enable them (or the right types of them) to flourish. Most of the people at the roundtable were people I know well so I felt very comfortable doing it, but I do just generally enjoy chairing things. The only difficulty is when I have strong opinions that I feel I shouldn’t express because of the role I’m playing; I need to find better ways to navigate that.

I attended three advisory groups/boards this month:

The OpenSAFELY Oversight Group, where one of the topics was the perennial observation that finding funding for infrastructural work like OpenSAFELY is really hard.
The Oversight Group for the Geospatial Commission’s work on the public attitudes to the ethics of location data where I have to say that the Ada Lovelace Institute and Traverse have done a fantastic job pulling together the materials and plan for the public consultation exercise they’re running. This post by Aidan Peppin explores some of the issues. Since there were lots of great voices in the group highlighting the privacy issues, I tried to inject some thinking about discrimination due to differential data quality – the classic Pokemon Go data ethics problem – and competition issues introduced when some mapping providers can provide better services because of their existing dominance and exclusive access to the location data of their customers (ie Google Maps gives good routes because it has access to data from everyone using Google Maps).
The Board of Directors for the Global Partnership for Sustainable Development Data, where I have to admit I always feel out of place compared to the experience and wisdom of the amazing other board members.

I spoke at:

the Data4Good Festival about data sharing and stewardship in the social sector; slides are here and there should be a blog coming soon. My basic argument was that lots of data infrastructure we need will end up being supported by the social sector, so we need to get better at recognising and supporting it.
a panel at Data and the Future of Financial Services on ESG data
a Heads of Data Strategy group (an informal group of heads of data strategy in UK public sector organisations) meetup, to talk about data institutions – time was too short!

I also chaired half a day at the Digital Government Virtual Summit, which included asking questions of Minister Julia Lopez and chairing two panels: one on government’s data transformation, and one on digital identity. It was not a particularly easy experience, in part due to the particular technology choice and malfunctions throughout the morning, but it was fun chatting with Sue Bateman and Ollie Buckley, who have been working on data in government for about as long as I have, about this next phase.

I was also interviewed for a Wired piece about Citymapper, and what the future might hold for it. As usual a lot of the discussion was cut for the piece, but I tried to make points about the utility of the kind of data Citymapper captures from commuters for local government decision makers, alongside the issues they (and other similar services) might encounter using location data in this way.

Thoughts I had

Broken links

I tweeted about a study looking at the number of broken links in New York Times articles: up to 72% of links in articles from 1998. Early hypertext systems had links that couldn’t break: if a page disappeared it invalidated those pages that linked to it. This works fine – indeed well, as you get alerted to and can correct broken links – for self-contained hypertext systems such as a book. But for the World Wide Web to flourish, the responsibility for maintaining it had to be distributed and links had to be breakable.

Tim wrote that Cool URIs don’t change in 1998 (is that really 23 years ago? god, I feel old). But of course they do, because so many are technology-bound and information-architecture bound and each generation of webmasters (as if we have webmasters any more) will want to change one or the other or both. (This happened on the ODI website when it was redesigned, for example – some of the old content never got moved over to the new system.)

The web is designed to enable growth and change over time, and – like our brains – it is designed to forget. It could have been designed differently. The Memento project shows how HTTP headers can enable browsers to indicate the date and time of the content of the page (such as when a newspaper article was written), and request the content of linked pages-at-a-particular-time. Servers can redirect historic links to archives maintained by themselves, or the internet archive, or more specialised ones.

Web archives are certainly important. We surely want future historians to be able to understand our lives and how we got to where they are, when they are, rather than this being another dark ages in the historical record.

But is it a problem that historic links don’t work automatically? That you might have to go searching for the reference document in an archive? That an archive might not have been made in the first place?

I think it depends on the link, and the likelihood that a link will be important depends on the nature of the linked page. There are links that simply provide colour or further information in case of interest. There are links to content that make it far easier to understand the context of a page you’re looking at (such as those to a video game that is the subject of a review). And there are links where the linked page provides essential material without which the full import of the linking page cannot be understood, a good example of this being when standards and referenced from legislation.

Having cool URIs that don’t change, and stewards that think about and honour the fact that others may be linking to their content as essential reference points, is really important for data and information infrastructure. It’s why we put so much thought into the design of legislation URLs for legislation.gov.uk, to make them independent of any technology choices that might be made in the future.

But for the rest? The web is a collaborative enterprise that we are building and evolving together constantly. It reminds me of Ellen’s recent video where she questions so eloquently our approach to data and AI, including how we use industrial metaphors for what we do with data (extraction, refinement) rather than natural ones (burrowing, foraging). How would our attitude to link rot change if we saw the web not as an engineered thing we have built, but as a forest we are tending? Missing pages not as threats to structural integrity, but as part of the natural process of renewal?

I think this is why I find it beautiful that so many links from old New York Times articles no longer work. Those articles record the shape of a memory of the past, like ivy wrapped around a dead tree, and one day they too will fade. But that’s OK, there’ll be plenty of new growth. We can let them go.

Changing data narratives

As part of reflecting on the National Data Strategy, I revisited the 2012 Open Data White Paper from Cabinet Office that laid out a government-wide mandate to “put data and transparency at the heart of government”. Looking back at it now, there are many familiar themes. Francis Maude and the Coalition government at the time wanted to “inspire innovation and enterprise that spurs social and economic growth”. It wanted to “ensure public services are more personalised and efficient”. Some things don’t change.

But what struck me was how the Open Data White Paper and the National Data Strategy were almost inverses of each other in how they think about achieving these goals.

The 2012 Open Data White Paper over-emphasised the role of the public sector as a supplier of data for others to consume, and under-emphasised the importance of data sharing from other sectors – including the private sector – and the use of data by government itself. By contrast, the National Data Strategy focuses on getting data shared better in the wider economy (Mission One), and on government’s own use of data (Mission Three) but misses the importance of government’s role as a steward of public data.

I am troubled by this, particularly given that the gap between these two missions is mirrored by a gap in the responsibility to fulfil them, falling as it does between the remit of DCMS and the Central Digital and Data Office (CDDO) in Cabinet Office. Whose responsibility is it to ensure that data held by the public sector, that would be useful in the wider economy, is made as open as possible?

Another thing that is worrying me at the moment (as I mentioned above) is the growing polarisation in the UK between the narrative of “data as a public good” and “data as a privacy threat”. Earlier in the year, we saw Oliver Dowden saying in the FT (my emphasis):

I want to set a bold new approach that capitalises on all we’ve learnt during the pandemic, which forced us to share data quickly, efficiently and responsibly for the public good. It is one that no longer sees data as a threat, but as the great opportunity of our time.

and of course Cummings’ worship of data scientists (tw/ Daily Mail). This is a narrative that emphasises the good and opportunities that can come from the use of data. I’m scheduled to speak on a panel at CogX with Sam Gilbert, author of Good Data, which also makes this case, including about uses of data for private-interest purposes (that bring public benefit through creating jobs, economic growth and innovation).

And then we see the film People You May Know and the noise around the use of data from GPs for research (GPDPR), both of which emphasise harms and risks about the collection, use and sharing of data.

Both sides paint pictures of worlds that could be, whether utopias to provoke hopes or dystopias to provoke fears; whether about miraculous predictions granted through data science or about the identification, targeting and detention of citizens by tech firms.

I’ve spent many years trying to find a nuanced path between these two extremes that does not trade public good for personal privacy, or vice versa, but achieves both. The debate seems to me to increasingly lack this nuance, even though it’s in the grey areas that most need the attention. This concerns me for two reasons.

First, I would really like the public and political dialogue about data to be more informed and based in reality. I was particularly perturbed by People You May Know in this regard because I felt that by creating a fictional world that was close to the real one, it blurred the boundaries between the two. I don’t know how viewers are supposed to tell which bits are real things happening now and which are made up. My sense is that many will assume that some of the parts that were fiction are actually fact. Isn’t this the essence of disinformation?

Second, I am worried that as the debate becomes more polarised, the concerns that the other side raise, or claims they make, become seen as misrepresentation and exaggeration. I’m particularly concerned, as the debates become more shrill, about pro-good-data people dismissing the concerns of pro-privacy-people as paranoia. Conspiracy theory. Project fear. I am worried that a perception of panic-mongering will further entrench whatever view there is inside government that privacy is passé, rather than undermining it. As psychologists have found, “heated debates only convince the already converted, and further entrench the opposition”.

But these are typical “insider” concerns. “Can’t we all be reasonable?” “Don’t rock the boat.” Logically, I recognise that I have a bias to a different style, approach and theory of change. Logically, I also believe strong “outsider” voices are essential for keeping issues on the agenda, and I can see what happens when we lack them (as we have for a few years in the open data space). So I have very mixed feelings and thoughts, but can’t deny there is this niggle in the back of my head, and in my gut, which I have learned not to ignore, that the heat in the current debate might be making things worse rather than better.

Personal

Work life

Working from home

As pressure increases to return to the office, I’ve been thinking about the particular challenges of hybrid working at the organisation level. It feels to me that in any working pattern, there are sets of individual needs, role needs, team needs, organisational needs, and societal needs at play, and we (all of us, I’m not particularly talking about ODI) have now a chance to re-examine and re-set the balance between those needs, and how we choose to meet them, if we choose to take it.

Individuals have different needs due to their lives, personalities, and preferences, and these also change over time, as demands on them change and emergencies come up, or as their confidence and competence grows around their work. Different roles involve different levels and types of interaction with other people and with the physical office space. Different teams prefer to work with autonomy, or in pairs, or in groups. Different organisations have more or less need for physical presences for visitors or events, and different office atmospheres that they want to project or cultures they want to preserve. And different regions and societies have different patterns for the physical distribution of office space and the shops that support them.

I can’t help feeling it would be good to take some time to unpack these different needs, and apply some imagination in working out whether some needs that individual needs have been subservient to might be satisfied in different ways. We’ve seen that teams can work remotely, just as they can work in person; which works better for which tasks, and what tools can help? We’ve seen that some people just can’t work from home, not because of their job but because of their living situation or personality, but they might equally dislike their commute and a half-empty office; perhaps they can find nearby flexible, creative, buzzing, shared offices to work in instead?

I have a feeling that without proactive exploration of this space, we will drift back to what we were used to, and that will be a missed opportunity.

Free Fridays

I’m no longer going to be working (as such) on Fridays, from the beginning of June. So I spent some time during May working out what I’ll do with the extra flexible day each week. Some life admin, which I never feel like doing and is sometimes hard to do at the weekend, certainly. But I also want to use them to do the two things I don’t get enough time for: writing, and coding. So my plan is to spend roughly half each Friday writing a book, and half doing some kind of data analysis and visualisation, or similar.

The book I’ve been thinking about is to get down on the page a lot of what I’ve learned over the last fifteen years about data strategy, specifically data strategy that incorporates an understanding of and influence on the wider data ecosystem that an organisation plays a part in. I’m going to try to get a rough outline together by the end of June. If there’s something particular you’d like me to write about, do let me know.

On the coding / analysis / visualisation side, I’m less certain about where to focus. Maybe just something interesting from the week’s news. Maybe some of the economic statistics that I want to get my head around. We’ll see. Suggestions welcome!

Home life

It was my birthday at the end of May, and with some restrictions lifted and Covid-19 rates low, we hired a car and visited my family in Cambridge on the final Sunday. It was a wonderful day. The weather was perfect – sunny but not too hot. We had a delicious vegan picnic. And I had a lovely long chat with my mum. Cambridge itself was packed, but we were outside the whole time so felt safe. We visited the college that my eldest will be applying to; I’m still choosing to live in denial that she will ever leave home though.

The other brilliant thing over the weekend was that my autistic youngest child was in one of their chatty states. We went for a walk on the Saturday to some nearby National Trust land that we hadn’t previously visited, and they loved it, as did my partner who rarely gets to see and interact with them when they’re like that. Seeing them together made me very happy.

Video games we’ve been playing:

My eldest and I finished off It Takes Two – a cooperative puzzle/adventure game where you play two doll versions of divorcing parents. We loved the balance of the gameplay, the imagination of the setting, and the story worked too. One horrific moment where we had to dismember a stuffed elephant will live with us a long time…
I played a bit of The Long Dark – a pretty tough survival game that might be easier to play on a PC than on a console. I enjoyed the realistic grind, and how the threats are mostly about finding warmth, food and medicine, but I haven’t gone back to it.
Andrew recommended Buildings Have Feelings Too! which I did enjoy. You have to gradually build up different neighbourhoods based on what buildings, and the businesses or homes they house, need. It’s challenging and sometimes panic-inducing, when you have to deal with the ripple effects of the collapse of a business, but a great twist on city-management games.

We had our first session of the role playing game Good Society, where we created the main characters, worked out their backstories, the relationships between them and some secondary characters to make their lives difficult. I’m so happy getting to roleplay again, with both this and our ongoing game of Masks. I’m always tired coming into it, but two hours of collective creativity and laughter perk me right up.

TV we’ve been watching:

The Alienist – I remember really enjoying the original book when I read it (checks dates) 25 years ago, but almost nothing of the plot! These were good watching.
Star Trek Discovery – we’d stopped watching part way through Season Two so picked it up again from there. Definitely watchable, but we do get a bit frustrated that the crew seem to frequently spend precious minutes on deep and meaningful conversations in the middle of emergency situations.
The Politician – my eldest and I started watching this on the odd evenings when my partner is busy doing other things, so we’re not making very rapid progress through it, but I’ve enjoyed what we’ve seen.
Law & Order – now part way through season 8. We are starting to try to predict which piece of evidence will be suppressed to create drama during the courtroom portion of the show.

Family film watching:

Labyrinth for the nth time. “It’s so stimulating being your hat!”
The Mole Agent has a premise that did not particularly appeal to me – a documentary about a private investigator hiring an elderly man to go undercover in a nursing home in Chile – but it was excellent. Funny, sad and touching. Highly recommended.
The Mitchells vs the Machines – US animated family fare, with some nice touches and digs at big tech. I think I overall preferred The Croods.

Mental health

My youngest has settled into what seems to be a relatively stable pattern of going to school for the full school day, but not always attending every class if their anxiety is too high. (When they don’t, they go to a specific area where they can do work set by their class teacher, or any other work that they feel like.) They have been spending some of this time learning – because they want to – A level maths and A level politics materials, which they then tell me about in great detail and with huge joy when we relax in the evening. The school is sending me letters with concerns about their attendance, which I’m ignoring.

ODI