Government Should Do its Own Data Homework

I’ve been reflecting a little since OpenTech on the relationship between the developer community and government.

Let me set out my perspective first. My goal is to help ensure that the public sector publishes reusable data in the long term.

To do that, data publication needs to be sustainable. It needs to be embedded within the day-to-day activity of the public sector, something that seems as natural as the generation of PDF reports seems today. It also needs to be useful. It needs to be easy for anyone to understand and reuse the data, with minimal effort. It cannot be the case, long term, that you need to be an expert hacker to reuse government data.

To get there, we need to work towards a virtuous cycle in which the public sector is rewarded for publishing useful data well. The reward may come from financial savings, from increasing data quality, from better delivery of its remit, or simply from kudos. It doesn’t matter how, but there needs to be some reward, or it just won’t happen.

Over the last few years, government has had to be persuaded that it’s a good idea to release their data at all. The message from the developer community has been “give us your data and we’ll show you what we can do with it!” Through hack days and various similar activities, developers have excited, wowed and dazzled officials and politicians, opening their eyes to what could be done. Through sustained argument and political pressure, developers have set out the economic and moral case that releasing data not only could, but should happen.

They have been incredibly successful. We have data.gov.uk, open data from Ordnance Survey, strong commitments to open data within the Coalition Agreement, and the Public Sector Transparency Board who are now applying that pressure, with authority, at the heart of government.

My perception is that the argument that government should open up its data has basically been won. The questions within the public sector are now about how, not whether. And as a result, in this changed environment, I’m growing slightly uneasy about the core developer message of “give us your data and we’ll show you what we can do with it!”

There are two things about that message that concern me. First, it implies government is doing it all wrong. Second, it implies that government doesn’t need to do any better, because the developer community can take up all the slack and fill in all the gaps. It’s like getting fed up with a child struggling with their homework, and saying “oh, just give it here and I’ll do it!” It’s a narrative that simultaneously undermines the best efforts of those within government and removes from them the motivation and opportunity to learn to do better.

Of course there is a tricky balance here. We don’t want to let up pressure on the government to release important information. We don’t want government to feel that they have to get their data perfect before releasing it. And we can’t always wait for government, which can be slow-moving as an organisation, to provide everything we need right now.

However, there are certain things that only the owners of data — those within the public sector — can do. People who own data understand it so much better than third parties: what codes mean, what values are used to indicate missing data, what gets included and what gets left out, which columns aren’t really used any more, which interpretations are safe and which are meaningless. Data owners can be trusted in a way that no one outside could be; when data publication becomes a sustainable part of their activity, they are much better placed to provide a steady, reliable, flow of data than a third-party API that could disappear or get out of date whenever the volunteer behind it moves on to something new.

People in government must be given the responsibility to publish their data well. And there are three core ways in which I think developers could help them.

First, while there are many more technically savvy people within government than is sometimes made out, the average civil servant lacks both know-how and tooling. I think developers could help a huge amount here. What about hack days where developers sit side by side with civil servants to help them clean and publish their data? What about engaging with the owners of a particular data set to help them to publish it in a way that was reusable and sustainable? What about writing services, accessible through the locked-down IT systems that civil servants have to use, that enabled them to convert their data into multiple formats, and to link up the ways they refer to things with the way other people do?

Second, while government needs to be responsible for publishing its data, it can’t be responsible for building everything that end-users need based on that data. Developers have the facility to create applications that bring together data from diverse parts of the public sector, and combine it with data from outside. This has always been a feature of hack days, of course; all I’m arguing for is a focus on applications that the public sector shouldn’t be doing itself.

Third, we need to build the virtuous cycle that I talked about above. Government needs to hear about what works for developers, as well as what doesn’t. What data releases have been helpful and why? Who are the stars? Who should be rewarded and emulated? We need ways of feeding back in a constructive way to public sector workers who are trying their best with the resources they have — often extensive subject-matter expertise but little time, locked-down technology and contracting finances.

The vitality and engagement of the developer community has played a massively important role in the open government data initiative within the UK, and I’m sure it will continue to do so. We are incredibly lucky, here, to have a collection of talented and motivated developers who volunteer their time to work with government data. My hope is simply that the relationship between government and developers can grow into one that is more encouraging and supportive, that understands the constraints and concerns of those within government, and that provides practical help to overcome them.

Comments

Re: Government Should Do its Own Data Homework

Thank you for this brilliant post. I agree with you: everyone in government should have the opportunity (and deserves the respect) to learn whatever skills are necessary to publish useful data. I’m happy to do whatever I can to support that. And as you’ve said, there is enormous scope for the developer community to help government in this endeavour. (I might add that many developers don’t understand government as well as they might; as a community, we might help there as well.)

I’ll go a step further to say that, in addition to developers building whizzy apps and neat visualisations for the public, I think that everyone in government should have more tools on their own data which are tailored to them. When I think of civil servants as consumers of the data in addition to publishers, I think there is potential to help and inform them in their jobs using a lot of this data. Benchmarking against other organisations, perhaps, or historical queries on last year’s spending. There seems to be an unseen cache of opportunities for building things to make government function more effectively on the inside. That’s something I would love to see happen.

You’ve set forth a vision here, a role for government that is very useful for LinkedGov and which I will happily support. The challenge is how to get there. That vision is of considerable scale, requiring significant time and resources.

My sense (feel free to correct me) is that we are ultimately talking about thousands, if not tens of thousands of data holders across the public sector. Each of them is facing cuts and reduced teams, increased responsibilities and the added load of publishing their data. It’s a chaotic time for everyone, of course, and though it could be interesting to launch a campaign to skill up every one of them on publishing open data— it’s a hugely ambitious people project, which would be time-consuming (into multiple years) and expensive. (Furthermore it’s not something I have the power to kick off… though if you do, I’d be happy to help!)

So while I think your vision should be realised, I also believe it will take a while. I similarly worry that if we were to ask all development on government data to wait until the sector is up to speed— we would lose a number of opportunities for building great tools. There is energy and momentum now. I’d like to help that momentum turn into something useful.

LinkedGov was created from expressions of frustration (from both civil servants and developers) about how cumbersome data is to work with right now. I think you’ve nailed it: their frustrations will largely be eradicated when clean, human-readable and machine-readable interconnected (or linked) data is published from all data holders in government. At that point, I would think that LinkedGov will no longer be needed. I’ll go find another project to work on. :)

Re: Government Should Do its Own Data Homework

One last point on the virtuous cycle you’ve outlined between government and developers (your third way that developers can help):

LinkedGov is planning to track popularity and usage of data, demand for unpublished topics (judged by queries), feedback on specific datasets and format preferences. We’d like to help inform your virtuous cycle. It might be useful at some point to talk about what you would like to see from these, or if there is anything else you would like to have tracked. Or any other ways in which the project can help— we’d love to hear your thoughts.

Re: Government Should Do its Own Data Homework

Thanks for the post. It reflects some of the concerns that I have for Open Data at present.

I run the Open Data Cities project which is encouraging the local authorities and public bodies within Greater Manchester to open up their data. My main concern is about sustainability. I think your post highlights an unrealistic expectation of what developers can do and what government is able to do, at this point in time.

With local authorities we often feel that they hold an expectation that as soon as a new dataset is released, applications and visualisations should start appearing. This is not always the case, most developers have to earn money and although there is some fantastic stuff created it rarely pays. This creates a problem and also highlights the point that just because data is released doesn’t mean that people are going to use it, well at least not immediately.

We should think of Open Data as a manifestation of a healthy, functioning government and that data should be open at the point of collation allowing not only the development of innovative applications and services outside of government but within government too. What Bryan Sivak and his team have done in Washington DC http://track.dc.gov/ shows that Open Data isn’t just for developers.

The closing the gap between developers and government is key to creating a more sustainable Open Data Ecology. With our own project we organise meet the developer days. These meet ups are for developers and local government data people to talk in a non-procurement setting. Many people within government - local government at least - want to engage with the developer community but need to be given the tools, the backing and the environment for this to happen.

Re: Government Should Do its Own Data Homework

I agree with your comments Julian but I would also say that local government needs to be a bit more active itself in seeking out engagement with developers. Social media tools provide an easy entry point in this regard yet how many council officers understand how to use social media? I recently did a survey across all policy units in London boroughs asking “do you use social media” - 80% of the replies came back n/a. And on the broader question about government and developers - I accept that developers need to monetize and will not always be able to make the contribution they make now to the open data movement essentially volunteering their own time and efforts for free. But I think we all realise that now is not the time to worry too much about what will happen - we need to work together to reach critical mass of data release - then see what new business models emerge. Right now we are still charging up the hill but for me as a public servant its fantastic that so many creative developers are charging up the hill with me I just need to get more of my public sector colleagues to join us.