Q&A: The challenge of open data in humanitarian response
When natural or man-made disasters occur, humanitarian actors need timely and accurate information to plan their course of action. Data, however, isn’t always easy to come by.
When Sarah Telford joined the United Nations Office for the Coordination of Humanitarian Affairs (OCHA), she had little prior experience with data.
“I was an English major, so when I started, I worked on sitreps (situation reports) and different information products,” she said. “They were mostly written in long, block narratives.”
Reading through them gave her an idea. “There is so much hidden data in these reports, so much information that needs to be extracted. So, I started thinking, what if we could get just a little more analytical with them?”
But when she tried finding the source data, she hit a roadblock. “It was really all over the place,” she said. “Someone would have it in a local file or on their laptop, and I just thought, this has got to be simpler. We have to be able to easily locate and use this data.”
In 2014, Telford teamed up with her colleagues to create the Humanitarian Data Exchange, or HDX.
Today, the online platform, which gathers data from over 350 humanitarian organisations, “is used in almost every country in the world,” Telford said.
Watch the extended interview with Sarah Telford here:
“And it has gotten real traction over the years,” she added. “We started with maybe 600 datasets, now we have 6,000. I can honestly say we’ve created efficiency in locating and working with humanitarian data.”
HDX is managed from OCHA’s Centre for Humanitarian Data, in The Hague, where Telford is the lead. We caught up with her as she visited Brussels to attend the Digitalisation in Humanitarian Aid workshop organised by the European Commission’s Directorate General for Civil Protection and Humanitarian Aid Operations (DG ECHO).
Capacity4dev (C4D): You’ve been with HDX since the very beginning. How has it changed over the years?
Sarah Telford (ST): When we started, our focus was on getting organisations to sign up and make data openly accessible. Now, with data privacy and security in the picture, the process has become more complicated. We’re faced with the broader issues of data policy and literacy, and the capacity to absorb data and use it.
I think where we need to go now is to bring all of this together, but things get more and more complex the more ambitious we become. There are so many different players out there, so many issues related to interoperability, shared standards and so on. So, it’s not an easy task.
One of the areas we’re focusing on now is speeding up the flow of data. Imagine you’re a policy officer sitting at your desk here in Brussels, and you’ve just received a report. With how data is transmitted these days, it could be months old by the time it gets to you. You’re always responding to the past and what has already happened. You never get to look at the real-time picture, let alone make predictions about the future.
“We need to get to a point where we have a clear understanding of what can be shared and what shouldn’t.”
We need to get to a point where we have a shared infrastructure, a clear understanding of which data is sensitive and which isn’t, what can be shared and what shouldn’t. We also need to have the ability to navigate through it all more quickly. Only then, I think, will we be at a point where we can say we’re working with current data, responding to the reality that exists now, rather than the state of things from months ago. That’s where we’d like to be with HDX.
It’s not an easy task, but what keeps me going, is that in the humanitarian sector we cannot be effective if we don’t know where people are. We need to know what they need, we need to know who’s there, who’s responding and what are the gaps, so data is essential for our work.
C4D: What are the main challenges to working with open data?
ST: There are many issues, especially when it comes to privacy. At HDX, we’ve never allowed the sharing of personal data. We have a quality assurance process, where we go through every file and check if there are any names or personal information. Where this gets more complicated, is when you’re dealing with the so-called community-identifiable and demographically-identifiable information – CII and DII.
The problem there is that the risk of re-identification is not always that obvious. For instance, we recently received data about a camp where a survey was done. We’re seeing more of these types of community-perception surveys, which is good. When we looked at the file, we didn’t find anything wrong with it.
Then, a colleague alerted us that if you conduct an analysis on the data, there’s a high risk of re-identification, and you could locate women who had been sexually abused in the camps.
HDX has a quality assurance process to remove any names or personal information. But the risk of re-identification is not always obvious, Telford explained.
And I was horrified. I thought, this is the reality now. It’s no longer immediately apparent that this data is dangerous, but it is, or could be in the wrong hands.
So, at HDX, we’ve now started looking at how to make the data even more secure. This concept we’re working on – HDX Secure – will involve passwords and encryption. If the main challenge with HDX was to grow the platform, HDX Secure will be a hundred times harder, but I think we have to go in that direction.
C4D: If this will make sharing data more difficult, is it worth it in the long run?
ST: I strongly believe we need a more secure infrastructure for the exchange of data in the humanitarian sector. But, you’re right, it’s not without consequences. So, just last week, we were in Amman (Jordan), looking at the whole of Syria operation and how data’s being shared from the different hubs in Damascus, Gazientep (Turkey) and Jordan.
They’re doing an amazing job, really. They have clear protocols, so, for instance, community-level data is only allowed to be shared at the hub level. I was really impressed.
But I also saw that the fear and confusion around which data is safe and which isn’t can lead to a point where no data is shared publicly. Donors might say, ‘We’d really like to see this data that’s in your product, can we have the file?’ And this just sets up a whole process that I think should be easier.
Resources and discussions on Information and Communication Technologies' (ICTs) role in achieving development objectives
Resources and discussions on topics related to aid effectiveness
An open platform for sharing data across humanitarian crises and organisations
The Hague-based centre that manages HDX
But I’m optimistic. In the Middle East, I saw an incredible level of interest in data policy. In the Amman office for our OCHA Yemen operations, I met an information management officer who had a book on the ethics of data right next to his desk that he reads in his spare time, and I thought, wow, these colleagues are really trying to figure this out. From our side, we need to give them more guidelines and support, so that they’re not on their own.
C4D: Going forward, what do you think is key in how we approach conversations about open data?
ST: This question brings me back to the very beginning. HDX was created in 2014, just as the Ebola outbreak was at its peak in West Africa. Around that time, an organisation was putting out PDF reports about the outbreak, which included the number of cases and deaths. You can imagine the kind of inefficiency this created, with people having to go through these reports, extract the data and put it into other formats.
And so, we thought we’d do it for them. We’ll just take the data out, put it into an accessible format – a CSV file – and make it available on the HDX platform. This became our most popular dataset ever, downloaded thousands upon thousands of times.
“It was really that simple, taking data that was trapped and making it open.”
And it was really that simple, taking data that was trapped and making it open. The New York Times used this data to create a visual that it shared on its homepage showing the sort of dimensions of the crisis. This visual would have been seen by millions of people. So that was really encouraging to me.
But the thing that upsets me now, is that we recently witnessed a new Ebola outbreak in the Democratic Republic of the Congo, yet we're seeing the same PDF sitreps. And I wonder why have we not moved beyond this? Where’s the leverage to get organisations to open their data and make it accessible?
We have to help people navigate these spaces and bring in the right levers. Is it the donors who are going to make things move? Is it HDX creating these services that will eventually help? Is it showing the value of data by creating visuals?
The data we’re collecting in the humanitarian community is a public good. It’s funded by governments, so it should not be kept in private hands. Certainly, we need to be careful about sensitive information, but this shouldn’t be an excuse for other types of data.
In healthcare and research, there’s an obligation to make data accessible. We have not yet seen this in the humanitarian sector, so there’s more work to be done by donors, organisations and everyone involved.
Every day, we’re having these discussions at our level, but it would make things that much easier if there was someone senior saying, ‘listen, we need to make data open to everyone’. If I had one wish, I guess it would be this one.