Paul-Olivier Dehaye: 'Data governance should connect with individuals to have a global reach'

Paul-Olivier Dehaye. Photo: Alain Herzog.

Data. A notion that scares as much as it fascinates. More than that, controlling or protecting data has become a major political, economic and social issue, as the global digital society emerges from data sharing.

As people conduct their lives increasingly online, the question of digital rights, particularly rights to privacy and freedom of expression, is becoming increasingly important. What role does multilateralism play in the process? And is global data governance possible or a mere illusion? These are some of the questions that will be examined during the conference organized by the University of Geneva, as part of the digital series launched in September.

Among the guest speakers, Paul-Olivier Dehaye, director of the nonprofit organization,, and board member of MyData Global. Brilliant mathematician, he found a new meaning and challenge in his interest for data, playing a key role in the revelation of the Cambridge Analytica scandal. Today, he reflects on a complex subject and the need for an inclusive governance.

How did the science become your life partner?

I don't know that science has remained my life partner. Both of my parents have a scientific mind, and I've always been very analytical. I always knew I wanted to study math. Why I like math has changed over time. I like the universality of it, the fact that it's a language that allows you to construct very abstract structures in your mind, and express them. Someone else might recreate the same structure, but express it differently. Those nuances are interesting. And I also liked the ability of some branches of math to construct tools that allow us to keep secrets, cryptography for example. That has always fascinated me. So when I went into data protection, that's when I started to see that I liked math, also for that reason.

How did you transition from academia to data, from a more abstract environment to a more concrete application?

You are actually right. To me, it is more concrete, although it is still very abstract for most people. And that's the challenge I actually like: to be between the very abstract and the very concrete, to have to talk to the average person about extremely abstract topics, to make it relevant to them and move society that way. I think that's precisely what attracted me to data. If you choose an academic career, you become an expert in a very narrow field. You see that over time the set of people who understand and appreciate your work shrinks drastically. You become very isolated and I just wanted that to change. is an expression of that change. What is its purpose?

To make individual data rights actionable, and collectively useful. I thought about that a while ago, but it is still very relevant. I think that individuals and civil society should shape the debate around data, AI, and other similar topics. There needs to be a push in that direction, there needs to be an effort to have a voice in this debate. When you look really concretely at what the available mechanisms are, those that allow you to speak for yourself and not have someone else speaking in your name, then data rights are an essential tool. And assuming that those data rights are actually effectively practical, which is far from the truth, then what? There is this individual dimension. It is no good if you alone use your rights, you still have to build power, collective power, and social power. That is the other part, to neutralize efforts, learnings, and making sense of he digital world. That's the frontier I tried to set up.

You used the word effort. Why do do you think it is so difficult?

There are two reasons why it is hard. I don't know which one is the first, but it is a vicious cycle between the two, I guess. First of all, transparency, and those data rights in general, are toxic to companies, not so much to the small bakery around the corner, maybe a bit more to a bigger company. If you think about Swisscom, for instance. They might not want to be super transparent even if they claim they are. They might not want to be super transparent about what they do with your data.

But it is extremely toxic to the biggest of players, a Google or a Facebook. It becomes extremely difficult for them to be transparent, because they sometimes don't even know what they are doing with this data, or what the implications are. You have a system where everyone sort of wants to be the top dog. You hear initiatives, like ‘we want to be the Google of Switzerland’. And that can only really work by taking people's data and abusing it in some way. As a consequence, people feel dispossessed, they feel like nothing can change. It is a bit like for climate change. ‘I cycled to work, what's that going to change about the bigger picture?’ Because of that, it becomes even more exceptional that people exercise their rights, and this is the second reason it is an effort: you become exceptional because you care. In that sense, it is a vicious cycle.

How must transparency be understood?

Often, you hear like, ‘transparency? Yeah, it's nice, on paper’. But then the individual has this data, and then what?. This is why there is a collectively useful aspect to transparency. In other words, you get the same resources that the other side has. You don't necessarily understand the data, but you can go with it, and build your own path around this resource. You can reach out to others that can help you understand it, and that can help you see if there's a problem.

Can we trust the notion of transparency?

In many contexts, we can. If you look at the business model in some ecosystems, advertising for example, it's about being promiscuous with your data, about sharing your data with others. So you take a legalistic approach to this, every data flow should be accounted for, should have clear responsibilities attached to it, and so on. And if you start investigating and asking, ‘where did that my data flow, who was responsible here or here’, by very nature, you can ask multiple people about that. You can add the source or destination, and even maybe later recipients who might need to know the provenance of this data. And you can cross the information together. When you start doing that, you realize that a huge amount of business models are about leaving large parts of these  things very blurry. To one actor, the data is personal, to the next, it is anonymous. But if you are able to track all of that down together, very quickly, you will find contradictions.

To what extent do we understand data?

There is a lot of blurriness. And there certainly is a lot of blurriness in the risks. So even if you had a good definition of what your personal data is, you still wouldn't know the risks associated to it. If someone has your picture, does that mean that they have the capacity to find you in five years? Or not? Or if they have your picture from five, ten, fifteen years ago, do they have the capacity of predicting what your face will look like in ten years? Who knows? Another aspect is that the definition of personal data is changing over time because of technological progress for instance. It is a very complex and dynamic notion. But the question of transparency is more basic than that. There is data in Facebook's servers that I know and they know is personal, but that I won't see. They have decided to profile me according to all kinds of characteristics and they just don't want to be transparent about that.

So how do we act then and take back control over our data?

Regardless of whether you are a technical person or not, reach out in the physical world to people who want the same thing and are exploring this topic. The hardest bit is to build collective power around those digital questions. One avenue is MyData. It is a global movement. If you are in Geneva, join MyData Geneva and so on. Those are places where you can discuss the matter because you are not going to solve this on your own even for your own data.

MyData Global is an nonprofit organization based in Finland that gathers 100 organizations, 1000 people who care about those topics, and who are trying to develop solutions or to advocate things like this. It is a network of people, and some sub units are concretely building prototypes, alternatives. I am the local leader of the Geneva hub. Here, it is about discussing those issues, raising awareness, building a network of people who care about those things. You do meet people from all kinds of places, from the data scientist who does that all day, to the grandma, who just cares and doesn't know what to do about it. It is actually very rewarding to cross all those borders on such a universal topic.

What would be a good sustainable model around data governance?

The problem with data is that it is simultaneously universal, and crisscrosses many different sectors. Data must be governed in many different contexts at once. As everyone realizes that there is a huge amount of power in this, everyone wants to be the one deciding for everything. But at the same time, it is an intensely personal topic. That means that what happens around this should be partly an individual decision, whatever top down structure is chosen. It should really connect with movements like My Data, that try to have a global reach at a lower level. Global reach for individuals, global reach for cities, global reach for small non profit organizations, touching on aspects of data that you care about. It might be work, autonomy, discriminations. Those voices should pop up very quickly on the international scene. It is another key element of the strategy of MyData Geneva to connect these horizontal and vertical dimensions of the governance problem.

The second aspect is that some principles of governance should be higher than others. So, the topmost part of the structure should be focused on individual rights and on ensuring that whatever structures exist below and many different types will exist, should be respectful of those rights. Already at the bottommost level, people have different opinions. It is going to be very hard to reconcile all of that at the topmost level. It is really about this juggling between the top global scale and the individual scale.

How can Geneva help in the process?

Geneva is a city of networks and network of networks. It is pretty clear that the role for Geneva here is to make those networks visible, and connect the lowest layer of networks that exist to the top most layer and to try to zip the two of them together as effectively as possible.