15270945867_Df08A1Db40_QAt Data Days 2014, one of the most compelling talks came from Lorena Jaume Palasi, a lecturer & PhD candidate at the Ludwig Maximilian University of Munich. Her talk addressed intriguing questions about data ethics- do we need to create new ethical paradigms to deal with data? Where does the abstract right to privacy- and our fear of technological advancements which may impinge on this- stem from? Here is what she had to say.


Ethics & Big Data 

When we talk about big data, we hear a lot of panic. We hear a lot of people suggesting we rethink our scientific rules and scientific theory, that we need to create new paradigms. We also hear a lot of demand for new ethical concepts, trying to grasp the concept of big data. The internet community, related private sector, scientific community, civil society, and regulators (decision makers from politics) think that there is a need for that. I disagree.

From the perspective of scientific theory, we’ve been coping with inductive theories since the very beginning of science and philosophical thinking about science and scientific methods. The challenges with big data are issues of man made statistical bias. These are grounded on the problems that human beings in general have: subjectivity and prejudices, and intellectual or social biases and constraints. From that point of view, I don’t think that we need a new paradigm since having more data, more unstructured and heterogeneous data than before, doesn’t change the concepts, the abstract concepts of the theory of science.

With regards to the ethical demands, in terms of a need for different ethical principles to cope with big data, again I think this is unnecessary. We may have more data, not only more but more heterogeneous of course, but the ethical principles we have do not depend on the size. They depend on abstract concepts like justice, freedom from discrimination, the right to have honor, and the right to have a private sphere.

The challenges of big data are still based on the human made decision of discrimination on the basis of a data correlation. It’s not the data that leads to this sort of discrimination, it’s the algorithms that might be programmed by someone deciding to create a discrimination based on some correlations that are possibly inaccurate, and ethically and scientifically wrong. But this is not new.

The Question of Privacy

Human kind has always been afraid of new technology adaptions that alter the perception they previously held. We experienced this with the introduction of the car, with the telegram, with television. We even had it with the invention of the printing press. We’ve experienced it any time we invented something new that would automate some processes. This is the first aspect, and this aspect sometimes has good reasons, sometimes it is merely fear of the unknown.

Because the unknown is difficult to grasp, a human cannot make a calculation on their behaviours and the consequences of that behaviour. This is one aspect, not knowing the consequences.

The other aspect is also very interesting: Many of the innovations being made in the internet originated from the porn industry, the military, or academia. These are places that are not perceived by the many. Only a very small elite, focusing on the development of different techniques, proving it within their peers. Only once it had been well developed and used within the own sector did it begin to spread to other sectors. For instance, frying pans with Teflon. Teflon was developed by the NASA with a very specific technical application in mind, and with further development somehow it landed in every kitchen in the world. Now it is something which is not new, and is not very innovative, it is common.

So we had this sort of development in a niche, development for many, many years, and then it becomes mainstream. Nowadays what we have is the main economic sector playing a role in these innovations. So it’s not just those niches. With the internet of things we’re looking at existing consumer goods like washing machines and fridges. Companies working with everyday tools and experimenting with data. The user is in the middle of this mainstream, and is perceiving the many, many different actors developing with their data This a significant point, having the impression that somehow every aspect of someone’s life can be perceived as a subject of interest for economic developments. This raises questions, since development was usually attached to a neutral purpose.

In Europe, while not as pronounced as in America, there is an expectation of research for the sake of research, for the sake of human knowledge and the betterment human kind. There is no harm if that research is being passed to the private sector. In contrast, research and development based on economical betterment can lead to concerns around the methods and ethics involved. We also have a semantical problem, which stems from the history of data protection. When data protection regulations we put in place, the regulators had in mind states and huge corporations with incredible capacity to process data. There were only three or four actors that were able to do that, particularly states. Regulations were designed to save people from the power of a very powerful state, an autocratic state, rather than from the point of view of human rights. Instead of saying we want to protect privacy or we want to protect freedom of speech, they decided to say we want to protect data. Instead of concentrating on which values should be regulated, they made classifications of data sensitivity. What they didn’t see coming is the permeation of this technology, and how everyone would become an actor in this. It’s all of us, as individuals. Whenever we send an SMS, whenever we blog or whenever we write something on Facebook, we are processing data automatically. This changes the whole situation because somehow the regulation that was supposed to affect only the governments is also affecting us in our everyday lives.

This is changing the expectations we have towards each other, all because of this semantical concept we have: data protection, it’s my data. We never had this perception with the traditional concept of privacy. When we have a conversation online, this is data. When we have it in private, offline, you wouldn’t call that data.

Then the question would be whose data is that? Is that your data, or is that my data? Is this the data of the telephone that is recording this? Who’s behind that? Is it from the service provider for the telephone? We have new questions, and if we didn’t have the semantical question, or difference between privacy and data protection, perhaps many of these questions wouldn’t be raised.

So it’s three aspects. It’s fear of technology, it’s also the way innovation is being produced right now, and the incentive behind the innovation. And it’s also this regulation that has produced new expectations.

The current reforms are still attached to the principles that were used in the ‘90s, which originated in the ‘70s. So It’s still based on how data protection was conceived. Data protection needs to first define which values we want to protect. So is it about honor, is it about freedom from discrimination, is it about freedom of speech, freedom of choice, is it about privacy, and is it about the right to have intimacy as well.

Do we have different levels of privacy? Is there a privacy in the public sphere? What does it mean, public sphere? Are there any barriers when we talk about public sphere? These are questions that are being raised.

(image credit: vintagedept)

Previous post

Meet HP Haven OnDemand- HP's Big Data Analytics in the Cloud

Next post

Google Focusing R&D On "Artificial Intelligence for Data Science"