A data set that can analyze personal attributes of 3 million Facebook users turned out to be accessible by anyone for four years


ByStock Catalog

Facebook is heavily damaged by a scandal massaged by user data via Cambridge Analytica (Cambridge Analytika), but a data set detailing more than 3 million user attributes is easy for everyone for four years It turned out that it was left in an accessible state. Although Facebook has begun to investigate this issue, it is not likely to be responsible for malfunctioning the policy of using personal information.

Huge new Facebook data leak exposed intimate details of 3m users | New Scientist
https://www.newscientist.com/article/2168713-huge-new-facebook-data-leak-exposed-intimate-details-of-3m-users/

Anyone could download Cambridge researchers' 4-million-user Facebook data set for years | TechCrunch
https://techcrunch.com/2018/05/14/anyone-could-download-cambridge-researchers-4-million-user-facebook-dataset-for-years/

According to a study by New Scientist, personal information was collected from 6 million users by an application called "myPersonality" developed by researchers at the University of Cambridge. The myPersonality app analyzed user attributes from answers to psychological tests. Approximately half of the 6 million users of the application, 3.1 million users agreed to share data of Facebook profile with the myPersonality project. In this way, psychological analysis of individuality from the index called "Big Five" such as "whether it is open", "whether it is serious", "whether it is outgoing", "whether there is synchronization" or "degree of nervousness" A large amount of personal information indicating personal tastes and preferences obtained by utilizing the method was collected.

ByAdriano Gasparri

The personal data of the user who consented to data sharing has been arranged in a form of anonymous data with the user name deleted, and it is put together as one data set suitable for data analysis. And as this data set could be used for research purposes, it was widely and widely provided under the condition that it is registered as a collaborator of the myPersonality project and follows the usage agreement. Researchers who managed this data set include Mr. Michael Kozinsky who was conceived to conceive of data utilization techniques by Cambridge · Analytka.

The problem is not only researchers agreeing to use the data set, Google, Microsoft, Yahoo! Including more than 280 people in 150 organizations including IT companies. Although the individual name was deleted, it was highly probable that data that could identify an individual by comparing it with other data collected from Facebook data was commercially used as a target advertisement is. In the policy of Facebook use at that time, the act of distributing collected user data to other people was prohibited, utilization of data collected by myPersonality was a violation of the terms of use, but somehow Facebook officials also applied myPersonality project It seems there was an application for registration to.

In addition, Cambridge Analyticska was also applying for participation as a collaborator to the myPersonality project in 2013, but the application was dismissed from doubts about political use. However, Dr. Alexander Kogan, who gave the data analysis method for Cambridge · Analytica, knows that application for participation was allowed in 2014.


A myPersonality project that was required to be used for research purposes and was screened a certain amount, but one researcher who participated in the project had disclosed credentials to GitHub to allow students to use the data . According to the New Scientist, "It was possible to reach this credential information with one search over four years", so that anyone can easily obtain credentials and be able to access the myPersonality dataset It turned out.

Facebook revealed that we stopped using the myPersonality application due to policy violation, but said that it was one month ago to stop using it. Criticism seems to be gathered against the abandonment of a large amount of user data collected on Facebook until it is available.

in Web Service,   Security, Posted by darkhorse_log