Data firm left detailed profiles of 48 million people on a publicly accessible website

In the wake of Facebook's Cambridge Analytica scandal, another data firm was discovered to have amassed similar user profiles of millions of people.


A report published Wednesday reveals how a data firm built psychographic profiles on 48 million people, using data from Facebook, Twitter, LinkedIn, Zillow, and others—and then left that trove of data unprotected on a cloud storage repository.

The data was compiled by LocalBlox, a firm that “automatically crawls, discovers, extracts, indexes, maps and augments data in a variety of formats from the web and from exchange networks” to build consumer profiles that it sells to companies.  

In February, Chris Vickery, an ethical data breach hunter and director of cyber risk research at the security firm UpGuard, was able to access millions of these profiles on an unlisted and unprotected Amazon Web Services S3 bucket. The bucket contained a 151.3-gigabyte file that, when decompressed, amounted to a 1.2 terabyte that contained the user profiles. It was aptly named “final_people_data_2017_5_26_48m.json.”

“In the wake of the Facebook/Cambridge Analytica debacle, the importance of massive sets of psychographic data is becoming more and more apparent,” UpGuard’s report reads. “The exposed LocalBlox dataset combines standard personal information like name and address, with data about the person’s internet usage, such as their LinkedIn histories and Twitter feeds. This combination begins to build a three-dimensional picture of every individual affected—who they are, what they talk about, what they like, even what they do for a living—in essence, a blueprint from which to create targeted persuasive content, like advertising or political campaigning.”

The consumer profiles amassed by LocalBlox vary in level of detail. Much of the information can be harvested from public sources—the email address listed on your Facebook profile, or the city of residence shown on your Twitter page. Some of the information is believed to have been collected from non-public sources, such as purchased marketing data.

In a ZDNet article published Wednesday, LocalBlox’s chief technology officer Ashfaq Rahman said most of the data discovered by Vickery was fabricated for internal tests, and that Vickery had “hacked in” to the publicly accessible repository. But Vickery had informed LocalBlox that he accessed the repository after discovering the vulnerability in February, and it was reportedly secured soon after.

“Rahman would not say why he restricted the bucket’s permissions hours later,” reads the ZDNet article.

According to Rahman, “no other individual is believed to have accessed this file from the S3 bucket.”

LocalBlox didn’t break any laws in its harvesting of consumer data, though it’s not clear whether it violated the terms of websites like LinkedIn, Facebook, and Zillow, all of which explicitly prohibit data scraping.

In a 2013 article, LocalBlox’s president Sabira Arefin said it’s “up to the individual sites and system to determine the terms and conditions and then enforce any security mechanism in place if they want to prevent scraping.”

Vickery said that companies like LocalBlox should be more responsible in the way they handle and stores people’s data.

“Concentrating millions of people's details can become by its very nature a weaponized thing, and something that can lead to a lot of harm,” Vickery said.

UpGuard’s report concludes:

“The profitability gained by data must come with the responsibility of protecting its integrity and privacy. Cloud storage itself provides functionality and speed at a reasonable cost, but cloud assets require careful configuration—the thin line between private and public can be erased with the flip of a single switch. The lack of controls around common IT processes are what allow critical errors like this to slip into production, eroding the privacy of millions of people.”

Compelling speakers do these 4 things every single time

The ability to speak clearly, succinctly, and powerfully is easier than you think

Former U.S. President Barack Obama speaks during a Democratic Congressional Campaign Committee rally at the Anaheim Convention Center on September 8, 2018 in Anaheim, California. (Photo by Barbara Davidson/Getty Images)
Personal Growth

The ability to communicate effectively can make or break a person's assessment of your intelligence, competence, and authenticity.

Keep reading Show less

Scientists invent method to extract gold from liquid waste

The next gold rush might take place in our sewers.

Shutterstock
Surprising Science
  • Even though we think of it as exceedingly rare, gold can be found all around us.
  • The trouble is, most of the gold is hard to get at; its too diluted in our waste or ocean waters to effectively extract.
  • This new technique quickly, easily, and reliably extracts gold from most liquids.
Keep reading Show less

Juul to stop selling most e-cigarettes in stores, leaves social media

Facing mounting pressure from the public and government agencies, the e-cigarette maker announced major changes to its business model on Tuesday.

(Photo by Scott Olson/Getty Images)
Politics & Current Affairs
  • Juul makes flavored e-cigarettes and currently dominates the vaping industry, with 70% of the market share.
  • The FDA is planning to ban the sale of flavored e-cigarettes in gas stations and convenient stores this week.
  • Some have called teenage vaping an epidemic. Data from 2018 show that about 20% of high school students had used an e-cigarette in the past 30 days.
Keep reading Show less