Categories: Security

DefCon 2017: ‘Anonymous’ Browsing Data Easy To De-Anonymise

A pair of German researchers have shown just how easy it is to identify individuals and track their internet browsing habits in detail using supposedly ‘anonymised’ data sources.

The research, presented at DefCon in Las Vegas by journalist Svea Eckert and data scientist Andreas Dewes, sheds light on the practices of companies that collect user data, either for their own purposes or selling it on to third parties.

Defeating anonymisation

The data is, in theory, stripped of identifying information before being used, but Eckert and Dewes found that in some cases as few as 10 web addresses is enough to identify who the ‘clickstream’ belongs to.

Once the individual has been identified, often by matching particular information in the clickstream to data that’s publicly available, the stream indicates everything that user has been doing online, minute by minute, Eckert and Dewes said.

For instance, their experiment uncovered the porn habits of a judge and the drug preferences of a German MP.

They found details of an ongoing police investigation by examining Google Translate URLs, in which which are stored the full text of any query, after matching one clickstream to a particular police detective.

In many cases identity could be established by examining particular URLs – for instance, if someone logs into their own analytics page on Twitter an address is generated that contains their own username.

In other cases the clickstream might indicate the user visited a particular site, at a particular time, say, a YouTube video, when the individual has mentioned looking at that video on a publicly visible source such as a blog.

Nosy browser extensions

“The increase in publicly available information on many people makes de-anonymisation via linkage attacks easier than ever before,” the researchers said the presentation.

The data was surprisingly easy to obtain, with 95 percent of it coming from 10 popular browser extensions. Such extensions offer users a service, but also monitor everything they do online and use it themselves, for purposes such as targeting advertisements, or sell it to third parties.

They estimated up to 10,000 extensions collect detailed user data, but most have a relatively small user base.


“When thinking about surveillance, everyone worries about government agencies like the NSA and big corporations like Google and Facebook,” Eckert wrote in a blog post. “But actually there are hundreds of companies that have also discovered data collection as a revenue source… Most of them keep their data to themselves, some exchange it, but a few sell it to anyone who’s willing to pay,”

Social engineering

The researchers posed as a marketing company that wanted to buy browsing information to train its machine learning tools, and it took them about two weeks to obtain one month’s worth of browsing information on three million German users, compiling a database of 3 billion URLs spread across 9 million sites.

The information was so sensitive they deleted it after the investigation for fear it might fall into the hands of hackers. In her blog post Eckert said the way browser extension companies collect and resell user data is “often illegal” under European law.

A data broker provided Eckert and Dewes for free with information obtained from browser plugins including Web Of Trust (WOT), which, ironically, provides reviews of websites’ privacy practices.

After German public broadcasting network NDR published a report based on the study last November, WOT reworked its extensions and mobile app to better protect users’ anonymity, also giving users the ability to opt out of data collection.

But Eckert and Dewes said it’s next to impossible to make a clickstream fully anonymous.

“High-dimensional, user-related data is really hard to robustly anonymise, even if you really try to do so,” they said in the DefCon presentation.

Users who want to anonymise the clickstream themselves can use services such as TOR or a VPN with rotating exit nodes, or client-side software that blocks trackers, they said.

How much do you know about privacy? Try our quiz!

Matthew Broersma

Matt Broersma is a long standing tech freelance, who has worked for Ziff-Davis, ZDnet and other leading publications

Recent Posts

Tesla Recalls 46,000 Cybertrucks Over ‘Crash Risk’ Faulty Trim

All Cybertrucks manufactured between November 2023 and February 2025 recalled over trim that can fall…

1 day ago

Elon Musk Issued Summons By SEC Over Failure To Disclose Twitter Stake

As Musk guts US federal agencies, SEC issues summons over Elon's failure to disclose ownership…

1 day ago

Alphabet Spins Out Taara To Challenge Musk’s Starlink

Moonshot project Taara spun out of Google, uses lasers and not satellites to provide internet…

1 day ago

Pebble Creator Debuts New Watches As ‘Labour Of Love’

Pebble creator launches two new PebbleOS-based smartwatches with 30-day battery life, e-ink screens after OS…

3 days ago

Amazon Loses Appeal To Record EU Privacy Fine

Amazon loses appeal in Luxembourg's administrative court over 746m euro GDPR fine related to use…

3 days ago

Nvidia, xAI Join BlackRock AI Infrastructure Project

Nvidia, xAI to participate in project backed by BlackRock, Microsoft to invest $100bn in AI…

3 days ago