Canada’s leading surveillance agency is monitoring millions of Internet users’ file downloads in a dragnet search to identify extremists, according to top-secret documents.
The covert operation, revealed Wednesday by CBC News in collaboration with The Intercept, taps into Internet cables and analyzes records of up to 15 million downloads daily from popular websites commonly used to share videos, photographs, music, and other files.
The revelations about the spying initiative, codenamed LEVITATION, are the first from the trove of files provided by National Security Agency whistleblower Edward Snowden to show that the Canadian government has launched its own globe-spanning Internet mass surveillance system.
According to the documents, the LEVITATION program can monitor downloads in several countries across Europe, the Middle East, North Africa, and North America. It is led by the Communications Security Establishment, or CSE, Canada’s equivalent of the NSA. (The Canadian agency was formerly known as “CSEC” until a recent name change.)
The latest disclosure sheds light on Canada’s broad existing surveillance capabilities at a time when the country’s government is pushing for a further expansion of security powers following attacks in Ottawa and Quebec last year.
Ron Deibert, director of University of Toronto-based Internet security think tank Citizen Lab, said LEVITATION illustrates the “giant X-ray machine over all our digital lives.”
“Every single thing that you do – in this case uploading/downloading files to these sites – that act is being archived, collected and analyzed,” Deibert said, after reviewing documents about the online spying operation for CBC News.
The ostensible aim of the surveillance is to sift through vast amounts of data to identify people uploading or downloading content that could be connected to terrorism – such as bomb-making guides and hostage videos.
In the process, however, CSE combs through huge volumes of data showing uploads and downloads initiated by Internet users not suspected of any wrongdoing.
In a top-secret PowerPoint presentation, dated from mid-2012, an analyst from the agency jokes about how, while hunting for extremists, the LEVITATION system gets clogged with information on innocuous downloads of the musical TV series Glee.
CSE finds some 350 “interesting” downloads each month, the presentation notes, a number that amounts to less than 0.0001 per cent of the total collected data.
The agency stores details about downloads and uploads to and from 102 different popular file-sharing websites, according to the 2012 document, which describes the collected records as “free file upload,” or FFU, “events.” Only three of the websites are named: RapidShare, SendSpace, and the now defunct MegaUpload.
SendSpace said in a statement that “no organization has the ability/permission to trawl/search Sendspace for data,” adding that its policy is not to disclose user identities unless legally compelled. Representatives from RapidShare and MegaUpload had not responded to a request for comment at time of publication.
LEVITATION does not rely on cooperation from any of the file-sharing companies. A separate secret CSE operation codenamed ATOMIC BANJO obtains the data directly from internet cables that it has tapped into, which can be viewed through the agency’s OLYMPIA program. CSE then sifts out the unique IP address of each computer that downloaded files from the targeted websites.
The IP addresses are valuable pieces of information to CSE’s analysts, helping to identify people whose downloads have been flagged as suspicious. The analysts use the IP addresses as a kind of search term, entering them into other surveillance databases that they have access to, such as the vast repositories of intercepted Internet data shared with the Canadian agency by the NSA and its British counterpart Government Communications Headquarters.
Once a suspicious file-downloader is identified, analysts can plug that IP address into MUTANT BROTH, a database run by the British electronic spy agency Government Communications Headquarters (GCHQ), to see five hours of that computer’s online traffic before and after the download occurred, opening the door for further surveillance of their activities.
That can sometimes lead them to a Facebook profile page and to a string of Google and other cookies used to track online users’ activities for advertising purposes. This can help identify an individual.
In one example in the top-secret document, analysts also used the U.S. National Security Agency’s powerful MARINA database, which keeps online metadata on people for up to a year, to search for further information about a target’s Facebook profile. It helped them find an email address.
After doing its research, the Levitation team then passes on a list of suspects to CSE’s Office of Counter Terrorism.
Since the secret 2012 presentation about LEVITATION was authored, both RapidShare and SendSpace have toughened security by encrypting users’ connections to their websites, which may have thwarted CSE’s ability to target them for surveillance. But many other popular file-sharing sites have still not adopted encryption, meaning they remain vulnerable to the snooping.
As of mid-2012, CSE was maintaining a list of 2,200 particular download links that it regarded as connected to suspicious “documents of interest.” Anyone clicking on those links could have found themselves subject to extra scrutiny from the spies.
The file-sharing surveillance also raises questions about the number of Canadians whose downloading habits could have been swept up as part of LEVITATION’s dragnet.
By law, CSE isn’t allowed to target Canadians. Canada’s commissioner charged with reviewing the secretive group found it unintentionally swept up private communications of 66 Canadians while monitoring signals intelligence abroad, but concluded there was no sign of unlawful practice.
In the LEVITATION presentation, however, two Canadian IP addresses that trace back to a web server in Montreal appear on a list of suspicious downloads found across the world. The same list includes downloads that CSE monitored in closely allied countries, including the United Kingdom, United States, Spain, Brazil, Germany and Portugal.
It is unclear from the document whether LEVITATION has ever prevented any terrorist attacks. The agency cites only two successes of the program in the 2012 presentation: the discovery of a hostage video through a previously unknown target, and an uploaded document that contained the hostage strategy of a terrorist organization. The hostage in the discovered video was ultimately killed, according to public reports.
A CSE spokesman declined to comment to The Intercept on whether LEVITATION remained active, and would not provide examples of useful intelligence gleaned from the spying, or explain how long data swept up under the operation is retained.
I’ve argued the NSA does similar analysis using known codes tied to Inspire (not the URL, necessarily, but possibly the encryption code included in each Inspire edition) on upstream collection, which would basically identify the people within the US who had downloaded AQAP’s propaganda magazine. One reason I’m so confident NSA does this is because of the high number of FBI sting operations that seem to arise from some 20-year old downloading Inspire, which them appears to get sent out to a local FBI office for further research into online activities and ultimately approaches by a paid informant or undercover officer.
But as the “Scoreboard” slide in this presentation makes clear, what this process gives you is not validated IDs, but rather probabilistic matches (which FISC appears to deal with using minimization procedures, suggesting they let NSA collect on these probabilistic matches with the understanding they have to treat the data in some certain way if it ends up being a false positive).
That’s important not just for the young men whom FBI decides might make worthwhile targets (even if they’re being targeted, largely, on their First Amendment activities).
It’s important, too, for the false negatives, by far the most important of which I believe to be the Tsarnaev brothers, both of whom reportedly had downloaded multiple episodes of Inspire, as well as other similar jihadist material, and on whom NSA had collected data it never accessed until after the attack, but neither of whom got targeted off this correlation process before they attacked the Boston Marathon.
That is, this really important possible false negative, just as much as the dubious positives that end up getting unbalanced young men targeted by the FBI, may say as much about the reliability of this process as anything else.
This CSE PPT is not yet proof that my suspicions are entirely accurate (though my claims here about correlations are based on officially released documents). But they strongly suggest my suspicions have been correct.
And — particularly given ODNI’s refusal to release what appears to be a key opinion describing the terms on which FISC permits the use of these correlations — this ought to elicit far more conversations about how NSA and its Five Eye partners “correlate” identities and how those correlations get used.