Logo for Special Source Operations branch of NSA from PowerPoint disclosed by Edward Snowden

Millions of users from around the world have had their contact lists from their personal email and instant messaging accounts harvested by the National Security Agency, according to a report from The Washington Post’s Barton Gellman and Ashkan Soltani.

The collection “sweeps in the contacts of many Americans,” as two unnamed “senior US intelligence officials” confirmed. That number could be “in the millions or tens of millions.”

The previously undisclosed program is detailed in documents obtained from former NSA contractor Edward Snowden. Congress or the Foreign Intelligence Surveillance Court has not authorized the collection of “contact lists in bulk.” It would be illegal to do this kind of collection at facilities in the United States, however, the agency is able to avoid these legal restrictions by “intercepting contact lists from access points ‘all over the world.’” Since none of these points are on US territory, there are no limits to what can be collected and stored.

The agency does not have to restrict its intake to “contact lists belonging to specified foreign intelligence targets.” Anything passing through is assumed to not be from US persons.


In practice, data from Americans is collected in large volumes — in part because they live and work overseas, but also because data crosses international boundaries even when its American owners stay at home. Large technology companies, including Google and Facebook, maintain data centers around the world to balance loads on their servers and work around outages.

The only check against abuse by officials in the NSA appears to be that they will not search the database containing all of this data unless a case can be made that there is information on a “valid foreign intelligence target.” However, there is no outside review of the NSA’s decision to conduct a search of this data. It can abuse its authority, stretch what is permissible under “minimization rules,” and not face any legal challenge or other consequences.

An internal PowerPoint presentation slide indicates that on a single day in 2012 the Special Source Operations branch collected the following number of email address books: 444,743 from Yahoo!, 105,068 from Hotmail, 33,697 from Gmail, 82,857 from Facebook and 22,881 from other providers.

This typical intake suggests more than 250 million address books are collected in a year. However, on a single day, less than 14% could be attributed a target, which means at that rate the NSA collected and stored about 215 million contact lists of users it would never be able to use.

Five hundred thousand “buddy lists” and inboxes were collected on a typical day, according to the same presentation. Ninety percent of the data is collected because the contact lists contained what the NSA was looking for in the search. In other words, the data was most often collected from a person who was not a target or whose content was definitely not relevant to any sort of investigation.

The collection of too much data that is not useful and will never be necessary for any investigation is acknowledged as a problem in a page of the NSA’s Wikipedia called “Intellipedia.”

In 2011, a “significant portion of collection was repetitive.” The data was also found to be “of little foreign intelligence value.”

A program called SCISSORS was apparently developed to help reduce the amount of extraneous data collected. It would block “ownerless address books” at “points of access.” A “throttle” would also be placed in another program previously reported, XKeyScore, which The Guardian described as a tool that makes it possible to collect “nearly everything a user does on the internet.” [A feature related to Yahoo! Messenger is highlighted as contributing to "excessive collection."]

Another PowerPoint presentation highlights the need to shift “collection philosophy” at the NSA. It suggested that analysts “memoralize what you need” versus “order one of everything off the menu and eat what you want.” Storing “less of the wrong data” was recommended, and a slide indicated that “20% reduction in “content to long-term repositories” was in the process of taking place.

Internet companies are not notified by NSA of the collection because the data is collected “on the fly” as it crosses Internet switches. Companies probably like this arrangement because they can claim there was no way they could challenge violations of their users’ privacy.

As NSA whistleblower William Binney told The Daily Caller in June:

They’re making themselves dysfunctional by collecting all of this data. They’ve got so much collection capability but they can’t do everything. They’re probably getting something on the order of 80 percent of what goes up on the network. So they’re going into the telecoms who have recorded all of the material that has gone across the network. And the telecoms keep a record of it for I think about a year. They’re asking the telecoms for all the data so they can fill in the gaps. So between the two sources of what they’ve collected, they get the whole picture.

They can do textual processing at a rate of about 10 gigabits a second. What that means is about a million and a quarter 1,000-character emails a second. They’ve got something like 10 to 20 sites for this around the United States. So you can really see why they need to build something like Utah to store all of this stuff. But the basic problem is they can’t figure out what they have, so they store it all in the hope that down the road they might figure something out and they can go back and figure out what’s happening and what people did. It’s retroactive analysis. The FBI is using it that way too.

This “hoarding complex” actually increases the likelihood that America will be attacked by terrorists.

Six terrorist attacks have occurred by people whom the FBI or CIA had previously identified. The US intelligence community’s commitment to building the haystack bigger and bigger and bigger contributed to a failure to detect David Headley, Abdulhakim Mujahid Muhammad, Major Nidal Hasan, Umar Farouk Abdulmutallab and Tamerlan Tsarnaev.  The sheer amount of data overwhelmed analysts and they failed to connect the dots.

Sen. Dianne Feinstein, a key defender of preserving and even expanding intelligence agency’s surveillance powers, would have Americans believe that if these kind of programs were available prior to the September 11th attacks planes may not have been hijacked. However, as FBI whistleblower Coleen Rowley has pointed out dots were not connected back then because there was too much intelligence, not a lack of intelligence. Agencies had all the information they just were not able to act, assess or appreciate what they had collected.

None of the presentations suggest the NSA concedes their hoarding of data is getting in the way of their ability to “protect” the country. The acknowledgment that the agency collects too much data is an acknowledgment that there are technical issues of which employees should prevent.

There are at least twelve recent known cases where NSA employees abused surveillance tools to spy on girlfriends, husbands suspect of cheating, family or to target women for sex. These were all self-reported cases so that creates the possibility that many more of these cases of abuse have taken place.

What the collection of data from contact lists allows is the construction of “maps of a person’s life.” One can see their personal, professional, political and religious connections. Though there is no indication of how contact list data has contributed to the fight against terrorism, it seems reasonable to ask whether this collected data is more likely to be abused for sexual purposes or private investigations (that may also be political) before the vast collection of data ever helps thwart a terrorist plot or activity.