Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
BETA
THIS IS A BETA EXPERIENCE. OPT-OUT HERE

More From Forbes

Edit Story

From Encrypted Drives To Amazon's Cloud -- The Amazing Flight Of The Panama Papers

Following
Updated Apr 11, 2016, 10:43am EDT
This article is more than 8 years old.

It was an epic haul. Whoever caused the Panama Papers breach at tax avoidance and offshore company specialist Mossack Fonseca leaked an astonishing 11 million documents and 2.6 terabytes of data, the largest of all time. Previous mega-leaks were in the gigabyte territory - Wikileaks Cablegate at 1.7GB, Ashley Madison 30GB, Sony Pictures  an estimated 230GB.

The logistics of the journalistic operation behind the Panama Papers were equally astounding, a year of exchanging information over bespoke open source software between more than 100 publications, from the Guardian to the BBC, and 400 journalists. All Mossack Fonseca's emails, files and images had to be stored on encrypted drives then moved to the cloud securely to keep the story from spilling ahead of time or to uninvited parties, whilst remaining usable for both technical and non-technical journalists.

And after that concerted effort, the stories are spilling out fast. A cellist-playing close pal of Putin's shovelling money through foreign entities that ended up funding a ski resort where the Russian leader's daughter was married. The prime minister of Iceland failing to disclose his wife owned an offshore firm that had a claim on failed banks. And the father of UK prime minister David Cameron running an off-shore company to avoid tax. More are coming.

Where's all of that data stored now? In an Amazon cloud data center, accessible to anyone who knows the URL and has a password. The journey of those files, from the leaks to the revelations, is an astonishing example of developers working with journalists to keep whistleblowers and the information they supply safe and, just as crucially, usable. With the extra kicker: it was largely done using free, open source technology.

The hack

A leaked message to customers would indicate it all started with a typical hack, a preventable one at that. In a letter, dated April 1 and posted on Wikileaks' Twitter profile, the firm told customers it was investigating an email server hack. Mossack Fonseca did not respond to repeated requests for comment on the breach, though director Ramon Fonseca told Reuters the hack was "limited" and complained of an "international campaign against privacy," despite the significant amount of data that was siphoned out of the organization.

Now it's in the media spotlight, Mossack Fonseca is being mocked for alleged poor security practices, as well as facing accusations it facilitated widespread tax avoidance, even where criminal proceeds were involved. (Its full response to those allegations, which it largely denies, can be found here). Its emails were not encrypted, according to ACLU privacy and encryption expert Christopher Soghoian, whilst its websites were peppered with potential weaknesses, ripe for any willing hacker.

FORBES discovered the firm ran a three-month old version of WordPress for its main site, known to contain some vulnerabilities, but more worrisome was that, according to Internet records, its portal used by customers to access sensitive data was most likely run on a three-year-old version of Drupal, 7.23. That platform has at least 25 known vulnerabilities at the time of writing, two of which could have been used by a hacker to upload their own code to the server and start hoovering up data. Back in 2014, Drupal warned of a swathe of attacks on websites based on its code, telling users that anyone running anything below version 7.32 within seven hours of its release should have assumed they’d been hacked.

That critical vulnerability may have been open for more than two-and-a-half years on Mossack Fonseca's site, if it hadn't been patched at the time without updating website logs. It remains a valid route for hackers to try to get more data from the firm and its customers. On its site, the company claims: "Your information has never been safer than with Mossack Fonseca's secure Client Portal." That boast now looks somewhat misguided.

Critical encryption

Whatever weakness was exploited by the leaker, for at least a year, the company didn't notice the breach, or did not issue a public alert. After the leaker pilfered the information and shifted it to their own servers, they made initial contact with Bastian Obermayer, journalist at Süddeutsche Zeitung (SZ), through encrypted chat. That could have been a Jabber client, or Android and iPhone apps such as Signal, Telegram or Wickr.

The leaker, using the name John Doe, made it clear how they would communicate from that point forward. "There are a couple of conditions. My life is in danger. We will only chat over encrypted files. No meeting, ever."

Soon after, the data started pouring in, though not all at once. SZ coordinated with the International Consortium of Investigative Journalists (ICIJ) to handle the humongous data troves that came in incrementally. In the first group meeting, they had to decide on what to do with 1TB. Similar meetings were held until the Papers totalled 2.6TB. According to Mar Cabra, head of the data and research unit at the ICIJ, the files and their replicas were spread across different encrypted hard drives, using the VeraCrypt software to lock up the information. (Obermayer told me he used the same on his PC when handling the Panama Papers).

VeraCrypt is a burgeoning open source form of file encryption that many see as a more secure version of the once widely-used TrueCrypt software. A fork of TrueCrypt, VeraCrypt was designed by French cryptographers at IDRIX, the beta products released in 2013 for Apple OS X, Microsoft Windows and Linux. As TrueCrypt developers declared support would no longer be provided, IDRIX's creation became one of a few immediately popular successors.

Mounir Idrassi, the main developer of VeraCrypt, told FORBES over encrypted mail his software fixed many vulnerabilities discovered in TrueCrypt and used stronger algorithms. Idrassi claimed that where the password used to lock and unlock the data is strong - with plenty of different letters, numbers and characters - "it's virtually impossible to crack VeraCrypt volumes."

The "hidden volumes" feature lets users unlock the visible and less sensitive part of Veracrypt with one password, whilst another is used for the sensitive information. "It is technically impossible to prove if there is a hidden volume," said Idrassi. Such a mechanism is important for plausible deniability; a journalist under pressure could hand over the first password if pressured but not reveal the second used to access the most valuable data.

But Veracrypt is not yet proven. It has not been audited by independent hackers as TrueCrypt was, and found largely secure even after support had been dropped. According to Steve Lord, a whitehat hacker with UK-based Mandalorian, the service is not enough on its own to protect files. "There needs to be associated processes for handling and communicating information securely, and I’d avoid connecting any system with the raw Panama data to the Internet, if at all possible. There are a lot of people who want that data," Lord said.

According to Cabra, however, there was only one problem they had with using VeraCrypt: one drive became corrupted as the initial batch of data was moved over. The team simply had to re-start the process. As far as she's aware, there's been no indication the leaks have been leaked again, outside the grasp of the journalists privy to the files.

Up to Amazon

And she wasn't afraid to make the information reachable on the web. To make all the information accessible to more than 400 journalists, Cabra said the files were uploaded to Amazon, a lengthy process, but not as time-consuming as sorting data into searchable formats.

All the software used was open source, tweaked to suit the reporters' needs. The search tool, allowing reporters to hunt for names like Putin or places like the British Virgin Isles, was based on Apache Solr, used by a large number of search-heavy organizations, including DuckDuckGo, a privacy-focused tool. Solr was combined with Apache's Tika, an indexing software that can parse different file types, be they PDFs or emails as in the Panama Papers, drawing out the text from the non-essential data. Layered on top was the shiny interface, built using Blacklight, another open source development.

Once built, more than 400 reporters, who would meet in person at events organized by SZ and the ICIJ throughout 2015 and 2016, only needed the link and a randomly-generated password to start rummaging around the Panama Papers for leads. Outside of security against brute force guessing of usernames and passwords, there was no other access protection, though anyone communicating with the site did so over encrypted lines using the SSL protocol, just as cryptographically-protected websites from banks to Facebook do.

To understand what they were looking at, the reporters could use integrated data visualization, using a mix of graph database tech Neo4j with Linkurious to make the job of making connections between files easier.

A separate site, a "virtual newsroom" as Cabra called it, did include an extra protection: two-factor authentication using Google Authenticator, providing an additional one-time code to enter after the password was provided. In that space, reporters could update colleagues with their latest story ideas, all delivered via a Facebook-like newsfeed, whilst using the chat feature for further collaboration. Again, the social network was constructed on open source software, Oxwall. (ICIJ makes some of its own tools open source too, the most recent addition on Github being a command line tool for content analysis).

Some reporters, including those at SZ, also used Nuix, a proprietary tool often used by law enforcement bodies and auditing companies to uncover evidence in epic data repositories. The Sydney-based company's CEO Eddie Sheehy said though 2.6TB was large, his organization's software had helped search through 300-400TB of information before. Customers include the US Secret Service, Homeland Security, the European Commission and the Home Office. It's partnered with the ICIJ for the last five years. "We break data down to its smallest component and start to tell stories about them, whether that's IP addresses, telephone numbers, company names," Sheehy said.

The value of cryptography

All of this, Cabra told me, was designed to employ cryptography in a usable way - something all organizations, including Mossack Fonseca, would benefit from. "Reporters are getting more and more used to using encryption and it’s getting less and less complicated." Cabra might also have just helped coordinate the most significant open source project ever seen.

And at a time when the FBI and other governments are seeking to install backdoors in major products, most notably Apple's iPhone, the Panama Papers show just how vital encryption is to revealing stories of corruption that are undoubtedly in the public interest.

"It is a proud moment to know that VeraCrypt was useful for such important revelations especially in the current political climate where the use of encryption is vilified in almost every media outlet," added Idrassi, who did not know his creation was used to secure the Panama Papers. "It is important to take this opportunity to educate the masses and politicians alike that encryption is not only used for bad things but more importantly it is of critical use by journalists, human right activists and other dissidents living under repressive regimes."

UPDATE Though Internet records would indicate Mossack Fonseca is running the vulnerable Drupal 7.23, the company may have patched certain vulnerabilities and not updated those records. This is, according to whitehat hacker Steve Lord, unlikely but possible. The story has been updated to reflect that.

The company is still running an old version of the software, Drupal 7, according to the source code for the site, which has numerous known vulnerabilities.

Follow me on TwitterCheck out my websiteSend me a secure tip