December 13, 2018

My Bucket's Got a Hole in it - Cloud Storage vs Security


I think we all know of Hank Williams Sr from when he famously made popular the hit song, "My S3 Bucket's Got a Hole in it".  No?  I've gotta get a handle on things.. (harder to think of puns about buckets and clouds, but I'll do what I can)  Even though it may not have been mentioned explicitly, I think it's safe to assume he was referring to remote data storage as it relates to information security.  In this article I'd like to explore the traditional storage of data on web servers as well as cloud storage, such as Amazon's S3 buckets, and weigh the pros and cons of both.  Specifically, I'd like to focus on storage for content a customer may upload, say through a web portal.  No country music artists were harmed in the making of this blog.

Recent Events with Cloud Storage

Leaky buckets have been all over the news lately.  Big blunders came from the US Government (NSA, Pentagon, & US Voter Records), GoDaddy, Booz Allen Hamilton, Dow Jones, Verizon, Time Warner.. the list goes on and on.  There are smaller ones all the time, but they pale (🙈) in comparison.  Millions upon millions of sensitive customer and corporate records were dumped out on the public Internet, allowing anyone to read the content.  Their buckets have major holes in them, and I'm talking the big 5 gallon Home Depot ones!  It used to be that large data exposures were the result of a vulnerability that was exploited, such as SQL Injection or internal malware that allowed attackers to have access to sensitive internal data stores.

So why today, when we're seeing a decrease in things like SQL Injection, are we continuing to see these large-scale data exposures?  You guessed it, leaky buckets!  When an organization plans to put sensitive data in the cloud, they expect it to be internally visible and not publicly open, so why do these things happen?  Like with many security issues, a lack of secure configuration hardening standards and change control mishaps may be to blame here.  According to some reports, as much as 7% of S3 instances are publicly accessible without requiring authentication and with many being unencrypted.  That's scary stuff!  Is it worth the risk at all?  Should we be using buckets?  Is Hank Williams Sr responsible?  Is anyone still reading this?

Traditional Internal Storage

This is how things were always done, before "the cloud".  For our younger readers, think 2001 Space Odyssey's Monolith.

Mankind Before the Cloud

Before "the cloud" (ooh.. ahhh..) and still today, user uploads and other data specific to the web service were primarily hosted on the same physical instance on a web server.  (Crazy! No way)  Sometimes, even within the same web service path.  If not handled properly, this can lead to an entire slew of vulnerabilities.  As a web application penetration tester at Pondurance, I'm always looking for the following issues as they relate to file upload functionality as a way to compromise the host or to gain access to sensitive data.

Web Shells

Think Spiderman at the beach.  Hackers love shells.  In their simplest form, they allow an attacker to remotely control a compromised host through a list of commands, typically via a terminal.  A common example is a reverse bind shell, of say the Windows command prompt or bash in Linux.  The attacker can issue OS commands just as the victim would be able to with that user's context.

Issuing OS Commands (df -h)

Now, think about that upload functionality you've seen on websites before.  If there's an unrestricted file upload vulnerability, meaning the server doesn't limit what content you can upload (or you can bypass those controls), then in theory you can upload your own content for the web server to execute.  If the content you upload is in a web directory and it's filename is known, you can view the files you upload in the browser on that site.  

What happens if you upload your own web page with the same server-side scripting language the service is using?  It'll be processed as if it's part of the site!  Now you can add in some OS command execution scripts and essentially run commands on the OS from the web browser, AKA a web shell.

Simple PHP Web Shell ("df -h" via Browser)

If you're using a path external to the web service or cloud storage, this won't result in a web shell.  

Directory Indexing

I see examples of this most often in WordPress instances, but it's also a common configuration default in Apache.  It's possible with any web service, really.  This is potentially dangerous because it allows the user to browse the web directory structure and see file contents they may otherwise not be privy to.  I've done penetration tests in the past where directory listing allowed me to access sensitive unique identifiers for filenames where content discovery techniques would otherwise fail to disclose, such as SQL database dumps, configuration files, etc.

Another issue is in the form of information disclosure.  It's generally a bad idea to allow attackers to know more about the layout of your site than necessary.  Even if there's nothing sensitive sitting in a directory now, it's possible something could get saved out there and grabbed at a later time.

Wordpress Directory Listing Example

As with web shells, if your data is stored elsewhere, such as the cloud, this isn't likely to be a vulnerability you'll see.

Directory Traversal

Depending on permissions of the account associated with the running web service, it may be possible to inject characters to traverse backwards for the relative path of a resource.  For example, if the web service is calling a file you just uploaded, say "ThisIsAllYourFaultHank.pdf", and you're able to reach it by going to "", you might be able to inject "../" before the filename to go up a directory.  Your new command would be "".  That file won't exist, but if you go up enough, you might be able to call a file you have access to read, such as "/etc/passwd".  

Directory Traversal Example

If your content is hosted anywhere on this server, even potentially outside of the web directory structure, it may be accessible depending on permissions.

Direct Object Reference

With the exception of Directory Traversal (and leaking IDORs through other means), Direct Object References are needed in order for most of these attacks to work.  Even if you successfully upload a web shell or malware, if you don't know how to reach that file to call it back, your attack is going to be limited.  Indirect Object References (IDORs) are a way of calling a unique identifier, such as a hash or GUID, instead of the filename and path itself.  This is just another preventative layer to help protect against issues related to executing malware, web shell execution, information disclosure, and directory traversal to get to data.

Malware Execution

If upload functionality within a web site is unrestricted, meaning it doesn't properly filter the uploaded content, (via content type, file extension, etc) it may allow uploading of malicious files.  Depending on the circumstances, the malware may be executed by the bad actor or discovered and executed by a curious system administrator at a later time.  Since the web service is storing uploaded content on the same OS, the OS can become infected and lead to a compromise of all other user data or anything else on the host.  This isn't really possible within an S3 environment, unless the web server executes content locally after retrieving it from cloud storage.

Traditional Storage Takeaway

Okay, so these are bad.. but what about the benefits of self-hosted content?  Well, for one, you get to keep it close to you.  Give me a minute.. I'm working on finding another pro..  I can't say availability because AWS often has more redundancies in place.  I can't say load times since CDN's are made for this very reason.  I can't really say access control since buckets are pretty granular in regards to permissions.  Let's just go with, "keeping your data close".  That doesn't mean it's not reachable, any of the methods mentioned above could lead to a compromise of data no matter where it's hosted.  I'm also not saying that housing your own data means it's insecure, there are secure ways of doing this for sure.

So let's look at the positive and negative side of cloud storage and see how it stacks up!

Buckets and Containers

Amazon has it's Simple Storage Service (S3) and Microsoft Azure has it's Blob Storage containers.  There's a laundry list of other cloud data storage solutions out there as well.  For the most part, they all pretty much operate the same way.  They both allow for the storage of any data type, are scalable, and reliable.

Both S3 buckets and Blob Containers are inherently secure.  This is probably the opposite of what most people think when they see these stories about public leaks of data.  In fact, most issues are introduced by administrators accidentally exposing these buckets to the public Internet or by making permissions too loose.  Human error whaaat?

As a penetration tester, there are a number of tools out there that assist in identifying cloud storage buckets and containers that are not properly configured.  One such tool is Bucket Stream which cleverly watches certificate transparency logs to match up domains with potential S3 buckets.

Open source tools such as prowler can be leveraged to perform a security configuration hardening audit, based on the CIS Amazon Web Services Foundations Benchmark.  This is useful not just for S3 buckets, but many of the AWS services.  There are a variety that can be used depending on what cloud infrastructure you have, to measure your security posture against best practices.


Securing data or information has been a challenge as long as the digital world has been around.  That's the whole premise behind information security, so this isn't exactly a new problem.  Whether you decide to go with cloud storage or traditional storage, I hope this blog helps to shine a light on the pros and cons of each.  Cloud storage has gotten a bit of a bad reputation lately, but security issues can be introduced as we've seen in either solution.  Also, it's entirely possible your organization intends to make some S3 data public.  After all, it doesn't have to be private.  Just make sure the data you classify as private is handled appropriately with ACLs or separate buckets.

Having a policy to enforce secure baseline configuration standards is a must, and should be checked from time to time to ensure no issues have been introduced.  Permissions should also be evaluated on a recurring basis to avoid accidental exposure to unauthorized users.  These are true whether you're talking about an S3 bucket, an Apache configuration file, or operation system policies.  

Amazon has a support link here to offer guidance on securing S3 buckets.  Some of the recommendations include encrypting data in case it falls into unexpected hands, logging, and restricted access control.  Similarly, Microsoft has a white paper on securing Azure Containers.

Following this advice will hopefully prevent you from having that awkward, "Uh hum.. you're.. data is showing" conversation from a stranger.  Thanks to Hank, the authors of the awesome tools I referenced, and my dad's guitar hobby for the inspiration for this blog!  Is anyone still here? 😜

- Curtis Brazzell