Should You Back Up an Ultra-Redundant Object Storage Service? Absolutely!
Amazon Simple Storage Service, commonly known as “Amazon S3” or “S3” is praised across the industry as one of the most reliable, secure, scalable and highly-available public cloud storage services. AWS themselves say: “Amazon S3 is designed for 99.999999999% of data durability because it automatically creates and stores copies of all S3 objects across multiple systems. This means your data is available when needed and protected against failures, errors, and threats.”
This all sounds very promising, but what if you have all your archives and analytical data, as well as the critical data supporting your websites and mobile phone applications stored in S3 and it does suffer from a failure? How about a situation where you are unable to access your data for a few minutes, hours or days? IT managers know that there is always a small chance that systems may not work as planned or that human error can cause the accidental deletion of critical data. Solutions employing data redundancy and additional security need to be factored in to help limit the effect of a failure.
Fortunately, there are many solutions that enable you to secure and back up S3 buckets before critical failures. These solutions include versioning, cross-region replication, MFA Delete and syncing the S3 bucket to a local filesystem (EBS) using the AWS CLI. All of these solutions are built right into Amazon AWS, but most users are not aware of this. Let’s find out how these AWS-native solutions can help further protect your cloud storage.
Versioning is a quick and easy way to protect your bucket from overwriting or accidental deletion. By enabling versioning on your bucket, you can create multiple versions of the same object, but with different version IDs. Each overwrite of an object will create a new object with a different version; this will help you to restore from any previous version as necessary. For deletions, if versioning is enabled, S3 marks the deleted object as a previous version and stores it– this is to help in restoring an accidentally deleted object. Each additional object has a different version ID differentiating all objects.
Versioning can be enabled on a bucket using the AWS Console or the AWS CLI. Do keep in mind, however, that once versioning is enabled on a bucket, it can only be suspended, not disabled. By suspending versioning, you can keep all your current versions, but S3 just won’t create new versions of any objects.
Similar to online banking and the idea that an account holder should have two-step verification enabled when logging into their accounts to limit suspicious activity, S3 also has an option to require Multi-Factor Authorization (MFA) prior to deleting objects. Also known as MFA Delete, this tool offers an additional level of security for S3 buckets. Essentially, MFA Delete requires additional verification using software or hardware-based device to provide an authentication code before allowing a user to alter bucket versioning or permanently delete an object.
MFA Delete can be enabled via the AWS CLI using the root account only, not through the console or by any other user. MFA Delete is a great tool on its own to prevent permanent deletion but coupling it along with versioning can further ensure that objects aren’t accidentally or permanently deleted.
We now know what we can do to prevent human errors when administering S3, but how can we make sure that our data is safe in case AWS S3 has an internal system failure, or an entire availability zone goes dark? Fortunately, AWS created Cross-Region Replication (CRR) rules to copy an entire bucket from one region to another bucket located in a different region. For example, we can use CRR to asynchronously copy a bucket located in the Northern California region to a bucket located in the London region.
When enabling CRR to replicate your buckets across regions it is important to keep in mind that you must also have versioning enabled. Additionally, other benefits with CRR enabled include object replicas maintaining their key names and metadata, as well as being able to change the storage tier while creating a rule. Although AWS claims 99.99% reliability in S3 and the chance that an entire region will go down is extremely slim, Cross-Region Replication makes it easy and simple to create backups of your S3 buckets.
Sync to EBS using AWS CLI
Another solution to backing up an S3 bucket is to sync the bucket using the AWS CLI to an EBS volume attached to an EC2 instance. This is AWS’s own command-line interface that enables a user to programmatically interact with most of the AWS services. Being that it is programmatic, it tends to have additional capabilities than simply using a console and also may be more technically advanced for the average user.
Amazon makes it incredibly easy to sync an S3 bucket to the EC2 instance. The first step is to make sure that we have an EC2 instance running with enough storage provisioned to it. Then, we should connect to the instance and configure the AWS CLI with the access key, secret key and the correct permissions. After configuring the CLI, we can create a local file directory using the name of our S3 bucket on the instance. Once the directory has been created on the EC2 instance, we can initiate a sync command between the S3 bucket and the newly-created local S3 directory.
The initial sync may take a while considering the internet throughput and the bucket size but following the initial sync, we can automate the future backups. Using a cron job we can write a simple script to automate the future syncs. We can also set a specific time for the jobs to take place–this can be hourly, daily or whenever your organization requires.
Better Safe Than Sorry
The chance of an entire AWS availability zone going down is extremely slim, but it doesn’t hurt to add additional layers of security and redundancy to your S3 buckets. We learned that there are multiple ways within AWS to do this. For accidental deletion there is versioning, which allows you to restore previous object versions and coupled with MFA Delete, you can further prevent unauthorized users from deleting objects. Cross-Region Replication and syncing to an EBS volume also help to ensure that your data is safe during the slim chance that S3 does suffer from an outage. Redundancy and security are key when designing an architecture that is extremely reliable even during an outage.