Learn how to define hybrid cloud storage, describe the architecture of an AWS Storage Gateway implementation, and and identify how Storage Gateway can be used for both data migration and hybrid storage.
- [Instructor] We've discussed S3's incredible durability and availability. Unfortunately, it can be a bit challenging to integrate with other systems since it doesn't present a traditional file system interface. By default, you must interact with it via restful HTTP calls, AWS CLI commands or via the AWS web console. But what if there was a way to make S3 mount for as a file system? That's where AWS Storage Gateway comes in. Storage Gateway is a software appliance you can place between your data stores and S3.
Storage Gateway acts as a go-between, presenting a file system front end to S3 and Glacier Storage. It can be configured to provide file, volume or tape interfaces. That's where the Glacier aspect comes in, of course. Tape storage is often called storage for large quantities of infrequently-accessed data. Storage Gateway is smart and it will cache hot data locally so you don't have to go all the way to S3 each time you access frequently-used data. Here's where things get interesting. It can be deployed either to EC2 or to VMware.
Deploying to VMware means that Storage Gateway can act as a bridge between your existing local data center and the Cloud. In a Hybrid Cloud scenario, you deploy your gateway as a VMware appliance in your local data center. Servers in your data center can now mount a file system on that gateway and seamlessly gain access to Cloud storage. You can persist in this mode and maintain a true Hybrid Cloud or you can use the gateway as a bridge to AWS, leveraging it as a way to get data to S3 in a one-time migration.
Storage Gateway provides a few different ways to present a file system. File gateway allows the client to mount via interface Version 3 and 4.1. Volume gateway uses an I-Scuzzy interface. Tape gateway presents a virtual tape library, or VTL interface. Let's dig into the first mode. As a file gateway, AWS Storage Gateway can present an NFS file system that is backed remotely by an S3 bucket. This file interface can be mounted by any instance that supports NFS version 3 or 4.1.
This can occur entirely in AWS or the gateway can be on premises in VMware, making the file gateway a local bridge to Cloud storage. Unlike mounting EFS from outside AWS, this model does not require the use of AWS Direct Connect. Within a single gateway instance, you create one or more file shares. These map to individual buckets on the back end and will be your actual NFS mount points. This means that you can potentially serve many different applications and use cases with a single file gateway.
You might be wondering how file gateway handles NFS metadata like file permissions. Permission bits and ownership attributes are stored in S3 object metadata. To provide optimal performance, the file gateway pays attention to what data is used most often and keeps it in a local cache, so hot content can be read and written quickly without incurring network latency. Files are synced to S3 behind the scenes while you get your data without lag. File Gateway does come with a few caveats. First of all, AWS makes clear that although you are technically able to access the back end S3 directly, you should avoid doing so.
The behavior of file gateway is not well-defined when you alter the back end S3 storage and may result in errors or lost data. Likewise, you don't want to set any data management rules on the S3 back end either. Any S3 object that moves to Glacier will result in an error for the gateway. Other things to consider: AWS File Gateway cannot support symbolic or hard links, so if those features are critical to your use case, you'll need to use another solution. Also, since S3 does not support renaming objects directly, file gateway implements the operation by copying the affected file objects and deleting the old ones.
This means that your gateway won't have to send any additional data over the network, but it also means that mass renames, such as renaming a folder containing many files and sub-folders, may not be an instantaneous task. Finally, although the NFS protocol will allow connections from more than one client, AWS again recommends that you do not do this and instead maintain a one-to-one connection policy between clients and file shares. This will help avoid possible data corruption. If what you need is an NFS file system that supports multiple clients as a first class use case, you're going to want to look at EFS.
However, if your goal is to take advantage of S3's highly-durable, relatively cheap storage, while avoiding the complexities of interacting directly with the S3 API, AWS Storage Gateway is a great way to do just that.
Join AWS architect Brandon Rich and learn how to configure object storage solutions and lifecycle management in Simple Storage Service (S3), a web service offered by AWS, and migrate, back up, and replicate relational data in RDS. Find out how to leverage flexible network storage with Elastic File System (EFS), and use the new AWS Glue service to move and transform data. Plus, learn how Snowball can help you transfer truckloads of data in and out of the cloud.
- What is data management?
- AWS S3 basics
- S3 bucket creation
- S3 upload and logging
- S3 event notifications
- S3 data lifecycle configuration
- Working with Amazon Elastic Block Store volumes
- Creating and mounting an EFS
- Creating an AWS RDS instance
- RDS backup and recovery
- Moving data with AWS Database Migration Service
- Moving data with Data Pipeline and Glue