Join Brian Eiler for an in-depth discussion in this video Other storage and data services, part of Amazon Web Services: Implementing and Troubleshooting IaaS Products.
- [Instructor] Aside from all the different types of storage that we've talked about, there are still a few more types, or data services, that we can talk about. One type of data storage is actually a storage gateway and it's designed to facilitate the migration of data from your on-premises applications up to the cloud. Now, there are several different use cases for this. Archiving, it could be backups, could even be things like cloud bursting where your might have applications that traditionally run within your on-premises data center but every now and then you need more capacity than what your data center can hold.
So, you'll run your compute workloads in the cloud and this particular device is going to give you the ability to synchronize the data between your on-premises data center and the S3 storage. The storage gateway is actually an appliance that you'll install at your on-premises data center. It's a virtual appliance and it gives you a point of contact where your individual work stations or servers can access the storage gateway using protocols like NFS or iSCSI.
It's going to automatically buffer the data that's used on premises and then it's going to efficiently move it in and out of the cloud. There are several types of these storage gateways. One of them is a tape gateway. It effectively looks like a virtual tape library. It's great for solutions that natively want to talk to a VTL in order to back up your data. The storage gateway will pretend to be a tape library so that you could actually write your data to it and then it's synchronized up to the cloud for longterm retrieval.
Another option would be to set it up as a file gateway. This is going to NFS and the data is going to be synchronized up to S3. The third option is a block storage-based solution and this is the volume gateway. The storage is going to be mounted as iSCSI, on premises and the servers will then send and receive back and forth the data from the storage appliance but the storage gateway is going to synchronize that data up to the cloud. There are two subtypes of these volume gateways and it has to do with where the data is going to reside.
The top one or the cashed volumes says that the primary copy remains in S3 and there's going to be a cache of data on-premises. This is helpful if you have a scenario where most of the reading and writing of the data is done in the cloud but you'd like to have a copy locally at all times either because you got analytics running at your office or other components where you need to have access to that real time. The bottom option where it says stored volumes, the primary copy remains on-premises and periodically through asynchronous replication, the storage gateway pushes that data up to S3.
This is more ideal if you happen to be using this for more of a backup approach where the application servers are going to be talking directly to the storage gateway and it, in turn, just pushes a copy up to the cloud for safe keeping. Let's say that you've got an incredibly large data center where you've got a bunch of data that pushing the data up to the cloud just seems a formidable challenge and something that's going to take an incredibly long amount of time. One option that's available within AWS is called Snowball.
It's a service that allows you to do an offline transfer of up to petabytes worth of data into the Amazon cloud. Each one of these devices holds up to 100 terabytes. The way you get it is by creating a job in the AWS console. AWS will then mail you the device, physical device, you'll plug it into your data center, transfer the files that you would like to have up in the cloud, you'll put it onto that device, ship it back over to AWS and they will push the data into S3.
While it's on the device, all the data's encrypted and as soon as they're done pushing the data into S3 all of that data is erased from the hardware. Now, this solution can also be used to export data from S3 back to your on-premises location. It's fast, cheap and it's also easy to use. Let's say that you've got more than 100 terabytes of data. Let's say we're talking about the entire data center. For this, we're going to be looking at something that can push data in, well, a tractor trailer.
It is actually a semi-truck trailer that's going to be dropped off at your facility. It's 45 feet long and it's going to contain upwards of 100 petabytes of data. So, each one of those trailers can handle a very large quantity of data and just like with Snowball, all of the data on that container is going to be encrypted while it's being moved over to Amazon. Now, from a security point of view some of the concerns when you're moving this much data is that, wow, what happens if somebody were to hijack it? While they're performing the transfer there's dedicated security staff, GPS tracking, there's alarms, video surveillance and if you really need, you can even have a security vehicle escort for the data as it makes it's journey to the data center.
- AWS global infrastructure
- VPC use cases
- EC2 instance types
- EC2 purchasing and troubleshooting
- Creating AMIs
- Using AWS storage solutions such as EBS, EFS, S3, and Glacier
- Versioning and cross-region replication on S3