See AWS Macie in action. It's a machine learning service that finds potentially sensitive data in selected S3 buckets. Macie also analyzes CloudTrail log activity and presents dashboards of highlighted activity on your AWS account.
- [Instructor] As part of planning encryption, you need to find the data that needs to be encrypted. Amazon has a new service that can help. It's called Macie. It's a security service that uses machine learning to automatically discover, classify, and protect sensitive data in AWS, data like PII, or personally identifiable information, or intellectual property. And you get dashboards and alerts. It's a pretty cool service. It's new, it's only available as of this recording in US east and west. So I'm going to show you it in the subsequent movie.
So to set up, you want to follow the setup steps here, in particular, you need to setup the appropriate IM role. And they have some cloud formation templates. I'll scroll down. And you can click there and then that will launch the template and just basically click start. And then wait for it to complete. And then you'll see over in IM under roles, you have two roles. You have customer service and you have a customer setup role. Then you go into the main console and you bring up Macie and then you have to say which accounts you want to have which objects monitored.
So I'm going to select my account. And here I have S3. And inside of here, I've already setup a bucket. And there is a charge for this. So this talks about that. And I had run through this one time. So you can see that this is selected. So I'm going to unselect it. Typically it would look like this when you first start. And I set up a bucket with some potentially sensitive information. And I will provide you with those files or you can make your own to try this out.
And I'm going to click that to say scan this bucket, basically. I'm going to scroll down and say review and save. And then it's telling me I'm logging everything in S3 so I need to turn that on. And then click save. And now this is going to be monitored. Now while the monitoring is occurring, I'm just going to show you some of the settings and what Macie monitors for, by default. So if I click in the settings, there's four categories. And again, we're looking for sensitive information in S3 buckets that we might want to encrypt.
So we have content types, file extensions, themes and regex. Probably the easiest thing to understand is file extensions. So let's look there. Literally, these are just files of certain types. And some of the type of information that Macie is warning that you may want to protect or things like source code, certificates, so on and so fourth. So you can see that we have a classification, and then we have a risk value. Now this is going to be important in terms of how we query because the risk value goes from 1 to 10.
And when you query for information that could be sensitive, you set the risk value. And notice over here, you have it enabled. This is configurable, you can turn it on and off. You get a set of settings by default. And this is how you would edit it. So the first settings are file extensions. The next thing Macie looks at are pattern matches for regex. Those have categories around personally identifiable information. And a lot of it is US at this time. So things like California Driver's Licenses, phone numbers, zip codes, and then hacker-types of patterns such metasploit modules.
And you can see, we have a minimum number of matches and then we have a risk. So just to give you a couple of high-risk configurations, we have Cisco router configuration, things that Amazon has seen in people's S3 buckets. Huawei config file, and if we scroll down, my personal favorite is the AWS secret key. We all know we shouldn't be putting that unencrypted in buckets. And so Amazon assigns that a risk of 10. So now we're going to go back and we'll look at the other classifications are around content types.
And so these are types of files. And again, if I look at the risk to see what is most risky, it's kind of interesting. I think this is interesting. So we have application pgp, we have pgp keys, so encryption keys. And if we scroll down here, we can see exchange server certificates, so security keys, basically, checking those unencrypted. And we have a number of office documents and work business documents here. So one more category and that would be themes.
So in terms of themes, I think this is the most interesting, actually. You have to have a minimum number of keyword combinations. And we have things like banking, corporate proposals. So financial information, basically, MasterCard credit card keywords. If we scroll down here, we can see in this case this one, network scanner keywords is actually disabled by default. And we can look and see, we've got interesting stuff like password keywords. So if we click in here, it's looking for two keyword combinations of common passwords.
So, interesting set of things that it's looking for. In addition to scanning your designated S3 buckets, Macie also will show you information about cloud trail events. And cloud trail events of course can be sensitive because they are a record of API calls against services. So we see both events and errors. Now how do you get the Macie information? There's two ways. You can go to the alerts which gives you what Amazon tells you are suspicious activities.
You can look at the alert information. But what I think is even more interesting is the dashboard. In the dashboard, you have a rollup of critical assets and so forth. But the really sort of fun thing to me is the queries you can do here. So you have S3 and we can see that we found some problems in this data, based in the criterion in the settings. We have some json checked in, we have some python code and this will be unencrypted. And we can pass our mouse over it and we can see that we have one document in our bucket.
And by the way, our bucket looks like this. So we have a json file there and a python file there. And Macie found that. And we can then do different types of queries here. We can go to the S3 objects. And we can see that we've got some financial keywords, as well. So if we want to see what that is, we can click on it and we can see that we have a file, one result matched and this is in the research section in S3 and if we scroll down, we can see information about this.
We can see the account ID, the bucket owner, the name, and this is a match to the financial keywords. So the object theme is financial keywords. If we open this up, let's scroll down so you can see it, we can see that we have a matching value there. And then if we click into this, we can see even more information about this file. And we can see the modification, the risk, whether it's PII, and here we matched on these keywords. Now I just copied all the keywords into the file, to be honest with you, to make it really simple.
But it is interesting the types of things that are matching in Macie. Now just to go back to the dashboard and to show what else we can do here. We can look at S3 objects by PII. And we've got some potential email addresses. So if we click on that, we can see, it originally found 'em. But I guess it hasn't found them yet. Once it finds them, I can save this as an alert, I can mark it as a favorite and I can look at my favorite query. So the research and the dashboard kind of go hand-in-hand.
In addition to PII, I can look at high-risk cloud trail events. And this is peripheral to encryption but it's still an interesting capability. You can see that we have these types of events that people typically want to have logging around. So let's just pick one around encryption, create key. And then we can look at the create key and we can drill down more information about the create key and see who did that and when that occurred. So it's a visualization of logging.
So this is a really interesting service that is just very new out from the Amazon security team and I think has great relevance around deciding where to encrypt in S3 and then it does have the cloud trail stuff in addition to this. Just to be complete, we have high-risk cloud trail. We have activity locations. We have cloud trail events, activity ISPs and cloud trail user identity types. So great set of services to help you when you're planning your encryption strategy for your S3 buckets.
- Core AWS security design concepts
- Designing using a data flow diagram
- Using negative use cases
- Working with IAM user and role objects
- Design concepts for encryption
- Design encryption with AWS Key Management Service
- Third-party data security tools
- Designing for disaster recovery services