Microsoft Sentinel – How to Leverage built-in Amazon Web Services S3 Data Connector

In late November 2021, Microsoft published something many organizations using Microsoft Sentinel (in multi-cloud architecture) have been waiting for, Microsoft Sentinel built-in AWS S3 data connector. At the time of writing, there are two versions for AWS data connector:

Legacy connector for CloudTrail (CT) management and data logs
New data connector, Amazon S3, that provides capability to ingest the following logs from S3 Bucket:
- Amazon Virtual Private Cloud (VPC) logs
- Amazon Guard Duty Findings
- AWS CloudTrail management and data logs

In this blog, the focus is on the new S3 version that provides wider support for different log types. How to establish AWS integration with the legacy model is found in my earlier blogs. The earlier blog describes different options for data ingestion from AWS including integration with Microsoft Defender for Cloud Apps (MDCA).

Kudos to Panu Mälkki who helped me to establish AWS configurations.

Architecture

In the legacy connector, Microsoft Sentinel is relaying to AWS CloudTrail and the data it provides (management & data logs). The new S3 connector is built in a different way. Naturally, the same relationship between Amazon Web Services and Microsoft Sentinel needs to be established (role & permissions).

The main benefit of using the AWS S3 bucket for the integration is that you need to establish integration from Sentinel to one (1) AWS account and S3 bucket instead of all AWS accounts (CloudTrail scenario).

Side note: In the case example you can find from the next chapter, I have used three (3) different S3 buckets, one for each data type.

How it works: (docs.microsoft.com)

AWS services are configured to send logs to S3 (Simple Storage Service) storage buckets.
- Available services (at the time of writing): VPC Flow logs, GuardDuty & CloudTrail
The S3 bucket sends notification messages to the SQS (Simple Queue Service) message queue whenever it receives new logs.
The Microsoft Sentinel AWS S3 connector polls the SQS queue at regular, frequent intervals. If there is a message in the queue, it will contain the path to the log files.
The connector reads the message with the path, then fetches the files from the S3 bucket.
To connect to the SQS queue and the S3 bucket, Microsoft Sentinel uses AWS credentials and connection information embedded in the AWS S3 connector’s configuration.
- The AWS credentials are configured with a role and a permissions policy giving them access to those resources.
- Similarly, the Microsoft Sentinel workspace ID is embedded in the AWS configuration, so there is in effect two-way authentication.

Configuration

In a nutshell, the following tasks need to be done to establish the connection between Sentinel and AWS:

In the AWS environment configure the services (VPC logs, CloudTrail & GuardDuty Findings) send logs to the S3 bucket which you would like to have in Microsoft Sentinel.
Defined necessary assumed roles & permissions so that Sentinel is able to read needed audit data.
In the Microsoft Sentinel side, Amazon Web Services S3 configuration is needed (Role ARN, SQS URL, and destination table) to ingest the data.

In AWS environment:
- Configure your AWS service(s) to send logs to an S3 bucket.
- Create a Simple Queue Service (SQS) queue to provide notification.
- Create an assumed role to grant permissions to your Microsoft Sentinel account (external ID) to access your AWS resources.
- Attach the appropriate IAM permissions policies to grant Microsoft Sentinel access to the appropriate resources (S3 bucket, SQS).
In Microsoft Sentinel:
- Enable and configure the AWS S3 Connector in the Microsoft Sentinel portal.
- Verify the data flow

The integration takes place through S3 bucket(s) and SQS queues and the idea in this example scenario is to use three (3) different S3 buckets (one for each log type) but you could also use only one (1) S3 bucket. Both options provide pros & cons but in this specific case, there were management reasons to move forward with the separate ones.

When more services are added (and are supported by S3 data connector) configuration is still needed in Sentinel & AWS side no matter which option you have selected.

With three (3) 3 buckets the log ingestion flow is following:

AWS Service Exports logs to an S3 bucket.
S3 bucket is configured with an event notification that creates a new entry to an SQS queue.
Sentinel has an AWS Account where it assumes a role from your environment. This role has to have accesses to read/write SQS queues and S3 buckets should have bucket policies with permission to S3 and KMS.
Using IAM role it polls the SQS queue and pulls the logs from S3 buckets.

Detailed information is found from docs.microsoft.com where you can find instructions for manual and automated setup options.

When all the pre-requisites are configured on the AWS side you can add “Role ARN”, “SQS URL” & destination table to Sentinel configuration to finalize the configuration.

After configuration is in place it will take approximately 15-30min after you can expect to find AWS data from Microsoft Sentinel.

Analytics Rules

At the time of writing, there are the following analytics rules available out of the box (which uses AWS audit data). The data tables that contain the AWS data are:

AWS Cloud Trail – AWSCloudTrail
AWS Guard Duty Findings – AWSGuardDuty
AWS VPC Flow logs – AWSVPCFlow

Built-in rules:

AWS Guard Duty Findings

Earlier, I have earlier created three (3) separate analytics rules for AWS Guard Duty findings, one for each severity (low, medium, high). Now, there is a new template available in Microsoft Sentinel that can be used to achieve the same end-result but with only one (1) rule. This template is already pushed to Sentinel instances and is also found from the Microsoft Sentinel Github repo.

AWSGuardDuty 
| extend tokens = split(ActivityType,":") 
| extend ThreatPurpose = tokens[0], tokens= split(tokens[1],"/") 
| extend ResourceTypeAffected = tokens[0], ThreatFamilyName= tokens[1] 
| extend UniqueFindingId = Id | extend AWSAcoundId = AccountId 
| project-away tokens,ActivityType, Id, AccountId 
| project-away TimeGenerated, TenantId, SchemaVersion, Region, Partition 
| extend Severity= iff(Severity between (7.0..8.9),"High",iff(Severity between (4.0..6.9), "Medium", iff(Severity between (1.0..3.9),"Low","Unknown")))

One thing to mention is that the way GuardDuty findings/alerts are handled Sentinel caused an extra headache for me. The problem was that there is little room in Sentinel to configure how alerts are created from events compared to incidents. It meant that if I went forward with the typical setting, at the end of the day all guardDuty findings were underneath one (1) incident. By configuring the following settings incidents were created in a more logical manner (especially with sample findings).

Also, pay attention to entities and adding custom details to the incident will help analysts in the investigation.

My example rule for medium severity findings. In my example case, I created different rules for all severity categories which are:

Low 1.0-3.9
Medium 4.0-6.9
High 7.0 ->

AWSGuardDuty
| where Severity between (4.0..6.9)
| extend AWSACcountId = AccountId
| extend UniqueFindingId = Id 
| extend tokens = split(ActivityType,":") 
| extend ThreatPurpose = tokens[0], tokens= split(tokens[1],"/") 
| extend ResourceTypeAffected = tokens[0], ThreatFamilyName= tokens[1] 
| extend AWSArn = Arn
| extend ResourceType = ResourceDetails.resourceType
| extend ActionType = parse_json(tostring(ServiceDetails.action)).actionType
| extend City = parse_json(tostring(parse_json(tostring(parse_json(tostring(parse_json(tostring(ServiceDetails.action)).awsApiCallAction)).remoteIpDetails)).city)).cityName 
| extend Country = parse_json(tostring(parse_json(tostring(parse_json(tostring(parse_json(tostring(ServiceDetails.action)).awsApiCallAction)).remoteIpDetails)).country)).countryName
| extend IPAddress = parse_json(tostring(parse_json(tostring(parse_json(tostring(ServiceDetails.action)).awsApiCallAction)).remoteIpDetails)).ipAddressV4 
| extend userName = parse_json(tostring(ResourceDetails.accessKeyDetails)).userName
| project TimeGenerated, Severity, AWSACcountId, Region, UniqueFindingId, AWSArn, ResourceType, ActionType, Title, Description, Country, IPAddress, userName

At the end of the day, the GuardDuty findings are found with detailed information from Microsoft Sentinel.

Threat Hunting

To use Azure Sentinel threat hunting you need to ingest raw data to the underlying Azure Log Analytics workspace. Microsoft community has a wide range of built-in hunting queries implemented to Azure Sentinel which can be used to get started.

For AWS, the built-in hunting queries are available from the following data sources: ‘AWSCloudTrail, ‘AWSS3BucketAPILogsParsed’, ‘AWSBucketAPILogs_CL’. The last two ones requires custom log import through Logstash (+S3 KQL parser installation).

Caveats

Different types of logs can be stored in the same S3 bucket, but should not be stored in the same path.
Each SQS queue should point to one type of message, so if you want to ingest GuardDuty findings and VPC flow logs, you should set up separate queues for each type.
Similarly, a single SQS queue can serve only one path in an S3 bucket, so if for any reason you are storing logs in multiple paths, each path requires its own dedicated SQS queue.
When integration is in place it does pull only new logs from S3 bucket by default.

References

Microsoft Sentinel AWS S3 data connector

AWS detections in Sentinel Github

2 thoughts on “Microsoft Sentinel – How to Leverage built-in Amazon Web Services S3 Data Connector”

Luke says:

October 7, 2022 at 09:23

Great blog explaining how the new connector works. I have been using the new connector to ingest logs for VPCFlow and GuardDuty. I haven’t yet switched from using the legacy CloudTrail connector to the new S3 connector for CloudTrail.

Do you know if I will get duplicate CloudTrail events when cutting over from legacy to SQS/S3? There are 600,000+ items sitting in the SQS queue right now, i’m concerned I will get duplicate events when I turn off all the old connections to each account, then enable the new one.
Thanks

1. Sami Lamppu says:
  
  October 10, 2022 at 11:46
  
  Hi Luke,
  First of all, thanks for reading! You’re right, the S3 + SQS connector has “old” data based on the data retention policy on the SQS side. By default, it’s 4 days but it can be configured to something else. You should check this one before doing a cutover.
  
  If you want to have a “clean slate” when doing the cutover you can purge the SQS queue but I personally would like to avoid situations where I will miss events from the data source.
  
  That being said, I would do the following:
  – Verify data retention policy (from CT queue). To the best of my knowledge, all of the queues have their own data retention policy.
  – Verify that existing connector is up to date and you have all the needed data in Sentinel
  – If there are a lot of duplicates, reduce data retention from the default to for example 1 day
  – Do cutover
  – Change data retention on SQS back to default or the custom value you had
  
  I you have more questions you can reach me out by email (samilamppu@hotmail.com) and we can continue discussion there.

Sam's Corner

Site about cloud security

Microsoft Sentinel – How to Leverage built-in Amazon Web Services S3 Data Connector

Architecture

Configuration

Analytics Rules

AWS Guard Duty Findings

Threat Hunting

Caveats

References

2 thoughts on “Microsoft Sentinel – How to Leverage built-in Amazon Web Services S3 Data Connector”

Leave a comment Cancel reply

Architecture

Configuration

Analytics Rules

AWS Guard Duty Findings

Threat Hunting

Caveats

References

Share this:

2 thoughts on “Microsoft Sentinel – How to Leverage built-in Amazon Web Services S3 Data Connector”

Leave a comment Cancel reply