GITHUB
This document explains how to ingest GitHub audit logs to Google Security Operations using Amazon S3. The parser attempts to extract data from the "message" field using various grok patterns, handling both JSON and non-JSON formats. Based on the extracted "process_type", it applies specific parsing logic using grok, kv, and other filters to map the raw log data into the Unified Data Model (UDM) schema.
Before you begin
Make sure you have the following prerequisites:
- Google SecOps instance.
- Privileged access to GitHub Enterprise Cloud tenant with enterprise owner permissions.
- Privileged access to AWS (S3, IAM).
Collect GitHub Enterprise Cloud prerequisites (Enterprise access)
- Sign in to the GitHub Enterprise Cloud Admin Console.
- Go to Enterprise settings > Settings > Audit log > Log streaming.
- Make sure you have enterprise owner permissions to configure audit log streaming.
- Copy and save in a secure location the following details:
- GitHub Enterprise name
- Organization names under the enterprise
Configure AWS S3 bucket and Identity and Access Management for Google SecOps
- Create Amazon S3 bucket following this user guide: Creating a bucket
- Save bucket Name and Region for future reference (for example,
github-audit-logs
). - Create a User following this user guide: Creating an IAM user.
- Select the created User.
- Select Security credentials tab.
- Click Create Access Key in the Access Keys section.
- Select Third-party service as Use case.
- Click Next.
- Optional: Add a description tag.
- Click Create access key.
- Click Download .CSV file to save the Access Key and Secret Access Key for future reference.
- Click Done.
Configure the IAM policy for GitHub S3 streaming
- In the AWS console, go to IAM > Policies > Create policy > JSON tab.
- Copy and paste the following policy.
Policy JSON (replace
github-audit-logs
if you entered a different bucket name):{ "Version": "2012-10-17", "Statement": [ { "Sid": "AllowPutObjects", "Effect": "Allow", "Action": "s3:PutObject", "Resource": "arn:aws:s3:::github-audit-logs/*" } ] }
Click Next > Create policy.
Name the policy
GitHubAuditStreamingPolicy
and click Create policy.Go back to the IAM user created earlier.
Select the Permissions tab.
Click Add permissions > Attach policies directly.
Search for and select
GitHubAuditStreamingPolicy
.Click Next > Add permissions.
Configure GitHub Enterprise Cloud audit log streaming
- Sign in to GitHub Enterprise Cloud as an enterprise owner.
- Click your profile photo, then click Enterprise settings.
- In the enterprise account sidebar, click Settings > Audit log > Log streaming.
- Select Configure stream and click Amazon S3.
- Under Authentication, click Access keys.
- Provide the following configuration details:
- Region: Select the bucket's region (for example,
us-east-1
). - Bucket: Type the name of the bucket you want to stream to (for example,
github-audit-logs
). - Access Key ID: Enter your access key ID from the IAM user.
- Secret Key: Enter your secret key from the IAM user.
- Region: Select the bucket's region (for example,
- Click Check endpoint to verify that GitHub can connect and write to the Amazon S3 endpoint.
- After you've successfully verified the endpoint, click Save.
Create read-only IAM user & keys for Google SecOps
- Go to AWS Console > IAM > Users > Add users.
- Click Add users.
- Provide the following configuration details:
- User: Enter
secops-reader
. - Access type: Select Access key – Programmatic access.
- User: Enter
- Click Create user.
- Attach the minimal read policy (custom): Users > secops-reader > Permissions > Add permissions > Attach policies directly > Create policy.
JSON:
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": ["s3:GetObject"], "Resource": "arn:aws:s3:::github-audit-logs/*" }, { "Effect": "Allow", "Action": ["s3:ListBucket"], "Resource": "arn:aws:s3:::github-audit-logs" } ] }
Name =
secops-reader-policy
.Click Create policy > search/select > Next > Add permissions.
Create an access key for
secops-reader
: Security credentials > Access keys > Create access key > download the.CSV
(you'll paste these values into the feed).
Configure a feed in Google SecOps to ingest GitHub logs
- Go to SIEM Settings > Feeds.
- Click + Add New Feed.
- In the Feed name field, enter a name for the feed (for example,
GitHub audit logs
). - Select Amazon S3 V2 as the Source type.
- Select GitHub as the Log type.
- Click Next.
- Specify values for the following input parameters:
- S3 URI:
s3://github-audit-logs/
- Source deletion options: Select deletion option according to your preference.
- Maximum File Age: Include files modified in the last number of days. Default is 180 days.
- Access Key ID: User access key with access to the S3 bucket.
- Secret Access Key: User secret key with access to the S3 bucket.
- Asset namespace: the asset namespace.
- Ingestion labels: the label applied to the events from this feed.
- S3 URI:
- Click Next.
- Review your new feed configuration in the Finalize screen, and then click Submit.
UDM mapping table
Log Field | UDM Mapping | Logic |
---|---|---|
actor |
principal.user.userid |
The value is taken from the actor field. |
actor_id |
principal.user.attribute.labels.value |
The value is taken from the actor_id field. |
actor_ip |
principal.ip |
The value is taken from the actor_ip field. |
actor_location.country_code |
principal.location.country_or_region |
The value is taken from the actor_location.country_code field. |
application_name |
target.application |
The value is taken from the application_name field. |
business |
target.user.company_name |
The value is taken from the business field. |
business_id |
target.resource.attribute.labels.value |
The value is taken from the business_id field. |
config.url |
target.url |
The value is taken from the config.url field. |
created_at |
metadata.event_timestamp |
The value is converted from UNIX milliseconds to a timestamp. |
data.cancelled_at |
extensions.vulns.vulnerabilities.scan_end_time |
The value is converted from ISO8601 format to a timestamp. |
data.email |
target.email |
The value is taken from the data.email field. |
data.event |
security_result.about.labels.value |
The value is taken from the data.event field. |
data.events |
security_result.about.labels.value |
The value is taken from the data.events field. |
data.head_branch |
security_result.about.labels.value |
The value is taken from the data.head_branch field. |
data.head_sha |
target.file.sha256 |
The value is taken from the data.head_sha field. |
data.hook_id |
target.resource.attribute.labels.value |
The value is taken from the data.hook_id field. |
data.started_at |
extensions.vulns.vulnerabilities.scan_start_time |
The value is converted from ISO8601 format to a timestamp. |
data.team |
target.user.group_identifiers |
The value is taken from the data.team field. |
data.trigger_id |
security_result.about.labels.value |
The value is taken from the data.trigger_id field. |
data.workflow_id |
security_result.about.labels.value |
The value is taken from the data.workflow_id field. |
data.workflow_run_id |
security_result.about.labels.value |
The value is taken from the data.workflow_run_id field. |
enterprise.name |
additional.fields.value.string_value |
The value is taken from the enterprise.name field. |
external_identity_nameid |
target.user.email_addresses |
If the value is an email address, it is added to the target.user.email_addresses array. |
external_identity_nameid |
target.user.userid |
The value is taken from the external_identity_nameid field. |
external_identity_username |
target.user.user_display_name |
The value is taken from the external_identity_username field. |
hashed_token |
network.session_id |
The value is taken from the hashed_token field. |
job_name |
target.resource.attribute.labels.value |
The value is taken from the job_name field. |
job_workflow_ref |
target.resource.attribute.labels.value |
The value is taken from the job_workflow_ref field. |
org |
target.administrative_domain |
The value is taken from the org field. |
org_id |
additional.fields.value.string_value |
The value is taken from the org_id field. |
programmatic_access_type |
additional.fields.value.string_value |
The value is taken from the programmatic_access_type field. |
public_repo |
additional.fields.value.string_value |
The value is taken from the public_repo field. |
public_repo |
target.location.name |
If the value is "false", it is mapped to "PRIVATE". Otherwise, it is mapped to "PUBLIC". |
query_string |
additional.fields.value.string_value |
The value is taken from the query_string field. |
rate_limit_remaining |
additional.fields.value.string_value |
The value is taken from the rate_limit_remaining field. |
repo |
target.resource.name |
The value is taken from the repo field. |
repo_id |
additional.fields.value.string_value |
The value is taken from the repo_id field. |
repository_public |
additional.fields.value.string_value |
The value is taken from the repository_public field. |
request_body |
additional.fields.value.string_value |
The value is taken from the request_body field. |
request_method |
network.http.method |
The value is converted to uppercase. |
route |
additional.fields.value.string_value |
The value is taken from the route field. |
status_code |
network.http.response_code |
The value is converted to an integer. |
timestamp |
metadata.event_timestamp |
The value is converted from UNIX milliseconds to a timestamp. |
token_id |
additional.fields.value.string_value |
The value is taken from the token_id field. |
token_scopes |
additional.fields.value.string_value |
The value is taken from the token_scopes field. |
transport_protocol_name |
network.application_protocol |
The value is converted to uppercase. |
url_path |
target.url |
The value is taken from the url_path field. |
user |
target.user.user_display_name |
The value is taken from the user field. |
user_agent |
network.http.user_agent |
The value is taken from the user_agent field. |
user_agent |
network.http.parsed_user_agent |
The value is parsed. |
user_id |
target.user.userid |
The value is taken from the user_id field. |
workflow.name |
security_result.about.labels.value |
The value is taken from the workflow.name field. |
workflow_run.actor.login |
principal.user.userid |
The value is taken from the workflow_run.actor.login field. |
workflow_run.event |
additional.fields.value.string_value |
The value is taken from the workflow_run.event field. |
workflow_run.head_branch |
security_result.about.labels.value |
The value is taken from the workflow_run.head_branch field. |
workflow_run.head_sha |
target.file.sha256 |
The value is taken from the workflow_run.head_sha field. |
workflow_run.id |
target.resource.attribute.labels.value |
The value is taken from the workflow_run.id field. |
workflow_run.workflow_id |
security_result.about.labels.value |
The value is taken from the workflow_run.workflow_id field. |
N/A | metadata.event_type |
The value is determined based on the action and actor fields. If the action field contains "_member", the value is set to "USER_RESOURCE_UPDATE_PERMISSIONS". If the action field is not empty and the actor field is not empty, the value is set to "USER_RESOURCE_UPDATE_CONTENT". Otherwise, the value is set to "USER_RESOURCE_ACCESS". |
N/A | metadata.log_type |
The value is set to "GITHUB". |
N/A | metadata.product_name |
The value is set to "GITHUB". |
N/A | metadata.vendor_name |
The value is set to "GITHUB". |
N/A | target.resource.resource_type |
The value is set to "STORAGE_OBJECT". |
N/A | security_result.about.labels.key |
The value is set to a constant string based on the corresponding data field. For example, for data.workflow_id , the key is set to "Workflow Id". |
N/A | target.resource.attribute.labels.key |
The value is set to a constant string based on the corresponding data field. For example, for data.hook_id , the key is set to "Hook Id". |
Need more help? Get answers from Community members and Google SecOps professionals.