GITHUB

Supported in:

This document explains how to ingest GitHub audit logs to Google Security Operations using Amazon S3. The parser attempts to extract data from the "message" field using various grok patterns, handling both JSON and non-JSON formats. Based on the extracted "process_type", it applies specific parsing logic using grok, kv, and other filters to map the raw log data into the Unified Data Model (UDM) schema.

Before you begin

Make sure you have the following prerequisites:

  • Google SecOps instance.
  • Privileged access to GitHub Enterprise Cloud tenant with enterprise owner permissions.
  • Privileged access to AWS (S3, IAM).

Collect GitHub Enterprise Cloud prerequisites (Enterprise access)

  1. Sign in to the GitHub Enterprise Cloud Admin Console.
  2. Go to Enterprise settings > Settings > Audit log > Log streaming.
  3. Make sure you have enterprise owner permissions to configure audit log streaming.
  4. Copy and save in a secure location the following details:
    • GitHub Enterprise name
    • Organization names under the enterprise

Configure AWS S3 bucket and Identity and Access Management for Google SecOps

  1. Create Amazon S3 bucket following this user guide: Creating a bucket
  2. Save bucket Name and Region for future reference (for example, github-audit-logs).
  3. Create a User following this user guide: Creating an IAM user.
  4. Select the created User.
  5. Select Security credentials tab.
  6. Click Create Access Key in the Access Keys section.
  7. Select Third-party service as Use case.
  8. Click Next.
  9. Optional: Add a description tag.
  10. Click Create access key.
  11. Click Download .CSV file to save the Access Key and Secret Access Key for future reference.
  12. Click Done.

Configure the IAM policy for GitHub S3 streaming

  1. In the AWS console, go to IAM > Policies > Create policy > JSON tab.
  2. Copy and paste the following policy.
  3. Policy JSON (replace github-audit-logs if you entered a different bucket name):

    {
    "Version": "2012-10-17",
    "Statement": [
        {
        "Sid": "AllowPutObjects",
        "Effect": "Allow",
        "Action": "s3:PutObject",
        "Resource": "arn:aws:s3:::github-audit-logs/*"
        }
    ]
    }
    
  4. Click Next > Create policy.

  5. Name the policy GitHubAuditStreamingPolicy and click Create policy.

  6. Go back to the IAM user created earlier.

  7. Select the Permissions tab.

  8. Click Add permissions > Attach policies directly.

  9. Search for and select GitHubAuditStreamingPolicy.

  10. Click Next > Add permissions.

Configure GitHub Enterprise Cloud audit log streaming

  1. Sign in to GitHub Enterprise Cloud as an enterprise owner.
  2. Click your profile photo, then click Enterprise settings.
  3. In the enterprise account sidebar, click Settings > Audit log > Log streaming.
  4. Select Configure stream and click Amazon S3.
  5. Under Authentication, click Access keys.
  6. Provide the following configuration details:
    • Region: Select the bucket's region (for example, us-east-1).
    • Bucket: Type the name of the bucket you want to stream to (for example, github-audit-logs).
    • Access Key ID: Enter your access key ID from the IAM user.
    • Secret Key: Enter your secret key from the IAM user.
  7. Click Check endpoint to verify that GitHub can connect and write to the Amazon S3 endpoint.
  8. After you've successfully verified the endpoint, click Save.

Create read-only IAM user & keys for Google SecOps

  1. Go to AWS Console > IAM > Users > Add users.
  2. Click Add users.
  3. Provide the following configuration details:
    • User: Enter secops-reader.
    • Access type: Select Access key – Programmatic access.
  4. Click Create user.
  5. Attach the minimal read policy (custom): Users > secops-reader > Permissions > Add permissions > Attach policies directly > Create policy.
  6. JSON:

    {
    "Version": "2012-10-17",
    "Statement": [
        {
        "Effect": "Allow",
        "Action": ["s3:GetObject"],
        "Resource": "arn:aws:s3:::github-audit-logs/*"
        },
        {
        "Effect": "Allow",
        "Action": ["s3:ListBucket"],
        "Resource": "arn:aws:s3:::github-audit-logs"
        }
    ]
    }
    
  7. Name = secops-reader-policy.

  8. Click Create policy > search/select > Next > Add permissions.

  9. Create an access key for secops-reader: Security credentials > Access keys > Create access key > download the .CSV (you'll paste these values into the feed).

Configure a feed in Google SecOps to ingest GitHub logs

  1. Go to SIEM Settings > Feeds.
  2. Click + Add New Feed.
  3. In the Feed name field, enter a name for the feed (for example, GitHub audit logs).
  4. Select Amazon S3 V2 as the Source type.
  5. Select GitHub as the Log type.
  6. Click Next.
  7. Specify values for the following input parameters:
    • S3 URI: s3://github-audit-logs/
    • Source deletion options: Select deletion option according to your preference.
    • Maximum File Age: Include files modified in the last number of days. Default is 180 days.
    • Access Key ID: User access key with access to the S3 bucket.
    • Secret Access Key: User secret key with access to the S3 bucket.
    • Asset namespace: the asset namespace.
    • Ingestion labels: the label applied to the events from this feed.
  8. Click Next.
  9. Review your new feed configuration in the Finalize screen, and then click Submit.

UDM mapping table

Log Field UDM Mapping Logic
actor principal.user.userid The value is taken from the actor field.
actor_id principal.user.attribute.labels.value The value is taken from the actor_id field.
actor_ip principal.ip The value is taken from the actor_ip field.
actor_location.country_code principal.location.country_or_region The value is taken from the actor_location.country_code field.
application_name target.application The value is taken from the application_name field.
business target.user.company_name The value is taken from the business field.
business_id target.resource.attribute.labels.value The value is taken from the business_id field.
config.url target.url The value is taken from the config.url field.
created_at metadata.event_timestamp The value is converted from UNIX milliseconds to a timestamp.
data.cancelled_at extensions.vulns.vulnerabilities.scan_end_time The value is converted from ISO8601 format to a timestamp.
data.email target.email The value is taken from the data.email field.
data.event security_result.about.labels.value The value is taken from the data.event field.
data.events security_result.about.labels.value The value is taken from the data.events field.
data.head_branch security_result.about.labels.value The value is taken from the data.head_branch field.
data.head_sha target.file.sha256 The value is taken from the data.head_sha field.
data.hook_id target.resource.attribute.labels.value The value is taken from the data.hook_id field.
data.started_at extensions.vulns.vulnerabilities.scan_start_time The value is converted from ISO8601 format to a timestamp.
data.team target.user.group_identifiers The value is taken from the data.team field.
data.trigger_id security_result.about.labels.value The value is taken from the data.trigger_id field.
data.workflow_id security_result.about.labels.value The value is taken from the data.workflow_id field.
data.workflow_run_id security_result.about.labels.value The value is taken from the data.workflow_run_id field.
enterprise.name additional.fields.value.string_value The value is taken from the enterprise.name field.
external_identity_nameid target.user.email_addresses If the value is an email address, it is added to the target.user.email_addresses array.
external_identity_nameid target.user.userid The value is taken from the external_identity_nameid field.
external_identity_username target.user.user_display_name The value is taken from the external_identity_username field.
hashed_token network.session_id The value is taken from the hashed_token field.
job_name target.resource.attribute.labels.value The value is taken from the job_name field.
job_workflow_ref target.resource.attribute.labels.value The value is taken from the job_workflow_ref field.
org target.administrative_domain The value is taken from the org field.
org_id additional.fields.value.string_value The value is taken from the org_id field.
programmatic_access_type additional.fields.value.string_value The value is taken from the programmatic_access_type field.
public_repo additional.fields.value.string_value The value is taken from the public_repo field.
public_repo target.location.name If the value is "false", it is mapped to "PRIVATE". Otherwise, it is mapped to "PUBLIC".
query_string additional.fields.value.string_value The value is taken from the query_string field.
rate_limit_remaining additional.fields.value.string_value The value is taken from the rate_limit_remaining field.
repo target.resource.name The value is taken from the repo field.
repo_id additional.fields.value.string_value The value is taken from the repo_id field.
repository_public additional.fields.value.string_value The value is taken from the repository_public field.
request_body additional.fields.value.string_value The value is taken from the request_body field.
request_method network.http.method The value is converted to uppercase.
route additional.fields.value.string_value The value is taken from the route field.
status_code network.http.response_code The value is converted to an integer.
timestamp metadata.event_timestamp The value is converted from UNIX milliseconds to a timestamp.
token_id additional.fields.value.string_value The value is taken from the token_id field.
token_scopes additional.fields.value.string_value The value is taken from the token_scopes field.
transport_protocol_name network.application_protocol The value is converted to uppercase.
url_path target.url The value is taken from the url_path field.
user target.user.user_display_name The value is taken from the user field.
user_agent network.http.user_agent The value is taken from the user_agent field.
user_agent network.http.parsed_user_agent The value is parsed.
user_id target.user.userid The value is taken from the user_id field.
workflow.name security_result.about.labels.value The value is taken from the workflow.name field.
workflow_run.actor.login principal.user.userid The value is taken from the workflow_run.actor.login field.
workflow_run.event additional.fields.value.string_value The value is taken from the workflow_run.event field.
workflow_run.head_branch security_result.about.labels.value The value is taken from the workflow_run.head_branch field.
workflow_run.head_sha target.file.sha256 The value is taken from the workflow_run.head_sha field.
workflow_run.id target.resource.attribute.labels.value The value is taken from the workflow_run.id field.
workflow_run.workflow_id security_result.about.labels.value The value is taken from the workflow_run.workflow_id field.
N/A metadata.event_type The value is determined based on the action and actor fields. If the action field contains "_member", the value is set to "USER_RESOURCE_UPDATE_PERMISSIONS". If the action field is not empty and the actor field is not empty, the value is set to "USER_RESOURCE_UPDATE_CONTENT". Otherwise, the value is set to "USER_RESOURCE_ACCESS".
N/A metadata.log_type The value is set to "GITHUB".
N/A metadata.product_name The value is set to "GITHUB".
N/A metadata.vendor_name The value is set to "GITHUB".
N/A target.resource.resource_type The value is set to "STORAGE_OBJECT".
N/A security_result.about.labels.key The value is set to a constant string based on the corresponding data field. For example, for data.workflow_id, the key is set to "Workflow Id".
N/A target.resource.attribute.labels.key The value is set to a constant string based on the corresponding data field. For example, for data.hook_id, the key is set to "Hook Id".

Need more help? Get answers from Community members and Google SecOps professionals.