이 문서에서는 서버리스 방식으로 수십억 개의 객체에 작업을 실행할 수 있는 Cloud Storage 기능인 스토리지 일괄 작업을 설명합니다. 스토리지 일괄 작업을 사용하면 수십억 개의 객체에 대한 대규모 API 작업을 자동화하여 각 요청에 대한 스크립트를 작성하고 유지하는 데 필요한 개발 시간을 줄일 수 있습니다.
스토리지 일괄 작업을 사용하면 객체 유지, 객체 삭제, 객체 메타데이터 업데이트, 객체 재작성과 같은 네 가지 변환 중 하나를 여러 객체에 한 번에 실행할 수 있습니다. 스토리지 일괄 작업을 사용하려면 어떤 객체에 어떤 변환을 적용해야 하는지 정의하는 작업 구성을 만듭니다.
일괄 작업을 만들면 요청의 상태를 나타내는 장기 실행 작업(LRO)이 반환됩니다. 요청에 지정된 모든 객체에 변환이 적용되었는지 여부를 나타냅니다.
이점
확장성: 단일 스토리지 일괄 작업으로 수백만 개의 객체에서 변환을 실행합니다.
서버리스 실행: 서버리스 환경에서 일괄 작업을 실행하여 인프라를 관리할 필요가 없습니다.
자동화: 복잡하고 반복적인 작업을 자동화하여 운영 효율성을 개선합니다.
개발 시간 단축: 복잡한 커스텀 스크립트를 작성하고 유지하지 않아도 됩니다.
성능: 시간에 민감한 작업을 필요한 시간 내에 완료합니다. 버킷에서 여러 일괄 작업을 동시에 실행하면 3시간 이내에 최대 10억 개의 객체를 처리할 수 있습니다.
자동 재시도: 실패한 작업을 자동으로 재시도합니다.
작업 모니터링: 모든 작업의 상태와 완료를 모니터링하기 위한 자세한 진행 상황을 추적합니다.
비용 최적화: Cloud Storage 버킷에서 객체를 일괄 삭제하여 스토리지 비용을 절감합니다.
작업 구성
스토리지 일괄 작업을 만들려면 다음 작업 구성을 설정해야 합니다.
작업 구성은 다양한 처리 요구사항에 따라 작업이 정의되는 방식을 제어하는 매개변수입니다.
작업 이름: 스토리지 일괄 작업을 식별하는 고유한 이름입니다. 작업을 추적, 모니터링, 참조하는 데 사용됩니다. 작업 이름은 영숫자입니다(예: job-01).
작업 설명(선택사항): 작업의 목적에 관한 간단한 설명입니다. 이렇게 하면 작업 세부정보를 이해하고 문서화하는 데 도움이 됩니다. 예를 들면 Deletes all objects in a bucket입니다.
버킷 이름: 처리할 객체가 포함된 스토리지 버킷의 이름입니다. 이는 입력 데이터를 찾는 데 필수적입니다. 예를 들면 my-bucket입니다. 작업에는 버킷 이름을 하나만 지정할 수 있습니다.
객체 선택: 처리할 객체를 정의하는 선택 기준입니다. 다음 옵션 중 하나를 사용하여 기준을 지정할 수 있습니다.
매니페스트: 스토리지 일괄 작업을 만들 때 매니페스트를 만들고 위치를 지정합니다. 매니페스트는 처리할 객체 하나 또는 객체 목록이 포함된 CSV 파일로, Google Cloud에 업로드됩니다. 매니페스트의 각 행에는 객체의 bucket 및 name이 포함되어야 합니다. 원하는 경우 객체의 generation을 지정할 수 있습니다. generation을 지정하지 않으면 객체의 현재 버전이 사용됩니다.
[[["이해하기 쉬움","easyToUnderstand","thumb-up"],["문제가 해결됨","solvedMyProblem","thumb-up"],["기타","otherUp","thumb-up"]],[["이해하기 어려움","hardToUnderstand","thumb-down"],["잘못된 정보 또는 샘플 코드","incorrectInformationOrSampleCode","thumb-down"],["필요한 정보/샘플이 없음","missingTheInformationSamplesINeed","thumb-down"],["번역 문제","translationIssue","thumb-down"],["기타","otherDown","thumb-down"]],["최종 업데이트: 2025-08-26(UTC)"],[],[],null,["# Storage batch operations\n\n| Storage batch operations is available only if you've configured [Storage Intelligence](/storage/docs/storage-intelligence/overview).\n\nThis document describes storage batch operations, a\nCloud Storage capability that lets you perform operations on billions of\nobjects in a serverless manner. Using\nstorage batch operations, you can automate large-scale API\noperations on billions of objects, reducing the development time required to\nwrite and maintain scripts for each request.\n\nTo learn how to create storage batch operations jobs, see\n[Create and manage storage batch operations jobs](/storage/docs/batch-operations/create-manage-batch-operation-jobs).\n\nOverview\n--------\n\nStorage batch operations let you run one of four transformations on\nmultiple objects at once: placing an object hold, deleting an object,\nupdating object metadata, and rewriting objects. To use\nstorage batch operations, you create a [job configuration](#job-configurations) that\ndefines what transformations should be applied to which objects.\n\nCreating a batch operation returns a long-running operation\n(LRO) that indicates the status of your request: whether the transformation has\nbeen applied to all specified objects in your request.\n\n### Benefits\n\n- **Scalability**: Perform transformations on millions of objects with a single storage batch operations job.\n- **Serverless execution**: Run batch jobs in a serverless environment, eliminating the need to manage infrastructure.\n- **Automation**: Automate complex and repetitive tasks, improving operational efficiency.\n- **Reduced development time**: Avoid writing and maintaining complex custom scripts.\n- **Performance**: Complete time-sensitive operations within the required time. With multiple batch jobs running concurrently on a bucket, you can process up to one billion objects within three hours.\n- **Automatic retries**: Automatic retries for failed operations.\n- **Job monitoring**: Detailed progress tracking to monitor the status and completion of all jobs.\n\n### Use cases\n\nWhen used with [Storage Insights datasets](/storage/docs/insights/datasets),\nstorage batch operations allow you to accomplish the following\ntasks:\n\n- **Security management**:\n\n - Set encryption keys on multiple objects using the [rewrite object](/storage/docs/json_api/v1/objects/rewrite) method.\n - Apply or remove object holds to control object immutability.\n- **Compliance**:\n\n - Use object holds to meet data retention requirements for regulatory compliance.\n - Delete data between specific timeframes, to meet wipeout compliance requirements.\n- **Data transformation**: Perform bulk updates to object metadata.\n\n- **Cost optimization**: Bulk delete objects in Cloud Storage buckets to\n reduce storage costs.\n\nJob configurations\n------------------\n\nTo [create a storage batch operations job](/storage/docs/batch-operations/create-manage-batch-operation-jobs#create-batch-operation-job), you'll need to set the following job configurations.\nJob configurations are parameters that control how the job is defined for\ndifferent processing requirements.\n\n- **Job name** : A unique name to identify the storage batch operations job. This is used for tracking, monitoring, and referencing the job. Job names are alphanumeric, for example, `job-01`.\n\n- **Job Description** (Optional): A brief description of the job's purpose. This helps with understanding and documenting the job details. For example, `Deletes all objects in a bucket`.\n\n- **Bucket name** : The name of the storage bucket containing the objects to be processed. This is essential for locating the input data. For example, `my-bucket`. You can specify only one bucket name for a job.\n\n- **Object selection**: The selection criteria that defines which objects to process. You can specify the criteria using any one of the following options:\n\n - **Manifest** : Create a manifest and specify its location when you create the storage batch operations job. The manifest is a CSV file, uploaded to Google Cloud, that contains one object or a list of objects that you want to process. Each row in the manifest must include the `bucket` and `name` of the object. You can optionally specify the `generation` of the object. If you don't specify the `generation`, the current version of the object is used.\n\n The file must include a header row of the following format:\n\n `bucket,name,generation`\n\n The following is an example of the manifest: \n\n ```\n bucket,name,generation\n bucket_1,object_1,generation_1\n bucket_1,object_2,generation_2\n bucket_1,object_3,generation_3\n ```\n | **Caution:** Ensure the manifest only includes objects from the bucket provided in the storage batch operations job. Rows referencing other buckets are ignored.\n\n You can also create a manifest using Storage Insights datasets. For details, see [Create a manifest using Storage Insights datasets](/storage/docs/batch-operations/create-manage-batch-operation-jobs#create-manifest-using-insights-datasets).\n - **Object prefixes**: Specify a list of prefixes to filter objects within the bucket. Only objects with these prefixes are processed. If empty, all objects in the bucket are processed.\n\n- **Job type:** Storage batch operations supports the following job types, running a single job per batch operation.\n\n - **Object deletion** : You can [delete objects](/storage/docs/deleting-objects) within a bucket. This is crucial for cost optimization, data lifecycle management, and compliance with data deletion policies.\n\n | **Caution:** By default, Cloud Storage retains soft-deleted objects for a duration of seven days. If you have accidentally deleted the objects, you can restore these soft-deleted objects during this duration. However, if you have disabled [soft delete](/storage/docs/soft-delete) for your bucket, you cannot recover deleted objects.\n - **Metadata updates** : You can modify the [object metadata](/storage/docs/metadata#editable). This includes updating custom metadata, storage class, and other object properties.\n\n - **Object hold updates** : You can enable or disable [object holds](/storage/docs/object-holds). Object holds prevent objects from being deleted or modified, which is essential for compliance and data retention purposes.\n\n - **Object encryption key updates** : You can manage the [customer-managed encryption keys](/storage/docs/encryption/customer-managed-keys) for one or more objects. This includes applying or changing encryption keys using the [rewrite object](/storage/docs/json_api/v1/objects/rewrite) method.\n\nLimitations\n-----------\n\nStorage batch operations has the following limitations:\n\n- Storage batch operations jobs have a maximum lifetime of 14 days. Any\n ongoing job that doesn't complete within 14 days of its creation is\n automatically cancelled.\n\n- We don't recommend running more than 20 concurrent batch operations jobs on\n the same bucket.\n\n- Storage batch operations is not supported on the following\n buckets:\n\n - Buckets that have [Requestor Pays](/storage/docs/requester-pays) enabled.\n\n - Buckets located in the `us-west8` region.\n\nWhat's next\n-----------\n\n- [Create and manage storage batch operations jobs](/storage/docs/batch-operations/create-manage-batch-operation-jobs)"]]