Cloud Storage FUSE를 사용하면 학습 데이터를 Cloud Storage 버킷에 로드하고 마운트된 파일 시스템처럼 커스텀 학습 작업에서 해당 데이터에 액세스할 수 있습니다. Cloud Storage FUSE를 사용하면 다음과 같은 이점이 있습니다.
학습 데이터가 복제본으로 다운로드되지 않고 학습 작업에 스트리밍되므로 작업이 실행될 때 데이터 로드 및 설정 작업 속도가 빨라집니다.
학습 작업이 API를 호출하거나 응답을 처리하거나 클라이언트 측 라이브러리와 통합하지 않고도 대량의 입력과 출력을 처리할 수 있습니다.
Cloud Storage FUSE가 대용량 파일 순차 읽기 및 분산 학습 시나리오에서 높은 처리량을 제공합니다.
사용 사례
다음과 같은 상황에서 학습 데이터를 저장하려면 Cloud Storage를 사용하는 것이 좋습니다.
학습 데이터는 이미지, 텍스트, 동영상과 같은 구조화되지 않은 데이터입니다.
학습 데이터는 TFRecord와 같은 형식의 구조화된 데이터입니다.
학습 데이터에는 원시 동영상과 같은 대용량 파일이 포함됩니다.
분산 학습을 사용합니다.
작동 원리
커스텀 학습 작업은 루트 /gcs 디렉터리의 하위 디렉터리인 Cloud Storage 버킷에 액세스할 수 있습니다. 예를 들어 gs://example-bucket/data.csv에 학습 데이터가 있는 경우 다음과 같이 Python 학습 애플리케이션에서 버킷을 읽고 쓸 수 있습니다.
기본적으로 커스텀 학습 작업은 Vertex AI 커스텀 코드 서비스 에이전트를 사용하여 동일한 Google Cloud 프로젝트 내의 모든 Cloud Storage 버킷에 액세스할 수 있습니다.
버킷에 대한 액세스를 제어하려면 작업에 커스텀 서비스 계정을 할당하면 됩니다. 이 경우 Cloud Storage 버킷에 대한 액세스 권한은 커스텀 서비스 계정의 Cloud Storage 역할과 연결된 권한에 따라 부여됩니다.
예를 들어 커스텀 학습 작업에 버킷 A에 대한 읽기 및 쓰기 액세스 권한을 부여하되 버킷 B에 대한 읽기 액세스 권한만 부여하려면 다음 역할이 있는 서비스 계정을 작업에 할당하면 됩니다.
[[["이해하기 쉬움","easyToUnderstand","thumb-up"],["문제가 해결됨","solvedMyProblem","thumb-up"],["기타","otherUp","thumb-up"]],[["이해하기 어려움","hardToUnderstand","thumb-down"],["잘못된 정보 또는 샘플 코드","incorrectInformationOrSampleCode","thumb-down"],["필요한 정보/샘플이 없음","missingTheInformationSamplesINeed","thumb-down"],["번역 문제","translationIssue","thumb-down"],["기타","otherDown","thumb-down"]],["최종 업데이트: 2025-09-04(UTC)"],[],[],null,["# Use Cloud Storage as a mounted file system\n\n[Cloud Storage FUSE](/storage/docs/gcs-fuse) lets you load training data to a Cloud Storage bucket and access that data from your custom training job like a mounted file system. Using Cloud Storage FUSE has the following benefits:\n\n\u003cbr /\u003e\n\n- Training data is streamed to your training job instead of downloaded to replicas, which can make data loading and setup tasks faster when the job starts running.\n- Training jobs can handle input and output at scale without making API calls, handling responses, or integrating with client-side libraries.\n- Cloud Storage FUSE provides high throughput for large file sequential reads and in distributed training scenarios.\n\nUse cases\n---------\n\nWe recommend using Cloud Storage for storing training data in the following\nsituations:\n\n- Your training data is unstructured data, such as image, text, and video.\n- Your training data is structured data in a format such as TFRecord.\n- Your training data contains large files, such as raw video.\n- You use distributed training.\n\nHow it works\n------------\n\nCustom training jobs can access your Cloud Storage buckets as subdirectories\nof the root `/gcs` directory. For example, if your training data is located at\n`gs://example-bucket/data.csv`, you can read and write to the bucket from your\nPython training application as follows:\n\n**Read to the bucket** \n\n with open('/gcs/example-bucket/data.csv', 'r') as f:\n lines = f.readlines()\n\n**Write to the bucket** \n\n with open('/gcs/example-bucket/epoch3.log', 'a') as f:\n f.write('success!\\n')\n\nBucket access permissions\n-------------------------\n\nBy default, a custom training job can access any Cloud Storage bucket\nwithin the same Google Cloud project by using the\n[Vertex AI Custom Code Service Agent](/vertex-ai/docs/general/access-control#service-agents).\nTo control access to buckets, you can assign a\n[custom service account](/vertex-ai/docs/general/custom-service-account)\nto the job. In this case, access to a Cloud Storage bucket is granted based\non the permissions associated with the Cloud Storage roles of the custom\nservice account.\n\nFor example, if you want to give the custom training job read and write access\nto Bucket-A but only read access to Bucket-B, you can assign a custom service\naccount that has the following roles to the job:\n\n- `roles/storage.objectAdmin` for Bucket-A\n- `roles/storage.objectViewer` for Bucket-B\n\nIf the training job attempts to write to Bucket-B, a \"permission denied\" error\nis returned.\n\nFor more information on Cloud Storage roles, see\n[IAM roles for Cloud Storage](/storage/docs/access-control/iam-roles).\n\nBest practices\n--------------\n\n- Avoid renaming directories. A renaming operation is not atomic in Cloud Storage FUSE. If the operation is interrupted, some files remain in the old directory.\n- Avoid unnecessarily closing (`close()`) or flushing files (`flush()`). Closing or flushing files pushes the file to Cloud Storage, which incurs a cost.\n\n### Performance optimization guidelines\n\nTo get optimal read throughput when using Cloud Storage as a file system, we\nrecommend implementing the following guidelines:\n\n- To reduce the latency introduced by looking up and opening objects in a bucket, store data in larger and fewer files.\n- Use [distributed training](/vertex-ai/docs/training/distributed-training) to maximize bandwidth utilization.\n- Cache frequently accessed files to improve read performance. For details, see [Overview of caching in Cloud Storage FUSE](/storage/docs/gcsfuse-cache).\n- Use local storage for checkpointing and logs instead of Cloud Storage.\n\nLimitations\n-----------\n\nTo learn about the limitations of Cloud Storage FUSE, including the differences\nbetween Cloud Storage FUSE and POSIX file systems, see\n[Limitations and differences from POSIX file systems](/storage/docs/gcs-fuse#differences-and-limitations).\n\nUse Cloud Storage FUSE\n----------------------\n\nTo use Cloud Storage FUSE for custom training, do the following:\n\n1. [Create a Cloud Storage bucket](/storage/docs/creating-buckets). Note that dual-region and multi-region buckets are not supported for custom training.\n2. Upload your training data to the bucket. For details, see\n [Uploads](/storage/docs/uploads-downloads#uploads).\n\n To learn about other options for transferring data to Cloud Storage, see\n [Data transfer options](/storage-transfer/docs/transfer-options).\n3. [Install Cloud Storage FUSE](/storage/docs/gcsfuse-install).\n\n4. [Mount the bucket in your training application](#how_it_works).\n\nWhat's next\n-----------\n\n- [See Cloud Storage FUSE documentation](/storage/docs/gcs-fuse).\n- [Learn about Cloud Storage FUSE pricing](/storage/docs/gcs-fuse#charges).\n- [Prepare your training application](/vertex-ai/docs/training/code-requirements) for use on Vertex AI."]]