이 페이지는 Cloud Translation API를 통해 번역되었습니다.

Google Cloud 서버리스 Apache Spark에서 GPU 사용

다음과 같은 결과를 얻기 위해 Google Cloud Apache Spark용 서버리스 일괄 워크로드에 GPU 가속기를 연결할 수 있습니다.

대규모 데이터 분석 워크로드 처리를 가속화합니다.
GPU 머신러닝 라이브러리를 사용하여 대규모 데이터 세트에서 모델 학습을 가속화합니다.
동영상 또는 자연어 처리와 같은 고급 데이터 분석을 수행합니다.

모든 지원되는 Serverless for Apache Spark Spark 런타임은 각 워크로드 노드에 Spark RAPIDS 라이브러리를 추가합니다. Serverless for Apache Spark Spark 런타임 버전 1.1도 워크로드 노드에 XGBoost 라이브러리를 추가합니다. 이러한 라이브러리는 GPU 가속 워크로드에 사용할 수 있는 강력한 데이터 변환 및 머신러닝 도구를 제공합니다.

GPU 이점

Apache Spark용 서버리스 Spark 워크로드에 GPU를 사용할 때의 이점은 다음과 같습니다.

성능 개선: GPU 가속은 특히 머신러닝과 딥 러닝, 그래프 처리, 복합 분석과 같은 컴퓨팅 집약적인 태스크에서 Spark 워크로드 성능을 크게 향상시켜 줍니다.
빠른 모델 학습: 머신러닝 태스크에서 GPU를 연결하면 모델 학습에 필요한 시간을 크게 줄여주고 데이터 과학자 및 엔지니어가 반복 처리 및 실험을 빠르게 수행할 수 있게 도와줍니다.
확장성: 고객이 점점 더 복잡해지는 처리 요구를 해결하기 위해 더 많은 GPU 노드 또는 더 강력한 GPU를 추가할 수 있습니다.
비용 효율성: GPU는 초기 투자가 필요하지만 처리 시간 감소 및 보다 효율적인 리소스 활용률 덕분에 시간 경과에 따라 비용 절감 효과를 얻을 수 있습니다.
향상된 데이터 분석: GPU 가속을 통해 대규모 데이터 세트에서 이미지 및 비디오 분석, 자연어 처리 등의 고급 분석을 수행할 수 있습니다.
향상된 제품: 더 빠른 처리를 통해 의사 결정을 더 빠르게 수행하고 보다 응답성이 뛰어난 애플리케이션을 지원할 수 있습니다.

제한사항 및 고려사항

GPU 가속기는 프리미엄 가격 등급에서 사용할 수 있습니다.
NVIDIA A100 또는 NVIDIA L4 GPU를 Apache Spark용 서버리스 일괄 워크로드에 연결할 수 있습니다. Google Cloud A100 및 L4 가속기는 Compute Engine GPU 지역 가용성에 따라 제공됩니다.
XGBoost 라이브러리는 Apache Spark용 서버리스 Spark 런타임 버전 1.x를 사용할 때 Apache Spark용 서버리스 GPU로 가속화된 워크로드에만 제공됩니다.
XGBoost를 사용한 Apache Spark용 서버리스 GPU로 가속화된 일괄 워크로드에는 향상된 Compute Engine 할당량이 활용됩니다. 예를 들어 NVIDIA L4 GPU를 사용하는 서버리스 일괄 워크로드를 실행하려면 NVIDIA_L4_GPUS 할당량을 할당해야 합니다.
가속기가 사용 설정된 작업은 constraints/compute.requireShieldedVm 조직 정책과 호환되지 않습니다. 조직에서 이 정책을 적용하면 가속기가 사용 설정된 작업이 성공적으로 실행되지 않습니다.
2.2 버전 이전에 지원되는 Apache Spark용 서버리스 런타임으로 RAPIDS GPU 가속을 사용할 때는 기본 문자 집합을 UTF-8로 설정해야 합니다. 자세한 내용은 GPU 가속기로 서버리스 일괄 워크로드 만들기를 참조하세요.

가격 책정

GPU 가속기는 프리미엄 가격 등급에서 사용할 수 있습니다. 가속기 가격 책정 정보는 Apache Spark용 서버리스 가격 책정을 참고하세요.

시작하기 전에

GPU 가속기가 연결된 서버리스 일괄 워크로드를 만들려면 먼저 다음을 수행합니다.

Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Roles required to select or create a project

Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
Create a project: To create a project, you need the Project Creator (roles/resourcemanager.projectCreator), which contains the resourcemanager.projects.create permission. Learn how to grant roles.

Go to project selector

Verify that billing is enabled for your Google Cloud project.

Enable the Dataproc, Compute Engine, and Cloud Storage APIs.

Roles required to enable APIs

To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains the serviceusage.services.enable permission. Learn how to grant roles.

Enable the APIs

Install the Google Cloud CLI.

외부 ID 공급업체(IdP)를 사용하는 경우 먼저 제휴 ID로 gcloud CLI에 로그인해야 합니다.

gcloud CLI를 초기화하려면, 다음 명령어를 실행합니다.

gcloud init

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Roles required to select or create a project

Select a project: Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
Create a project: To create a project, you need the Project Creator (roles/resourcemanager.projectCreator), which contains the resourcemanager.projects.create permission. Learn how to grant roles.

Go to project selector

Verify that billing is enabled for your Google Cloud project.

Enable the Dataproc, Compute Engine, and Cloud Storage APIs.

Roles required to enable APIs

To enable APIs, you need the Service Usage Admin IAM role (roles/serviceusage.serviceUsageAdmin), which contains the serviceusage.services.enable permission. Learn how to grant roles.

Enable the APIs

Install the Google Cloud CLI.

외부 ID 공급업체(IdP)를 사용하는 경우 먼저 제휴 ID로 gcloud CLI에 로그인해야 합니다.

gcloud CLI를 초기화하려면, 다음 명령어를 실행합니다.

gcloud init

In the Google Cloud console, go to the Cloud Storage Buckets page.
Go to Buckets
Click Create.
On the Create a bucket page, enter your bucket information. To go to the next step, click Continue.
1. In the Get started section, do the following:
  - Enter a globally unique name that meets the bucket naming requirements.
  - To add a bucket label, expand the Labels section (), click Add label, and specify a key and a value for your label.
2. In the Choose where to store your data section, do the following:
  1. Select a Location type.
  2. Choose a location where your bucket's data is permanently stored from the Location type drop-down menu.
    - If you select the dual-region location type, you can also choose to enable turbo replication by using the relevant checkbox.
  3. To set up cross-bucket replication, select Add cross-bucket replication via Storage Transfer Service and follow these steps:
    Set up cross-bucket replication
    
    In the Bucket menu, select a bucket.
    
    In the Replication settings section, click Configure to configure settings for the replication job.
    
    The Configure cross-bucket replication pane appears.
    
    To filter objects to replicate by object name prefix, enter a prefix that you want to include or exclude objects from, then click Add a prefix.
    
    To set a storage class for the replicated objects, select a storage class from the Storage class menu. If you skip this step, the replicated objects will use the destination bucket's storage class by default.
    
    Click Done.
3. In the Choose how to store your data section, do the following:
  1. Select a default storage class for the bucket or Autoclass for automatic storage class management of your bucket's data.
  2. To enable hierarchical namespace, in the Optimize storage for data-intensive workloads section, select Enable hierarchical namespace on this bucket.
    Note: You cannot enable hierarchical namespace in existing buckets.
4. In the Choose how to control access to objects section, select whether or not your bucket enforces public access prevention, and select an access control method for your bucket's objects.
  Note: You cannot change the Prevent public access setting if this setting is enforced at an organization policy.
5. In the Choose how to protect object data section, do the following:
  - Select any of the options under Data protection that you want to set for your bucket.
    - To enable soft delete, click the Soft delete policy (For data recovery) checkbox, and specify the number of days you want to retain objects after deletion.
    - To set Object Versioning, click the Object versioning (For version control) checkbox, and specify the maximum number of versions per object and the number of days after which the noncurrent versions expire.
    - To enable the retention policy on objects and buckets, click the Retention (For compliance) checkbox, and then do the following:
      - To enable Object Retention Lock, click the Enable object retention checkbox.
      - To enable Bucket Lock, click the Set bucket retention policy checkbox, and choose a unit of time and a length of time for your retention period.
  - To choose how your object data will be encrypted, expand the Data encryption section (), and select a Data encryption method.
Click Create.

GPU 가속기로 서버리스 일괄 워크로드 만들기

동시 로드된 PySpark 태스크를 실행하기 위해 NVIDIA L4 GPU를 사용하는 Apache Spark용 서버리스 일괄 워크로드를 제출합니다. 다음 안내에 따라 gcloud CLI를 사용합니다.

펼치기를 클릭한 후 텍스트 또는 코드 편집기를 사용해서 나열된 PySpark 코드를 만들고 로컬 머신에 test-py-spark-gpu.py 파일로 저장합니다.

#!/usr/bin/env python

"""S8s Accelerators Example."""

import subprocess
from typing import Any
from pyspark.sql import SparkSession
from pyspark.sql.functions import col
from pyspark.sql.types import IntegerType
from pyspark.sql.types import StructField
from pyspark.sql.types import StructType

spark = SparkSession.builder.appName("joindemo").getOrCreate()


def get_num_gpus(_: Any) -> int:
  """Returns the number of GPUs."""
  p_nvidia_smi = subprocess.Popen(
      ["nvidia-smi", "-L"], stdin=None, stdout=subprocess.PIPE
  )
  p_wc = subprocess.Popen(
      ["wc", "-l"],
      stdin=p_nvidia_smi.stdout,
      stdout=subprocess.PIPE,
      stderr=subprocess.PIPE,
      universal_newlines=True,
  )
  [out, _] = p_wc.communicate()
  return int(out)


num_workers = 5
result = (
    spark.sparkContext.range(0, num_workers, 1, num_workers)
    .map(get_num_gpus)
    .collect()
)
num_gpus = sum(result)
print(f"Total accelerators: {num_gpus}")

# Run the join example
schema = StructType([StructField("value", IntegerType(), True)])
df = (
    spark.sparkContext.parallelize(range(1, 10000001), 6)
    .map(lambda x: (x,))
    .toDF(schema)
)
df2 = (
    spark.sparkContext.parallelize(range(1, 10000001), 6)
    .map(lambda x: (x,))
    .toDF(schema)
)
joined_df = (
    df.select(col("value").alias("a"))
    .join(df2.select(col("value").alias("b")), col("a") == col("b"))
    .explain()
)

로컬 머신에서 gcloud CLI를 사용하여 각 작업자가 L4 GPU로 가속화된 5개의 작업자를 사용해서 Apache Spark용 서버리스 서버리스 일괄 작업을 제출합니다.

gcloud dataproc batches submit pyspark test-py-spark-gpu.py \
    --project=PROJECT_ID \
    --region=REGION \
    --deps-bucket=BUCKET_NAME \
    --version=1.1 \
    --properties=spark.dataproc.executor.compute.tier=premium,spark.dataproc.executor.disk.tier=premium,spark.dataproc.executor.resource.accelerator.type=l4,spark.executor.instances=5,spark.dataproc.driverEnv.LANG=C.UTF-8,spark.executorEnv.LANG=C.UTF-8,spark.shuffle.manager=com.nvidia.spark.rapids.RapidsShuffleManager

참고:

PROJECT_ID: Google Cloud 프로젝트 ID입니다.
REGION: 워크로드를 실행하는 데 사용할 수 있는 Compute Engine 리전입니다.
BUCKET_NAME: Cloud Storage 버킷 이름입니다. Spark는 일괄 워크로드를 실행하기 전에 이 버킷의 /dependencies 폴더에 워크로드 종속 항목을 업로드합니다.
--version: 모든 지원되는 Google Cloud Apache Spark용 Serverless 런타임은 GPU 가속 워크로드의 각 노드에 RAPIDS 라이브러리를 추가합니다. 런타임 버전 1.1만 GPU 가속 워크로드의 각 노드에 XGBoost 라이브러리를 추가합니다.

--properties(Spark 리소스 할당 속성 참조):

spark.dataproc.driverEnv.LANG=C.UTF-8 및 spark.executorEnv.LANG=C.UTF-8(2.2 이전의 런타임 버전에서 필요): 이러한 속성은 기본 문자 집합을 C.UTF-8로 설정합니다.
spark.dataproc.executor.compute.tier=premium(필수): GPU로 가속화된 워크로드는 프리미엄 데이터 컴퓨팅 단위(DCU)를 사용해서 비용이 청구됩니다. Apache Spark용 서버리스 가속기 가격 책정을 참고하세요.
spark.dataproc.executor.disk.tier=premium(필수): A100-40, A100-80, L4 가속기가 있는 노드는 프리미엄 디스크 등급을 사용해야 합니다.
spark.dataproc.executor.resource.accelerator.type=l4(필수): GPU 유형을 하나만 지정해야 합니다. 예시 작업에서는 L4 GPU를 선택합니다. 다음 인수 이름을 사용해서 다음과 같은 가속기 유형을 지정할 수 있습니다.

GPU 유형 인수 이름

A100 40GB a100-40

A100 80GB a100-80
spark.executor.instances=5(필수): 최소 2 이상이어야 합니다. 이 예시에서는 5로 설정합니다.
spark.executor.cores(선택사항): 이 속성을 설정해서 코어 vCPU 수를 지정할 수 있습니다. L4 GPU에 유효한 값은 4(기본값), 8, 12, 16, 24, 48 또는 96입니다. A100 GPU의 유일한 유효 기본값은 12입니다. L4 GPU 및 24, 48 또는 96 코어가 있는 구성에는 2, 4 또는 8 GPU가 각 실행자에 연결되어 있습니다. 다른 모든 구성에는 1 GPU가 연결되어 있습니다.
spark.dataproc.executor.disk.size(필수): L4 GPU의 디스크 크기는 375GB로 고정되어 있습니다. 단, 24, 48 또는 96 코어(각각 750, 1,500 또는 3,000GB)가 있는 구성은 예외입니다. L4로 가속화된 워크로드를 제출할 때 이 속성을 다른 값으로 설정하면 오류가 발생합니다. A100 40 또는 A100 80 GPU를 선택할 때 유효한 크기는 375g, 750g, 1500g, 3000g, 6000g, 9000g입니다.

GPU 유형	인수 이름
A100 40GB	`a100-40`
A100 80GB	`a100-80`

spark.executor.memory (선택사항) 및 spark.executor.memoryOverhead (제한됨): 메모리를 설정할 수 있지만 memoryOverhead는 설정할 수 없습니다. 설정된 속성에 사용되지 않는 사용 가능한 메모리 양은 설정되지 않은 속성에 적용됩니다. spark.executor.memoryOverhead는 PySpark 일괄 워크로드의 경우 사용 가능한 메모리의 40% 로 설정되고 다른 워크로드의 경우에는 10% 로 설정됩니다 (Spark 리소스 할당 속성 참고).

다음 표에서는 다른 A100 및 L4 GPU 구성에 대해 설정할 수 있는 최대 메모리 양을 보여줍니다. 각 속성의 최솟값은 1024MB입니다.

	A100(40GB)	A100(80GB)	L4(코어 4개)	L4(코어 8개)	L4(코어 12개)	L4(코어 16개)	L4(코어 24개)	L4(코어 48개)	L4(코어 96개)
최대 총 메모리(MB)	78040	165080	13384	26768	40152	53536	113072	160608	321216

Spark RAPIDS 속성 (선택사항): 기본적으로 Apache Spark용 서버리스는 다음 Spark RAPIDS 속성 값을 설정합니다.
- spark.plugins=com.nvidia.spark.SQLPlugin
- spark.executor.resource.gpu.amount=1
- spark.task.resource.gpu.amount=1/$spark_executor_cores
- spark.shuffle.manager=''. 기본적으로 이 속성은 설정되지 않습니다. NVIDIA는 성능 향상을 위해 GPU를 사용할 때 RAPIDS 셔플 관리자를 사용 설정할 것을 권장합니다. 이렇게 하려면 워크로드를 제출할 때 spark.shuffle.manager=com.nvidia.spark.rapids.RapidsShuffleManager를 설정합니다.
- spark.rapids.sql.concurrentGpuTasks= 최소값 (gpuMemoryinMB / 8, 4)
- spark.rapids.shuffle.multiThreaded.writer.threads= min (VM의 CPU 코어 수 / VM당 GPU 수, 32)
- spark.rapids.shuffle.multiThreaded.reader.threads= min (VM의 CPU 코어 수 / VM당 GPU 수, 32)
Spark RAPIDS 속성 설정에 대해서는 Apache Spark 구성을 위한 RAPIDS 가속기를 참조하고 Spark 고급 속성 설정에 대해서는 Apache Spark 고급 구성을 위한 RAPIDS 가속기를 참조하세요.

Google Cloud 서버리스 Apache Spark에서 GPU 사용

GPU 이점

제한사항 및 고려사항

가격 책정

시작하기 전에

Set up cross-bucket replication

GPU 가속기로 서버리스 일괄 워크로드 만들기