StorageConfig(mapping=None, *, ignore_unknown_fields=False, **kwargs)Storage configuration for a model deployment.
Attributes |
|
|---|---|
| Name | Description |
model_bucket_uri |
str
Optional. The Google Cloud Storage bucket URI to load the model from. This URI must point to the directory containing the model's config file ( config.json) and model weights.
A tuned GCSFuse setup can improve LLM Pod startup time by
more than 7x. Expected format:
gs://.
|
xla_cache_bucket_uri |
str
Optional. The URI for the GCS bucket containing the XLA compilation cache. If using TPUs, the XLA cache will be written to the same path as model_bucket_uri. This can
speed up vLLM model preparation for repeated deployments.
|