Class BlobWriteSessionConfigs (2.37.0)

public final class BlobWriteSessionConfigs

Factory class to select and construct BlobWriteSessionConfigs.

There are several strategies which can be used to upload a Blob to Google Cloud Storage. This class provides factories which allow you to select the appropriate strategy for your workload.

Comparison of Strategies
Strategy Factory Method(s) Description Transport(s) Supported Considerations Retry Support Cloud Storage API used
Default (Chunk based upload) #getDefault() Buffer up to a configurable amount of bytes in memory, write to Cloud Storage when full or close. Buffer size is configurable via DefaultBlobWriteSessionConfig#withChunkSize(int) gRPC, HTTP The network will only be used for the following operations:
  1. Creating the Resumable Upload Session
  2. Transmitting zero or more incremental chunks
  3. Transmitting the final chunk and finalizing the Resumable Upload Session
  4. If any of the above are interrupted with a retryable error, the Resumable Upload Session will be queried to reconcile client side state with Cloud Storage
Each chunk is retried up to the limitations specified in StorageOptions#getRetrySettings() Resumable Upload
Buffer to disk then upload Buffer bytes to a temporary file on disk. On close() upload the entire files contents to Cloud Storage. Delete the temporary file. gRPC, HTTP
  1. A Resumable Upload Session will be used to upload the file on disk.
  2. If the upload is interrupted with a retryable error, the Resumable Upload Session will be queried to restart the upload from Cloud Storage's last received byte
Upload the file in the fewest number of RPC possible retrying within the limitations specified in StorageOptions#getRetrySettings() Resumable Upload
Journal to disk while uploading journaling(Collection < Path>) Create a Resumable Upload Session, before transmitting bytes to Cloud Storage write to a recovery file on disk. If the stream to Cloud Storage is interrupted with a retryable error query the offset of the Resumable Upload Session, then open the recovery file from the offset and transmit the bytes to Cloud Storage. gRPC
  1. The stream to Cloud Storage will be held open until a) the write is complete b) the stream is interrupted
  2. Because the bytes are journaled to disk, the upload to Cloud Storage can only be as fast as the disk.
  3. The use of Compute Engine Local NVMe SSD is strongly encouraged compared to Compute Engine Persistent Disk.
Opening the stream for upload will be retried up to the limitations specified in StorageOptions#getRetrySettings() All bytes are buffered to disk and allow for recovery from any arbitrary offset. Resumable Upload
Parallel Composite Upload #parallelCompositeUpload() Break the stream of bytes into smaller part objects uploading each part in parallel. Then composing the parts together to make the ultimate object. gRPC, HTTP
  1. Performing parallel composite uploads costs more money. Class A operations are performed to create each part and to perform each compose. If a storage tier other than STANDARD is used, early deletion fees apply to deletion of the parts.

    An illustrative example. Upload a 5GiB object using 64MiB as the max size per part.

    1. 80 Parts will be created (Class A)
    2. 3 compose calls will be performed (Class A)
    3. Delete 80 Parts along with 2 intermediary Compose objects (Free tier as long as STANDARD class)

           Once the parts and intermediary compose objects are deleted, there will be no storage charges related to those temporary objects.
         </li>
         <li>
           The service account/credentials used to perform the parallel composite upload require
           <a href="https://cloud.google.com/storage/docs/access-control/iam-permissions#object_permissions"><code>storage.objects.delete</code></a>
           in order to cleanup the temporary part and intermediary compose objects.
           <p><i>To handle handle part and intermediary compose object deletion out of band</i>
           passing <xref uid="" data-throw-if-not-resolved="false">PartCleanupStrategy#never()</xref> to <xref uid="com.google.cloud.storage.ParallelCompositeUploadBlobWriteSessionConfig.withPartCleanupStrategy*" data-throw-if-not-resolved="false">ParallelCompositeUploadBlobWriteSessionConfig#withPartCleanupStrategy(PartCleanupStrategy)</xref>
           will prevent automatic cleanup.
         </li>
         <li>
           Please see the <a href="https://cloud.google.com/storage/docs/parallel-composite-uploads">
           Parallel composite uploads</a> documentation for a more in depth explanation of the
           limitations of Parallel composite uploads.
         </li>
         <li>
           A failed upload can leave part and intermediary compose objects behind which will count
           as storage usage, and you will be billed for it.
           <p>By default if an upload fails, an attempt to cleanup the part and intermediary compose
           will be made. However if the program were to crash there is no means for the client to
           perform the cleanup.
           <p>Every part and intermediary compose object will be created with a name which ends in
           <code>.part</code>. An Object Lifecycle Management rule can be setup on your bucket to automatically
           cleanup objects with the suffix after some period of time. See
           <a href="https://cloud.google.com/storage/docs/lifecycle">Object Lifecycle Management</a>
           for full details and a guide on how to setup a <a href="https://cloud.google.com/storage/docs/lifecycle#delete">Delete</a>
           rule with a <a href="https://cloud.google.com/storage/docs/lifecycle#matchesprefix-suffix">suffix match</a> condition.
         </li>
         <li>
           Using parallel composite uploads are not a one size fits all solution. They have very
           real overhead until uploading a large enough object. The inflection point is dependent
           upon many factors, and there is no one size fits all value. You will need to experiment
           with your deployment and workload to determine if parallel composite uploads are useful
           to you.
         </li>
       </ol>
     </td>
     <td>
       Automatic retires will be applied for the following:
       <ol>
         <li>Creation of each individual part</li>
         <li>Performing an intermediary compose</li>
         <li>Performing a delete to cleanup each part and intermediary compose object</li>
       </ol>
    
       Retrying the creation of the final object is contingent upon if an appropriate precondition
       is supplied when calling <xref uid="com.google.cloud.storage.Storage.blobWriteSession*" data-throw-if-not-resolved="false">Storage#blobWriteSession(BlobInfo, BlobWriteOption...)</xref>.
       Either <xref uid="" data-throw-if-not-resolved="false">BlobTargetOption#doesNotExist()</xref> or <xref uid="com.google.cloud.storage.Storage.BlobTargetOption.generationMatch(long)" data-throw-if-not-resolved="false">Storage.BlobTargetOption#generationMatch(long)</xref>
       should be specified in order to make the final request idempotent.
       <p>Each operation will be retried up to the limitations specified in <xref uid="com.google.cloud.storage.StorageOptions.getRetrySettings*" data-throw-if-not-resolved="false">StorageOptions#getRetrySettings()</xref>
     </td>
     <td>
       <ul>
         <li><a href="https://cloud.google.com/storage/docs/parallel-composite-uploads">Parallel composite uploads</a></li>
         <li><a href="https://cloud.google.com/storage/docs/uploading-objects-from-memory">Direct uploads</a></li>
         <li><a href="https://cloud.google.com/storage/docs/composite-objects">Compose</a></li>
         <li><a href="https://cloud.google.com/storage/docs/deleting-objects">Object delete</a></li>
       </ul>
     </td>
    

See Also: Storage#blobWriteSession(BlobInfo, BlobWriteOption...), BlobWriteSessionConfig, GrpcStorageOptions.Builder#setBlobWriteSessionConfig(BlobWriteSessionConfig)

Inheritance

java.lang.Object > BlobWriteSessionConfigs

Static Methods

bidiWrite()

public static BidiBlobWriteSessionConfig bidiWrite()

Factory to produce a resumable upload using a bi-directional stream. This should provide a small performance increase compared to a regular resumable upload.

Configuration of the buffer size can be performed via BidiBlobWriteSessionConfig#withBufferSize(int).

Returns
Type Description
BidiBlobWriteSessionConfig

bufferToDiskThenUpload(Path path)

public static BufferToDiskThenUpload bufferToDiskThenUpload(Path path)

Create a new BlobWriteSessionConfig which will first buffer the content of the object to a temporary file under the specified path.

Once the file on disk is closed, the entire file will then be uploaded to Cloud Storage. See Also: Storage#blobWriteSession(BlobInfo, BlobWriteOption...), GrpcStorageOptions.Builder#setBlobWriteSessionConfig(BlobWriteSessionConfig)

Parameter
Name Description
path Path
Returns
Type Description
BufferToDiskThenUpload
Exceptions
Type Description
IOException

bufferToDiskThenUpload(Collection<Path> paths)

public static BufferToDiskThenUpload bufferToDiskThenUpload(Collection<Path> paths)

Create a new BlobWriteSessionConfig which will first buffer the content of the object to a temporary file under one of the specified paths.

Once the file on disk is closed, the entire file will then be uploaded to Cloud Storage.

The specifics of how the work is spread across multiple paths is undefined and subject to change. See Also: Storage#blobWriteSession(BlobInfo, BlobWriteOption...), GrpcStorageOptions.Builder#setBlobWriteSessionConfig(BlobWriteSessionConfig)

Parameter
Name Description
paths Collection<Path>
Returns
Type Description
BufferToDiskThenUpload
Exceptions
Type Description
IOException

bufferToTempDirThenUpload()

public static BlobWriteSessionConfig bufferToTempDirThenUpload()

Create a new BlobWriteSessionConfig which will first buffer the content of the object to a temporary file under java.io.tmpdir.

Once the file on disk is closed, the entire file will then be uploaded to Cloud Storage. See Also: Storage#blobWriteSession(BlobInfo, BlobWriteOption...), GrpcStorageOptions.Builder#setBlobWriteSessionConfig(BlobWriteSessionConfig)

Returns
Type Description
BlobWriteSessionConfig
Exceptions
Type Description
IOException

getDefault()

public static DefaultBlobWriteSessionConfig getDefault()

Factory to produce the default configuration for uploading an object to Cloud Storage.

Configuration of the chunk size can be performed via DefaultBlobWriteSessionConfig#withChunkSize(int). See Also: GrpcStorageDefaults#getDefaultStorageWriterConfig()

Returns
Type Description
DefaultBlobWriteSessionConfig

journaling(Collection<Path> paths)

public static JournalingBlobWriteSessionConfig journaling(Collection<Path> paths)

Create a new BlobWriteSessionConfig which will journal writes to a temporary file under one of the specified paths before transmitting the bytes to Cloud Storage.

The specifics of how the work is spread across multiple paths is undefined and subject to change. See Also: Storage#blobWriteSession(BlobInfo, BlobWriteOption...), GrpcStorageOptions.Builder#setBlobWriteSessionConfig(BlobWriteSessionConfig)

Parameter
Name Description
paths Collection<Path>
Returns
Type Description
JournalingBlobWriteSessionConfig

parallelCompositeUpload()

public static ParallelCompositeUploadBlobWriteSessionConfig parallelCompositeUpload()

Create a new BlobWriteSessionConfig which will perform a Parallel Composite Upload by breaking the stream into parts and composing the parts together to make the ultimate object. See Also: Storage#blobWriteSession(BlobInfo, BlobWriteOption...), GrpcStorageOptions.Builder#setBlobWriteSessionConfig(BlobWriteSessionConfig)

Returns
Type Description
ParallelCompositeUploadBlobWriteSessionConfig