Storage bucket naming guidelines
Bucket names must adhere to the following naming conventions:
- Be unique within the project. A project appends a unique prefix to the bucket
name, ensuring there aren't clashes within the organization. In the unlikely
event of a prefix and bucket name clash across organizations, the bucket
creation fails with a
bucket name in use
error. - Have at least one and no more than 57 characters.
- Refrain from including any personally identifiable information (PII).
- Be DNS-compliant.
- Start with a letter and contain only letters, numbers, and hyphens.
Install s3cmd tool CLI
The s3cmd
tool is a command-line tool for managing object storage.
- To download the tool, navigate to the directory from where the GDC bundle was extracted.
Run the following commands to extract the s3cmd image,
s3cmd.tar.tar.gz
, to an empty temporary directory:tmpdir=$(mktemp -d) gdcloud artifacts extract oci/ $tmpdir \ --image-name=$(gdcloud artifacts tree oci | grep s3cmd.tar | sed 's/^.* //')
scp
the tar file to a client machine where you uses3cmd
for object operations; unzip and install the image.
Choose one of the following installation methods to install the s3cmd
tool:
Install through tar file
To unpack the archive and install the
s3cmd
package, run the following commands. You must have the Pythondistutils
module to install the package. The module is often part of the core Python package or you can install it using your package manager.tar xvf /tmp/gpc-system-tar-files/s3cmd.tar.tar.gz cd /tmp/gpc-system-tar-files/s3cmd sudo python3 setup.py install
Optional: Clean up the downloaded files:
rm /tmp/gpc-system-tar-files/s3cmd.tar.tar.gz rm -r /tmp/gpc-system-tar-files/s3cmd
Install with docker image
To install the
s3cmd
image, run the following commands:docker load --input s3cmd-docker.tar export S3CFG=/EMPTY_FOLDER_PATH/ alias s3cmd="docker run -it --net=host --mount=type=bind,source=/$S3CFG/,target=/g/ s3cmd-docker:latest -c /g/s3cfg"
Optional: Clean up the downloaded files:
rm s3cmd-docker.tar
Add the export and alias to the
.bashrc
file to persist after restarting the client.
Configure the s3cmd tool
Use the s3cmd tool for object-based operations.
Run the s3cmd --configure
command and specify the following:
- Access Key: Enter the access key obtained from the secret in getting access credentials.
- Secret Key: Enter the secret key obtained from the secret in getting access credentials.
- Default Region: Press
ENTER
. - S3 Endpoint: Enter the endpoint your Infrastructure Operator (IO) provides.
- For DNS style bucketnaming, enter
s3://%(bucket)
. - Optional: Enter an encryption password to protect files in transit.
- In Path to GPG, enter
/usr/bin/gpg
. - Enter
Yes
to use the HTTPS protocol. - Press
Enter
to skip entering the proxy server name.
Create storage buckets
Before you begin
A project namespace manages bucket resources in the root admin cluster. You must have a project to create a bucket. To create a new project, see Create a project. You must have appropriate bucket permissions to perform the following operations. See granting bucket access.
Create a bucket
To create a bucket, apply a bucket specification to your project namespace:
kubectl apply -f bucket.yaml
The following is an example of a bucket specification:
apiVersion: object.gdc.goog/v1alpha1
kind: Bucket
metadata:
name: BUCKET_NAME
namespace: NAMESPACE_NAME
spec:
description: DESCRIPTION
storageClass: standard-rwo
bucketPolicy :
lockingPolicy :
defaultObjectRetentionDays: RETENTION_DAY_COUNT
For more details, see the Bucket API reference.
List storage buckets
To list all the buckets that you have access to in a given object storage tenant, complete the following steps:
Run the following commands to list all buckets:
kubectl get buckets --all-namespaces kubectl get buckets --namespace NAMESPACE_NAME
Delete storage buckets
You can delete storage buckets by using the CLI. Buckets must be empty before you can delete them.
Use the
GET
orDESCRIBE
command in the View bucket configuration section to get the fully qualified bucket name.If the bucket is not empty, empty the bucket:
s3cmd rm --recursive -—force s3://FULLY_QUALIFIED_BUCKET_NAME
Delete the empty bucket:
kubectl delete buckets/BUCKET_NAME --namespace NAMESPACE_NAME
View bucket configuration
Use either command to view the configuration details for a bucket:
kubectl describe buckets/BUCKET_NAME --namespace NAMESPACE_NAME
kubectl get buckets/BUCKET_NAME --namespace NAMESPACE_NAME -o yaml
Set an object retention period
By default, you can delete objects at any time. Enable object locking with a retention period to prevent all objects in the bucket from deletion for the specified number of days. You cannot delete a bucket until you delete all objects after the retention period.
You must enable object locking when creating the bucket. You cannot enable or disable object locking after you create a bucket. However, you can modify the default object retention period.
You can create a bucket with or without enabling object locking. If you've enabled object locking, specifying a default retention period is optional.
To modify the retention period, update the
Bucket.spec.bucketPolicy.lockingPolicy.defaultObjectRetentionDays
field in the
Bucket resource.
The following is an example of updating the field in the Bucket resource:
apiVersion: object.gdc.goog/v1alpha1
kind: Bucket
metadata:
name: BUCKET_NAME
namespace: NAMESPACE_NAME
spec:
description: "This bucket has a default retention period specified."
storageClass: standard-rwo
bucketPolicy :
lockingPolicy :
defaultObjectRetentionDays: RETENTION_DAY_COUNT
---
apiVersion: object.gdc.goog/v1alpha1
kind: Bucket
metadata:
name: BUCKET_NAME
namespace: NAMESPACE_NAME
spec:
description: "This would enable object locking but not specify a default retention period."
storageClass: standard-rwo
bucketPolicy :
lockingPolicy :
---
apiVersion: object.gdc.goog/v1alpha1
kind: Bucket
metadata:
name: BUCKET_NAME
namespace: NAMESPACE_NAME
spec:
description: "This bucket does not have locking or retention enabled."
storageClass: standard-rwo
Any updates to the retention period apply to objects created in the bucket after the update. For pre-existing objects, the retention period does not change.
When you've enabled object locking, if you attempt to overwrite an object, you add a new
version of the object. You can retrieve both object versions.
For details on how to list object versions, see ListObjectVersions
from the
Amazon Web Services documentation:
https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjectVersions.html
To create a write-once, read-many (WORM) bucket, refer to the WORM Bucket section.
Grant bucket access
You can provide bucket access to other users or service accounts by creating
and applying RoleBindings
with predefined roles.
Predefined roles
project-bucket-object-viewer: This role lets a user list all buckets in the project, list objects in those buckets, and read objects and object metadata. This role does not let you write operations on objects, such as uploading, overwriting, or deleting
project-bucket-object-admin: This role lets a user list all buckets in the project, and write and read operations on objects, such as uploading, overwriting, or deleting.
project-bucket-admin: This role lets users manage all buckets in the given namespace, as well as all the objects in those buckets.
To see a complete list of the permissions granted for these roles, see the preset role permissions section.
To get the permissions that you need to create project role bindings,
ask your Project IAM Admin to grant you the Project IAM Admin
(project-iam-admin
) role.
The following is an example of creating a RoleBinding
for granting access to a
user and a service account:
Create a YAML file on your system, such as
rolebinding-object-admin-all-buckets.yaml
.apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: namespace: NAMESPACE_NAME name: readwrite-all-buckets roleRef: kind: Role name: project-bucket-object-admin apiGroup: rbac.authorization.k8s.io subjects: - kind: ServiceAccount namespace: NAMESPACE_NAME name: SA_NAME - kind: User namespace: NAMESPACE_NAME name: bob@example.com # Could be bob or bob@example.com based on your organization settings. apiGroup: rbac.authorization.k8s.io ```
Apply the YAML file:
kubectl apply \ -f rolebinding-object-admin-all-buckets.yaml
Get bucket access credentials
When you grant access to a bucket, the access credentials are created in a Secret.
The format of the secret name is object-storage-key-SUBJECT_TYPE-SUBJECT_HASH
.
- Values for
SUBJECT_TYPE
are the following:user
: the user.sa
: theServiceAccount
.
SUBJECT_HASH
is the base32-encoded SHA256 hash of the subject name.
As an example, the user bob@foo.com
has the secret named:
object-storage-key-user-oy6jdqd6bxfoqcecn2ozv6utepr5bgh355vfku7th5pmejqubdja
Access the user secret
For a user subject, the Secret is in the object-storage-access-keys
namespace in the root admin cluster.
Find the secret name:
kubectl auth can-i --list --namespace object-storage-access-keys | grep object-storage-key-
You receive an output similar to the following:
secrets [] [object-storage-key-nl-user-oy6jdqd6bxfoqcecn2ozv6utepr5bgh355vfku7th5pmejqubdja,object-storage-key-std-user-oy6jdqd6bxfoqcecn2ozv6utepr5bgh355vfku7th5pmejqubdja] [get]
Get the contents of the corresponding Secret to access buckets:
kubectl get -o yaml --namespace object-storage-access-keys secret object-storage-key-rm-user-oy6jdqd6bxfoqcecn2ozv6utepr5bgh355vfku7th5pmejqubdja
You receive an output similar to the following:
data: access-key-id: MEhYM08wWUMySjcyMkVKTFBKRU8= create-time: MjAyMi0wNy0yMiAwMTowODo1OS40MTQyMTE3MDMgKzAwMDAgVVRDIG09KzE5OTAuMzQ3OTE2MTc3 secret-access-key: Ump0MVRleVN4SmhCSVJhbmlnVDAwbTJZc0IvRlJVendqR0JuYVhiVA==
Decode the access key ID and secret:
echo "MEhYM08wWUMySjcyMkVKTFBKRU8=" | base64 -d \ && echo \ && echo "Ump0MVRleVN4SmhCSVJhbmlnVDAwbTJZc0IvRlJVendqR0JuYVhiVA==" | base64 -d
You receive an output similar to the following:
0HX3O0YC2J722EJLPJEO Rjt1TeySxJhBIRanigT00m2YsB/FRUzwjGBnaXbT
Follow the section, Configure s3cmd, with the resulting information.
Access the service account secret
For a service account (SA) subject, the Secret is in the same namespace as the bucket. To find the name, run:
kubectl get --namespace NAMESPACE_NAME secrets -o=jsonpath=
'{.items[?(@.metadata.annotations.object\.gdc\.goog/subject=="SA_NAME")].metadata.name}'
You receive an output similar to the following:
object-storage-key-rm-sa-mng3olp3vsynhswzasowzu3jgzct2ert72pjp6wsbzqhdwckwzbq
You can reference the Secret in your pod as environment variables (https://kubernetes.io/docs/concepts/configuration/secret/#using-secrets-as-environment-variables) or files (https://kubernetes.io/docs/concepts/configuration/secret/#using-secrets-as-files-from-a-pod).
Preset role permissions
project-bucket-object-viewer permissions
This role grants permissions to get and list objects and objects' metadata in the bucket.
The project-bucket-object-viewer
role has the following permissions:
Bucket API permissions:
- Get
- List
- Watch
S3 object storage permissions:
GetObject
GetObjectAcl
GetObjectVersion
ListBucket
ListBucketVersions
ListBucketMultipartUploads
ListMultipartUploadParts
project-bucket-object-admin permissions
This role grants permissions to put and delete objects, object versions, and
tags in the bucket. Additionally, it also grants all permissions in the
project-bucket-object-viewer
.
The project-bucket-object-admin
role has the following object storage
permissions:
S3 Object storage permissions:
AbortMultipartUpload
DeleteObject
DeleteObjectVersion
PutObject
RestoreObject
project-bucket-admin permissions
This role grants permissions to create, update, or delete Bucket
resources in
the project namespace. Additionally, it also grants all permissions in
project-bucket-object-admin
.
The project-bucket-object-admin
role has the following permissions:
Bucket API permissions:
- Create
- Update
- Delete
Create a WORM Bucket
A WORM bucket ensures that nothing else overwrites objects and it retains them for a minimum period of time. Audit logging is an example use case for a WORM bucket.
Take the following steps to create a WORM bucket:
Set a retention period when creating the bucket. For example, the following example bucket has a retention period of 365 days.
apiVersion: object.gdc.goog/v1alpha1 kind: Bucket metadata: name: foo logging-bucket namespace: foo-service spec: description: "Audit logs for foo" storageClass: standard-rwo bucketPolicy : lockingPolicy : defaultObjectRetentionDays: 365
Grant the
project-bucket-object-viewer
role to all users who need read-only access:apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: namespace: foo-service name: object-readonly-access roleRef: kind: Role name: project-bucket-object-viewer apiGroup: rbac.authorization.k8s.io subjects: - kind: ServiceAccount namespace: foo-service name: foo-log-processor - kind: User name: bob@example.com apiGroup: rbac.authorization.k8s.io
Grant the
project-bucket-object-admin
role to users who need to write content to the bucket:apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: namespace: foo-service name: object-write-access roleRef: kind: Role name: project-bucket-object-viewer apiGroup: rbac.authorization.k8s.io subjects: - kind: ServiceAccount namespace: foo-service name: foo-service-account
Restore from object storage to file system on block storage
Allocate a persistent volume
To restore files from an object storage endpoint, follow these steps:
Allocate a persistent volume (PV) to target in the restore. Use a persistent volume claim (PVC) to allocate the volume, as shown in the following example:
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: restore-pvc namespace: restore-ns spec: storageClassName: standard-rwo accessModes: ReadWriteOnce resources: requests: storage: 1Gi # Need sufficient capacity for full restoration.
Check the status of the PVC:
kubectl get pvc restore-pvc -n restore-ns
After the PVC is in a
Bound
state, it is ready to consume inside the Pod that rehydrates it.If a
Stateful
set eventually consumes the PV, you must match the rendered StatefulSet PVCs. The pods thatStatefulSet
produces consume the hydrated volumes. The following example shows volume claim templates in a StatefulSet namedss
.volumeClaimTemplates: - metadata: name: pvc-name spec: accessModes: [ "ReadWriteOnce" ] storageClassName: standard-rwo resources: requests: storage: 1Gi
Pre-allocate PVCs with names such as
ss-pvc-name-0
andss-pvc-name-1
to ensure that the resultant Pods consume the pre-allocated volumes.
Hydrate the Persistent Volume (PV)
After the PVC is bound to a PV, start the Job
to populate the PV:
apiVersion: batch/v1
kind: Job
metadata:
name: transfer-job
namespace: transfer
spec:
template:
spec:
serviceAccountName: data-transfer-sa
volumes:
- name: data-transfer-restore-volume
persistentVolumeClaim:
claimName: restore-pvc
containers:
- name: storage-transfer-pod
image: gcr.io/private-cloud-staging/storage-transfer:latest
command: /storage-transfer
args:
- --src_endpoint=https://your-src-endpoint.com
- --src_path=/your-src-bucket
- --src_credentials=transfer/src-secret
- --dst_path=/restore-pv-mnt-path
- --src_type=s3
- --dst_type=local
volumeMounts:
- mountPath: /restore-pv-mnt-path
name: data-transfer-restore-volume
After the Job
has finished running, the data from the object storage bucket
populates the volume. A separate pod can consume the data by using
the same standard mechanisms for mounting a volume.