Connect from Compute Engine

This guide gives instructions on creating a single Compute Engine client and connecting it to your Parallelstore instance.

To create and connect from multiple Compute Engine clients, you can follow the instructions in Connect from Compute Engine: multiple clients.

For better performance, client Compute Engine VMs should be created in the same zone as the Parallelstore instance.

Required permissions

You must have the following IAM role in order to create a Compute Engine VM:

Create a Compute Engine VM

Follow the instructions to create a Compute Engine VM using one of the following images:

You can choose any machine type and boot disk. We recommend at least a c2-standard-4 machine type; for higher client performance, increase the number of vCPUs to increase the network throughput. For example, a c3-standard-176 with Tier 1 networking provides 200Gbps of egress bandwidth.

Google Cloud console

  1. In the Google Cloud console, go to the VM instances page.

    Go to VM instances

  2. Select your project and click Continue.

  3. Click Create instance.

  4. Enter a name for your VM in Name. For more information, see Resource naming convention.

  5. Select the Region and Zone from the drop-down menus for this VM. Your VM should be in the same zone as your Parallelstore instance.

  6. Select a Machine configuration for your VM from the list.

  7. In the Boot disk section, click Change.

  8. Select the Public images tab.

  9. From the Operating system drop-down, select one of: HPC VM image, Ubuntu, or Debian.

  10. From the Version drop-down, select one of: HPC Rocky Linux 8, Ubuntu 22.04 LTS, or Debian GNU/Linux 12 (bookworm). Select either the x86/64 version or the Arm64 version to match your machine type.

  11. To confirm your boot disk options, click Select.

  12. Expand the Advanced Options section, then expand Networking.

  13. Under Network interfaces, select the VPC network you created in Configure a VPC network.

  14. To create and start the VM, click Create.

gcloud

Use the gcloud command line tool to create a VM:

HPC Rocky Linux 8

gcloud compute instances create VM_NAME \
  --project=PROJECT_ID \
  --zone=LOCATION \
  --machine-type=c2d-standard-112 \
  --network-interface=stack-type=IPV4_ONLY,subnet=NETWORK_NAME,nic-type=GVNIC \
  --network-performance-configs=total-egress-bandwidth-tier=TIER_1 \
  --create-disk=auto-delete=yes,boot=yes,device-name=VM_NAME,\
image=projects/cloud-hpc-image-public/global/images/hpc-rocky-linux-8-v20240126,\
mode=rw,size=100,type=pd-balanced

Rocky Linux 9 Optimized

gcloud compute instances create VM_NAME \
  --project=PROJECT_ID \
  --zone=LOCATION \
  --machine-type=c2d-standard-112 \
  --network-interface=stack-type=IPV4_ONLY,subnet=NETWORK_NAME,nic-type=GVNIC \
  --network-performance-configs=total-egress-bandwidth-tier=TIER_1 \
  --create-disk=auto-delete=yes,boot=yes,device-name=VM_NAME,\
image=projects/rocky-linux-cloud/global/images/rocky-linux-9-optimized-gcp-v20241112,\
mode=rw,size=100,type=pd-balanced

RHEL 9

gcloud compute instances create VM_NAME \
  --project=PROJECT_ID \
  --zone=LOCATION \
  --machine-type=c2d-standard-112 \
  --network-interface=stack-type=IPV4_ONLY,subnet=NETWORK_NAME,nic-type=GVNIC \
  --network-performance-configs=total-egress-bandwidth-tier=TIER_1 \
  --create-disk=auto-delete=yes,boot=yes,device-name=VM_NAME,\
image=projects/rhel-cloud/global/images/rhel-9-v20241112,\
mode=rw,size=100,type=pd-balanced

Ubuntu 22.04

gcloud compute instances create VM_NAME \
  --project=PROJECT_ID \
  --zone=LOCATION \
  --machine-type=c2d-standard-112 \
  --network-interface=stack-type=IPV4_ONLY,subnet=NETWORK_NAME,nic-type=GVNIC \
  --network-performance-configs=total-egress-bandwidth-tier=TIER_1 \
  --create-disk=auto-delete=yes,boot=yes,device-name=VM_NAME,\
image=projects/ubuntu-os-cloud/global/images/ubuntu-2204-jammy-v20240927,\
mode=rw,size=100,type=pd-balanced

Debian 12

gcloud compute instances create VM_NAME \
  --project=PROJECT_ID \
  --zone=LOCATION \
  --machine-type=c2d-standard-112 \
  --network-interface=stack-type=IPV4_ONLY,subnet=NETWORK_NAME,nic-type=GVNIC \
  --network-performance-configs=total-egress-bandwidth-tier=TIER_1 \
  --create-disk=auto-delete=yes,boot=yes,device-name=VM_NAME,\
image=projects/debian-cloud/global/images/debian-12-bookworm-v20240415,\
mode=rw,size=100,type=pd-balanced

For more information about available options, see the Compute Engine documentation.

SSH to the client VM

Google Cloud console

To SSH to your Compute Engine VM, you must first create a firewall rule allowing SSH.

  1. In the Google Cloud console, go to the Firewall policies page.

    Go to Firewall policies

  2. Click Create firewall rule.

  3. Enter a Name for the rule.

  4. For Network, select the VPC network you created earlier.

  5. Select Ingress as the Direction of traffic, and Allow as the Action on match.

  6. From the Targets drop-down, select All instances in the network.

  7. In the Source IPv4 ranges field, enter 0.0.0.0/0.

  8. From Protocols and ports, select Specified protocols and ports.

  9. Select TCP and enter 22 in the Ports field.

  10. Click Create.

Then, SSH to your VM:

  1. In the Google Cloud console, go to the VM instances page.

    Go to VM instances

  2. In the instances table, find your instance's row, and click SSH in the column titled Connect.

  3. If prompted to do so, click Authorize to allow the connection.

gcloud

To SSH to your Compute Engine VM, you must first create a firewall rule allowing SSH.

gcloud compute firewall-rules create FIREWALL_RULE_NAME \
  --allow=tcp:22 \
  --network=NETWORK_NAME \
  --source-ranges=0.0.0.0/0 \
  --project=PROJECT_ID

Then connect using gcloud compute ssh:

gcloud compute ssh VM_NAME --zone=ZONE --project=PROJECT_ID

Install the DAOS client library

The DAOS client library provides a POSIX-like interface to the Parallelstore data layer. The software runs as an agent on your client machines and must be installed and run before you can access your data.

HPC Rocky Linux 8

The following commands must be executed on each Compute Engine VM.

  1. Add the Parallelstore package repository:

    sudo tee /etc/yum.repos.d/parallelstore-v2-6-el8.repo << EOF
    [parallelstore-v2-6-el8]
    name=Parallelstore EL8 v2.6
    baseurl=https://us-central1-yum.pkg.dev/projects/parallelstore-packages/v2-6-el8
    enabled=1
    repo_gpgcheck=0
    gpgcheck=0
    EOF
    
  2. Update the local metadata cache:

    sudo dnf makecache
    
  3. Install daos-client:

    sudo dnf install -y epel-release && \
    sudo dnf install -y daos-client
    
  4. Upgrade libfabric:

    sudo dnf upgrade -y libfabric
    

Rocky Linux 9 Optimized

The following commands must be executed on each Compute Engine VM.

  1. Add the Parallelstore package repository:

    sudo tee /etc/yum.repos.d/parallelstore-v2-6-el9.repo << EOF
    [parallelstore-v2-6-el9]
    name=Parallelstore EL9 v2.6
    baseurl=https://us-central1-yum.pkg.dev/projects/parallelstore-packages/v2-6-el9
    enabled=1
    repo_gpgcheck=0
    gpgcheck=0
    EOF
    
  2. Update the local metadata cache:

    sudo dnf makecache
    
  3. Install daos-client:

    sudo dnf install -y epel-release && \
    sudo dnf install -y daos-client
    
  4. Upgrade libfabric:

    sudo dnf upgrade -y libfabric
    

RHEL 9

The following commands must be executed on each Compute Engine VM.

  1. Add the Parallelstore package repository:

    sudo tee /etc/yum.repos.d/parallelstore-v2-6-el9.repo << EOF
    [parallelstore-v2-6-el9]
    name=Parallelstore EL9 v2.6
    baseurl=https://us-central1-yum.pkg.dev/projects/parallelstore-packages/v2-6-el9
    enabled=1
    repo_gpgcheck=0
    gpgcheck=0
    EOF
    
  2. Update the local metadata cache:

    sudo dnf makecache
    
  3. Install daos-client:

    sudo dnf install https://dl.fedoraproject.org/pub/epel/epel-release-latest-9.noarch.rpm
    
    sudo dnf install -y epel-release && \
    sudo dnf install -y daos-client
    
  4. Upgrade libfabric:

    sudo dnf upgrade -y libfabric
    

Ubuntu 22.04

The following commands must be executed on each Compute Engine VM.

  1. Add the Parallelstore package repository:

    curl https://us-central1-apt.pkg.dev/doc/repo-signing-key.gpg | sudo apt-key add -
    echo "deb https://us-central1-apt.pkg.dev/projects/parallelstore-packages v2-6-deb main" | sudo tee -a /etc/apt/sources.list.d/artifact-registry.list
    
  2. Update the package index:

    sudo apt update
    
  3. Install daos-client:

    sudo apt install -y daos-client
    

Debian 12

The following commands must be executed on each Compute Engine VM.

  1. Add the Parallelstore package repository:

    curl https://us-central1-apt.pkg.dev/doc/repo-signing-key.gpg | sudo apt-key add -
    echo "deb https://us-central1-apt.pkg.dev/projects/parallelstore-packages v2-6-deb main" | sudo tee -a /etc/apt/sources.list.d/artifact-registry.list
    
  2. Update the package index:

    sudo apt update
    
  3. Install daos-client:

    sudo apt install -y daos-client
    

Increase the open files limit (Ubuntu only)

For VMs running Ubuntu 22.04, you must increase the open files limit to 131072 to support dfuse and the interception library.

If you choose not to use the interception library, you can alternatively run ulimit -n 131072 immediately before starting dfuse.

To increase the open files limit from 1024, run the following commands on each VM.

sudo tee -a /etc/security/limits.conf <<EOF
* soft nofile 131072
* hard nofile 131072
EOF

Then, reboot:

sudo reboot

SSH to the client VM again once it finishes rebooting.

Update the DAOS agent configuration

Update /etc/daos/daos_agent.yml as follows:

  1. Uncomment and update access_points with the accessPoints IP addresses from the Parallelstore instance properties. For example: access_points: ['172.21.95.2', '172.21.95.4', '172.21.95.5'].

    To print the access points in the correct format to copy and paste, run the following command:

    echo access_points\: $(gcloud beta parallelstore instances describe \
      INSTANCE_ID --location LOCATION --project PROJECT_ID \
      --format "value[delimiter=', '](format("{0}", accessPoints))")
    
  2. Uncomment the following two lines. Indentation matters so make sure to keep the spaces in front of allow_insecure:

    # transport_config:
    #   allow_insecure: false
    
  3. Change the value of allow_insecure to true as certificates are not supported.

     transport_config:
       allow_insecure: true
    
  4. Specify the network interface that provides connectivity to the Parallelstore instance. The interface is often eth0, ens4, or enp0s3, but might be different depending on your network configuration. You can use the route command to show your VM's default gateway; the interface to specify is usually the one sharing a subnet with the gateway.

    First, list all available network interfaces:

    ip a
    

    The output is similar to the following:

    1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
        link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
        inet 127.0.0.1/8 scope host lo
           valid_lft forever preferred_lft forever
    2: eth0@if7: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1460 qdisc noqueue state UP group default
        link/ether e4:9x:3f:x7:dx:f7 brd ff:ff:ff:ff:ff:ff link-netnsid 0
        inet 10.88.0.3/16 brd 10.88.255.255 scope global eth0
           valid_lft forever preferred_lft forever
    3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1460 qdisc noqueue state DOWN group default
        link/ether 02:4x:6y:1z:84:45 brd ff:ff:ff:ff:ff:ff
        inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
           valid_lft forever preferred_lft forever
    

    Run route to display the routing table:

    route
    

    The output is similar to the following:

    Kernel IP routing table
    Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
    default         10.88.0.1       0.0.0.0         UG    0      0        0 eth0
    10.88.0.0       0.0.0.0         255.255.0.0     U     0      0        0 eth0
    172.17.0.0      0.0.0.0         255.255.0.0     U     0      0        0 docker0
    

    In the example, the default gateway is 10.88.0.1, and it's shared by eth0, so specify eth0 as the interface to use.

    Edit /etc/daos/daos_agent.yml. Uncomment include_fabric_ifaces and update the value:

    include_fabric_ifaces: ["eth0"]
    

    Save and close the file.

Start the DAOS agent

HPC Rocky Linux 8

sudo systemctl start daos_agent.service

You can check the status to make sure the agent is running:

systemctl status daos_agent.service

Rocky Linux 9 Optimized

sudo systemctl start daos_agent.service

You can check the status to make sure the agent is running:

systemctl status daos_agent.service

RHEL 9

sudo systemctl start daos_agent.service

You can check the status to make sure the agent is running:

systemctl status daos_agent.service

Ubuntu 22.04

sudo mkdir /var/run/daos_agent && \
sudo daos_agent -o /etc/daos/daos_agent.yml &

Debian 12

sudo mkdir /var/run/daos_agent && \
sudo daos_agent -o /etc/daos/daos_agent.yml &

Set up logging

Set up local logging to aid with client-side debugging, if necessary:

export D_LOG_MASK=INFO
export D_LOG_FILE_APPEND_PID=1
rm -f /tmp/client.log.*
export D_LOG_FILE=/tmp/client.log

Mount the instance using dfuse

Mount the Parallelstore instance using dfuse (DAOS FUSE).

  1. Edit /etc/fuse.conf to add user_allow_other.

  2. Specify the --multi-user option with dfuse:

    mkdir -p /tmp/parallelstore
    dfuse -m /tmp/parallelstore \
      --pool default-pool \
      --container default-container \
      --disable-wb-cache \
      --thread-count=20 \
      --eq-count=10 \
      --multi-user
    

For help optimizing the values of --thread-count and --eq-count, see the Thread count and event queue count section of the Performance considerations page.

Access your Parallelstore instance

Your Parallelstore instance is now mounted to your Compute Engine VM at the path specified by the -m flag, and readable/writable using standard POSIX syntax, with some exceptions.

If you run df on the instance, the SIZE value is 1.5x the value specified with --capacity-gib. The amount of usable space is still --capacity-gib due to the nature of the erasure encoding used by Parallelstore. Every 2 bytes written uses 3 bytes from the perspective of df.

Unmount the instance

The Parallelstore instance can be unmounted using the following command:

sudo umount /tmp/parallelstore/

What's next