Deploying the Elastifile Cross-Cloud Data Fabric

By Allon Cohen, PhD, VP Products, and Adi Sprachman, Director Product Management, Elastifile

This article describes how to deploy and use the Elastifile cross-cloud data fabric with Google Cloud Platform (GCP).

Elastifile's cross-cloud data fabric is a new platform that enables hybrid-cloud data access and in-cloud data processing with new dynamic workflows that span sites, clouds, and active and inactive data sources. Elastifile provides native file system compatibility, which allows you to deploy existing and new applications in the cloud without refactoring. Elastifile creates an elastic shared storage layer for in-cloud applications by delivering strong consistency, POSIX compliance, native protocols (NFS), and linear, predictable performance at scale.

Objectives

This tutorial teaches you how to do the following:

  • Understand Elastifile's features and architecture.
  • Launch the Elastifile Management Server (EMS).
  • Configure an Elastifile Cloud File System (ECFS) with multiple storage nodes.
  • Mount the ECFS from a test instance and assess performance.

Elastifile Cloud File System

Elastifile Cloud File System (ECFS) is a unique software-defined infrastructure solution that seamlessly runs as a distributed file system within the cloud, letting you run enterprise applications on GCP without modification. In GCP, you deploy ECFS using standardized node images that run natively on Compute Engine. As shown in Figure 1, each ECFS node includes flash storage resources, and the collection of Elastifile storage nodes dynamically aggregates all assigned resources for capacity pooling and linear performance scaling. Elastifile presents this storage pool through NFS as a distributed file system encompassing all of the provisioned resources.

Cluster of Elastifile storage nodes

Figure 1: ECFS in-cloud deployment

For details on the benefits and capabilities of the Elastifile Cloud File System for lifting and shifting applications to cloud, see the Elastifile CloudConnect information page.

Cloud scale elasticity

When growing the cluster within GCP, Elastifile linearly expands distributed file system performance. The following example, using a backend of 30 Elastifile in-cloud nodes, demonstrates the linearly scalable performance, supporting up to 10,000 simultaneous NFS client connections, generated by 1,000 containerized applications. For all these workloads, at all scalability points, latency remains consistently below 2 milliseconds. The environment is presented in Figure 2 below.

Growing the cluster of storage notes within GCP

Figure 2: ECFS linear file system scaling in GCP

System architecture

Elastifile Cloud File System (ECFS) is built from 1 Elastifile Management Server (EMS) and at least 3 dedicated storage nodes. You can scale up by adding additional storage nodes, increasing system capacity and performance.

In GCP, ECFS deployment supports two types of storage nodes:

  • Persistent storage node: 5 x 2-TB SSD persistent disks per node
  • Scratch storage node: 4 x 0.75-TB local SSD per node

Deploying ECFS in GCP

Deployment starts with installing the EMS and launching the graphical user interface. From the GUI, you can select the type of the required storage and the system's capacity. The EMS spawns and configures the storage nodes. After you click Deploy, the system is ready for use.

As shown in the following diagram, this tutorial explains how to deploy a minimal Elastifile configuration including 1 EMS instance, 3 persistent storage nodes, and 1 test instance:

Deploying a minimal Elastifile configuration

Costs

This tutorial uses billable components of GCP, including:

The Pricing Calculator estimates the GCP cost of this tutorial is about $10 per hour. Additional charges might apply based on your consumption of GCP resources.

Before you begin

  1. Sign in to your Google Account.

    If you don't already have one, sign up for a new account.

  2. Select or create a GCP project.

    Go to the Manage resources page

  3. Make sure that billing is enabled for your project.

    Learn how to enable billing

  4. Enable the Compute Engine API.

    Enable the API

Requesting quotas

This tutorial requires 64 CPU cores and 30 TB of Persistent Disk SSD in your target region.

If you don't have enough CPUs or Persistent Disk SSD, then you can increase your quotas as follows. Request these additional resources at least a few days in advance to help ensure that there is enough time to fulfill your request.

  1. Go to the Quotas page.

    GO TO THE QUOTAS PAGE

  2. In the Quotas page, select the quotas you want to change.

  3. Click the Edit Quotas button at the top of the page.
  4. Fill in your name, email, and phone number, and click Next.
  5. Fill in your quota request, and click Next.

    1. CPUs: request at least 64 more than the current CPU consumption in the target region.
    2. Persistent Disk SSD (GB): request at least 30,000 more than the current persistent disk-SSD consumption in the target region.
  6. Submit your request. The Compute Engine team will respond to your request within 48 hours.

Requesting access to Elastifile resources

Launching Elastifile instances requires access to Elastifile-maintained Compute Engine images. You can launch Elastifile instances directly from Cloud Launcher through the GCP Console. Alternatively, you can gain access to these images by creating a service account for Elastifile instances and requesting access to Elastifile images for that service.

Create a service account for Elastifile administration:

  1. Start Cloud Shell:

    Open Cloud Shell

  2. Create an Elastifile service account:

    gcloud iam service-accounts create elastifile --display-name "Elastifile service account"
    

  3. Request image access from Elastifile by emailing support@elastifile.com. Include the following in the body of your email:

    • The email address of your Google Account
    • The email address of the service account you created in the previous step, such as elastifile@[YOUR-PROJECT-ID].iam.gserviceaccount.com

After your quota increase request has been approved and Elastifile has responded to your request for image access, you can proceed with deploying the EMS.

Initializing environment variables

Initialize environment variables by running the following commands in Cloud Shell. You can change the values for REGION and ZONE to specify where to deploy resources.

REGION=us-east1
ZONE=us-east1-b
PROJECT_ID=$DEVSHELL_PROJECT_ID
SERVICE_ACCOUNT=elastifile@$[YOUR-PROJECT-ID].iam.gserviceaccount.com

Authorizing the service account

To grant permissions to the Elastifile service account, run the following commands in Cloud Shell.

gcloud iam service-accounts add-iam-policy-binding $SERVICE_ACCOUNT --member \
    serviceAccount:$SERVICE_ACCOUNT --role roles/iam.serviceAccountActor
gcloud projects add-iam-policy-binding $PROJECT_ID --member \
    serviceAccount:$SERVICE_ACCOUNT --role roles/compute.instanceAdmin.v1

Deploying the Elastifile Management Server

Start deployment by installing the EMS and launching the graphical user interface. From the GUI, you can select the type of the required storage and system capacity. The EMS spawns and configures the storage nodes. After you click Deploy, the system is ready for use.

To launch the EMS instance, run the following command. Be sure to use the image URL provided by Elastifile in response to your access request, which might be different than the --image value shown below.

gcloud compute instances create tutorial-ems --image \
    https://www.googleapis.com/compute/v1/projects/elastifile-ci/global/images/emanage-2-1-0-10-e0090fefba5b \
    --service-account $SERVICE_ACCOUNT --machine-type n1-standard-2 \
    --scopes=cloud-platform --tags http-server --zone $ZONE

Creating a firewall rule for HTTP traffic

To create a firewall rule allowing HTTP traffic to the EMS, run the following command.

gcloud compute firewall-rules create default-allow-http --direction=INGRESS \
    --network=default --action=ALLOW --rules=tcp:80 --source-ranges=0.0.0.0/0 \
    --target-tags=http-server

Configuring the Elastifile Management Server

Configure the EMS, using its web-based console.

  1. In Cloud Shell, determine the external address of the EMS instance:

    gcloud compute instances describe \
        tutorial-ems --zone $ZONE --format \
        "value(networkInterfaces[0].accessConfigs[0].natIP)"
    

  2. Using the IP address returned by the previous command, navigate to the EMS console in your web browser.

  3. Sign in using default credentials.

    • Username: admin
    • Password: changeme
  4. Review and accept the license agreement by clicking I Accept.

  5. Complete the Change login password form and click Save.

  6. On the System Configuration page, scroll down and click Next.

  7. On the DNS page, enter the service name tutorial-nfs.local and click Next.

  8. On the Notifications page, scroll down and click Configure System.

Deploying storage nodes

The system displays the System View page. The navigation icon to return to this page later is at the bottom of the left-hand menu.

  1. On the System line, where it says Hosts: 0 Storage nodes, Devices: 0, and Raw Capacity: 0.0 B, click Add.

  2. On the Add capacity screen, select Persistent (10 TB per node), then click Add and Close.

  3. Visit the VM instances page in the GCP Console. The page lists the 3 storage nodes launching with names like tutorial-ems-elfs-abcd1234.

  4. After a couple of minutes, tiles for the new storage nodes show "Ready to deploy" on the System View page in the EMS web console. Click the Deploy button at the top right. When the dialog appears, click Deploy again, then Close.

  5. After about five minutes, the storage node tiles show Active, at which point you can proceed with the next section to provision a file system.

Provisioning a file system

Elastifile's namespaces are provisioned using a Data Container (DC). The DC can be shared between a few specific clients (such as an application's cluster) or open to multiple clients as a shared namespace. For each DC, you can define a soft and hard quota and enable or disable compression and deduplication.

In this section, you create a data container and configure the NFS export.

  1. In the left-hand navigation menu, click Data Containers.

  2. Mouse over the red plus-sign icon in the upper right, then click the dropdown Add Data Container cube icon.

  3. Select Public, then click Select.

  4. Fill in the following fields, then click Create.

    • Data container name: tutorial-dc
    • Soft Quota (GB): 1000
    • Hard Quota (GB): 1000
  5. Reveal the Edit panel by clicking the slash (/) row under Exports (1).

  6. Update the following fields, then click Save.

    • User Mapping: Map Everyone to
    • User ID: root

Deploying a test instance

Next, launch a Compute Engine instance to mount the NFS export and perform tests.

  1. In Cloud Shell, use gcloud to deploy the test instance:

    gcloud compute instances create "tutorial-test" --zone $ZONE \
        --machine-type "n1-standard-2" --image-family  "debian-9" --image-project \
        "debian-cloud" --boot-disk-size "200" --boot-disk-device-name \
        "tutorial-test" --metadata=ems=$(gcloud compute instances describe \
        tutorial-ems --zone $ZONE --format "value(networkInterfaces[0].networkIP)")
    

Configuring an NFS mount on a test instance

  1. In Cloud Shell, use SSH to connect to your test instance:

    gcloud compute ssh tutorial-test --zone $ZONE

  2. In your test instance SSH session, install NFS libraries.

    sudo apt-get -y install nfs-common

  3. Use the metadata address to configure the DHCP client to use the EMS instance as a nameserver.

    echo -e "\nsupersede domain-name-servers $(curl -sH "Metadata-Flavor: Google" \
        "http://metadata/computeMetadata/v1/instance/attributes/ems");" | \
        sudo tee -a /etc/dhcp/dhclient.conf
        sudo ifdown -a; sudo ifup -a
    

  4. Show the NFS mounts exported by the EMS instance.

    sudo showmount -e tutorial-nfs.local

    That command displays results like the following:

    Export list for tutorial-nfs.local:
    /tutorial-dc/root *
    

  5. Mount the export.

    sudo mkdir -p /mnt/elastifile
    sudo chown $(whoami):$(whoami) /mnt/elastifile
    sudo mount tutorial-nfs.local:tutorial-dc/root /mnt/elastifile
    

  6. Verify the mount.

    df -h

    Note the last line of the output, which is similar to the following:

    Filesystem                             Size  Used Avail Use% Mounted on
    udev                                   3.7G     0  3.7G   0% /dev
    tmpfs                                  749M   10M  739M   2% /run
    /dev/sda1                              197G 1005M  188G   1% /
    tmpfs                                  3.7G     0  3.7G   0% /dev/shm
    tmpfs                                  5.0M     0  5.0M   0% /run/lock
    tmpfs                                  3.7G     0  3.7G   0% /sys/fs/cgroup
    tutorial-nfs.local:tutorial-dc/root 1000G     0 1000G   0% /mnt/elastifile
    

Testing Elastifile

Next, use the fio utility to assess the read/write performance of the Elastifile Cloud File System accessed through the NFS export.

  1. Install fio.

    sudo apt-get install -y fio

  2. Run a fio test to verify performance.

    fio --filename=/mnt/elastifile/tutorial.fio --direct=1 --rw=randrw \
        --refill_buffers --norandommap --randrepeat=0 --ioengine=libaio --bs=8k \
        --size=2G --rwmixread=70 --iodepth=5 --numjobs=5 --runtime=120 \
        --group_reporting --name=tutorial
    

  3. Wait for fio to finish, then inspect the output.

    read : io=7167.4MB, bw=68003KB/s, iops=8500, runt=107923msec
    .
    .
    .
    write: io=3072.1MB, bw=29157KB/s, iops=3644, runt=107923msec
    .
    .
    .
    Run status group 0 (all jobs):
       READ: io=7165.2MB, aggrb=75110KB/s, minb=75110KB/s, maxb=75110KB/s,
       mint=97685msec, maxt=97685msec
       WRITE: io=3074.9MB, aggrb=32232KB/s, minb=32232KB/s, maxb=32232KB/s,
       mint=97685msec, maxt=97685msec
    

  4. When you're done, close your ssh session with the test instance.

    exit

Cleaning up

Clean up the compute and IAM resources that this tutorial created.

  1. If your original Cloud Shell environment has timed out, you might need to reinitialize your environment variables by repeating the commands from "Initializing environment variables" above.

  2. Delete all instances matching the naming convention used in this tutorial.

    gcloud compute instances delete --zone $ZONE --quiet $(gcloud compute \
        instances list --filter="zone:$ZONE" --format "value(name)" | grep \
        '^tutorial-(ems(-elfs-[a-z0-9]*)\?\|test)$')
    

  3. Delete the service account and bindings.

    gcloud iam service-accounts remove-iam-policy-binding $SERVICE_ACCOUNT \
        --member serviceAccount:$SERVICE_ACCOUNT \
        --role roles/iam.serviceAccountActor
    

    gcloud projects remove-iam-policy-binding $PROJECT_ID \
        --member serviceAccount:$SERVICE_ACCOUNT \
        --role roles/compute.instanceAdmin.v1
    

    gcloud iam service-accounts delete $SERVICE_ACCOUNT --quiet

Summary

Elastifile provides a cross-cloud data fabric that was designed to enable holistic active and inactive data storage and management dynamically across sites and clouds. For active data workloads, it supports both transactional and batch workloads with consistently high performance and powerful data services for on-premises environments or clouds, or spanning both.

By leveraging an architecture optimized for cross-cloud deployment, enterprises deploying ECFS can enable dynamic workflows for applications, with efficient and cost-effective access to data regardless of application, environment, or location. The combination of ECFS in-cloud deployment and Elastifile's CloudConnect enables lifting enterprise applications from any on-premises storage solution and shifting to flexible use on GCP. The new paradigm enabled by Elastifile allows you to gain the benefits of cloud flexibility and cost structures for both your existing and your new applications, without requiring investment in application refactoring for cloud migration.

What's next

Was this page helpful? Let us know how we did:

Send feedback about...