By Allon Cohen, PhD, VP Products, and Adi Sprachman, Director Product Management, Elastifile
This article describes how to deploy and use the Elastifile cross-cloud data fabric with Google Cloud Platform (GCP).
Elastifile's cross-cloud data fabric is a new platform that enables hybrid-cloud data access and in-cloud data processing with new dynamic workflows that span sites, clouds, and active and inactive data sources. Elastifile provides native file system compatibility, which allows you to deploy existing and new applications in the cloud without refactoring. Elastifile creates an elastic shared storage layer for in-cloud applications by delivering strong consistency, POSIX compliance, native protocols (NFS), and linear, predictable performance at scale.
This tutorial teaches you how to do the following:
- Understand Elastifile's features and architecture.
- Launch the Elastifile Management Server (EMS).
- Configure an Elastifile Cloud File System (ECFS) with multiple storage nodes.
- Mount the ECFS from a test instance and assess performance.
Elastifile Cloud File System
Elastifile Cloud File System (ECFS) is a unique software-defined infrastructure solution that seamlessly runs as a distributed file system within the cloud, letting you run enterprise applications on GCP without modification. In GCP, you deploy ECFS using standardized node images that run natively on Compute Engine.
As shown in Figure 1, each ECFS node includes flash storage resources, and the collection of Elastifile storage nodes dynamically aggregates all assigned resources for capacity pooling and linear performance scaling. Elastifile presents this storage pool through NFS as a distributed file system encompassing all of the provisioned resources.
Figure 1: ECFS in-cloud deployment
For details on the benefits and capabilities of the Elastifile Cloud File System for lifting and shifting applications to cloud, see the Elastifile CloudConnect information page.
Cloud scale elasticity
When growing the cluster within GCP, Elastifile linearly expands distributed file system performance. The following example, using a backend of 30 Elastifile in-cloud nodes, demonstrates the linearly scalable performance, supporting up to 10,000 simultaneous NFS client connections, generated by 1,000 containerized applications. For all these workloads, at all scalability points, latency remains consistently below 2 milliseconds. The environment is presented in Figure 2 below.
Figure 2: ECFS linear file system scaling in GCP
Elastifile Cloud File System (ECFS) is built from one Elastifile Management Server (EMS) and at least three dedicated storage nodes. You can scale up by adding additional storage nodes, increasing system capacity and performance.
In GCP, ECFS deployment supports the following types of storage nodes:
- Standard persistent storage node: 4 x 1-TB standard persistent disks per node
- Small SSD persistent storage node: 4 x 0.175-TB SSD persistent disks per node
- Medium SSD persistent storage node: 4 x 1-TB SSD persistent disks per node
Deploying ECFS in GCP
Deployment starts with installing the EMS and launching the graphical user interface. From the GUI, you can select the type of the required storage and the system's capacity. The EMS spawns and configures the storage nodes. After you click Deploy, the system is ready for use.
As shown in the following diagram, this tutorial explains how to deploy a minimal Elastifile configuration including 1 EMS instance, 3 persistent storage nodes, and 1 test instance:
This tutorial uses billable components of GCP, including:
The Pricing Calculator estimates the GCP cost of this tutorial is about $10 per hour. Additional charges might apply based on your consumption of GCP resources.
Before you begin
Sign in to your Google Account.
If you don't already have one, sign up for a new account.
Select or create a GCP project.
Make sure that billing is enabled for your Google Cloud Platform project. Learn how to enable billing.
- Enable the Compute Engine API.
This tutorial requires 64 CPU cores and 30 TB of Persistent Disk SSD in your target region.
If you don't have enough CPUs or Persistent Disk SSD, then you can increase your quotas as follows. Request these additional resources at least a few days in advance to help ensure that there is enough time to fulfill your request.
Go to the Quotas page.
In the Quotas page, select the quotas you want to change.
Click the Edit Quotas button at the top of the page.
Fill in your name, email, and phone number, and click Next.
Fill in your quota request, and click Next.
- CPUs: request at least 64 more than the current CPU consumption in the target region.
- Persistent Disk SSD (GB): request at least 30,000 more than the current persistent disk-SSD consumption in the target region.
Submit your request. The Compute Engine team will respond to your request within 48 hours.
Requesting access to Elastifile resources
Launching Elastifile instances requires access to Elastifile-maintained Compute Engine images. You can launch Elastifile instances directly from Cloud Launcher through the GCP Console. Alternatively, you can gain access to these images by creating a service account for Elastifile instances and requesting access to Elastifile images for that service.
Create a service account for Elastifile administration:
Start Cloud Shell:
Create an Elastifile service account:
gcloud iam service-accounts create elastifile --display-name "Elastifile service account"
Request image access from Elastifile by emailing firstname.lastname@example.org. Include the following in the body of your email:
- The email address of your Google Account
- The email address of the service account you created in the
previous step, such as
After your quota increase request has been approved and Elastifile has responded to your request for image access, you can proceed with deploying the EMS.
Initializing environment variables
Initialize environment variables by running the following commands in
Cloud Shell. You can change the values for
specify where to deploy resources.
REGION=us-east1 ZONE=us-east1-b PROJECT_ID=$DEVSHELL_PROJECT_ID SERVICE_ACCOUNT=elastifile@$[YOUR-PROJECT-ID].iam.gserviceaccount.com
Authorizing the service account
To grant permissions to the Elastifile service account, run the following commands in Cloud Shell.
gcloud iam service-accounts add-iam-policy-binding $SERVICE_ACCOUNT --member \ serviceAccount:$SERVICE_ACCOUNT --role roles/iam.serviceAccountActor
gcloud projects add-iam-policy-binding $PROJECT_ID --member \ serviceAccount:$SERVICE_ACCOUNT --role roles/compute.instanceAdmin.v1
Deploying the Elastifile Management Server
Start deployment by installing the EMS and launching the graphical user interface. From the GUI, you can select the type of the required storage and system capacity. The EMS spawns and configures the storage nodes. After you click Deploy, the system is ready for use.
To launch the EMS instance, run the following command. Be sure to use the image
URL provided by Elastifile in response to your access request, which might be
different than the
--image value shown below.
gcloud compute instances create tutorial-ems --image \ https://www.googleapis.com/compute/v1/projects/elastifile-ci/global/images/emanage-2-1-0-10-e0090fefba5b \ --service-account $SERVICE_ACCOUNT --machine-type n1-standard-2 \ --scopes=cloud-platform --tags http-server --zone $ZONE
Creating a firewall rule for HTTP traffic
To create a firewall rule allowing HTTP traffic to the EMS, run the following command.
gcloud compute firewall-rules create default-allow-http --direction=INGRESS \ --network=default --action=ALLOW --rules=tcp:80 --source-ranges=0.0.0.0/0 \ --target-tags=http-server
Configuring the Elastifile Management Server
Configure the EMS, using its web-based console.
In Cloud Shell, determine the external address of the EMS instance:
gcloud compute instances describe \ tutorial-ems --zone $ZONE --format \ "value(networkInterfaces.accessConfigs.natIP)"
Using the IP address returned by the previous command, navigate to the EMS console in your web browser.
Sign in using default credentials.
Review and accept the license agreement by clicking I Accept.
Complete the Change login password form and click Save.
On the System Configuration page, scroll down and click Next.
On the DNS page, enter the service name
tutorial-nfs.localand click Next.
On the Notifications page, scroll down and click Configure System.
Deploying storage nodes
The system displays the System View page. The navigation icon to return to this page later is at the bottom of the left-hand menu.
On the System line, where it says
Hosts: 0 Storage nodes,
Devices: 0, and
Raw Capacity: 0.0 B, click Add.
On the Add capacity screen, select
Persistent (10 TB per node),then click Add and Close.
Visit the VM instances page in the GCP Console. The page lists the 3 storage nodes launching with names like
After a couple of minutes, tiles for the new storage nodes show "Ready to deploy" on the System View page in the EMS web console. Click the Deploy button at the top right. When the dialog appears, click Deploy again, then Close.
After about five minutes, the storage node tiles show
Active, at which point you can proceed with the next section to provision a file system.
Provisioning a file system
Elastifile's namespaces are provisioned using a Data Container (DC). The DC can be shared between a few specific clients (such as an application's cluster) or open to multiple clients as a shared namespace. For each DC, you can define a soft and hard quota and enable or disable compression and deduplication.
In this section, you create a data container and configure the NFS export.
In the left-hand navigation menu, click Data Containers.
Mouse over the red plus-sign icon in the upper right, then click the dropdown Add Data Container cube icon.
Select Public, then click Select.
Fill in the following fields, then click Create.
- Data container name:
- Soft Quota (GB):
- Hard Quota (GB):
- Data container name:
Reveal the Edit panel by clicking the slash (/) row under Exports (1).
Update the following fields, then click Save.
- User Mapping: Map Everyone to
- User ID:
Deploying a test instance
Next, launch a Compute Engine instance to mount the NFS export and perform tests.
In Cloud Shell, use
gcloudto deploy the test instance:
gcloud compute instances create "tutorial-test" --zone $ZONE \ --machine-type "n1-standard-2" --image-family "debian-9" --image-project \ "debian-cloud" --boot-disk-size "200" --boot-disk-device-name \ "tutorial-test" --metadata=ems=$(gcloud compute instances describe \ tutorial-ems --zone $ZONE --format "value(networkInterfaces.networkIP)")
Configuring an NFS mount on a test instance
In Cloud Shell, use SSH to connect to your test instance:
gcloud compute ssh tutorial-test --zone $ZONE
In your test instance SSH session, install NFS libraries.
sudo apt-get -y install nfs-common
Use the metadata address to configure the DHCP client to use the EMS instance as a nameserver.
echo -e "\nsupersede domain-name-servers $(curl -sH "Metadata-Flavor: Google" \ "http://metadata/computeMetadata/v1/instance/attributes/ems");" | \ sudo tee -a /etc/dhcp/dhclient.conf sudo ifdown -a; sudo ifup -a
Show the NFS mounts exported by the EMS instance.
sudo showmount -e tutorial-nfs.local
That command displays results like the following:
Export list for tutorial-nfs.local: /tutorial-dc/root *
Mount the export.
sudo mkdir -p /mnt/elastifile sudo chown $(whoami):$(whoami) /mnt/elastifile sudo mount tutorial-nfs.local:tutorial-dc/root /mnt/elastifile
Verify the mount.
Note the last line of the output, which is similar to the following:
Filesystem Size Used Avail Use% Mounted on udev 3.7G 0 3.7G 0% /dev tmpfs 749M 10M 739M 2% /run /dev/sda1 197G 1005M 188G 1% / tmpfs 3.7G 0 3.7G 0% /dev/shm tmpfs 5.0M 0 5.0M 0% /run/lock tmpfs 3.7G 0 3.7G 0% /sys/fs/cgroup tutorial-nfs.local:tutorial-dc/root 1000G 0 1000G 0% /mnt/elastifile
Next, use the
fio utility to assess the read/write performance of the
Elastifile Cloud File System accessed through the NFS export.
sudo apt-get install -y fio
fiotest to verify performance.
fio --filename=/mnt/elastifile/tutorial.fio --direct=1 --rw=randrw \ --refill_buffers --norandommap --randrepeat=0 --ioengine=libaio --bs=8k \ --size=2G --rwmixread=70 --iodepth=5 --numjobs=5 --runtime=120 \ --group_reporting --name=tutorial
fioto finish, then inspect the output.
read : io=7167.4MB, bw=68003KB/s, iops=8500, runt=107923msec . . . write: io=3072.1MB, bw=29157KB/s, iops=3644, runt=107923msec . . . Run status group 0 (all jobs): READ: io=7165.2MB, aggrb=75110KB/s, minb=75110KB/s, maxb=75110KB/s, mint=97685msec, maxt=97685msec WRITE: io=3074.9MB, aggrb=32232KB/s, minb=32232KB/s, maxb=32232KB/s, mint=97685msec, maxt=97685msec
When you're done, close your
sshsession with the test instance.
Clean up the compute and IAM resources that this tutorial created.
If your original Cloud Shell environment has timed out, you might need to reinitialize your environment variables by repeating the commands from "Initializing environment variables" above.
Delete all instances matching the naming convention used in this tutorial.
gcloud compute instances delete --zone $ZONE --quiet $(gcloud compute \ instances list --filter="zone:$ZONE" --format "value(name)" | grep \ '^tutorial-\(ems\(-elfs-[a-z0-9]*\)\?\|test\)$')
Delete the service account and bindings.
gcloud iam service-accounts remove-iam-policy-binding $SERVICE_ACCOUNT \ --member serviceAccount:$SERVICE_ACCOUNT \ --role roles/iam.serviceAccountActor
gcloud projects remove-iam-policy-binding $PROJECT_ID \ --member serviceAccount:$SERVICE_ACCOUNT \ --role roles/compute.instanceAdmin.v1
gcloud iam service-accounts delete $SERVICE_ACCOUNT --quiet
Elastifile provides a cross-cloud data fabric that was designed to enable holistic active and inactive data storage and management dynamically across sites and clouds. For active data workloads, it supports both transactional and batch workloads with consistently high performance and powerful data services for on-premises environments or clouds, or spanning both.
By leveraging an architecture optimized for cross-cloud deployment, enterprises deploying ECFS can enable dynamic workflows for applications, with efficient and cost-effective access to data regardless of application, environment, or location. The combination of ECFS in-cloud deployment and Elastifile's CloudConnect enables lifting enterprise applications from any on-premises storage solution and shifting to flexible use on GCP. The new paradigm enabled by Elastifile allows you to gain the benefits of cloud flexibility and cost structures for both your existing and your new applications, without requiring investment in application refactoring for cloud migration.