Troubleshooting

This page shows you troubleshooting steps that you might find helpful if you run into problems using Filestore.

Slow performance

  1. Ensure that you are using the recommended machine type for the client VM.
  2. If your client VM is running Linux, confirm that you're using the default mount options.

  3. Ensure that the client VM is located in the same region as the Filestore instance. Mounting across regions not only reduces performance, it also incurs a networking cost.

  4. Ensure that your Filestore instance isn't at or near full capacity. When capacity is nearly full, any remaining space is highly fragmented, causing read and write operations to slow down. The amount of free space needed to avoid this scenario is case-dependent. We recommend setting up low disk space alerts.

  5. Test the performance of your Filestore instance using the fio tool.

    If the test results show abnormally slow performance, contact your account representative. If the test results show similar or greater performance than expected, continue to the next section.

Use cases that cause slow performance

Here are some use cases and scenarios that cause poor performance:

Workloads involving high volumes of small files

Filestore file shares use the sync export option for data safety and NFS protocol compliance. For most data-modifying operations, the Filestore instance waits for the data to be committed to storage before replying to requests from the client VM. When many files are involved in an operation, the client makes a long series of synchronous operations and the cumulative latency adds up.

An example of this scenario is when you extract an archive on the file share, like tar files. TAR makes many synchronous operations in a series when extracting an archive containing many files. As a result, performance is reduced.

If you're trying to copy many small files to a file share, try parallelizing file creation with a tool like gsutil:

mkdir -p /mnt/nfs/many_files_rsync/
time gsutil -m -q rsync -rp many_files /mnt/nfs/many_files_rsync/

Copying data between Cloud Storage and Filestore

Copying data from Cloud Storage to a Filestore instance using gsutil is known to be slow. There is no known mitigation.

Latency when mounting and unmounting a file share

When mounting a file share using the default mount options, the mount command attempts to discover the supported transport method of the Filestore instance, which introduces a three-second latency.

The mountd daemon first tries to use UDP, which Filestore doesn't support. Once the initial try times out, it falls back to TCP. To bypass this discovery process and eliminate the added latency, you can specify the tcp mount option, for example:

sudo mount -o tcp 10.0.0.2:/vol1 /mnt/nfs

This mount option is especially important if you're automounting with autofs.

Filestore is unresponsive

Filestore instance not responding to ping or traceroute requests

Filestore instances do not respond to ping or traceroute requests because Filestore doesn't allow ICMP.

To test for connectivity to a Filestore instance, you can run showmount from the client:

sudo showmount -e filestore-ip

The Filestore instance responds with its exported file system, for example:

Export list for 10.139.19.98:
/vol1 192.168.0.0/16,172.16.0.0/12,10.0.0.0/8

You can also check whether the client can reach Filestore's RPC information by running:

sudo rpcinfo -p <filestore-ip>

The response looks like:

program vers proto   port  service
 100000    4   tcp    111  portmapper
 100000    3   tcp    111  portmapper
 100000    2   tcp    111  portmapper
 100000    4   udp    111  portmapper
 100000    3   udp    111  portmapper
 100000    2   udp    111  portmapper
 100024    1   udp   2046  status
 100024    1   tcp   2046  status
 100003    3   tcp   2049  nfs
 100227    3   tcp   2049
 100021    1   udp   4045  nlockmgr
 100021    3   udp   4045  nlockmgr
 100021    4   udp   4045  nlockmgr
 100021    1   tcp   4045  nlockmgr
 100021    3   tcp   4045  nlockmgr
 100021    4   tcp   4045  nlockmgr
 100005    3   udp   2050  mountd
 100005    3   tcp   2050  mountd

Scheduled maintenance

Once in a while, Filestore becomes unresponsive for a few minutes and then becomes responsive again because of a scheduled maintenance event. For Filestore's SLA, see the SLA page.

Filestore does not support customer-defined maintenance windows. The schedule for maintenance windows for Filestore is also unavailable to customers.

Instance was deleted while still mounted to the client

If a file operation or unix command like df, ls, or any read/write operation stops responding, then the Filestore instance was likely deleted while still mounted to the client.

Check to see if the instance still exists:

    gcloud filestore instances list

If the instance is no longer listed, you can recover control by creating a new instance with the same IP address and file share name as the instance that was deleted. Once the instance is created, the unresponsive operation exits with an error. If you don't need the Filestore instance, you can unmount the file share and delete it.

To prevent something like this from happening in the future, make sure you unmount the Filestore instance first before deleting it.

Instance shows status REPAIRING

The Filestore instance is in an unhealthy state from internal causes beyond the user's control and is automatically repairing itself. The instance is unavailable during this time and you don't need to take any further actions.

Capacity issues

"No space left on device"

Check if the Filestore instance has sufficient inodes by running the following command on the client VM:

df -i

The command returns something similar to the following:

Filesystem           Inodes        IUsed      IFree         IUse%  Mounted on
10.0.0.2:/vol1    134217728        13         134217715     1%     /mnt/test

Each file stored on the file share consumes one inode. If IUse% reaches 100%, you are not able to store more files on the file share even if you haven't reached the maximum allocated capacity. The number of inodes scales with capacity. If you want to add more inodes, you must add more capacity.

'df' and 'du' commands report different amounts of free disk space

When a file that is open by a running process is deleted, the disk space that the file consumes does not get freed until the file is closed. The df commands accounts for the space consumed by deleted open files, whereas the du command does not. This difference in calculation is why the du command often shows more free space than df.

To display the deleted files that are still open by a running process, run:

lsof | grep deleted

Unable to create an instance

PERMISSION DENIED when creating a Filestore instance

  1. Check if the Filestore API is enabled:

    gcloud services enable file.googleapis.com
    
  2. Check if you have the roles/file.editor role. For details see IAM roles and permissions.

  3. If you are still encountering the error, then the Filestore service account might have had its file.serviceAgent role removed. To check if this is the case, run:

    gcloud projects get-iam-policy project-id-or-number  \
        --flatten="bindings[].members" \
        --format='table(bindings.role)' \
        --filter="bindings.members:service-project-number@cloud-filer.iam.gserviceaccount.com"
    

    where:

    • project-id-or-number is the ID or number of your Google Cloud project.
    • project-number is the number of your Google Cloud project.

    The command should return something similar to the following:

    ROLE
    roles/file.serviceAgent
    

    If roles/file.serviceAgent is not listed, you can restore it by running:

    gcloud projects add-iam-policy-binding project-id-or-number  \
        --member serviceAccount:service-project-number@cloud-filer.iam.gserviceaccount.com  \
        --role roles/file.serviceAgent
    

System limit for internal resources has been reached error when creating an instance

This error is caused by Filestore reaching an internal network quota. For every VPC network that you create a Filestore instance on, Filestore must create an internal network that peers with that network. These internal networks are preserved even when the Filestore instances and VPC networks associated with them are deleted.

Once the number of internal networks reaches 49 for a project, Filestore is no longer able to create new internal networks, which prevents you from creating Filestore instances on new VPC networks. Attempting to do so results in one of the following errors:

System limit for internal resources has been reached. Please request to adjust limit here: https://forms.gle/PFPJ2QD4KnCHzYEx9

You can clear the internal networks by disabling and then re-enabling the Filestore API:

gcloud services disable file.googleapis.com

gcloud services enable file.googleapis.com

If you can't disable the API because you have Filestore instances that you cannot delete or you don't want to lose quota that you've been granted through quota increase requests, then you can fill out the following form to have your network limits adjusted:

https://forms.gle/PFPJ2QD4KnCHzYEx9

If you need to regularly delete and create VPC networks and Filestore instances, there are two ways to avoid running out of network quota:

  • When you create a VPC network, use the same name as a previous network that's been used for Filestore instance creation.

  • Cycle through a pool of no more than 49 VPC networks instead of deleting and then recreating them.

Unable to mount file share

My VM or GKE pod can't access Filestore

Confirm whether the Filestore instance is reachable (ping and traceroute are not supported) by running:

sudo showmount -e <filestore-ip>

The command should respond with a list of exported file systems. Then check whether the client can reach Filestore's RPC information by running:

sudo rpcinfo -p <filestore-ip>

If the Filestore instance is not reachable, common causes include improperly configured network settings or ACL settings, or you are attempting to mount the wrong instance.

  1. Check whether IP-based access control is enabled and check whether the IP address of the client is restricted. The details can be found here.
  2. Check your firewall settings to make sure that the required ports are open. For details, see Configuring firewall rules.
  3. If you're trying to access Filestore from a GKE cluster, and are getting the error mount.nfs: access denied by server while mounting ..., see Unable to access file share from GKE clusters.

Permission denied when trying to mount a file share

Confirm whether there are any NFS Export Options listed for the instance:

gcloud filestore instances describe instance-id \
    --zone=zone

where:

  • instance-id is the instance ID of the Filestore.
  • zone is the zone where the Filestore instance resides.

The command returns something similar to:

createTime: '2019-10-11T17:28:23.340943077Z'
fileShares:
- capacityGb: '1024'
  name: vol1
  nfsExportOptions:
  - accessMode: READ_WRITE
    ipRanges:
    - 128.0.0.0/29
    squashMode: NO_ROOT_SQUASH
name: projects/yourproject/locations/us-central1-c/instances/nfs-server
networks:
- ipAddresses:
  - 10.0.0.2
  modes:
  - MODE_IPV4
  network: default
  reservedIpRange: 10.0.0.0/29
state: READY
tier: BASIC_HDD

If you find nfsExportOptions listed, check if the IP address of your client is within one of the ranges listed under ipRanges for the expected accessMode. If it isn't, you must edit the NFS Export Options.

Unable to mount a file share to App Engine

Filestore does not support App Engine.

Unable to mount a file share from a GKE cluster

You cannot directly mount Filestore file shares to GKE clusters. Instead, you must configure a PV and a PVC.

Unable to access file share from GKE clusters

For more troubleshooting information relating to Kubernetes or Google Kubernetes Engine, you can also refer to the official Kubernetes troubleshooting guide and the GKE troubleshooting guide.

Error: Output: mount.nfs: access denied by server while mounting x.x.x.x:/file-share-name

Make sure that the values of the PV spec.nfs.path and spec.nfs.server match with the name of the file share and the IP address of the Filestore instance, respectively.

Example:

If your file share is named vol1 and the IP address of the Filestore instance is 10.0.0.2, the PV spec.nfs.path and spec.nfs.server must match those values:

apiVersion: v1
kind: PersistentVolume
metadata:
 name: fileserver
spec:
 capacity:
   storage: 2T
 accessModes:
 - ReadWriteMany
 nfs:
   path: /vol1
   server: 10.0.0.2

Filestore API cannot be disabled

Make sure that all of your Filestore related resources, such as Filestore instances and backups, are deleted. You cannot disable the Filestore API while Filestore instances are deployed.

Error: Failed to create subnetwork. Couldn't find free blocks in allocated IP ranges.

For a given private connection, if you exhaust your allocated IP address space, Google Cloud returns this error: Failed to create subnetwork. Couldn't find free blocks in allocated IP ranges.

For details on how to resolve this issue, see IP address range exhaustion.