Troubleshooting

This page shows you troubleshooting steps that you might find helpful if you run into problems using Filestore.

Slow performance

  1. Ensure that you are using the recommended machine type for the client VM.
  2. If your client VM is running Linux, confirm that you're using the default mount options.

  3. Ensure that the client VM is located in the same region as the Filestore instance. Mounting across regions not only reduces performance, it also incurs a networking cost.

  4. Ensure that your Filestore instance isn't at or near full capacity. When capacity is nearly full, any remaining space is highly fragmented, causing read and write operations to slow down. The amount of free space needed to avoid this scenario is case dependent. We recommend setting up low disk space alerts.

  5. Test the performance of your Filestore instance using the fio tool.

    If the test results show abnormally slow performance, contact your account representative. If the test results show similar or greater performance than expected, continue to the next section.

Use cases that cause slow performance

Here are some use cases and scenarios that cause poor performance:

Workloads involving high volumes of small files

Filestore file shares use the sync export option for data safety and NFS protocol compliance. This means that for most data modifying operations, the Filestore instance waits for the data to be committed to storage before replying to requests from the client VM. When many files are involved in an operation, the client makes a long series of synchronous operations and the cumulative latency adds up.

An example of this is extracting an archive on the file share, like TAR files. TAR makes a large number of synchronous operations in a series when extracting an archive containing many files. As a result, performance is significantly reduced.

If you're trying to copy a large number of small files to a file share, try parallelizing file creation with a tool like gsutil:

mkdir -p /mnt/nfs/many_files_rsync/
time gsutil -m -q rsync -rp many_files /mnt/nfs/many_files_rsync/

Copying data between Cloud Storage and Filestore

Copying data from Cloud Storage to a Filestore instance using gsutil is currently known to be slow. There is no known mitigation.

Latency when mounting and unmounting a file share

There's a three second latency introduced when mounting a file share using the default mount options, which is caused by the mount command's attempt at discovering the supported transport method of the Filestore instance.

The mountd daemon first tries to use UDP, which Filestore doesn't support. Once the initial try times out, it falls back to TCP. To bypass this discovery process and eliminate the added latency, you can specify the tcp mount option, for example:

sudo mount -o tcp 10.0.0.2:/vol1 /mnt/nfs

This is especially important if you're automounting with autofs.

Filestore is unresponsive

Filestore instance not responding to ping or traceroute requests

Filestore instances do not respond to ping or traceroute requests because Filestore doesn't allow ICMP.

To test for connectivity to a Filestore instance, you can run showmount from the client:

sudo showmount -e filestore-ip

The Filestore instance responds with its exported filesystem, for example:

Export list for 10.139.19.98:
/vol1 192.168.0.0/16,172.16.0.0/12,10.0.0.0/8

You can also check whether the client can reach Filestore's RPC information by running:

sudo rpcinfo -p <filestore-ip>

The response looks like:

program vers proto   port  service
 100000    4   tcp    111  portmapper
 100000    3   tcp    111  portmapper
 100000    2   tcp    111  portmapper
 100000    4   udp    111  portmapper
 100000    3   udp    111  portmapper
 100000    2   udp    111  portmapper
 100024    1   udp   2046  status
 100024    1   tcp   2046  status
 100003    3   tcp   2049  nfs
 100227    3   tcp   2049
 100021    1   udp   4045  nlockmgr
 100021    3   udp   4045  nlockmgr
 100021    4   udp   4045  nlockmgr
 100021    1   tcp   4045  nlockmgr
 100021    3   tcp   4045  nlockmgr
 100021    4   tcp   4045  nlockmgr
 100005    3   udp   2050  mountd
 100005    3   tcp   2050  mountd

Scheduled maintenance

If Filestore is unresponsive for a few minutes and then becomes responsive again, the unresponsive period is likely caused by a scheduled maintenance event. For Filestore's SLA, see the SLA page.

Filestore does not support customer-defined maintenance windows. The schedule for maintenance windows for Filestore is also unavailable to customers.

Instance was deleted while still mounted to the client

If a file operation or unix command like df, ls, or any read/write operation stops responding, then the Filestore instance was likely deleted while still mounted to the client.

Check to see if the instance still exists:

    gcloud filestore instances list

If the instance is no longer listed, you can recover control by creating a new instance with the same IP address and file share name as the instance that was deleted. Once the instance is created, the unresponsive operation exits with an error. You can proceed to unmount the file share and delete the Filestore instance if it's not needed.

To prevent something like this from happening in the future, make sure you unmount the Filestore instance first before deleting it.

Instance shows status REPAIRING

The Filestore instance is in an unhealthy state from internal causes beyond the user's control and is automatically repairing itself. The instance is unavailable during this time and you don't need to take any further actions.

Capacity issues

"No space left on device"

  1. Check if the Filestore instance has sufficient inodes by running the following command on the client VM:

    df -i
    

    The command returns something similar to the following:

    Filesystem           Inodes        IUsed      IFree         IUse%  Mounted on
    10.0.0.2:/vol1    134217728        13         134217715     1%     /mnt/test
    

    Each file stored on the file share consumes one inode. If IUse% is at 100%, that means there are no free inodes and you are not able to store more files on the file share even if you haven't reached the maximum allocated capacity. The number of inodes scales with capacity. If you want to add more inodes, you must add more capacity.

  2. If you still have inodes remaining, then you may have reached the maximum number of entries (either files or subdirectories) for a directory. The maximum number of entries you can have in a directory depends on the name lengths of those entries. However, hitting this limit is a probability rather than a hard limit. To fix this, you'll need to create a deeper file hierarchy by distributing your entries into subdirectories.

'df' and 'du' commands report different amounts of free disk space

When a file that is open by a running process is deleted, the disk space that the file consumes does not get freed until the file is closed. The df commands accounts for the space consumed by deleted open files, whereas the du command does not. This is why the du command often shows more free space than df.

To display the deleted files that are still open by a running process, run:

lsof | grep deleted

Unable to create an instance

PERMISSION DENIED when creating a Filestore instance

  1. Check if the Filestore API is enabled:

    gcloud services enable file.googleapis.com
    
  2. Check if you have the roles/file.editor role. For details see IAM roles and permissions.

  3. If you are still encountering the error, then the Filestore service account may have had it's file.serviceAgent role removed. To check this, run:

    gcloud projects get-iam-policy project-id-or-number  \
        --flatten="bindings[].members" \
        --format='table(bindings.role)' \
        --filter="bindings.members:service-project-number@cloud-filer.iam.gserviceaccount.com"
    

    where:

    • project-id-or-number is the ID or number of your Google Cloud project.
    • project-number is the number of your Google Cloud project.

    The command should return something similar to the following:

    ROLE
    roles/file.serviceAgent
    

    If roles/file.serviceAgent is not listed, you can restore it by running:

    gcloud projects add-iam-policy-binding project-id-or-number  \
        --member serviceAccount:service-project-number@cloud-filer.iam.gserviceaccount.com  \
        --role roles/file.serviceAgent
    

Receiving an error code 13 when creating an instance

There are a few possible causes for error code 13 errors during instance creation, but the most common cause is Filestore reaching an internal network quota.

For every VPC network that you create a Filestore instance on, Filestore must create an internal network that peers with that network. These internal networks are preserved even when the Filestore instances and VPC networks associated with them are deleted.

Once the number of internal networks reaches 49 for a project, Filestore is no longer able to create new internal networks, which prevents you from creating Filestore instances on new VPC networks. Attempting to do so results in an error:

Error code 13, message: an internal error has occurred

The only way you can clear the internal networks is to disable and then re- enable the Filestore API. Before you can disable the API, you must delete all Filestore related resources, such as Filestore instances and backups.

gcloud services disable file.googleapis.com

gcloud services enable file.googleapis.com

If this is not possible because you have Filestore instances that you need and cannot delete, then you must contact your account representative to have the peered networks cleared manually.

If you need to regularly delete and create VPC networks and Filestore instances, there are two ways to avoid running out of network quota:

  • When you create a VPC network, use the same name as a previous network that's been used for Filestore instance creation.

  • Cycle through a pool of no more than 49 VPC networks instead of deleting and then recreating them.

Unable to mount file share

My VM or GKE pod can't access Filestore

Confirm whether the Filestore instance is reachable (ping and traceroute are not supported) by running:

sudo showmount -e <filestore-ip>

The command should respond with a list of exported filesystems. Then check whether the client can reach Filestore's RPC information by running:

sudo rpcinfo -p <filestore-ip>

If the Filestore instance is not reachable, common causes include improperly configured network settings or ACL settings, or you are attempting to mount the wrong instance.

  1. Check whether IP-based access control is enabled and check whether the IP address of the client is restricted. The details can be found here.
  2. Check your firewall settings to make sure that the required ports are open. For details, see Configuring firewall rules.
  3. If you're trying to access Filestore from a GKE cluster, and are getting the error mount.nfs: access denied by server while mounting ..., see Unable to access file share from GKE clusters.

Permission denied when trying to mount a file share

Confirm whether there are any NFS Export Options listed for the instance:

gcloud filestore instances describe instance-id \
    --zone=zone

where:

  • instance-id is the instance ID of the Filestore.
  • zone is the zone where the Filestore instance resides.

The command returns something similar to:

createTime: '2019-10-11T17:28:23.340943077Z'
fileShares:
- capacityGb: '1024'
  name: vol1
  nfsExportOptions:
  - accessMode: READ_WRITE
    ipRanges:
    - 128.0.0.0/29
    squashMode: NO_ROOT_SQUASH
name: projects/yourproject/locations/us-central1-c/instances/nfs-server
networks:
- ipAddresses:
  - 10.0.0.2
  modes:
  - MODE_IPV4
  network: default
  reservedIpRange: 10.0.0.0/29
state: READY
tier: BASIC_HDD

If you find nfsExportOptions listed, check if the IP address of your client is within one of the ranges listed under ipRanges for the expected accessMode. If it isn't, you must edit the NFS Export Options. For instructions on how to do this, see Editing instances.

Unable to mount a file share to App Engine

Filestore does not support App Engine.

Unable to mount a file share from a GKE cluster

You cannot directly mount Filestore file shares to GKE clusters. Instead, you must configure a PV and a PVC.

Unable to access file share from GKE clusters

For more troubleshooting information relating to Kubernetes or Google Kubernetes Engine, you can also refer to the official Kubernetes troubleshooting guide and the GKE troubleshooting guide.

Error: Output: mount.nfs: access denied by server while mounting x.x.x.x:/file-share-name

Make sure that the values of the PV spec.nfs.path and spec.nfs.server matches with the name of the file share and the IP address of the Filestore instance, respectively.

Example:

If your file share is named vol1 and the IP address of the Filestore instance is 10.0.0.2, the PV spec.nfs.path and spec.nfs.server should should match those values:

apiVersion: v1
kind: PersistentVolume
metadata:
 name: fileserver
spec:
 capacity:
   storage: 2T
 accessModes:
 - ReadWriteMany
 nfs:
   path: /vol1
   server: 10.0.0.2

Filestore API cannot be disabled

Make sure that all of your Filestore related resources, such as Filestore instances and backups, are deleted. You cannot disable the Filestore API while Filestore instances are deployed.