Vector Search FAQs

How many IP addresses should I reserve?

If there's no restriction on the IP range you can allocate, we recommend that you reserve a large IP range like /16 to avoid a future IP-exhausted issue.

If you don't want to over-allocate the IP ranges, you can do a rough estimation based on your data size and traffic. Each shard can host about 20 GB of data in Avro format, and each replica of the shard can serve about 800 to 1,000 queries per second (QPS). For JSON or CSV format, an estimation can be based off of (# of embeddings, * # of dimensions, * 4) if there is no restricts and labels, since it's difficult to get an accurate reading with them. The total cost estimate depends on the total size after conversion; the raw input file size is not a consideration.

The accurate QPS each replica can serve depends on, for example, your embedding size, dimensions, and algorithm configurations. We strongly recommend that you do a load test to determine an accurate number.

The total number of deployed index nodes needed is (the number of shards * the number of replicas per shard). For example, if your data size is 30 GB and QPS is 1,200, you need at least 2 shards and 2 replicas per shard, which is a total of 4 deployed index nodes.

After estimating your total deployed index nodes, you can pick the IP range prefix based on the following table:

Total deployed index nodes Recommended reserved IP prefix
1 - 10 /21
11 - 25 /20
26 - 50 /19
51 - 120 /18

How do I resolve an IP exhausted error?

To resolve an IP exhausted error, complete the following steps:

  1. Check for any unused DeployedIndexes and undeploy them to free up some IP spaces.

  2. Expand existing reserved IP ranges, or allocate more IP ranges.

For more information, see IP address range exhaustion.

Why can't I reuse the deployed index ID when the previous DeployedIndex is undeployed?

UndeployIndex cleanup requires at least 10 to 20 minutes to complete even after receiving a succeeded operation confirmation. We recommend you either wait for 10 to 20 minutes before reusing the same ID, or use a different ID.