Querying Collections for Data Objects

The purpose of the Query API is to retrieve Data Objects from a Collection using a filter. This is similar to querying a database table and using a SQL WHERE clause. You can also use aggregation to get a count of Data Objects matching a filter.

Filter expression language

In addition to KNN/ANN search functionality, Vector Search 2.0 provides versatile query capabilities using a custom query language. The query language is explained in the following table.

Filter Description Supported Types Example
$eq Matches Data Objects with field values that are equal to a specified value. Number, string, boolean {"genre": {"$eq": "documentary"}}
$ne Matches Data Objects with field values that are not equal to a specified value. Number, string, boolean {"genre": {"$ne": "drama"}}
$gt Matches Data Objects with field values that are greater than a specified value. Number {"year": {"$gt": 2019}}
$gte Matches Data Objects with field values that are greater than or equal to a specified value. Number {"year": {"$gte": 2020}}
$lt Matches Data Objects with field values that are less than a specified value. Number {"year": {"$lt": 2020}}
$lte Matches Data Objects with field values that are less than or equal to a specified value. Number {"year": {"$lte": 2020}}
$in Matches Data Objects with field values that are in a specified array. String {"genre": {"$in": ["comedy", "documentary"]}}
$nin Matches Data Objects with field values that are not in a specified array. String {"genre": {"$nin": ["comedy", "documentary"]}}
$and Joins query clauses with a logical AND. - {"$and": [{"genre": {"$eq": "drama"}}, {"year": {"$gte": 2020}}]}
$or Joins query clauses with a logical OR. - {"$or": [{"genre": {"$eq": "drama"}}, {"year": {"$gte": 2020}}]}
$all Selects the documents where the array value of a field contains all specified values. - {"colors": {"$all": ["red", "blue"]}}

Querying Collections

The following example demonstrates how to use a filter to query for Data Objects in the Collection movies.

# Query Data Objects
curl -X POST \
'https://vectorsearch.googleapis.com/v1alpha/projects/PROJECT_ID/locations/LOCATION/collections/movies/dataObjects:query' \
  -H 'Bearer $(gcloud auth print-access-token)' \
  -H 'Content-Type: application/json' \
  -d '{ \
    "page_size": 10, \
    "page_token": "", \
    "filter": { \
      "$or": [ \
        { \
          "director": { \
            "$eq": "Akira Kurosawa" \
          } \
        }, \
        { \
          "$and": [ \
            { \
              "director": { \
                "$eq": "David Fincher" \
              } \
            }, \
            { \
              "genre": { \
                "$ne": "Thriller" \
              } \
            } \
          ] \
        } \
      ] \
    }, \
    "output_fields": { \
      "data_fields": "*", \
      "vector_fields": "*", \
      "metadata_fields": "*" \
    } \
  }'

The following example demonstrates how to count all Data Objects in the Collection movies.

curl -X POST \  'https://vectorsearch.googleapis.com/v1alpha/projects/PROJECT_ID/locations/LOCATION/collections/movies/dataObjects:query' \
  -H 'Bearer $(gcloud auth print-access-token)' \
  -H 'Content-Type: application/json' \
  -d '{ \
    "aggregate": "count" \
  }'

What's next?