Class SpannerVectorStore (0.6.0)

SpannerVectorStore(instance_id: str, database_id: str, table_name: str, embedding_service: langchain_core.embeddings.embeddings.Embeddings, id_column: str = 'langchain_id', content_column: str = 'content', embedding_column: str = 'embedding', client: typing.Optional[google.cloud.spanner_v1.client.Client] = None, metadata_columns: typing.Optional[typing.List[str]] = None, ignore_metadata_columns: typing.Optional[typing.List[str]] = None, metadata_json_column: typing.Optional[str] = None, query_parameters: langchain_google_spanner.vector_store.QueryParameters = <langchain_google_spanner.vector_store.QueryParameters object>)

Initialize the SpannerVectorStore.

Parameters:

  • instance_id (str): The ID of the Spanner instance.
  • database_id (str): The ID of the Spanner database.
  • table_name (str): The name of the table.
  • embedding_service (Embeddings): The embedding service.
  • id_column (str): The name of the row ID column. Defaults to ID_COLUMN_NAME.
  • content_column (str): The name of the content column. Defaults to CONTENT_COLUMN_NAME.
  • embedding_column (str): The name of the embedding column. Defaults to EMBEDDING_COLUMN_NAME.
  • client (Client): The Spanner client. Defaults to Client().
  • metadata_columns (Optional[List[str]]): List of metadata columns. Defaults to None.
  • ignore_metadata_columns (Optional[List[str]]): List of metadata columns to ignore. Defaults to None.
  • metadata_json_column (Optional[str]): The generic metadata column. Defaults to None.
  • query_parameters (QueryParameters): The query parameters. Defaults to QueryParameters().

Methods

_generate_sql

_generate_sql(
    dialect,
    table_name,
    id_column,
    content_column,
    embedding_column,
    column_configs,
    primary_key,
    secondary_indexes: typing.Optional[
        typing.List[langchain_google_spanner.vector_store.SecondaryIndex]
    ] = None,
)

Generate SQL for creating the vector store table.

Parameters:

  • dialect: The database dialect.
  • table_name: The name of the table.
  • id_column: The name of the row ID column.
  • content_column: The name of the content column.
  • embedding_column: The name of the embedding column.
  • column_names: List of tuples containing metadata column information.

Returns:

  • str: The generated SQL.

_select_relevance_score_fn

_select_relevance_score_fn() -> typing.Callable[[float], float]

The 'correct' relevance function may differ depending on a few things, including:

  • the distance / similarity metric used by the VectorStore
  • the scale of your embeddings (OpenAI's are unit normed. Many others are not!)
  • embedding dimensionality
  • etc.

Vectorstores should define their own selection-based method of relevance.

add_documents

add_documents(
    documents: typing.List[langchain_core.documents.base.Document],
    ids: typing.Optional[typing.List[str]] = None,
    **kwargs: typing.Any
) -> typing.List[str]

Add documents to the vector store.

Parameters
Name Description
documents List[Document]

Documents to add to the vector store.

ids Optional[List[str]]

Optional list of IDs for the documents.

Returns
Type Description
List[str] List of IDs of the added texts.

add_texts

add_texts(
    texts: typing.Iterable[str],
    metadatas: typing.Optional[typing.List[dict]] = None,
    ids: typing.Optional[typing.List[str]] = None,
    batch_size: int = 5000,
    **kwargs: typing.Any
) -> typing.List[str]

Add texts to the vector store index.

Parameters
Name Description
texts Iterable[str]

Iterable of strings to add to the vector store.

metadatas Optional[List[dict]]

Optional list of metadatas associated with the texts.

ids Optional[List[str]]

Optional list of IDs for the texts.

batch_size int

The batch size for inserting data. Defaults to 5000.

Returns
Type Description
List[str] List of IDs of the added texts.

delete

delete(
    ids: typing.Optional[typing.List[str]] = None,
    documents: typing.Optional[
        typing.List[langchain_core.documents.base.Document]
    ] = None,
    **kwargs: typing.Any
) -> typing.Optional[bool]

Delete records from the vector store.

Parameters
Name Description
ids Optional[List[str]]

List of IDs to delete.

documents Optional[List[Document]]

List of documents to delete.

Returns
Type Description
Optional[bool] True if deletion is successful, False otherwise, None if not implemented.

from_documents

from_documents(documents: typing.List[langchain_core.documents.base.Document], embedding: langchain_core.embeddings.embeddings.Embeddings, instance_id: str, database_id: str, table_name: str, id_column: str = 'langchain_id', content_column: str = 'content', embedding_column: str = 'embedding', ids: typing.Optional[typing.List[str]] = None, client: typing.Optional[google.cloud.spanner_v1.client.Client] = None, metadata_columns: typing.Optional[typing.List[str]] = None, ignore_metadata_columns: typing.Optional[typing.List[str]] = None, metadata_json_column: typing.Optional[str] = None, query_parameter: langchain_google_spanner.vector_store.QueryParameters = <langchain_google_spanner.vector_store.QueryParameters object>, **kwargs: typing.Any) -> langchain_google_spanner.vector_store.SpannerVectorStore

Initialize SpannerVectorStore from a list of documents.

Parameters
Name Description
documents List[Document]

List of documents.

embedding Embeddings

The embedding service.

id_column str

The name of the row ID column. Defaults to ID_COLUMN_NAME.

content_column str

The name of the content column. Defaults to CONTENT_COLUMN_NAME.

embedding_column str

The name of the embedding column. Defaults to EMBEDDING_COLUMN_NAME.

ids Optional[List[str]]

Optional list of IDs for the documents. Defaults to None.

client Client

The Spanner client. Defaults to Client().

metadata_columns Optional[List[str]]

List of metadata columns. Defaults to None.

ignore_metadata_columns Optional[List[str]]

List of metadata columns to ignore. Defaults to None.

metadata_json_column Optional[str]

The generic metadata column. Defaults to None.

query_parameter QueryParameters

The query parameters. Defaults to QueryParameters().

Returns
Type Description
SpannerVectorStore Initialized SpannerVectorStore instance.

from_texts

from_texts(texts: typing.List[str], embedding: langchain_core.embeddings.embeddings.Embeddings, instance_id: str, database_id: str, table_name: str, metadatas: typing.Optional[typing.List[dict]] = None, id_column: str = 'langchain_id', content_column: str = 'content', embedding_column: str = 'embedding', ids: typing.Optional[typing.List[str]] = None, client: typing.Optional[google.cloud.spanner_v1.client.Client] = None, metadata_columns: typing.Optional[typing.List[str]] = None, ignore_metadata_columns: typing.Optional[typing.List[str]] = None, metadata_json_column: typing.Optional[str] = None, query_parameter: langchain_google_spanner.vector_store.QueryParameters = <langchain_google_spanner.vector_store.QueryParameters object>, **kwargs: typing.Any) -> langchain_google_spanner.vector_store.SpannerVectorStore

Initialize SpannerVectorStore from a list of texts.

Parameters
Name Description
texts List[str]

List of texts.

embedding Embeddings

The embedding service.

metadatas Optional[List[dict]]

Optional list of metadatas associated with the texts. Defaults to None.

id_column str

The name of the row ID column. Defaults to ID_COLUMN_NAME.

content_column str

The name of the content column. Defaults to CONTENT_COLUMN_NAME.

embedding_column str

The name of the embedding column. Defaults to EMBEDDING_COLUMN_NAME.

ids Optional[List[str]]

Optional list of IDs for the texts. Defaults to None.

client Client

The Spanner client. Defaults to Client().

metadata_columns Optional[List[str]]

List of metadata columns. Defaults to None.

ignore_metadata_columns Optional[List[str]]

List of metadata columns to ignore. Defaults to None.

metadata_json_column Optional[str]

The generic metadata column. Defaults to None.

query_parameter QueryParameters

The query parameters. Defaults to QueryParameters().

Returns
Type Description
SpannerVectorStore Initialized SpannerVectorStore instance.

init_vector_store_table

init_vector_store_table(
    instance_id: str,
    database_id: str,
    table_name: str,
    client: typing.Optional[google.cloud.spanner_v1.client.Client] = None,
    id_column: typing.Union[
        str, langchain_google_spanner.vector_store.TableColumn
    ] = "langchain_id",
    content_column: str = "content",
    embedding_column: str = "embedding",
    metadata_columns: typing.Optional[
        typing.List[langchain_google_spanner.vector_store.TableColumn]
    ] = None,
    primary_key: typing.Optional[str] = None,
    vector_size: typing.Optional[int] = None,
    secondary_indexes: typing.Optional[
        typing.List[langchain_google_spanner.vector_store.SecondaryIndex]
    ] = None,
) -> bool

Initialize the vector store new table in Google Cloud Spanner.

Parameters:

  • instance_id (str): The ID of the Spanner instance.
  • database_id (str): The ID of the Spanner database.
  • table_name (str): The name of the table to initialize.
  • client (Client): The Spanner client. Defaults to Client(project="span-cloud-testing").
  • id_column (str): The name of the row ID column. Defaults to ID_COLUMN_NAME.
  • content_column (str): The name of the content column. Defaults to CONTENT_COLUMN_NAME.
  • embedding_column (str): The name of the embedding column. Defaults to EMBEDDING_COLUMN_NAME.
  • metadata_columns (Optional[List[Tuple]]): List of tuples containing metadata column information. Defaults to None.
  • vector_size (Optional[int]): The size of the vector. Defaults to None.
max_marginal_relevance_search(
    query: str,
    k: int = 4,
    fetch_k: int = 20,
    lambda_mult: float = 0.5,
    pre_filter: typing.Optional[str] = None,
    **kwargs: typing.Any
) -> typing.List[langchain_core.documents.base.Document]

Return docs selected using the maximal marginal relevance.

Maximal marginal relevance optimizes for similarity to query AND diversity among selected documents.

max_marginal_relevance_search_by_vector

max_marginal_relevance_search_by_vector(
    embedding: typing.List[float],
    k: int = 4,
    fetch_k: int = 20,
    lambda_mult: float = 0.5,
    pre_filter: typing.Optional[str] = None,
    **kwargs: typing.Any
) -> typing.List[langchain_core.documents.base.Document]

Return docs selected using the maximal marginal relevance.

Maximal marginal relevance optimizes for similarity to query AND diversity among selected documents.

max_marginal_relevance_search_with_score_by_vector

max_marginal_relevance_search_with_score_by_vector(
    embedding: typing.List[float],
    k: int = 4,
    fetch_k: int = 20,
    lambda_mult: float = 0.5,
    pre_filter: typing.Optional[str] = None,
) -> typing.List[typing.Tuple[langchain_core.documents.base.Document, float]]

Return docs and their similarity scores selected using the maximal marginal relevance.

Maximal marginal relevance optimizes for similarity to query AND diversity among selected documents.

similarity_search(
    query: str,
    k: int = 4,
    pre_filter: typing.Optional[str] = None,
    **kwargs: typing.Any
) -> typing.List[langchain_core.documents.base.Document]

Perform similarity search for a given query.

Parameters
Name Description
query str

The query string.

k int

The number of nearest neighbors to retrieve. Defaults to 4.

pre_filter Optional[str]

Pre-filter condition for the query. Defaults to None.

Returns
Type Description
List[Document] List of documents most similar to the query.

similarity_search_by_vector

similarity_search_by_vector(
    embedding: typing.List[float],
    k: int = 4,
    pre_filter: typing.Optional[str] = None,
    **kwargs: typing.Any
) -> typing.List[langchain_core.documents.base.Document]

Perform similarity search by vector.

Parameters
Name Description
embedding List[float]

The embedding vector.

k int

The number of nearest neighbors to retrieve. Defaults to 4.

pre_filter Optional[str]

Pre-filter condition for the query. Defaults to None.

Returns
Type Description
List[Document] List of documents most similar to the query.

similarity_search_with_score

similarity_search_with_score(
    query: str,
    k: int = 4,
    pre_filter: typing.Optional[str] = None,
    **kwargs: typing.Any
) -> typing.List[typing.Tuple[langchain_core.documents.base.Document, float]]

Perform similarity search for a given query with scores.

Parameters
Name Description
query str

The query string.

k int

The number of nearest neighbors to retrieve. Defaults to 4.

pre_filter Optional[str]

Pre-filter condition for the query. Defaults to None.

Returns
Type Description
List[Tuple[Document, float]] List of tuples containing Document and similarity score.

similarity_search_with_score_by_vector

similarity_search_with_score_by_vector(
    embedding: typing.List[float],
    k: int = 4,
    pre_filter: typing.Optional[str] = None,
    **kwargs: typing.Any
) -> typing.List[typing.Tuple[langchain_core.documents.base.Document, float]]

Perform similarity search for a given query.

Parameters
Name Description
query str

The query string.

k int

The number of nearest neighbors to retrieve. Defaults to 4.

pre_filter Optional[str]

Pre-filter condition for the query. Defaults to None.

Returns
Type Description
List[Document] List of documents most similar to the query.