Python용 RAG 빠른 시작

이 페이지에서는 Vertex AI SDK를 사용하여 Vertex AI RAG Engine 태스크를 실행하는 방법을 보여줍니다.

Vertex AI RAG 엔진 소개 노트북을 사용하여 따라할 수도 있습니다.

Google Cloud 콘솔 준비

Vertex AI RAG 엔진을 사용하려면 다음 단계를 따르세요.

  1. Vertex AI SDK for Python을 설치합니다.

  2. Google Cloud 콘솔에서 이 명령어를 실행하여 프로젝트를 설정합니다.

    gcloud config set {project}

  3. 이 명령어를 실행하여 로그인을 승인합니다.

    gcloud auth application-default login

Vertex AI RAG 엔진 실행

이 샘플 코드를 복사하여 Google Cloud 콘솔에 붙여넣어 Vertex AI RAG 엔진을 실행합니다.

Python

from vertexai import rag
from vertexai.generative_models import GenerativeModel, Tool
import vertexai

# Create a RAG Corpus, Import Files, and Generate a response

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# display_name = "test_corpus"
# paths = ["https://drive.google.com/file/d/123", "gs://my_bucket/my_files_dir"]  # Supports Google Cloud Storage and Google Drive Links

# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")

# Create RagCorpus
# Configure embedding model, for example "text-embedding-004".
embedding_model_config = rag.EmbeddingModelConfig(
  publisher_model="publishers/google/models/text-embedding-004"
)

backend_config = rag.RagVectorDbConfig(rag_embedding_model_config=embedding_model_config)

rag_corpus = rag.create_corpus(
    display_name=display_name,
    backend_config=backend_config,
)

# List the rag corpus you just created
rag_corpus = rag.list_corpora()

# Import Files to the RagCorpus
# choose a corpus to import files to, you can use rag_corpus.name for just created corpus
# or use the name in list_corpora()
corpus_name = rag_corpus.name

transformation_config = rag.TransformationConfig(
      chunking_config=rag.ChunkingConfig(
          chunk_size=512,
          chunk_overlap=100,
      ),
  )

rag.import_files(
    corpus_name,
    paths,
    transformation_config=transformation_config, # Optional
    max_embedding_requests_per_min=1000,  # Optional
)

# Alternatively, you can use async import
response = await rag.import_files_async(
  corpus_name,
  paths,
  transformation_config=transformation_config, # Optional
  max_embedding_requests_per_min=1000,  # Optional
)
result = await response.result()
print(result)

# List the files in the rag corpus
rag.list_files(corpus_name)

# Direct context retrieval
rag_retrieval_config=rag.RagRetrievalConfig(
    top_k=3,  # Optional
    filter=rag.Filter(vector_distance_threshold=0.5)  # Optional
)
response = rag.retrieval_query(
    rag_resources=[
        rag.RagResource(
            rag_corpus=corpus_name,
            # Optional: supply IDs from `rag.list_files()`.
            # rag_file_ids=["rag-file-1", "rag-file-2", ...],
        )
    ],
    text="What is RAG and why it is helpful?",
    rag_retrieval_config=rag_retrieval_config,
)
print(response)

# Enhance generation
# Create a RAG retrieval tool
rag_retrieval_tool = Tool.from_retrieval(
    retrieval=rag.Retrieval(
        source=rag.VertexRagStore(
            rag_resources=[
                rag.RagResource(
                    rag_corpus=corpus_name,  # Currently only 1 corpus is allowed.
                    # Optional: supply IDs from `rag.list_files()`.
                    # rag_file_ids=["rag-file-1", "rag-file-2", ...],
                )
            ],
            rag_retrieval_config=rag_retrieval_config,
        ),
    )
)
# Create a gemini model instance
rag_model = GenerativeModel(
    model_name="gemini-1.5-flash-001", tools=[rag_retrieval_tool]
)

# Generate response
response = rag_model.generate_content("What is RAG and why it is helpful?")
print(response.text)
# Example response:
#   RAG stands for Retrieval-Augmented Generation.
#   It's a technique used in AI to enhance the quality of responses
# ...

다음 단계