이 페이지에서는 Vertex AI SDK를 사용하여 Vertex AI RAG Engine 태스크를 실행하는 방법을 보여줍니다.
이 Vertex AI RAG 엔진 소개 노트북을 사용하여 따라할 수도 있습니다.
Google Cloud 콘솔 준비
Vertex AI RAG 엔진을 사용하려면 다음 단계를 따르세요.
Google Cloud 콘솔에서 이 명령어를 실행하여 프로젝트를 설정합니다.
gcloud config set {project}
이 명령어를 실행하여 로그인을 승인합니다.
gcloud auth application-default login
Vertex AI RAG 엔진 실행
이 샘플 코드를 복사하여 Google Cloud 콘솔에 붙여넣어 Vertex AI RAG 엔진을 실행합니다.
Python
from vertexai import rag
from vertexai.generative_models import GenerativeModel, Tool
import vertexai
# Create a RAG Corpus, Import Files, and Generate a response
# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
# display_name = "test_corpus"
# paths = ["https://drive.google.com/file/d/123", "gs://my_bucket/my_files_dir"] # Supports Google Cloud Storage and Google Drive Links
# Initialize Vertex AI API once per session
vertexai.init(project=PROJECT_ID, location="us-central1")
# Create RagCorpus
# Configure embedding model, for example "text-embedding-004".
embedding_model_config = rag.EmbeddingModelConfig(
publisher_model="publishers/google/models/text-embedding-004"
)
backend_config = rag.RagVectorDbConfig(rag_embedding_model_config=embedding_model_config)
rag_corpus = rag.create_corpus(
display_name=display_name,
backend_config=backend_config,
)
# List the rag corpus you just created
rag_corpus = rag.list_corpora()
# Import Files to the RagCorpus
# choose a corpus to import files to, you can use rag_corpus.name for just created corpus
# or use the name in list_corpora()
corpus_name = rag_corpus.name
transformation_config = rag.TransformationConfig(
chunking_config=rag.ChunkingConfig(
chunk_size=512,
chunk_overlap=100,
),
)
rag.import_files(
corpus_name,
paths,
transformation_config=transformation_config, # Optional
max_embedding_requests_per_min=1000, # Optional
)
# Alternatively, you can use async import
response = await rag.import_files_async(
corpus_name,
paths,
transformation_config=transformation_config, # Optional
max_embedding_requests_per_min=1000, # Optional
)
result = await response.result()
print(result)
# List the files in the rag corpus
rag.list_files(corpus_name)
# Direct context retrieval
rag_retrieval_config=rag.RagRetrievalConfig(
top_k=3, # Optional
filter=rag.Filter(vector_distance_threshold=0.5) # Optional
)
response = rag.retrieval_query(
rag_resources=[
rag.RagResource(
rag_corpus=corpus_name,
# Optional: supply IDs from `rag.list_files()`.
# rag_file_ids=["rag-file-1", "rag-file-2", ...],
)
],
text="What is RAG and why it is helpful?",
rag_retrieval_config=rag_retrieval_config,
)
print(response)
# Enhance generation
# Create a RAG retrieval tool
rag_retrieval_tool = Tool.from_retrieval(
retrieval=rag.Retrieval(
source=rag.VertexRagStore(
rag_resources=[
rag.RagResource(
rag_corpus=corpus_name, # Currently only 1 corpus is allowed.
# Optional: supply IDs from `rag.list_files()`.
# rag_file_ids=["rag-file-1", "rag-file-2", ...],
)
],
rag_retrieval_config=rag_retrieval_config,
),
)
)
# Create a gemini model instance
rag_model = GenerativeModel(
model_name="gemini-1.5-flash-001", tools=[rag_retrieval_tool]
)
# Generate response
response = rag_model.generate_content("What is RAG and why it is helpful?")
print(response.text)
# Example response:
# RAG stands for Retrieval-Augmented Generation.
# It's a technique used in AI to enhance the quality of responses
# ...
다음 단계
- RAG API에 대한 자세한 내용은 Vertex AI RAG Engine API를 참고하세요.
- RAG의 응답에 대해 자세히 알아보려면 Vertex AI RAG 엔진의 검색 및 생성 출력을 참고하세요.
- Vertex AI RAG 엔진에 대한 자세한 내용은 Vertex AI RAG 엔진 개요를 참고하세요.