[[["이해하기 쉬움","easyToUnderstand","thumb-up"],["문제가 해결됨","solvedMyProblem","thumb-up"],["기타","otherUp","thumb-up"]],[["이해하기 어려움","hardToUnderstand","thumb-down"],["잘못된 정보 또는 샘플 코드","incorrectInformationOrSampleCode","thumb-down"],["필요한 정보/샘플이 없음","missingTheInformationSamplesINeed","thumb-down"],["번역 문제","translationIssue","thumb-down"],["기타","otherDown","thumb-down"]],["최종 업데이트: 2025-09-04(UTC)"],[],[],null,["# Run LLM inference on Cloud Run GPUs with Hugging Face TGI\n\nThe following example shows how to run a backend service that runs the [Hugging Face Text Generation Inference (TGI) toolkit](https://huggingface.co/docs/text-generation-inference), which is a toolkit for deploying and serving Large Language Models (LLMs), using Llama 3.\n\nSee the entire example at [Deploy Llama 3.1 8B with TGI DLC on Cloud Run](https://huggingface.co/docs/google-cloud/examples/cloud-run-tgi-deployment)."]]