fromgoogle.cloud.aiplatform.predictionimportLocalModel# {import your predictor and handler}local_model=LocalModel.build_cpr_model({PATH_TO_THE_SOURCE_DIR},f"{REGION}-docker.pkg.dev/{PROJECT_ID}/{REPOSITORY}/{IMAGE}",predictor={PREDICTOR_CLASS},handler={HANDLER_CLASS},requirements_path={PATH_TO_REQUIREMENTS_TXT},)
[[["容易理解","easyToUnderstand","thumb-up"],["確實解決了我的問題","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["難以理解","hardToUnderstand","thumb-down"],["資訊或程式碼範例有誤","incorrectInformationOrSampleCode","thumb-down"],["缺少我需要的資訊/範例","missingTheInformationSamplesINeed","thumb-down"],["翻譯問題","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["上次更新時間:2025-09-04 (世界標準時間)。"],[],[],null,["# Custom inference routines let you build [custom containers](/vertex-ai/docs/predictions/use-custom-container) with preprocessing and postprocessing code, without dealing with the details of setting up an HTTP server or building a container from scratch. You can use preprocessing to normalize and transform the inputs or make calls to external services to get additional data, and use postprocessing to format the model inference or run business logic.\n\nThe following diagram depicts the user workflow both with and without custom inference routines.\n\nThe main differences are:\n\n- You don't need to write a model server or a Dockerfile. The model server, which is the HTTP server that hosts the model, is provided for you.\n\n- You can deploy and debug the model locally, speeding up the iteration cycle during development.\n\nBuild and deploy a custom container\n-----------------------------------\n\nThis section describes how to use CPR to build a custom container with preprocessing and postprocessing logic and deploy to both a local and online endpoint.\n\n### Setup\n\nYou must have [Vertex AI SDK for Python](https://github.com/googleapis/python-aiplatform) and [Docker](https://www.docker.com/) installed in your environment.\n\n### Write custom `Predictor` inference interface\n\nImplement the [`Predictor`](https://github.com/googleapis/python-aiplatform/blob/main/google/cloud/aiplatform/prediction/predictor.py) interface.\n\nFor example, see [Sklearn's `Predictor` implementation](https://github.com/googleapis/python-aiplatform/blob/main/google/cloud/aiplatform/prediction/sklearn/predictor.py).\n\n### Write custom `Handler` (optional)\n\nCustom handlers have access to the raw request object, and thus, are useful in rare cases where you need to customize web server related logic, such as supporting additional request and response headers or deserializing non-JSON formatted inference requests.\n\nHere is a [sample notebook](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/prediction/custom_prediction_routines/SDK_Custom_Predict_and_Handler_SDK_Integration.ipynb) that implements both Predictor and Handler.\n\nAlthough it isn't required, for better code organization and reusability, we recommend that you implement the web server logic in the Handler and the ML logic in the Predictor as shown in the default handler.\n\n### Build custom container\n\nPut your custom code and an additional `requirements.txt` file, if you need to install any packages in your images, in a directory.\n\nUse Vertex AI SDK for Python to build custom containers as follows: \n\n from google.cloud.aiplatform.prediction import LocalModel\n\n # {import your predictor and handler}\n\n local_model = LocalModel.build_cpr_model(\n {PATH_TO_THE_SOURCE_DIR},\n f\"{REGION}-docker.pkg.dev/{PROJECT_ID}/{REPOSITORY}/{IMAGE}\",\n predictor={PREDICTOR_CLASS},\n handler={HANDLER_CLASS},\n requirements_path={PATH_TO_REQUIREMENTS_TXT},\n )\n\nYou can inspect the container specification to get useful information such as image URI and environment variables. \n\n local_model.get_serving_container_spec()\n\n### Run the container locally (optional)\n\nThis step is required only if you want to run and test the container\nlocally which is useful for faster iteration. In the following example,\nyou deploy to a local endpoint and send an inference request (format for\n[request body](/vertex-ai/docs/reference/rest/v1/projects.locations.endpoints/predict#request-body)). \n\n with local_model.deploy_to_local_endpoint(\n artifact_uri={GCS_PATH_TO_MODEL_ARTIFACTS},\n credential_path={PATH_TO_CREDENTIALS},\n ) as local_endpoint:\n health_check_response = local_endpoint.run_health_check()\n predict_response = local_endpoint.predict(\n request_file={PATH_TO_INPUT_FILE},\n headers={ANY_NEEDED_HEADERS},\n )\n\nPrint out the health check and inference response. \n\n print(health_check_response, health_check_response.content)\n print(predict_response, predict_response.content)\n\nPrint out all the container logs. \n\n local_endpoint.print_container_logs(show_all=True)\n\n### Upload to Vertex AI Model Registry\n\nYour model will need to access your model artifacts (the files from training), so make sure you've uploaded them to Google Cloud Storage.\n\nPush the image to the [Artifact Registry](/artifact-registry/docs). \n\n local_model.push_image()\n\nThen, upload to Model Registry. \n\n from google.cloud import aiplatform\n\n model = aiplatform.Model.upload(\n local_model=local_model,\n display_name={MODEL_DISPLAY_NAME},\n artifact_uri={GCS_PATH_TO_MODEL_ARTIFACTS},\n )\n\nOnce your model is uploaded to Model Registry, it may be used to [get batch inferences](/vertex-ai/docs/predictions/batch-predictions) or deployed to a Vertex AI endpoint to get online inferences.\n\n### Deploy to Vertex AI endpoint\n\n endpoint = model.deploy(machine_type=\"n1-standard-4\")\n\nOnce your model is deployed, you can [get online inferences](/vertex-ai/docs/predictions/get-predictions#get_online_predictions).\n\nNotebook Samples\n----------------\n\nThe samples showcase the different ways you can deploy a model with custom preprocessing and postprocessing using Vertex AI Inference.\n\n- [Custom Predictor with custom pre/post-processing for Sklearn, build your own container with Vertex AI SDK for Python.](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/prediction/custom_prediction_routines/SDK_Custom_Preprocess.ipynb)\n - Implement only loading of serialized preprocessor, preprocess, and postprocess methods in the Predictor. Inherit default model loading and predict behavior from Vertex AI-distributed `SklearnPredictor`.\n- [Custom Predictor, build your own container with Vertex AI SDK for Python.](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/prediction/custom_prediction_routines/SDK_Custom_Predict_SDK_Integration.ipynb)\n - Custom implementation of the entire Predictor.\n- [Custom Predictor and Handler, build your own container with Vertex AI SDK for Python.](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/prediction/custom_prediction_routines/SDK_Custom_Predict_and_Handler_SDK_Integration.ipynb)\n - Custom implementation of Predictor and Handler.\n - Customizing the Handler allows the model server to handle csv inputs.\n- [Custom Predictor, build your own container with Vertex AI SDK for Python and PyTorch.](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/prediction/custom_prediction_routines/SDK_Pytorch_Custom_Predict.ipynb)\n - Custom implementation of the Predictor.\n- [Existing image, test inference locally and deploy models with Vertex AI SDK for Python.](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/prediction/custom_prediction_routines/SDK_Triton_PyTorch_Local_Prediction.ipynb)\n - Use NVIDIA Triton inference server for PyTorch models."]]