[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-09-05。"],[[["\u003cp\u003eImplement retry logic and timeouts in both the client and middleware to handle delays and errors caused by failed requests to the Cloud Healthcare API.\u003c/p\u003e\n"],["\u003cp\u003eIdentify the origin of errors—whether they occurred in the client, middleware, or the Cloud Healthcare API—to simplify debugging, especially when planning for final error states.\u003c/p\u003e\n"],["\u003cp\u003eCertain error codes, such as \u003ccode\u003e408 Request Timeout\u003c/code\u003e and \u003ccode\u003e500 Internal Server Error\u003c/code\u003e, are retryable, while others are not, so choose which errors to retry based on your system's architecture and error analysis.\u003c/p\u003e\n"],["\u003cp\u003ePlan for final error states, which occur when retries are exhausted, and consider whether human intervention or logging is necessary to resolve or review these states.\u003c/p\u003e\n"],["\u003cp\u003eAnalyze percentile latency, such as the 50th and 99th percentiles, over extended periods and with large request samples to understand how the system performs under varying loads and to account for potential latency variations.\u003c/p\u003e\n"]]],[],null,["# Request latency and error handling best practices\n\nThis page describes best practices for optimizing request latency and\nhandling errors in the Cloud Healthcare API. Implement\nthese practices as you plan and design your system architecture.\n\nGoogle provides a service-level agreement (SLA) that defines the\nexpected uptime of the Cloud Healthcare API service and how clients can handle errors.\nFor more information, see\n[Cloud Healthcare Service Level Agreement (SLA)](/healthcare/sla).\n\nImplement retry logic and timeouts\n----------------------------------\n\nTo handle delays and errors caused by failed requests, implement\nappropriate retry logic and timeouts. When setting the timeout duration, allow\nsufficient time to do the following:\n\n- Let the Cloud Healthcare API process the request.\n- Determine if the error originated from the service or the client.\n\nYou can retry some errors, but others are non-retryable and persist across\nmultiple retries. For example, if the request data is incorrectly formatted,\nthe server responds with a `400 Bad Request` status code. The request won't\nsucceed until you fix the data.\n\nTo handle these situations, you need to [plan for final error states](#plan-for).\n\nFor more information on retry logic and timeouts, see\n[Retry failed requests](/healthcare-api/docs/best-practices-data-throughput#retry-failed).\n\nHandle errors at multiple layers\n--------------------------------\n\nWhen middleware interacts with the Cloud Healthcare API, implement retry logic\nand timeouts in the client and middleware. If a client encounters\nerrors past its retry limit, you must be\nable to identify if the error occurred in the client, the middleware, or the\nunderlying Cloud Healthcare API service. This is especially important when\nplanning for final error states.\n\nConsider the following scenario:\n\n1. The middleware receives a `500 Internal Server Error` error from the Cloud Healthcare API when sending a request.\n2. The middleware layer retries the request five more times, reaching its limit, and then stops retrying.\n3. The client receives a final `500 Internal Server Error` error.\n\nIt's important to understand that the error\noriginated in the Cloud Healthcare API, not the middleware. To\nsimplify debugging, provide this information in the error returned to the\nclient.\n\nThe following diagram shows a scenario where a middleware proxy receives\n`500 Internal Server Error` errors when forwarding a request from a client to\nthe Cloud Healthcare API. The client and proxy both implement\nerror handling and retries.\n**Figure 1.** The layers where you need to implement retry logic and timeouts are the **Client** and the **Proxy**. \nFigure 1 shows the following steps:\n\n1. The client sends a valid request to the Cloud Healthcare API through a middleware proxy.\n2. The proxy forwards the request to the Cloud Healthcare API.\n3. The Cloud Healthcare API returns a `500 Internal Server Error` error to the proxy. The proxy retries the request five more times until its retry limit is reached.\n4.\n The proxy returns the final error state, `500 Internal Server Error`, to the client.\n\n\n Using the recommendations shown earlier, you can\n debug the final error state by having the proxy return the following\n error to the client: \n\n ```none\n Error with underlying FHIR store in Cloud Healthcare API after 5 retries: 500 Internal Server Error\n ```\n\n\n Add any more information about the error returned from the Cloud Healthcare API.\n\nSometimes, the client or proxy receives `500 Internal Server Error` errors\npast their retry limits and can't retry again. In this case, a human\nmight need to intervene to diagnose if the error came from the proxy or the\nCloud Healthcare API.\n\nChoose which errors to retry\n----------------------------\n\nDepending on your [system's architecture](#system-architecture), you can retry certain\nerrors and ignore others. The following is a non-exhaustive list of retryable Cloud Healthcare API\nerror codes:\n\n- `408 Request Timeout`\n- `425 Too Early`\n- `429 Too Many Requests`\n- `500 Internal Server Error`\n- `502 Bad Gateway`\n- `503 Service Unavailable`\n- `504 Gateway Timeout`\n\nThese errors typically don't occur at the same frequency, and some might never occur.\n\n### System architecture effects\n\nYour system's architecture influences how and when you retry errors.\n\nFor example, in a direct client-to-server architecture, a client that receives\na `401 UNAUTHENTICATED` error from the\nCloud Healthcare API can re-authenticate and retry its request.\n\nSuppose a system has a middleware layer\nbetween the client and the Cloud Healthcare API. If the client authenticated\ncorrectly and an expired [authentication token](/docs/authentication/token-types)\ncaused the issue, then\nthe middleware must [refresh the token](/docs/authentication/token-types#refresh)\nand retry the request.\n\nAfter analyzing final error states, you can adjust the errors your client\nretries based on your findings.\n\nPlan for final error states\n---------------------------\n\nEven after implementing retry logic and timeouts,\na client or middleware might receive errors until their retries are exhausted.\nThe last error returned before retries and\ntimeouts are exhausted is the *final error state*. You might encounter a final error\nstate for data consistency errors.\n\nSometimes, a final error state requires human intervention. Try to implement a\nsolution to resolve the final error state for a request.\nOtherwise, log the final error state so a human can review it.\n\nConsider the following when planning how to handle final error states:\n\n- Whether there are processing dependencies that need to stop if a FHIR transaction or bundle can't complete successfully.\n- If many virtual machine (VM) instances start failing permanently, a client must report the requests that failed. After the problem is fixed, the client must retry the requests.\n- Monitoring, alerting systems, and [service-level objectives (SLOs)](/stackdriver/docs/solutions/slo-monitoring/ui/create-slo) are necessary for ensuring the stability of your system. See [Test and monitor](/healthcare-api/docs/best-practices-data-throughput#test-and) for more information.\n\nPlan for increased latency\n--------------------------\n\nThe Cloud Healthcare API is a scalable and performant service, but request\nlatency can still vary for the following reasons:\n\n- Small differences between requests, even if they seem insignificant, can cause extra processing time.\n- Similar requests might have different latencies. For example, two similar requests that add a record to data storage might have different latencies if one crosses a threshold that triggers an extra task, like allocating more storage.\n- The Cloud Healthcare API handles many requests concurrently. The time when a client sends a request, measured in fractions of a second, might coincide with a time when the Cloud Healthcare API is under heavier load than usual.\n- If a Cloud Healthcare API physical resource, such as a disk, is handling many requests, it needs to complete its queued tasks before handling other requests.\n- Sometimes, the Cloud Healthcare API retries errors on the server-side, which can increase latency for clients.\n- There might be multiple copies of data in different data centers in a regional or multi-regional location. If your requests are routed across multiple data centers, either on the original request or on a retry, there might be increased latency.\n\n### Plan using percentile latency\n\nYou can plan for increased latency by analyzing the *percentile latency* of\nyour requests. The\nfollowing examples describe the *50th percentile latency* and the\n*99th percentile latency*:\n\n- The 50th percentile latency is the maximum latency, in seconds, for the fastest 50% of requests. For example, if the 50th percentile latency is 0.5 seconds, then the Cloud Healthcare API processed 50% of requests within 0.5 seconds. The 50th percentile latency is also called the \"median latency\".\n- The 99th percentile latency is the maximum latency, in seconds, for the fastest 99% of requests. For example, if the 99th percentile latency is two seconds, then the Cloud Healthcare API processed 99% of requests within two seconds.\n\nIf you analyze the percentile latency over an interval when the Cloud Healthcare API only\nprocessed a few requests, the percentile latency might not be useful or indicative\nof overall performance because outlier requests can have a large influence.\n\nFor example, suppose a process in the Cloud Healthcare API processes\n100 requests in 100 minutes. The 99th percentile latency for the 100 minutes would\nbe based on the single slowest request. A latency measurement using a single\nrequest isn't sufficient for understanding if there are performance\nissues.\n\nGathering a larger request sample over a longer time period, like 24 hours,\ncan provide more insight into the overall\nbehavior of your system. You can use these samples to determine how your system\nresponds to heavy traffic."]]