[[["容易理解","easyToUnderstand","thumb-up"],["確實解決了我的問題","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["難以理解","hardToUnderstand","thumb-down"],["資訊或程式碼範例有誤","incorrectInformationOrSampleCode","thumb-down"],["缺少我需要的資訊/範例","missingTheInformationSamplesINeed","thumb-down"],["翻譯問題","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["上次更新時間:2025-09-04 (世界標準時間)。"],[[["\u003cp\u003eFailure modes are incorrect application states that trigger alerts, and the application must recover from these states to function properly.\u003c/p\u003e\n"],["\u003cp\u003eService readiness failures can occur when AI service workloads cannot be scheduled or configured, or when pre-trained services fail to reach the \u003ccode\u003eEnabled\u003c/code\u003e status.\u003c/p\u003e\n"],["\u003cp\u003eUser interface failures manifest as communication problems between the frontend and backend, or when service API endpoints fail to be displayed.\u003c/p\u003e\n"],["\u003cp\u003eThe three potential failure modes that may trigger an alert are related to the service readiness, AI data-plane runtime, or the user interface.\u003c/p\u003e\n"]]],[],null,["# Failure modes\n\nA failure mode is an incorrect application state that prompts an alert. The application must recover from a failure mode to run successfully. For example, the system prompts an alert when the AI pre-trained APIs aren't ready for use and exceed the designated enable time limit. If a failure mode occurs and the application cannot recover, contact your Infrastructure Operator for help.\n\nThe following failure modes (FMs) might occur and prompt an alert:\n\n- [Service readiness failures](#service-readiness-failures)\n- [AI data-plane runtime failures](#ai-data-plane-runtime-failures)\n- [User interface failures](#user-interface-failures)\n\n### Service readiness failures\n\nThe service readiness failures occur because of one of the following FMs:\n\n- **FM1 - Unable to schedule workloads**: One or more of the AI service workloads cannot be scheduled due to the lack of resources such as GPU, memory, or some other error.\n- **FM3 - Unable to configure components**: One of the required components of an AI service cannot be configured or created because of incorrect permissions or other issues. Those components are, for example, DNS or Ingress.\n- **FM4 - Services not reaching the `Enabled` status** : The pre-trained services cannot become ready after prompting the enablement process. The page displays the `Enabling` status for one or more services and, possibly, the AI infrastructure without changing to the `Enabled` status.\n\n### User interface failures\n\nThe user interface failures occur because of one of the following FMs:\n\n- **Frontend and backend communication failure** : The page displays an error message showing issues with backend communication. Error log entries have codes from `AIPL0500` to `AIPL0502`.\n- **Service API endpoints aren't displayed on the page** : If there is an error, the page shows the `Unable to fetch the endpoint` message instead of the endpoint."]]