障害モードとは、アラートを生成する誤ったアプリケーションの状態です。アプリケーションは、障害モードから復旧して正常に実行する必要があります。たとえば、AI 事前トレーニング済み API を使用する準備が整っておらず、指定された有効期限を超えた場合、システムはアラートを生成します。障害モードが発生し、アプリケーションが復元できない場合は、インフラストラクチャ オペレーターにお問い合わせください。
[[["わかりやすい","easyToUnderstand","thumb-up"],["問題の解決に役立った","solvedMyProblem","thumb-up"],["その他","otherUp","thumb-up"]],[["わかりにくい","hardToUnderstand","thumb-down"],["情報またはサンプルコードが不正確","incorrectInformationOrSampleCode","thumb-down"],["必要な情報 / サンプルがない","missingTheInformationSamplesINeed","thumb-down"],["翻訳に関する問題","translationIssue","thumb-down"],["その他","otherDown","thumb-down"]],["最終更新日 2025-09-04 UTC。"],[[["\u003cp\u003eFailure modes are incorrect application states that trigger alerts, and the application must recover from these states to function properly.\u003c/p\u003e\n"],["\u003cp\u003eService readiness failures can occur when AI service workloads cannot be scheduled or configured, or when pre-trained services fail to reach the \u003ccode\u003eEnabled\u003c/code\u003e status.\u003c/p\u003e\n"],["\u003cp\u003eUser interface failures manifest as communication problems between the frontend and backend, or when service API endpoints fail to be displayed.\u003c/p\u003e\n"],["\u003cp\u003eThe three potential failure modes that may trigger an alert are related to the service readiness, AI data-plane runtime, or the user interface.\u003c/p\u003e\n"]]],[],null,["# Failure modes\n\nA failure mode is an incorrect application state that prompts an alert. The application must recover from a failure mode to run successfully. For example, the system prompts an alert when the AI pre-trained APIs aren't ready for use and exceed the designated enable time limit. If a failure mode occurs and the application cannot recover, contact your Infrastructure Operator for help.\n\nThe following failure modes (FMs) might occur and prompt an alert:\n\n- [Service readiness failures](#service-readiness-failures)\n- [AI data-plane runtime failures](#ai-data-plane-runtime-failures)\n- [User interface failures](#user-interface-failures)\n\n### Service readiness failures\n\nThe service readiness failures occur because of one of the following FMs:\n\n- **FM1 - Unable to schedule workloads**: One or more of the AI service workloads cannot be scheduled due to the lack of resources such as GPU, memory, or some other error.\n- **FM3 - Unable to configure components**: One of the required components of an AI service cannot be configured or created because of incorrect permissions or other issues. Those components are, for example, DNS or Ingress.\n- **FM4 - Services not reaching the `Enabled` status** : The pre-trained services cannot become ready after prompting the enablement process. The page displays the `Enabling` status for one or more services and, possibly, the AI infrastructure without changing to the `Enabled` status.\n\n### User interface failures\n\nThe user interface failures occur because of one of the following FMs:\n\n- **Frontend and backend communication failure** : The page displays an error message showing issues with backend communication. Error log entries have codes from `AIPL0500` to `AIPL0502`.\n- **Service API endpoints aren't displayed on the page** : If there is an error, the page shows the `Unable to fetch the endpoint` message instead of the endpoint."]]