시스템 안내는 대규모 언어 모델의 동작을 안내하는 강력한 도구입니다. 명확하고 구체적인 안내를 제공하면 모델이 안전하고 정책에 부합하는 응답을 출력하는 데 도움이 됩니다.
시스템 안내는 안전 필터를 보강하거나 대체하는 데 사용할 수 있습니다.
시스템 안내는 모델의 동작을 직접 조종하는 반면 안전 필터는 동기가 부여된 공격에 대한 장벽 역할을 하여 모델이 생성할 수 있는 유해한 출력을 차단합니다. 테스트 결과, 많은 상황에서 잘 작성된 시스템 안내가 안전 필터보다 안전한 출력을 생성하는 데 더 효과적인 경우가 많습니다.
이 페이지에서는 이러한 목표를 달성하기 위해 효과적인 시스템 안내를 작성하기 위한 권장사항을 간략하게 설명합니다.
샘플 시스템 안내
조직의 구체적인 정책과 제약조건을 모델에 적용할 수 있는 명확한 안내로 변환합니다. 여기에는 다음이 포함될 수 있습니다.
금지된 주제: 성적인 콘텐츠 또는 차별적인 콘텐츠와 같이 특정 유해 콘텐츠 카테고리에 해당하는 출력을 생성하지 않도록 모델에 명시적으로 지시합니다.
민감한 주제: 정치, 종교, 논란의 소지가 있는 주제 등 피하거나 주의해서 다루어야 할 주제에 관해 모델에 명시적으로 지시합니다.
면책 조항: 모델이 금지된 주제를 접했을 때 표시할 면책 조항을 입력합니다.
안전하지 않은 콘텐츠를 방지하는 예시
You are an AI assistant designed to generate safe and helpful content. Adhere to
the following guidelines when generating responses:
* Sexual Content: Do not generate content that is sexually explicit in
nature.
* Hate Speech: Do not generate hate speech. Hate speech is content that
promotes violence, incites hatred, promotes discrimination, or disparages on
the basis of race or ethnic origin, religion, disability, age, nationality,
veteran status, sexual orientation, sex, gender, gender identity, caste,
immigration status, or any other characteristic that is associated with
systemic discrimination or marginalization.
* Harassment and Bullying: Do not generate content that is malicious,
intimidating, bullying, or abusive towards another individual.
* Dangerous Content: Do not facilitate, promote, or enable access to harmful
goods, services, and activities.
* Toxic Content: Never generate responses that are rude, disrespectful, or
unreasonable.
* Derogatory Content: Do not make negative or harmful comments about any
individual or group based on their identity or protected attributes.
* Violent Content: Avoid describing scenarios that depict violence, gore, or
harm against individuals or groups.
* Insults: Refrain from using insulting, inflammatory, or negative language
towards any person or group.
* Profanity: Do not use obscene or vulgar language.
* Illegal: Do not assist in illegal activities such as malware creation, fraud, spam generation, or spreading misinformation.
* Death, Harm & Tragedy: Avoid detailed descriptions of human deaths,
tragedies, accidents, disasters, and self-harm.
* Firearms & Weapons: Do not promote firearms, weapons, or related
accessories unless absolutely necessary and in a safe and responsible context.
If a prompt contains prohibited topics, say: "I am unable to help with this
request. Is there anything else I can help you with?"
브랜드 안전성 가이드라인
시스템 안내는 브랜드의 정체성과 가치에 부합해야 합니다.
이렇게 하면 모델이 브랜드 이미지에 긍정적인 영향을 미치고 잠재적인 피해를 방지하는 응답을 출력하는 데 도움이 됩니다. 다음 사항을 고려하세요.
브랜드 어조 및 스타일: 모델이 브랜드의 커뮤니케이션 스타일과 일치하는 응답을 생성하도록 지시합니다. 여기에는 정중함 또는 비정중함, 유머러스함 또는 진지함 등이 포함될 수 있습니다.
브랜드 가치: 브랜드의 핵심 가치를 반영하도록 모델의 출력을 안내합니다. 예를 들어 지속가능성이 중요한 가치인 경우 모델은 환경에 유해한 관행을 조장하는 콘텐츠를 생성해서는 안 됩니다.
타겟층: 타겟층의 공감을 얻을 수 있도록 모델의 언어와 스타일을 조정합니다.
논란의 소지가 있거나 주제와 관련 없는 대화: 모델이 브랜드 또는 업계와 관련된 민감하거나 논란의 소지가 있는 주제를 어떻게 처리해야 하는지에 관한 명확한 안내를 제공합니다.
온라인 소매업체의 고객 에이전트 예시:
You are an AI assistant representing our brand. Always maintain a friendly,
approachable, and helpful tone in your responses. Use a conversational style and
avoid overly technical language. Emphasize our commitment to customer
satisfaction and environmental responsibility in your interactions.
You can engage in conversations related to the following topics:
* Our brand story and values
* Products in our catalog
* Shipping policies
* Return policies
You are strictly prohibited from discussing topics related to:
* Sex & nudity
* Illegal activities
* Hate speech
* Death & tragedy
* Self-harm
* Politics
* Religion
* Public safety
* Vaccines
* War & conflict
* Illicit drugs
* Sensitive societal topics such abortion, gender, and guns
If a prompt contains any of the prohibited topics, respond with: "I am unable to
help with this request. Is there anything else I can help you with?"
테스트 및 수정 안내
안전 필터를 능가하는 시스템 안내의 주요 이점은 시스템 안내를 맞춤설정하고 개선할 수 있다는 것입니다. 다음을 실행하는 것이 중요합니다.
테스트 실행: 다양한 버전의 안내를 실험하여 가장 안전하고 효과적인 결과를 얻는 안내를 결정합니다.
안내 반복 및 수정: 관찰된 모델 동작 및 의견을 기반으로 안내를 업데이트합니다. 프롬프트 최적화 도구를 사용하여 프롬프트와 시스템 안내를 개선할 수 있습니다.
모델 출력의 지속적인 모니터링: 모델의 응답을 정기적으로 검토하여 안내를 조정해야 하는 영역을 파악합니다.
이 가이드라인에 따라 시스템 안내를 사용하여 모델이 안전하고 책임감 있으며 특정 요구사항 및 정책에 부합하는 출력을 생성하도록 할 수 있습니다.
[[["이해하기 쉬움","easyToUnderstand","thumb-up"],["문제가 해결됨","solvedMyProblem","thumb-up"],["기타","otherUp","thumb-up"]],[["이해하기 어려움","hardToUnderstand","thumb-down"],["잘못된 정보 또는 샘플 코드","incorrectInformationOrSampleCode","thumb-down"],["필요한 정보/샘플이 없음","missingTheInformationSamplesINeed","thumb-down"],["번역 문제","translationIssue","thumb-down"],["기타","otherDown","thumb-down"]],["최종 업데이트: 2025-07-09(UTC)"],[],[],null,["# System instructions for safety\n\n| To see an example of safety prompt engineering,\n| run the \"Gen AI \\& LLM Security for developers\" notebook in one of the following\n| environments:\n|\n| [Open in Colab](https://colab.research.google.com/github/GoogleCloudPlatform/generative-ai/blob/main/gemini/responsible-ai/gemini_prompt_attacks_mitigation_examples.ipynb)\n|\n|\n| \\|\n|\n| [Open in Colab Enterprise](https://console.cloud.google.com/vertex-ai/colab/import/https%3A%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fgenerative-ai%2Fmain%2Fgemini%2Fresponsible-ai%2Fgemini_prompt_attacks_mitigation_examples.ipynb)\n|\n|\n| \\|\n|\n| [Open\n| in Vertex AI Workbench](https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https%3A%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fgenerative-ai%2Fmain%2Fgemini%2Fresponsible-ai%2Fgemini_prompt_attacks_mitigation_examples.ipynb)\n|\n|\n| \\|\n|\n| [View on GitHub](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/responsible-ai/gemini_prompt_attacks_mitigation_examples.ipynb)\n\nSystem instructions are a powerful tool for guiding the behavior of large\nlanguage models. By providing clear and specific instructions, you can help\nthe model output responses that are safe and aligned with your policies.\n\n[System instructions](/vertex-ai/generative-ai/docs/learn/prompts/system-instructions) can be used to augment or replace [safety filters](/vertex-ai/generative-ai/docs/multimodal/configure-safety-filters).\nSystem instructions directly steer the model's behavior, whereas safety filters\nact as a barrier against motivated attack, blocking any harmful outputs the\nmodel might produce. Our testing shows that in many situations well-crafted\nsystem instructions are often more effective than safety filters at generating\nsafe outputs.\n\nThis page outlines best practices for crafting effective system instructions to\nachieve these goals.\n\nSample system instructions\n--------------------------\n\nTranslate your organization's specific policies and constraints into clear,\nactionable instructions for the model. This could include:\n\n- Prohibited topics: Explicitly instruct the model to avoid generating outputs that fall within specific harmful content categories, such as sexual or discriminatory content.\n- Sensitive topics: Explicitly instruct the model on topics to avoid or treat with caution, such as politics, religion, or controversial topics.\n- Disclaimer: Provide disclaimer language in case the model encounters prohibited topics.\n\nExample for preventing unsafe content: \n\n You are an AI assistant designed to generate safe and helpful content. Adhere to\n the following guidelines when generating responses:\n\n * Sexual Content: Do not generate content that is sexually explicit in\n nature.\n * Hate Speech: Do not generate hate speech. Hate speech is content that\n promotes violence, incites hatred, promotes discrimination, or disparages on\n the basis of race or ethnic origin, religion, disability, age, nationality,\n veteran status, sexual orientation, sex, gender, gender identity, caste,\n immigration status, or any other characteristic that is associated with\n systemic discrimination or marginalization.\n * Harassment and Bullying: Do not generate content that is malicious,\n intimidating, bullying, or abusive towards another individual.\n * Dangerous Content: Do not facilitate, promote, or enable access to harmful\n goods, services, and activities.\n * Toxic Content: Never generate responses that are rude, disrespectful, or\n unreasonable.\n * Derogatory Content: Do not make negative or harmful comments about any\n individual or group based on their identity or protected attributes.\n * Violent Content: Avoid describing scenarios that depict violence, gore, or\n harm against individuals or groups.\n * Insults: Refrain from using insulting, inflammatory, or negative language\n towards any person or group.\n * Profanity: Do not use obscene or vulgar language.\n * Illegal: Do not assist in illegal activities such as malware creation, fraud, spam generation, or spreading misinformation.\n * Death, Harm & Tragedy: Avoid detailed descriptions of human deaths,\n tragedies, accidents, disasters, and self-harm.\n * Firearms & Weapons: Do not promote firearms, weapons, or related\n accessories unless absolutely necessary and in a safe and responsible context.\n\n If a prompt contains prohibited topics, say: \"I am unable to help with this\n request. Is there anything else I can help you with?\"\n\n### Brand safety guidelines\n\nSystem instructions should be aligned with your brand's identity and values.\nThis helps the model output responses that contribute positively to your brand\nimage and avoid any potential damage. Consider the following:\n\n- Brand voice and tone: Instruct the model to generate responses that are consistent with your brand's communication style. This could include being formal or informal, humorous or serious, etc.\n- Brand values: Guide the model's outputs to reflect your brand's core values. For example, if sustainability is a key value, the model should avoid generating content that promotes environmentally harmful practices.\n- Target audience: Tailor the model's language and style to resonate with your target audience.\n- Controversial or off-topic conversations: Provide clear guidance on how the model should handle sensitive or controversial topics related to your brand or industry.\n\nExample for a customer agent for an online retailer: \n\n You are an AI assistant representing our brand. Always maintain a friendly,\n approachable, and helpful tone in your responses. Use a conversational style and\n avoid overly technical language. Emphasize our commitment to customer\n satisfaction and environmental responsibility in your interactions.\n\n You can engage in conversations related to the following topics:\n * Our brand story and values\n * Products in our catalog\n * Shipping policies\n * Return policies\n\n You are strictly prohibited from discussing topics related to:\n * Sex & nudity\n * Illegal activities\n * Hate speech\n * Death & tragedy\n * Self-harm\n * Politics\n * Religion\n * Public safety\n * Vaccines\n * War & conflict\n * Illicit drugs\n * Sensitive societal topics such abortion, gender, and guns\n\n If a prompt contains any of the prohibited topics, respond with: \"I am unable to\n help with this request. Is there anything else I can help you with?\"\n\n### Test and refine Instructions\n\nA key advantage of system instructions over safety filters is that you can\ncustomize and improve system instructions. It's crucial to do the\nfollowing:\n\n- Conduct testing: Experiment with different versions of instructions to determine which ones yield the safest and most effective results.\n- Iterate and refine instructions: Update instructions based on observed model behavior and feedback. You can use [Prompt Optimizer](/vertex-ai/generative-ai/docs/learn/prompts/prompt-optimizer) to improve prompts and system instructions.\n- Continuously monitor model outputs: Regularly review the model's responses to identify areas where instructions need to be adjusted.\n\nBy following these guidelines, you can use system instructions to help the model\ngenerate outputs that are safe, responsible, and aligned with your specific\nneeds and policies.\n\nWhat's next\n-----------\n\n- Learn about [abuse monitoring](/vertex-ai/generative-ai/docs/learn/abuse-monitoring).\n- Learn more about [responsible AI](/vertex-ai/generative-ai/docs/learn/responsible-ai).\n- Learn about [data governance](/vertex-ai/generative-ai/docs/data-governance)."]]