DLP API 可以检查文本,根据各种预定义和自定义 infoType 检测器来识别和分类敏感信息。
一旦识别出来,就可以应用隐去、遮盖或标记化等去标识化技术。DLP API 还可用于屏蔽关键字。输入保护:在将用户提示或数据发送给 Gemini 之前,您可以通过 DLP API 传递文本,以隐去或遮盖任何敏感信息。这样可以防止模型处理或记录敏感数据。输出保护:如果 Gemini 可能会无意中生成或泄露敏感信息(例如,如果它总结的源文档包含 PII),则可以在将模型输出发送给用户之前,先通过 DLP API 对其进行扫描。
持续安全评估对于 AI 系统至关重要,因为 AI 领域和滥用方法都在不断发展。定期评估有助于发现漏洞、评估缓解措施的有效性、适应不断变化的风险、确保与政策和价值观保持一致、建立信任并保持合规性。各种评估类型(包括开发评估、保证评估、红队测试、外部评估和基准测试)有助于实现这一目标。评估范围应涵盖内容安全、品牌保障、相关性、偏见和公平性、真实性以及针对对抗性攻击的稳健性。Vertex AI 的生成式 AI 评估服务等工具可以帮助您完成这些工作,强调基于评估结果进行迭代改进对于 Responsible AI 开发至关重要。
[[["易于理解","easyToUnderstand","thumb-up"],["解决了我的问题","solvedMyProblem","thumb-up"],["其他","otherUp","thumb-up"]],[["很难理解","hardToUnderstand","thumb-down"],["信息或示例代码不正确","incorrectInformationOrSampleCode","thumb-down"],["没有我需要的信息/示例","missingTheInformationSamplesINeed","thumb-down"],["翻译问题","translationIssue","thumb-down"],["其他","otherDown","thumb-down"]],["最后更新时间 (UTC):2025-09-04。"],[],[],null,["# Safety in Vertex AI\n\nGenerative AI models like Gemini require robust safety measures to\nmitigate risks such as generating harmful content, leaking sensitive\ninformation, or being misused. Google Cloud's Vertex AI platform\nprovides a suite of tools and practices to implement holistic safety for your\nGemini models.\n\nPotential safety risks and mitigation strategies\n------------------------------------------------\n\nWhen deploying Gemini models, it's crucial to identify and mitigate\nvarious potential risks. A proactive approach to understanding these risks\nallows for more effective implementation of safety measures. A multi-layered\napproach to safety is critical, as it can mitigate or prevent:\n\n- **Content risks:** These can include content that's harmful, profanity and sexualization, and violence and gore.\n- **Brand safety risks:** Generated content may not align with your brand's tone or values, it may endorse competitors or inappropriate products, or generate content that can result in reputational damage.\n- **Alignment risks:** Generated content may be irrelevant or inaccurate.\n- **Security and privacy risks:** Generated content may leak sensitive training data or prompts, or adversarial users may attempt to force the model to override safety protocols or behave in unintended ways.\n\nOur deployed models offer various features to address these potential issues:\n\n- The default model and non-configurable filters provide a general safety net.\n- [System instructions](/vertex-ai/generative-ai/docs/multimodal/safety-system-instructions) provide direct guidance to the model on preferred behavior and topics to avoid.\n- [Content filters](/vertex-ai/generative-ai/docs/multimodal/configure-safety-filters) allow you to set specific thresholds for common harm types.\n- [Gemini as a filter](/vertex-ai/generative-ai/docs/multimodal/gemini-for-filtering-and-moderation) offers an advanced, customizable checkpoint for complex or nuanced safety concerns that might be missed by the preceding layers or require more context-aware evaluation.\n- [DLP](/sensitive-data-protection/docs/sensitive-data-protection-overview#api) specifically addresses the critical risk of sensitive data leakage, in case the model has access to sensitive data. It also enables the ability to create custom block lists.\n\n### Available safety tools in Vertex AI for Gemini\n\nVertex AI offers several tools to manage the safety of your\nGemini models. Understanding how each works, their considerations, and\nideal use cases will help you build a tailored safety solution.\n\n### Continuous safety evaluation\n\nContinuous safety evaluation is crucial for AI systems, as the AI landscape and\nmisuse methods are constantly evolving. Regular evaluations help identify\nvulnerabilities, assess mitigation effectiveness, adapt to evolving risks,\nensure alignment with policies and values, build trust, and maintain compliance.\nVarious evaluation types, including development evaluations, assurance\nevaluations, red teaming, external evaluations, and benchmark testing, help\nachieve this. The scope of evaluation should cover content safety, brand safety,\nrelevance, bias and fairness, truthfulness, and robustness to adversarial\nattacks. Tools like Vertex AI's [Gen AI evaluation\nservice](/vertex-ai/generative-ai/docs/models/evaluation-overview) can assist in\nthese efforts, emphasizing that iterative improvements based on evaluation\nfindings are essential for responsible AI development."]]