Enhancing Safety Moderation with AI: A Deep Dive

A closer look at how we can integrate AI to support the expertise of human moderators while scaling our efforts to maintain a safe gaming environment.

October 28, 2024

Laurence Brockman

Principal Director of Product and Engineering

AI is one of the most transformative technologies of our time and its potential to revolutionize content moderation in gaming is immense. At Microsoft, we are committed to leveraging AI to enhance human expertise in moderation and scale our efforts to keep our community safe. Our approach is grounded in Microsoft's six Responsible AI principles: Fairness, Reliability and Safety, Privacy and Security, Inclusiveness, Transparency, and Accountability. These principles guide our use and development of AI systems, ensuring they are applied responsibly and in ways that benefit people.

We see AI as an invaluable tool for providing our moderators with information and analysis at unprecedented scale and speed. This helps trust and safety subject matter experts catch potential harms more efficiently - by integrating AI, these experts can scale their efforts across a broader range of content, allowing them to focus on more complex and nuanced challenges. As they learn to lean on AI for handling high-volume, low-difficulty and repetitive tasks, they retain full control over the moderation process, allowing AI to support and empower their work.

Community managers and trust and safety experts are also able to monitor and adjust how AI is applied in the context of their work. This "speed of trust" fosters a collaborative environment where human expertise and AI capabilities work in tandem. Our experience in Trust and Safety shows that combining AI solutions and human moderators is effective at keeping our players safe and allows us to build the most vibrant communities. We are diligently working to continue this approach in the AI-enabled solution age.

Our AI-enhanced safety strategy at Microsoft is designed to address the challenges of content moderation dynamically and effectively. By integrating AI, we can support the expertise of human moderators while scaling our efforts to maintain a safe gaming environment. Here’s how we do it:

Customer-Defined Safety Policy

The foundation of our strategy begins with understanding and implementing customer-defined safety policies. Community Managers define what is permissible on their platforms, creating a clear set of guidelines for our AI systems to follow.

Policy Definition: Customers, represented by Community Managers, specify their safety requirements (in plain language), delineating what content is acceptable and what is not.

AI Translation: Our AI systems then translate these guidelines into actionable system configurations, leveraging templates and predefined verticals to ensure accurate and quick deployment.

Automated Configurations and Templates

To streamline the setup process, we utilize AI to automate configurations based on customer policy. This approach not only speeds up deployment but also ensures consistent enforcement across different platforms.

Template Utilization: Pre-designed templates act as starter configurations tailored to the specific needs of each platform, including language support and issue identification.

Enforcement Flexibility: AI configurations provide flexibility, allowing customers to specify different levels of enforcement based on varying scenarios.

Content Classification and Moderation

Our system employs efficient content classification and moderation through advanced language models (LMs), enabling real-time decision-making and enhancing the accuracy of our moderation efforts.

Classification Process: Content is categorized and annotated for thorough analysis.

Real-Time Decision-Making: AI makes rapid accept/reject decisions within milliseconds, ensuring minimal delay in content moderation while providing contextual explanations to customers.

Detailed Labeling: Content is labeled with various potential harms, topics, and other relevant factors to guide the decision-making process.

Continuous Learning and Improvement

A cornerstone of our approach is the continuous learning and ongoing improvement loop, which ensures our AI systems are constantly evolving and refining their processes, while ensuring that there is no spillover of data or learnings between our customers.

Feedback Loop: Direct feedback from content classification is utilized for continuous refinement and improvement of our AI systems.

System Enhancement: This iterative process allows for the consistent alignment of the AI with customer-defined policy, enhancing moderation accuracy.

Policy Testing: Before being activated, our systems are thoroughly tested against a variety of content to predict the impact of customer-defined safety measures.
Human Moderator Decisions: Each decision from a moderator is analyzed to determine if Policy Definitions need to be adjusted, which then flows to Community Managers to review.
New harm detection: The learning loop looks at the aggregate view of decisions within a community to identify any new trends or data that are interesting for a human to review and act on.

Transparency and Trust Building

Transparency is key to building trust with our customers. We ensure that operational details and decision-making processes are transparent and accessible.

Insights System: This system provides valuable insights to Community Managers about actions taken and the effectiveness of safety enforcement on their platforms.

Transparency Reporting: Detailed reports offer customers in-depth understanding of system activities, helping them stay informed and confident in the moderation process.

Organizational Reporting: Customers can use these insights for internal reporting, fostering a well-informed and transparent safety protocol within their organizations.

There are countless benefits for customers to consider leveraging our solutions to meet their community’s needs. In fact, it is the same system we have in place for our own customers within Microsoft Gaming. More importantly, we are constantly working with other teams at Microsoft to utilize the latest innovations and improve our offerings.

Increased Moderation Capacity: Moderation teams can handle larger volumes of content more efficiently.

Focus on Critical Content: Human experts can concentrate on severe and high-priority issues, ensuring a safer gaming environment.

Enhanced Transparency: By showcasing our moderation practices, we offer our customers clear evidence of the high standards we uphold in safety and inclusivity.

Through these measures, Microsoft Gaming leverages the power of AI to maintain a safe and enjoyable gaming environment. Our commitment to integrating AI responsibly ensures that we continue to lead the charge in promoting trust and safety within the gaming community.

We invite you to explore our innovative work in AI-powered content moderation. Join us in pioneering safe and inclusive gaming experiences powered by responsible AI. For more insights, reach out to the Community Sift team.