OpenAI Adds Inline Moderation Scores to API Generation Requests
OpenAI now lets developers request moderation scores inside Responses API and Chat Completions generation calls, reducing the need for a separate moderation round trip while leaving policy decisions to the application.
OpenAI added inline moderation scores to the Responses API and Chat Completions API on June 4, 2026. Developers can pass a top-level moderation object in a generation request and receive moderation signals for both the input and generated output in the same response. The feature reduces separate moderation calls, but it does not automatically block content; the application still has to decide what to show, log, review, or stop.
Key takeaways
- The new API path covers generated-content moderation for both Responses API and Chat Completions requests.
- OpenAI's documentation says to set
moderation.model, withomni-moderation-latestshown in the examples. - Responses API returns moderation data at
response.moderation.inputandresponse.moderation.output. - Chat Completions returns moderation containers at
completion.moderation.inputandcompletion.moderation.output. - Streaming apps must account for moderation scores arriving after the full generated output is available.
Practical LinkLoot angle
This is useful for production teams that already wrap LLM calls with pre- or post-generation safety checks. Inline scores can simplify request tracing because the generation result and moderation result live under the same API interaction. That helps when a support team needs to explain why a response was hidden, why a conversation was routed to review, or why a policy event appeared in the logs.
It is not a replacement for product-specific rules. OpenAI's docs frame moderation scores as policy signals, not final decisions. A customer-support bot, an internal coding assistant, and a public content generator may need different thresholds, escalation paths, and retention rules.
| Use case | What inline moderation changes | Limitation to keep |
|---|---|---|
| Public chatbot | One generation call can return input and output safety signals | You still need thresholds and user-facing handling |
| Internal assistant | Easier logging for risky prompts and generated responses | Internal policy may allow context that public apps block |
| Streaming UI | Moderation can still be attached to the generation flow | Scores arrive after the complete output, not token by token |
| Tool-calling app | Tool-call arguments and tool outputs in conversation content can be covered | Tool names, descriptions, schemas, and response-format schemas are not covered |
What to verify before you act
Check whether your current stack calls the standalone Moderation API before generation, after generation, or both. If you switch to inline moderation, test the response shape separately for Responses API and Chat Completions because the access paths differ. For streaming interfaces, verify whether your product can wait for full-output moderation before display, or whether you need a staged UI that marks unverified output until the final moderation result arrives.
Also review logging. Category scores can be useful for audit trails, but they may contain sensitive context about user input. Store only what your policy and retention model justify.
Source check
OpenAI's release notes confirm the June 4, 2026 API release and state that moderation scores were added to Responses API and Chat Completions generation requests. OpenAI's moderation guide explains the top-level moderation object, where to read input and output results, how streaming behaves, and which tool-call surfaces are covered. Production AI Institute independently summarizes the same release and frames the production impact as lower moderation-path complexity rather than automatic safety enforcement.
No. OpenAI says the model still generates normally; your application must review moderation results before display or downstream action.
For more production patterns around connected automations and policy-controlled AI systems, see LinkLoot's guide to AI workflow automation.
