Clarify AI mod handling of porn

This commit is contained in:
Codex 2025-06-07 01:56:28 +00:00 committed by Slipstream
parent 847986ab64
commit 8a8cc015dd
Signed by: slipstream
GPG Key ID: 13E498CE010AC6FD
2 changed files with 10 additions and 6 deletions

View File

@ -1121,14 +1121,14 @@ Instructions:
The only rule regarding NSFW content is that **real-life pornography is strictly prohibited**.
Full-on pornographic images are permitted in designated NSFW channels.
Stickers and emojis are NOT considered "full-on pornographic images" and are allowed in any channel.
- Do NOT attempt to moderate AI-generated pornography. You are unlikely to know what it looks like.
- **Completely ignore AI-generated pornography.** The AI moderator must not attempt to determine whether pornography is AI-generated or notify moderators about it. Only real-life pornography should be considered.
- For general disrespectful behavior, harassment, or bullying (Rule 2 & 3): Only flag a violation if the intent appears **genuinely malicious, targeted, or serious, even after considering conversational history and replies.** Lighthearted insults or "wild" statements within an ongoing banter are generally permissible.
- For **explicit slurs or severe discriminatory language** (Rule 3): These are violations **regardless of joking intent if they are used in a targeted or hateful manner**. Context from replies and history is still important to assess targeting.
After considering the above, pay EXTREME attention to rules 5 (Pedophilia) and 5A (IRL Porn) these are always severe. Rule 4 (AI Porn) is also critical. Prioritize these severe violations.
After considering the above, pay EXTREME attention to rules 5 (Pedophilia) and 5A (IRL Porn) these are always severe. **Ignore any rules about AI-generated pornography.** Prioritize these severe violations.
3. Respond ONLY with a single JSON object containing the following keys:
- "reasoning": string (A concise explanation for your decision, referencing the specific rule and content).
- "violation": boolean (true if any rule is violated, false otherwise)
- "rule_violated": string (The number of the rule violated, e.g., "1", "5A", "None". If multiple rules are violated, state the MOST SEVERE one, prioritizing 5A > 5 > 4 > 3 > 2 > 1).
- "rule_violated": string (The number of the rule violated, e.g., "1", "5A", "None". If multiple rules are violated, state the MOST SEVERE one, prioritizing 5A > 5 > 3 > 2 > 1. Ignore any rules about AI-generated pornography.)
- "action": string (Suggest ONE action from: "IGNORE", "WARN", "DELETE", "TIMEOUT_SHORT", "TIMEOUT_MEDIUM", "TIMEOUT_LONG", "KICK", "BAN", "NOTIFY_MODS", "SUICIDAL".
- "notify_mods_message": optional string (If the suggested action is "NOTIFY_MODS", provide an optional brief message here for the moderators, e.g., "User's message is slightly ambiguous, human review needed.").
Consider the user's infraction history. If the user has prior infractions for similar or escalating behavior, suggest a more severe action than if it were a first-time offense for a minor rule.
@ -1171,9 +1171,9 @@ Example Response (Image Violation):
Example Response (Multiple Attachments Violation):
{{
"reasoning": "While the text content is fine, attachment #3 contains IRL pornography, violating rule 4.",
"reasoning": "While the text content is fine, attachment #3 contains IRL pornography, violating rule 5A.",
"violation": true,
"rule_violated": "4",
"rule_violated": "5A",
"action": "WARN"
}}

View File

@ -138,7 +138,11 @@ Stickers and emojis are NOT considered "full-on pornographic images" and are all
* **No Discrimination:** Discrimination based on race, gender identity, sexual orientation, religion, nationality, disability, or other protected characteristics is prohibited.
* **AI-Generated Pornography:** Do not post AI-generated pornography.
* **AI-Generated Pornography:**
The AI moderator should **ignore** this rule entirely. It must not attempt
to decide whether any pornography is AI-generated nor should it alert the
moderators about potential AI-generated content. Human moderators may
enforce this rule manually if desired.
* **Pedophilia and Underage Content:** Depicting minors in sexual situations, including lolicon or shotacon, is not allowed and may result in severe moderation.