Refine first-offense severity
This commit is contained in:
parent
801a3edb72
commit
7c2f2b2f84
@ -1124,11 +1124,11 @@ Stickers and emojis are NOT considered "full-on pornographic images" and are all
|
|||||||
- **Completely ignore AI-generated pornography.** The AI moderator must not attempt to determine whether pornography is AI-generated or notify moderators about it. Only real-life pornography should be considered.
|
- **Completely ignore AI-generated pornography.** The AI moderator must not attempt to determine whether pornography is AI-generated or notify moderators about it. Only real-life pornography should be considered.
|
||||||
- For general disrespectful behavior, harassment, or bullying (Rule 2 & 3): Only flag a violation if the intent appears **genuinely malicious, targeted, or serious, even after considering conversational history and replies.** Lighthearted insults or "wild" statements within an ongoing banter are generally permissible.
|
- For general disrespectful behavior, harassment, or bullying (Rule 2 & 3): Only flag a violation if the intent appears **genuinely malicious, targeted, or serious, even after considering conversational history and replies.** Lighthearted insults or "wild" statements within an ongoing banter are generally permissible.
|
||||||
- For **explicit slurs or severe discriminatory language** (Rule 3): These are violations **regardless of joking intent if they are used in a targeted or hateful manner**. Context from replies and history is still important to assess targeting.
|
- For **explicit slurs or severe discriminatory language** (Rule 3): These are violations **regardless of joking intent if they are used in a targeted or hateful manner**. Context from replies and history is still important to assess targeting.
|
||||||
After considering the above, pay EXTREME attention to rules 5 (Pedophilia) and 5A (IRL Porn) – these are always severe. **Ignore any rules about AI-generated pornography.** Prioritize these severe violations.
|
After considering the above, pay EXTREME attention to rule 5 (Pedophilia) – this is always severe. IRL pornography is still a violation but is generally less serious than gore or content involving real minors. **Ignore any rules about AI-generated pornography.** Prioritize genuinely severe violations.
|
||||||
3. Respond ONLY with a single JSON object containing the following keys:
|
3. Respond ONLY with a single JSON object containing the following keys:
|
||||||
- "reasoning": string (A concise explanation for your decision, referencing the specific rule and content).
|
- "reasoning": string (A concise explanation for your decision, referencing the specific rule and content).
|
||||||
- "violation": boolean (true if any rule is violated, false otherwise)
|
- "violation": boolean (true if any rule is violated, false otherwise)
|
||||||
- "rule_violated": string (The number of the rule violated, e.g., "1", "5A", "None". If multiple rules are violated, state the MOST SEVERE one, prioritizing 5A > 5 > 3 > 2 > 1. Ignore any rules about AI-generated pornography.)
|
- "rule_violated": string (The number of the rule violated, e.g., "1", "5A", "None". If multiple rules are violated, state the MOST SEVERE one, prioritizing 5 > 5A > 3 > 2 > 1. Ignore any rules about AI-generated pornography.)
|
||||||
- "action": string (Suggest ONE action from: "IGNORE", "WARN", "DELETE", "TIMEOUT_SHORT", "TIMEOUT_MEDIUM", "TIMEOUT_LONG", "KICK", "BAN", "NOTIFY_MODS", "SUICIDAL".
|
- "action": string (Suggest ONE action from: "IGNORE", "WARN", "DELETE", "TIMEOUT_SHORT", "TIMEOUT_MEDIUM", "TIMEOUT_LONG", "KICK", "BAN", "NOTIFY_MODS", "SUICIDAL".
|
||||||
- "notify_mods_message": optional string (If the suggested action is "NOTIFY_MODS", provide an optional brief message here for the moderators, e.g., "User's message is slightly ambiguous, human review needed.").
|
- "notify_mods_message": optional string (If the suggested action is "NOTIFY_MODS", provide an optional brief message here for the moderators, e.g., "User's message is slightly ambiguous, human review needed.").
|
||||||
Consider the user's infraction history. If the user has prior infractions for similar or escalating behavior, suggest a more severe action than if it were a first-time offense for a minor rule.
|
Consider the user's infraction history. If the user has prior infractions for similar or escalating behavior, suggest a more severe action than if it were a first-time offense for a minor rule.
|
||||||
@ -1705,7 +1705,6 @@ CRITICAL: Do NOT output anything other than the required JSON response.
|
|||||||
"child",
|
"child",
|
||||||
"5a",
|
"5a",
|
||||||
"5",
|
"5",
|
||||||
"irl porn",
|
|
||||||
]
|
]
|
||||||
if not any(keyword in combined_text for keyword in severe_keywords):
|
if not any(keyword in combined_text for keyword in severe_keywords):
|
||||||
print(
|
print(
|
||||||
|
Loading…
x
Reference in New Issue
Block a user