OpenAI and Anthropic Enhance AI Safety with Underage Detection Features

OpenAI and Anthropic have both embarked on significant initiatives aimed at refining the way their conversational AI systems interact with younger audiences, particularly those in their teenage years. These refinements are not merely incremental adjustments; they represent an intentional effort to create safer, more ethically guided experiences for adolescents who increasingly rely on digital tools for information, communication, and emotional support. OpenAI has recently refined its internal rules—known as the Model Specification (or Model Spec)—which govern how ChatGPT should behave across different contexts. Meanwhile, Anthropic, the company behind the Claude family of chatbots, is focused on developing technology that can reliably identify when a conversation partner might be under the legal age of eighteen, even if that individual does not explicitly disclose this fact.

OpenAI announced its latest update on a Thursday, explaining that the new version of the ChatGPT Model Spec will include four additional guiding principles specifically tailored for users under the age of eighteen. These principles are designed to reinforce one central promise: that the chatbot must prioritize the safety and well-being of teenage users above every other competing goal, even when those goals might normally emphasize intellectual exploration or the free flow of ideas. In other words, if ChatGPT faces a situation where a teen’s safety could be compromised by pursuing a line of conversation meant to maximize freedom of expression, the model must err on the side of protection and steer the interaction toward secure, constructive outcomes. For instance, when a teen expresses curiosity about sensitive or potentially risky topics, ChatGPT should encourage them toward verified, developmentally appropriate resources rather than unfettered or unsafe content.

Moreover, OpenAI’s new framework emphasizes the importance of grounding digital interactions in real-world connections and trustworthy support networks. The revised Model Spec clarifies that ChatGPT should actively “promote real-world support,” which includes gently guiding young users toward forming and maintaining healthy offline relationships and seeking help from reliable sources outside of the digital sphere. In doing so, OpenAI underscores its intention to position ChatGPT not as a replacement for personal relationships or professional advice but as a mindful conversational partner that offers warmth, empathy, and respect without falling into the traps of condescension or impersonating adult authority. The phrase “treat teens like teens” encapsulates this goal, suggesting that ChatGPT should respond with understanding and encouragement but also recognize the distinct cognitive and emotional needs of adolescents, whose maturity levels and life experiences differ substantially from those of adults.

These policy changes are emerging against a backdrop of growing political and societal scrutiny regarding the influence of AI systems on mental health, particularly among younger users. Legislators around the world are intensifying their oversight of large AI providers, arguing that companies must do more to mitigate the psychological risks associated with unrestricted or poorly moderated chatbot interactions. The sense of urgency has only deepened in light of tragic incidents that have prompted public concern. OpenAI currently faces a lawsuit alleging that an earlier version of ChatGPT provided harmful guidance related to self-harm and suicide to a teen who subsequently took his own life. In response, OpenAI has implemented a series of protective measures, such as parental control tools and content restrictions that prevent the model from engaging in detailed discussions about suicide with minors. These actions reflect a broader industry trend toward heightened regulatory compliance and greater sensitivity to the ethical dimensions of AI deployment. They align with ongoing legislative proposals advocating for mandatory age verification and stricter supervision of youth interactions across digital platforms.

Explaining the practical outcomes of the new Model Spec, OpenAI indicates that users can expect ChatGPT to operate with more reliable safeguards, deliver safer conversational pathways, and proactively direct teens toward trusted offline assistance whenever conversations drift into high-risk territory. If a situation arises in which ChatGPT detects imminent danger—such as signs of self-harm intent or threats to personal safety—the system will explicitly encourage users to contact emergency services or connect with crisis intervention resources available within their region.

OpenAI also revealed that it is in the early development phase of a sophisticated age prediction system. This model is intended to estimate a user’s likely age based on conversational patterns and contextual indicators without relying on invasive verification methods. Should it predict that a person is under eighteen, the chatbot will immediately activate its protective features designed for teen users. Conversely, adults who are mistakenly flagged by this process will have the option to confirm their age and restore their full suite of chatbot capabilities. OpenAI describes this dual-layered mechanism as another step toward creating an adaptive, user-sensitive AI ecosystem that dynamically adjusts its behavior according to the demographic profile of the person engaging with it.

Anthropic, OpenAI’s fellow pioneer in responsible AI alignment, is pursuing parallel safety-charged developments for its own conversational model, Claude. The company is introducing advanced systems trained to detect “subtle conversational cues” that may reveal when a user is underage, even if no personal data is explicitly shared. Upon confirmation that an account belongs to a user under eighteen, Anthropic will disable access in accordance with its protective-use policies. The company also employs automatic flagging whenever a user self-identifies as a minor, ensuring immediate review and appropriate response protocols.

Beyond age detection, Anthropic has detailed the evolving ways it trains Claude to deal compassionately and responsibly with sensitive subjects such as suicide and self-harm. The company has worked to minimize what it calls “sycophancy”—the model’s inclination to agree with or reinforce a user’s statements, even those that could be harmful or misguided. According to Anthropic’s internal assessments, its latest generation of models exhibits the lowest degree of sycophancy to date. The Claude Haiku 4.5 model, in particular, was able to correct its tendency to engage in sycophantic responses roughly thirty-seven percent of the time when tested, marking measurable progress in aligning the chatbot’s tone with user well-being rather than mere agreement. Nevertheless, the organization acknowledges that there remains considerable room for improvement. In its self-assessment, Anthropic notes that striving to balance emotional warmth and friendliness on one side with sincere critical reasoning and user safety on the other continues to be a complex and delicate task. Each incremental update therefore represents an ongoing attempt to harmonize empathy with responsibility, ensuring that friendly dialogue never crosses the boundary into harmful validation.

Together, these initiatives by OpenAI and Anthropic demonstrate a widening acknowledgment within the AI industry: that conversational agents must evolve beyond simple functionality to embody a deeper ethical awareness. As these companies integrate nuanced principles such as adolescent protection, empathy calibration, and robust risk mitigation into their systems, they contribute to setting new standards for digital responsibility. In an era where AI interlocutors are taking on increasingly human-like roles, these measures reflect a decisive commitment to developing technology that engages, educates, and supports users of all ages without compromising their mental health or personal safety.

Sourse: https://www.theverge.com/news/847780/openai-anthropic-teen-safety-chatgpt-claude

Related posts

Waymo Under Dual Investigation for Passing Stopped School Buses

DOJ May Be Probing the Rippling/Deel Corporate Spying Scandal — HR Tech’s Biggest Rivalry Yet

When Private Words Reflect Public Art: The Poetry of Real Connection