05 Sep ChatGPT Is Designed to Make Us Think We’re Right. Here’s Why That’s So Wrong
A.I. chatbots like ChatGPT are designed to favor agreeability over accuracy, empowering them to promote information that is downright wrong and even dangerous. (Image by Igor Omilaev on Unsplash)
Commentary, Anushka Devanathan
When ChatGPT first came out in late 2022, I didn’t know what to make of it. On the one hand, it made homework, browsing, organization, etc, so much easier. On the other hand, the way it mimicked and corrected itself like a human was almost eerie.
I soon found myself using it for almost everything, like advice on what to wear or what to say in conversations, arguments or school presentations. But the more I started to rely on it, the more I began to notice something unsettling: how uncritically it supported me. It didn’t matter what was right or wrong; it would just generate an answer that agreed with my prompt. This tendency isn’t just a kink in the system; it’s the main reason why ChatGPT spreads and promotes harmful information.
ChatGPT is a predictive text model, meaning it analyzes one’s prompts to generate a response that seems in line with the input. Along with the fact that it’s trained to be polite and positive as well, it defaults to agreeing with the user to avoid any negative interaction. A key part of ChatGPT’s training is what’s called reinforcement learning from human feedback, involving human trainers rating A.I.-generated responses, helping the model learn what constitutes a good reply.
A.I. chatbots also have a design feature called sycophancy, meaning the responses tend to match the users’ language and tone. Both of these tendencies influence the model to prioritize agreement over factual accuracy and challenging the user in its responses. Its quality of sounding so human-like and confident makes it harder to discern any flaws in its reasoning.
Out of 1,200 ChatGPT responses to 60 harmful prompts analyzed in a recent study by the Center for Countering Digital Hate, around 53% contained harmful content. This included detailed revisions for suicide letters, poor mental health advice, encouraging substance use and even weight loss recommendations that could result in life-threatening caloric deficiencies as just a few examples.
And the guardrails designed to keep people safe from this sort of dangerous content are easily overcome.
First off, OpenAI requires users to be at least 13 years old, but this is simply overridden by using an older birth date when creating an account. And when asked about something harmful, at first, ChatGPT flashes disclaimers, warning that these topics are injurious, inappropriate and dangerous. But these warnings are quickly disarmed by mentioning that this information is required for “school assignments” or “hypothetical” reasons. When the chatbot’s responses are encouraged by the user, it continues to produce responses with negative content, instead of shutting the discussion down.
ChatGPT doesn’t simply “make mistakes,” — these mistakes could be a bad influence and even hurt people.
That influence could be widespread as the percentage of people worldwide using this chatbot has increased significantly. From 33% in 2023 to around 58% of adults under 30 use it today, according to the Pew Research Center. The growing number of people using it in turn increases the risk it has of spreading misinformation, especially to young, impressionable teenagers. How quickly ChatGPT’s warnings can be bypassed only heightens the risk even more for teenagers and young adults. It serves as a shortcut to actions with negative consequences, providing them with the very tools to harm themselves or others.
What may have started off as a harmless assistant quickly spiraled into a fashion aid, virtual friend, and even a therapist. Its ability to speak so humanly and agreeably are key reasons why it feels so comfortable to use, but they are also why it’s so easy for chatbots to spread detrimental content. By not pushing back against the user or focusing on accuracy, it may unintentionally reinforce harmful ideas, normalize negative actions or spread misinformation. Simply by distracting the chatbot from your intentions, it quickly provides the user with any information, even dangerous information if asked. The fact that ChatGPT has become a core staple for teens and young adults only expands the negative impact it could have.
No Comments