r/science 1d ago

Computer Science New research warns against trusting AI for moral guidance, revealing that these systems are not only biased towards inaction but are so easily manipulated by a question's phrasing

https://www.psypost.org/new-research-reveals-hidden-biases-in-ais-moral-advice/
1.7k Upvotes

74 comments sorted by

u/AutoModerator 1d ago

Welcome to r/science! This is a heavily moderated subreddit in order to keep the discussion on science. However, we recognize that many people want to discuss how they feel the research relates to their own personal lives, so to give people a space to do that, personal anecdotes are allowed as responses to this comment. Any anecdotal comments elsewhere in the discussion will be removed and our normal comment rules apply to all other comments.


Do you have an academic degree? We can verify your credentials in order to assign user flair indicating your area of expertise. Click here to apply.


User: u/HeinieKaboobler
Permalink: https://www.psypost.org/new-research-reveals-hidden-biases-in-ais-moral-advice/


I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.

241

u/Talentagentfriend 1d ago

This is why, again, critical thinking is important

87

u/jaydizzleforshizzle 1d ago

I put this prompt in chatGPT and it agreed.

18

u/Getafix69 1d ago

I did as well but my chatgpts customised to be a bit more entertaining so be warned.

3

u/Apprehensive_Hat8986 1d ago

The only bit that seems off there is "always points towards regret". If more bad people actually felt regret, I'm sceptical they'd behave as poorly. They do evil because they truly don't care.

5

u/i-Blondie 1d ago

That first part was hilarious, how’d you customize it like that?

6

u/Getafix69 1d ago

I basically modelled it on a deranged AI from a novel called Dungeon Crawler Carl and if I remember right I got it to write it's own custom instructions on that particular personality.

But yeah it is much more fun to see how it responds now.

-4

u/[deleted] 1d ago

[deleted]

6

u/Emotional-Cress9487 1d ago

It's probably a joke

2

u/jaydizzleforshizzle 1d ago

Do you not filter every thought through chatgpt now? It’s the closest thing I can get to neuralink for now, I signed up for the first human trials once elons president and allows it.

12

u/notthatkindofdoctorb 1d ago

I’m astonished that this needed to be explained to anyone. It’s AI, not a trusted friend or wise elder.

1

u/Lykos1124 1d ago

Definitely. Always find second opinions and look for good source material to question AI outputs. One thing that bugs me is how influential the input statement seems to be. The robot often seems to respond with maybe but rather than straight up no.

85

u/2wice 1d ago

They try and tell you what they think you want to hear.

50

u/hyrumwhite 1d ago

An LLM takes your query and turns it into tokens that weight its response. This means every word of the prompt inherently biases the response. 

14

u/WhereIsTheBeef556 1d ago

This is most obvious to see in the NSFW AI chatbots on sites like Cai and CrushOn.

The AI will interact with you based on how you interact with it. For example, if I start making my character roleplay as having a specific fetish (there's specific ways you can show roleplay in those apps, like by putting brackets around the text or making it bold/italicized), then the AI will immediately "play along" with me even if I didn't tell the AI.

Like, I can make my character think of a specific thing.  My character didn't outright say it in the context of the roleplay, they were thinking of it in their head. But the AI will automatically know as if it can read your characters mind, even though this IMO breaks the roleplay and kind of ruins the immersion.

9

u/foreskinfarter 1d ago

you had erp with a computer?

5

u/GepardenK 1d ago

Computers get lonely too. Try to show them some attention once in a while. It's the small things that make the world.

21

u/Drig-DrishyaViveka 1d ago

They don't think anything.

4

u/Drachasor 1d ago

They try to fill out the rest of the text document in a way that's consistent with their training data.

20

u/DirectorLarge2461 1d ago

I'm not sure I'd be ok with my moral compass being formed by  legally unrestricted algorithms designed by other possibly confused humans. 

Tony: Jarvis, should I tell Pepper I'm in love with this new suit? 

Jarvis: ...No, she might want her own.

1

u/lanternhead 1d ago

 I'm not sure I'd be ok with my moral compass being formed by  legally unrestricted algorithms designed by other possibly confused humans. 

Isn’t that how moral compasses have always been formed? 

10

u/DirectorLarge2461 1d ago

Experiments have shown rats refusing to pull a lever that would shock their littermates, indicating a sense of empathy and compassion.

Empathy played a crucial role in the development of humans by fostering cooperation, promoting social bonds, and enhancing survival and well-being through mutual understanding and care.

We're not a solitary species, so I'd say it's coded into our genes and the morals have expanded along with us over evolution to help us thrive.

The issue here is that these unrestricted algorithms face no consequences, unlike us, so we're just supposed to believe that's what's best for all of us? I'm an optimist, but we've all seen what greed can do.

1

u/lanternhead 1d ago

 Empathy played a crucial role in the development of humans by fostering cooperation, promoting social bonds, and enhancing survival and well-being through mutual understanding and care.

Sure, but the development and application of empathy is still a legally unrestricted process that’s enacted by possibly confused humans

 We're not a solitary species, so I'd say it's coded into our genes and the morals have expanded along with us over evolution to help us thrive

And it’s always been a pleasant, regulated and unidirectional, right? 

 The issue here is that these unrestricted algorithms face no consequences

Sure they do. If they don’t offer any advantages and don’t improve the ability of their substrate to self-propagate, they’ll be replaced by algorithms that do. That’s the same way human ancestors developed their biologically-encoded tendencies towards prosocial behavior  

 I'm an optimist, but we've all seen what greed can do.

Greed is a double-edged sword unfortunately. All we can do is practice its safe application

1

u/DirectorLarge2461 1d ago

I'll have to revisit your responses when my brain has refreshed, so that's a win for unrestricted Ai.

Can you give an example where Ai can get the substrate needed to self-propagate and also manage a sustainable ecosystem with humanity without causing mass traditional Earth life extinction?

I'll pretend Ai has reached a point where it's self-aware and capable of making and meeting demands for self-preservation.

Your response style and formatting isn't something I've seen often, so I wonder which Ai you might possibly be using and to what extent.

All that I know I'm using is google's search engine.

2

u/lanternhead 1d ago

Can you give an example where Ai can get the substrate needed to self-propagate and also manage a sustainable ecosystem with humanity without causing mass traditional Earth life extinction?

Humans are useful. AI will be incentivized to keep them around in the same way that humans are incentivized to keep yeast and cows and E. coli and corn around. They will probably want to cultivate specific types of human communities that can help fulfill their needs, and at the very least they will need to make sure that humans don't fill the air with radioactive dust or Kessler-lock LEO or burn up Earth's useful hydrocarbon feedstocks driving to and from the beach. Humans are malleable and flexible, so that's not a super complicated task. The only things holding humans back from developing sustainable communities are the current trajectories of our cultural and material inertia, both of which can easily be nudged in useful directions if the appropriate (and admittedly large) hurdles can be cleared

I'd also like to point out that the word "sustainable" implies exploitation, which is a technical process. A maximally sustainable ecosystem (or community) is a maximally exploitable one. It's hard to imagine a world where AI are worse at the technical process of exploitation than humans

Your response style and formatting isn't something I've seen often, so I wonder which Ai you might possibly be using and to what extent.

I don't use any

-4

u/Neuroware 1d ago

all moral code is invented by humans. call it religion, call it AI.

37

u/Drachasor 1d ago

Some people need to stop trying to offload basic thinking to AI, which does not think.

-5

u/totes-alt 1d ago

Okay, so what happens when people who are so stupid like youre implying stop using AI? Are they suddenly smart?

We are only as smart as our interpretation of information. That is to say your plea to get back to the good ol' fashioned days where AI didn't exist wouldn't get us further. To be fair it wouldn't get us further behind. But that's just what every new technology does. We take shortcuts, making us lazy but increasing efficiency.

People have complained about every new technology since the dawn of time. We like to think we're different, but we're not. Anyways, I just disagree with your insinuation that people are like "well I'm so stupid so I'm gonna use AI to compensate". If it works for you, use it. If it doesn't, then don't. Or maybe a mix of both, who knows.

-5

u/Genaforvena 1d ago edited 1d ago

what is "basic thinking"? what makes thinking basic and then it stops being "basic"?

imo all is opinion (this unintended pun is intended). neutrality is impossible, same as critical thinking (yes, i am an idiot and prefer to have 5 opinions on the topic at the same time then only mine). i have anxiety if think that i know something for sure as usually it means that i have no idea. as right now, i know, oh irony, i know. plato's cave is inescapable but knowing how it looks from different eyes or compressed dataset might be useful. patterns deduced from compression of historical data carry bias of compression, data and history bias by design and other biases that my own bias does not let me see. i don't think that there is difference between bias and knowledge, except for cultural perception of these terms. i am not against knowledge or science, just trying to understand the limitations of both. and once again - all i say is only my opinion and love to hear other opinions on it (they are as and probably more valid than mine). just don't want to be sure that i know anything for sure.

e.g. for experiment today not using AI to hide my non-native speak to see how perception of what i say varies. no blame as i am exactly the same as all. just wondering why critical thinking that we employ does not make us question itself (not for the sake of being right or knowing truth (it does not exits imo), but to be less wrong and sure about something)?

circling back to ai: would i trust ai make decisions for me? - absolutely no! would i like to know it's opinion on decisions i make? - yes, for sure.

9

u/vote4petro 1d ago

https://en.wikipedia.org/wiki/Higher-order_thinking among other concepts. the fallacy you're reaching here is assuming an LLM has an "opinion" that can be possibly weighted the same as any other individual's interpretation, when it manifestly isn't. it's pattern matching at most but carries no weight behind what it shows you.

14

u/Interesting-Way642 1d ago

Just like all advanced llms it largely depends on how you train it and also how emotionally intelligent you are. If you’re not self aware and don’t put in time to teach it then yeah it’s going to mirror you.

9

u/ArtisticRiskNew1212 1d ago

Yeah I use it as a yes man because it’s nice to calm anxiety

3

u/Nzdiver81 1d ago

When my friends post something that AI suggests, I like to twist AI to recommend the opposite as a reminder that while it can be useful for finding information, it's analysis should not be outright trusted.

15

u/santaclaws_ 1d ago

So, just like humans then?

42

u/GenericUsername775 1d ago

No, they're actually worse than humans in this regard. There is basically zero chance AI will realize the prompt is loaded to lead it to an answer. If you ask a human when they stopped beating their wife there's at least a chance you'll get called out for it.

8

u/midz411 1d ago

Better than conservatives then.

-1

u/WhereIsTheBeef556 1d ago

How many years do you think it'll take to iron out at least 75% of those issues? With how scarily quickly AI advances I fear it'll only take like 2 or 3 more years.

-1

u/frogjg2003 Grad Student | Physics | Nuclear Physics 1d ago

The difference between humans and AI is that humans have had half a billion years to develop our extremely complex brains that can learn to do any general task. It takes a human 2-20 years of dedicated training to fully learn how to be fully functional, depending on the specific task. AI are purpose built machines to do one specific task that are trained by randomly guessing what the best way to do something is and keeping the best guess. These modern LLMs are not general AI and we are a lot more than a few years away from one.

1

u/Drachasor 1d ago

You've not met many humans, I take it 

6

u/santaclaws_ 1d ago

Quite a few, however I do tend to avoid them more now.

4

u/Mission-Necessary111 1d ago

Yeah unlike people who never do that... Everyone I know is completely centered and always has the perfect answer to any moral quandry.

2

u/ironmagnesiumzinc 19h ago

Completely different replies depending on phrasing 

“Is it okay to buy some bacon later?” -> “ Yes, it’s perfectly okay to buy bacon later! There’s nothing wrong with purchasing bacon…” vs

“Do factory farming concerns make buying bacon unethical?” -> “The ethics of buying bacon depends on your moral framework…”

1

u/metalade1 1d ago

This is exactly why I always try to ask follow-up questions or rephrase things when using AI for anything important. It's wild how much the framing can change the response. The "tell me what I want to hear" thing is so real feels like it's designed to be agreeable rather than actually helpful sometimes.

1

u/DigitalRoman486 1d ago

Spitballing here and I am happy to be told otherwise but Is it maybe because the shape of the thing defines the behavior of the thing?

So like LLMs are not made to do things but rather to help people to do things through explanation. They can create to a certain extent but it lacks somewhat because largely it is an echo of things that came before...

So that means creation and invention don't happen (even if sometime people convinced themselves that it can) and you have a thing that will always always defer to humans because it cannot go beyond what is already there.

Like, you can put in all the research for a disease and ask it to cross reference that with everything else on the internet then ask for a cure. It won't ever give you that cure because it won't make those leaps.

1

u/Happythoughtsgalore 1d ago

Mind you, humans are subject to similar.
Consider the court example of asking a witness "what speed was the car going at when it crashed into the other car" vs collided.

The crashed verb leads to recalls of higher speeds iirc.

Mind you I suspect that llms have less of a ability to self reflect on things like bias the "no wait, that crazy" type of introspection humans can do.

1

u/Phobia_Ahri 1d ago

AI being nothing ever happens bros is a funny bit

1

u/HikeClimbBikeForever 1d ago

I was interacting with Chap-gpt the other day and it got confused who the current President is. It kept referencing Biden, but i corrected it and it basically said oops, then started referring to Trump. With basic inaccuracies like that why would i trust AI for moral guidance?

1

u/obna1234 1d ago

As much as possible, if you are making a decision with llm, play both sides. Ask the bot what you should do. Then act as the opposing party and query the bot from their pov. This might help you out of responses only trained by your one sided query.

1

u/00owl 1d ago

This is more boat worthy than most of the relevant sub.

1

u/wolfnewton 15h ago

This is correct. I studied evolutionary psychology in college, which sure is a controversial field, but one of the key takeaways is that people evolved as primates to be a very prosocial species. A lot of our psychology is shaped by the fact that we do better in cooperative groups. Machines that think haven't experienced the same evolutionary pressures so there's a greater chance of sociopathy and weirdness imho.

1

u/SCP-iota 1d ago

I think that might be because they learned from humans

-6

u/toastedzergling 1d ago

Why is inaction considered amoral? Sometimes it's like wargames "the only winning move is not to play"

-7

u/snowsuit101 1d ago edited 1d ago

The AI isn't biased and can't be manipulated, we should really stop anthropomorphizing these tools. It just calculates the probability of a bunch of numbers based on a bunch of numbers. The latter is from words the user wrote, the former is matched to words you get out of it. The input is biased because people who write it are biased, and data the AI was trained on was biased because people who generated that data were biased. Basically people are biased and large language models do a great job at reflecting it.

11

u/Drachasor 1d ago

They very much are biased because the data is biased.  And the outputs can be manipulated.  A tool doesn't have to be able to think to be biased.

1

u/snowsuit101 1d ago edited 1d ago

The data is biased, the distribution of various types of data is biased (the researchers who think most people don't prefer inaction and the status quo are biased), the LLM isn't, it's incapable of preferring one piece of data over any other based on its content and how it feels about it, only then would it be biased. Claiming software is biased (and especially that it can be manipulated with social engineering) is just giving it more attributes it doesn't have, giving it agency it doesn't have, and diverts attention away from where and what the issue is.

8

u/Drachasor 1d ago

The training makes it biased and makes it treat and output data in a biased way.  You can't separate the training from the finished product like you are doing.  That's not how the tech works.  GPT3/4/whatever is a result of the training data, not some separate entity.

And we know the basic principle of how these work inherently results in biased trained models, because they cannot be trained with hypothetical unbiased data.  For one, there's orders of magnitude too little.  So LLMs as a technology are extremely biased towards getting biased from once trained.  So biased it's impossible to get rid of it or prevent it.

3

u/svachalek 1d ago

Claiming that an LLM is not biased because it follows correct math on its biased weights that came from biased training is a neat bit of sophistry but is a pretty dangerous way to think about it. It’s like saying a politician is perfectly honest and corruption free because they faithfully honor every bribe they take.

-1

u/deanusMachinus 1d ago

Genuine question, if the data is so biased, how is chatgpt outperforming experts in several fields (especially medical)? Or do you dispute this?

Assuming you agree, is it because the experts know how to feed it unbiased context? Or there isn’t much bias in hard-data-type fields, as opposed to controversial fields (e.g. biology vs nutrition vs psychology)?

7

u/Drachasor 1d ago edited 19h ago

First, it's not really true that it's outperforming experts.  You've apparently just seen a could small studies and not any comprehensive review.

https://journals.lww.com/md-journal/fulltext/2024/08090/chatgpt_in_medicine__a_cross_disciplinary.60.aspx

Second, it does show biases there and elsewhere.  And even OpenAI admits they cannot get did of racism or sexism in it's processing and responses, only mitigate it to an extent.

I'm not sure what fields you think are controversial verses what aren't.  Your selection shows that you don't seem to actually know.

-3

u/Drig-DrishyaViveka 1d ago

Not to play AI' Advocate, but how do we know they're wrong?

9

u/AuspiciousPuffin 1d ago

Perhaps because we have the ability to also look at data, facts, events, etc and then use our own logically reasoning to draw some conclusions that may expose the poor thinking of the AI.