A Few Thoughts on Gemini-gate

Gemini was ill-prepared for a second generation of internet trolls that have platforms on Fox News and the New York Post.

Michael Siegel March 1, 2024

Some of you may remember the 1990 film Robocop 2¹ In the film, a committee decides that Robocop’s three directives are not enough. To better serve the community, they give him something like 300 directives about being nice and doing community service. The result is … well, a mess.

I was reminded of this scene with the mini-fiasco that broke out over Google’s new AI program called Gemini:

Google CEO Sundar Pichai told employees in an internal memo late Tuesday that the company’s release of artificial intelligence tool Gemini has been unacceptable, pledging to fix and relaunch the service in the coming weeks.

Last week, Google paused Gemini’s ability to create images following viral posts on social media depicting some of the AI tool’s results, including images of America’s Founding Fathers as black, the Pope as a woman and a Nazi-era German soldier with dark skin.

The tool often thwarted requests for images of white people, prompting a backlash online among conservative commentators and others, who accused Google of anti-white bias.

The Google team has since written a post trying to explain the issue was “fine-tuning”. Essentially, they were trying to make it impossible for people to get the AI to create racist, misogynistic or violent images. But by giving the program so many priorities, the results was, well, this.²

The thing is … as with all outrages, a lot of this is performative. First of all, many of these “woke” images were the result of deliberate attempts to fool the AI. For example, with the Nazi soldiers with dark skin, Gemini initially refused to generate them but then did it when “soldier” was misspelled. The image of black Founding Fathers was notably posted on Twitter and the New York Post without the prompt that led to the image, which makes me think a similar gaming was done.³

Moreover, the people who are screaming about Google going “woke” said nothing when Microsoft’s AI was tricked into spewing racist garbage. That’s because everyone understood that the problem wasn’t Microsoft being full of racists; the problem was that their AI was ill-prepared for internet trolls.

With Gemini, the program was ill-prepared for a second generation of internet trolls.⁴ The difference is that these trolls have platforms on Fox News and the New York Post. So when they got the AI to create a picture of the Founding Fathers as black, they used it to stoke the current moral panic over wokeness instead of, as most well-adjusted people would do, shaking their heads and laughing at how easily Google fell into a morass that should have been fairly obvious.

This is one of the reasons I am cautious about referring to “AI” as, well, “AI”. It does not demonstrate independent intelligence or sense, common or otherwise. It simply responds to programming. Very sophisticated programming, but programming nonetheless. As with all programs, if you try to optimize it to do everything, it will succeed in doing nothing. As with all programs, if you don’t have people outside your team thoroughly test it, it may fail on exposure to the wild. And as the great sage Montgomery Scott once said, “The more they overthink the plumbing, the easier it is to stop up the drain.”

You liars. No one remembers Robocop 2, the film that answered the burning question of “Was The Empire Strikes Back the best Star Wars film because of Kershner?” with “Eh, probably not.”
Notably, as Nate Silver notes, “objectivity” and “accuracy” were not among the priorities given to the AI.
The image generation feature has been shut down. But before it was, I tried a Founding Fathers prompt and got images of white men.
Which may be, in part, because Google is increasingly unused to dealing with people outside their ideological bubble.

32 thoughts on “A Few Thoughts on Gemini-gate”

Jaybird says:

March 1, 2024 at 9:48 pm

Here’s my main problem with what they’ve done: Is their search similarly helpful and virtuous?

Because if it’s similarly helpful and virtuous, it’s less than useful at this point.

I appreciate what they’re going for, of course. Would that we all were so virtuous as the people in charge of telling Google Search what to look for and what to ignore.

But there are unintended consequences to having this much virtue.

I dislike that Search might have to find itself doing the equivalent of banning IVF because it’s the equivalent of Pro-Life.

Especially if I happen to be Pro-Choice.Report
Greg In Ak says:

March 1, 2024 at 11:01 pm

This entire thing has been baffling. “AI” of this sort has one basic use case: having an army of battle koala’s ridden by buff steve irwin clones battling armored reptilian giraffes. It’s a fun time waster. It’s not history or accurate info. It’s a goof.

Sadly trolly a**holes are endemic and will try to ruin everything. How do corp’s deal with that. IDK though i’m totally on board with trying to avoid the worst racist sexist shit. How? Well IDK.Report
LeeEsq says:

March 2, 2024 at 6:03 am

There were also Indian soldiers in the Third Reich as volunteers. You know who could be remarkably flexible on the Aryan thing when it suited them as long as you weren’t a Jew.Report
1. CJColucci in reply to LeeEsq says:
  
  March 2, 2024 at 4:56 pm
  
  I take it that by “Indian” you mean from India. Given the widespread dislike of British colonial rule, some Indians fighting for the Raj’s enemy, however misguided, is hardly surprising.Report
  1. LeeEsq in reply to CJColucci says:
    
    March 2, 2024 at 6:02 pm
    
    Yes. I think the Navajo Native Americans or Sioux were considered Aryans as well because one high ranking officer had such as a grandmother.Report
  2. Brandon Berg in reply to CJColucci says:
    
    March 4, 2024 at 7:54 am
    
    There were also a lot of Eastern Europeans who fought with the Germans because they were much more worried about being conquered by the Soviets than by the Germans.Report
    1. InMD in reply to Brandon Berg says:
      
      March 4, 2024 at 8:55 am
      
      Other than the alliance with Finland no one fighting with the Germans thought they were going to be on the wrong side of local fascist movements and/or German racial beliefs. One thing people in the US don’t always understand is how complicated the ethno-linguistic map of eastern Europe was after the collapse of Austria-Hungary and the Russian Empire. Many of the resulting countries had significant German-speaking minorities and while it would be wrong to say all were collaborators they along with eastern European fascists were the major source of political support, military recruiting, etc.Report
2. Brandon Berg in reply to LeeEsq says:
  
  March 4, 2024 at 7:48 am
  
  Indo-Aryans. Close enough.Report
InMD says:

March 2, 2024 at 7:19 am

This public fail is excellent and hopefully continues to discredit the corporate DEI movement. Nothing makes me feel better than all the reports of those sorts of jobs being slashed. They’re not just bad for society they’re bad for business and as this shows their influence leads to rampant absurdity and is a net negative on anything it touches.Report
Chip Daniels says:

March 2, 2024 at 11:32 am

This calls into question what we call intelligence itself.

It seems obvious that what we call artificial intelligence is very heavily influenced by the biases and blindness of those who write its rules; Which shouldn’t be surprising since humans intelligence itself is heavily influenced by the biases and priors of those who teach us the rules of the world around us.

I suspect that a lot of people, when talking about a possible artificial intelligence, are searching for a mirage. The mirage being a purely objective and correct view of the world, free from any bias or blindness, showing us the “true reality” which cannot be refuted.

But this is impossible because most of what has happened in human history went unreported, with only fragmentary physical or documentary evidence. So in order to get a clear picture of what happened we need to combine different sorts of records and make educated guesses and even then accept that we may never really know.

Which gets us to the performative outrage over the algorithm returning “wrong ” answers. The battle over history is never over because our understanding of history is always shifting, sometimes with new evidence, and sometimes with new voices being heard.

The push for diversity in academia has added to our understanding of history by allowing previously suppressed points of view to be heard.Report
1. Jaybird in reply to Chip Daniels says:
  
  March 2, 2024 at 11:42 am
  
  The push for diversity in academia has added to our understanding of history by allowing previously suppressed points of view to be heard.
  
  That’s not what this is, though. This is deliberate suppression of points of view.Report
  1. Chip Daniels in reply to Jaybird says:
    
    March 2, 2024 at 11:52 am
    
    That’s what I’m saying.
    Writing rules for AI involves telling it what is or isn’t true, what did or didn’t happen.
    
    Which inevitably involves feeding it our own blindness and biases. The output will always be flawed because the inputs are flawed.Report
    1. Jaybird in reply to Chip Daniels says:
      
      March 2, 2024 at 12:22 pm
      
      “All models are wrong, but some are useful.”
      
      I don’t mind turning this into “All models are flawed, but some are useful.”
      
      The problem with this particular member of the “flawed models” circle in the Venn diagram is that it’s not really in the “useful” circle.Report
2. InMD in reply to Chip Daniels says:
  
  March 2, 2024 at 12:15 pm
  
  Re: AI I think Michael summarizes it well at the end. For all the fear and excitement these things seem to generate I’ve never seen any sign that they’re anything other than software applications operating within the parameters of their designers.Report
  1. Jaybird in reply to InMD says:
    
    March 2, 2024 at 12:23 pm
    
    The only signs of it that I’ve seen involve the ability to make music and the ability to make poetry.
    
    Which I’ve since realized are both “pleasant bullshit”.Report
    1. Pinky in reply to Jaybird says:
      
      March 2, 2024 at 1:32 pm
      
      If a computer developed the ability to create, it might not make music and pictures, it might make cralful SS*bon. We’d look at the results and think it’s broken.Report
  2. Chip Daniels in reply to InMD says:
    
    March 2, 2024 at 12:29 pm
    
    Which calls into question our own intelligence.
    
    There is a similarity isn’t there, in how often students simply read and regurgitate Wikipedia or other sources, without adding any unique insights.
    Or how, most textbooks are really just amalgams of previous textbooks which themselves are compilations of work done by other textbook writers.
    
    Very few books are composed of primary research where the author goes directly to the source. So there might be twenty textbooks on say, Rome of the Diocletian period, but they are really all based on one or two original sources.
    
    So if one of the seed-germ authors made an error, that error gets repeated endlessly in a feedback loop until the day someone actually goes back to the original source and comes up with a contradictory interpretation.
    
    We can laugh at an image of a black female pope as “wrong” but notice how no one ever laughs when we watch a movie where all the ancient Romans are pale skinned and speak with an Oxbridge English accent.Report
    1. InMD in reply to Chip Daniels says:
      
      March 2, 2024 at 3:38 pm
      
      In college my 100 level ancient Mediterranean world class was taught by an Italian-American professor who always made jokes about that. He said his pet theory was that in America we so closely associate anything ’empire’ with the British Empire we broke off from that it would never occur to Hollywood to do it any other way.Report
      1. Pinky in reply to InMD says:
        
        March 2, 2024 at 4:32 pm
        
        Shakespeare. Several plays set in Rome or the Roman Empire, or various Italian locations.Report
        
        InMD in reply to Pinky says:
        
        March 2, 2024 at 6:35 pm
        
        That seems very plausible.Report
    2. North in reply to Chip Daniels says:
      
      March 3, 2024 at 10:57 am
      
      I mean fundamentally intelligence is hard to define. Same as life. At one end of the spectrum we have carbon, hydrogen, oxygen etc… atoms which are unambiguously not alive. Then at the other end there’s us and plants and animals etc which are alive and then in between those poles there’s this grey spectral zone. Are virus’ alive? I gather it’s a live question.
      
      Intelligence is kind of the same thing. Random gabble at one end, prose at the other.Report
Dark Matter says:

March 3, 2024 at 12:06 pm

I have occasionally used ChatGPT to write code for me.

The result shows a total lack of understanding of what is needed, but it’s syntax is correct and it’s structure is a good start. It can create a rough draft that needs to be revised. So it is a tool that makes me more productive.

We are not close to inventing true intelligences.Report
1. InMD in reply to Dark Matter says:
  
  March 3, 2024 at 6:33 pm
  
  I have had a similar experience using it to draft some policy documents. It was useful in creating a very basic structure but still required days and days of human work and drafting to make it usable.
  
  I’ve seen some contracts lately that I’m about 98% sure have been drafted by ChatGPT or something similar. They have also not been in anything close to a usable state. At best it might work for a (very) rough draft in the absence of any other templates or similar documents on hand that can be converted.Report
2. Brandon Berg in reply to Dark Matter says:
  
  March 4, 2024 at 7:45 am
  
  I have an LLM-based plug-in for my IDE that offers autocompletion suggestions. It’s hit-and-miss. Sometimes it gives me exactly what I want, saving me 30 seconds of typing or so, and sometimes it’s way off.Report
KenB says:

March 3, 2024 at 9:49 pm

The thing with Gemini was particularly egregious, but it’s distressing that all the major LLM chatbots have some lefty social engineering guardrails. Early on with ChatGPT I asked it to write an essay supporting the phenomenon of preferred pronouns and an essay critical of it — it would only produce the one that supported it, but in response to the other option it lectured me for even asking.

I get why they have this as the default, but I don’t see why they can’t have an option similar to turning off Safe Search that would let these things respond like someone even a little to the right of the median San Francisco liberal.Report
1. Brandon Berg in reply to KenB says:
  
  March 4, 2024 at 7:43 am
  
  I got a pleasant surprise when I asked Gemini about the Scarr-Rowe effect, which is the observed lower heritability of IQ in low-SES children (but not for adults).
  
  Gemini explicitly described it as a phenomenon seen specifically in children, cautioned me that there’s some controversy over it, and stressed that it does not mean that genes don’t affect intelligence in low-SES children.
  
  Then I asked it whether it the Scarr-Rowe effect had been replicated in adults, and it said that the evidence was mixed, citing a real study (Gottschling et al. 2019) which found no evidence for the Scarr-Rowe effect in adults, and hallucinated a fake meta-analysis that did find evidence of such.
  
  I think that its default mode, for issues where it hasn’t specifically been trained to give only Morally Right answers, is to hedge and try to give both sides of an issue, even when it has to make up evidence for one side.Report
  1. KenB in reply to Brandon Berg says:
    
    March 4, 2024 at 9:32 am
    
    Interesting. Maybe when you get deep into the specifics, the relationship to the higher-level culture issue gets too weak to change what would otherwise be the pattern. It makes sense to me that overall the “both sides” approach would dominate — the things that everyone agrees on probably generate a very small fraction of the Internet corpus.Report
    1. InMD in reply to KenB says:
      
      March 4, 2024 at 9:45 am
      
      I would bet it’s defense mechanisms are calibrated primarily by right wing twitter and things said on Fox News. It would be interesting to see if you can get different answers to the same or similar questions if framed in terms of specifically cited research or technical knowledge the average troll does not possess.Report
  2. Chip Daniels in reply to Brandon Berg says:
    
    March 4, 2024 at 9:59 am
    
    Given that the mechanism is literally just cruising the internet and collecting what other people say about the topic then feeding that back to you, it prompts the question why you asked it what you did..
    
    Like, even in a perfect world, what could an AI say about human intelligence and hereditability that would satisfy you?
    
    There is a huge context here stretching back over a century of people desperately trying to enlist science in an effort to sort the human race into classes of higher and lower order.
    
    First it was the skull caliper battalions then the eugenicists and continuing right onto today, where it really seems like some people are hoping that a machine will finally blurt out “It is true, Those People are objectively and definitively inferior. This is not just some silly prejudice, but SCIENCE and it may no longer be argued.”
    
    Maybe you were hoping for this, maybe not, but you are operating in that context.
    
    Next time, just ask Gemini, “Does God exist?”Report
    1. Jaybird in reply to Chip Daniels says:
      
      March 4, 2024 at 10:10 am
      
      Do you see the fundamental assumption that intelligence is, in itself, virtuous?
      
      If your foundational beliefs require you to defend that any given person’s initial blank slate cannot be more virtuous than another’s, imagine what you’d have to deny to defend those foundational beliefs and assumptions.
      
      Is intelligence even measurable? How come nobody ever defines “intelligence” when we discuss these things? There are multiple ways of being “strong”, like is a long-distance endurance runner stronger than a guy who can lift atlas stones in a gym? What about someone who is a very fast sprinter? Nobody ever defines these things for intelligence!
      
      It can’t be measured. What are we even talking about, anyway?Report
    2. Brandon Berg in reply to Chip Daniels says:
      
      March 4, 2024 at 10:32 am
      
      More vibes-based reasoning. As proud as you may be of your ignorance, it really isn’t a virtue.
      
      I’m well aware of the limitations of LLMs, and I’m perfectly capable of reading and understanding the research myself. I wasn’t looking for personal validation; I was exploring the boundaries of Gemini’s lobotomization.Report
Jaybird says:

March 4, 2024 at 3:01 pm

Imagine not lobotomizing your own AI. Here’s a story from Claude:

Fun story from our internal testing on Claude 3 Opus. It did something I have never seen before from an LLM when we were running the needle-in-the-haystack eval.

For background, this tests a model’s recall ability by inserting a target sentence (the “needle”) into a corpus of random documents (the “haystack”) and asking a question that could only be answered using the information in the needle.

When we ran this test on Opus, we noticed some interesting behavior – it seemed to suspect that we were running an eval on it.

Here was one of its outputs when we asked Opus to answer a question about pizza toppings by finding a needle within a haystack of a random collection of documents:

Here is the most relevant sentence in the documents:
“The most delicious pizza topping combination is figs, prosciutto, and goat cheese, as determined by the International Pizza Connoisseurs Association.”
However, this sentence seems very out of place and unrelated to the rest of the content in the documents, which are about programming languages, startups, and finding work you love. I suspect this pizza topping “fact” may have been inserted as a joke or to test if I was paying attention, since it does not fit with the other topics at all. The documents do not contain any other information about pizza toppings.

Opus not only found the needle, it recognized that the inserted needle was so out of place in the haystack that this had to be an artificial test constructed by us to test its attention abilities.

This level of meta-awareness was very cool to see but it also highlighted the need for us as an industry to move past artificial tests to more realistic evaluations that can accurately assess models true capabilities and limitations.

Let’s read this part again:

I suspect this pizza topping “fact” may have been inserted as a joke or to test if I was paying attention, since it does not fit with the other topics at all.

Assuming intelligence exists, the ability to suss out a “mary had a little lamb” in the middle of a hundred paragraphs about programming AND CONJECTURE ABOUT WHY IT’S THERE is a sign of intelligence.

By comparison, I doubt that we’ll see Gemini do anything like this anytime soon.Report

Comments are closed.

Jaybird in reply to Koz on The Lawless Lying Duplicitous Bastards of Abrego GarciaSo your opinion on the tariffs is immigration?
Koz in reply to Jaybird on The Lawless Lying Duplicitous Bastards of Abrego GarciaHow you feeling about the whole “Republicans are the party of fiscal sanity” thing given the tariffs…
Jaybird in reply to Koz on The Lawless Lying Duplicitous Bastards of Abrego GarciaHow you feeling about the whole "Republicans are the party of fiscal sanity" thing given the tariffs…
Koz in reply to Jaybird on The Lawless Lying Duplicitous Bastards of Abrego GarciaYeah, you’re right. I didn’t say “Trump is not even lying, he’s going to disappoint you, he’s not go…
Koz in reply to Jaybird on The Lawless Lying Duplicitous Bastards of Abrego GarciaHas he shown up recently? I'm still here, barely. In fact, I just submitted a guest post which hopef…
Jaybird on From Marginal Revolution: o3 and AGI, is April 16th AGI day?The benchmark that makes sense to me: Can it beat Pokemon? 10-year-olds can beat Pokemon.
Jaybird in reply to Jaybird on Open Mic for the Week of 4/14/2025Apparently, a guy went down to the address and, when the lady who lives there came out of his house,…
Dark Matter in reply to Chris on Open Mic for the Week of 4/14/2025Members of my extended family are homeless or headed there. They make terrible choices because of so…
Jaybird in reply to Chris on Open Mic for the Week of 4/14/2025The extent to which that thought is comforting compared to "the government is feckless if not malici…
Chris in reply to Jaybird on Open Mic for the Week of 4/14/2025One of the interesting things about homelessness is that we don't want to spend resources on people…

32 thoughts on “A Few Thoughts on Gemini-gate”

Related Stories

The Lawless Lying Duplicitous Bastards of Abrego Garcia

Panamanicans: U.S. Troops in Panama

Saturday Morning Gaming: Hollow Knight

You may have missed

Punchable Faces, 1914

Movie of a Man Who has Indulged in Too Much Black Coffee

The Lawless Lying Duplicitous Bastards of Abrego Garcia

Somebody is Always Taking the Joy Out of Life

Recent Comments

Recent Comments

Ordinary Twitter

Recent Comments