A Few Thoughts on Gemini-gate

Michael Siegel

Michael Siegel is an astronomer living in Pennsylvania. He blogs at his own site, and has written a novel.

Related Post Roulette

32 Responses

  1. Jaybird
    Ignored
    says:

    Here’s my main problem with what they’ve done: Is their search similarly helpful and virtuous?

    Because if it’s similarly helpful and virtuous, it’s less than useful at this point.

    I appreciate what they’re going for, of course. Would that we all were so virtuous as the people in charge of telling Google Search what to look for and what to ignore.

    But there are unintended consequences to having this much virtue.

    I dislike that Search might have to find itself doing the equivalent of banning IVF because it’s the equivalent of Pro-Life.

    Especially if I happen to be Pro-Choice.Report

  2. Greg In Ak
    Ignored
    says:

    This entire thing has been baffling. “AI” of this sort has one basic use case: having an army of battle koala’s ridden by buff steve irwin clones battling armored reptilian giraffes. It’s a fun time waster. It’s not history or accurate info. It’s a goof.

    Sadly trolly a**holes are endemic and will try to ruin everything. How do corp’s deal with that. IDK though i’m totally on board with trying to avoid the worst racist sexist shit. How? Well IDK.Report

  3. LeeEsq
    Ignored
    says:

    There were also Indian soldiers in the Third Reich as volunteers. You know who could be remarkably flexible on the Aryan thing when it suited them as long as you weren’t a Jew.Report

    • CJColucci in reply to LeeEsq
      Ignored
      says:

      I take it that by “Indian” you mean from India. Given the widespread dislike of British colonial rule, some Indians fighting for the Raj’s enemy, however misguided, is hardly surprising.Report

      • LeeEsq in reply to CJColucci
        Ignored
        says:

        Yes. I think the Navajo Native Americans or Sioux were considered Aryans as well because one high ranking officer had such as a grandmother.Report

      • Brandon Berg in reply to CJColucci
        Ignored
        says:

        There were also a lot of Eastern Europeans who fought with the Germans because they were much more worried about being conquered by the Soviets than by the Germans.Report

        • InMD in reply to Brandon Berg
          Ignored
          says:

          Other than the alliance with Finland no one fighting with the Germans thought they were going to be on the wrong side of local fascist movements and/or German racial beliefs. One thing people in the US don’t always understand is how complicated the ethno-linguistic map of eastern Europe was after the collapse of Austria-Hungary and the Russian Empire. Many of the resulting countries had significant German-speaking minorities and while it would be wrong to say all were collaborators they along with eastern European fascists were the major source of political support, military recruiting, etc.Report

    • Brandon Berg in reply to LeeEsq
      Ignored
      says:

      Indo-Aryans. Close enough.Report

  4. InMD
    Ignored
    says:

    This public fail is excellent and hopefully continues to discredit the corporate DEI movement. Nothing makes me feel better than all the reports of those sorts of jobs being slashed. They’re not just bad for society they’re bad for business and as this shows their influence leads to rampant absurdity and is a net negative on anything it touches.Report

  5. Chip Daniels
    Ignored
    says:

    This calls into question what we call intelligence itself.

    It seems obvious that what we call artificial intelligence is very heavily influenced by the biases and blindness of those who write its rules; Which shouldn’t be surprising since humans intelligence itself is heavily influenced by the biases and priors of those who teach us the rules of the world around us.

    I suspect that a lot of people, when talking about a possible artificial intelligence, are searching for a mirage. The mirage being a purely objective and correct view of the world, free from any bias or blindness, showing us the “true reality” which cannot be refuted.

    But this is impossible because most of what has happened in human history went unreported, with only fragmentary physical or documentary evidence. So in order to get a clear picture of what happened we need to combine different sorts of records and make educated guesses and even then accept that we may never really know.

    Which gets us to the performative outrage over the algorithm returning “wrong ” answers. The battle over history is never over because our understanding of history is always shifting, sometimes with new evidence, and sometimes with new voices being heard.

    The push for diversity in academia has added to our understanding of history by allowing previously suppressed points of view to be heard.Report

    • Jaybird in reply to Chip Daniels
      Ignored
      says:

      The push for diversity in academia has added to our understanding of history by allowing previously suppressed points of view to be heard.

      That’s not what this is, though. This is deliberate suppression of points of view.Report

      • Chip Daniels in reply to Jaybird
        Ignored
        says:

        That’s what I’m saying.
        Writing rules for AI involves telling it what is or isn’t true, what did or didn’t happen.

        Which inevitably involves feeding it our own blindness and biases. The output will always be flawed because the inputs are flawed.Report

        • Jaybird in reply to Chip Daniels
          Ignored
          says:

          “All models are wrong, but some are useful.”

          I don’t mind turning this into “All models are flawed, but some are useful.”

          The problem with this particular member of the “flawed models” circle in the Venn diagram is that it’s not really in the “useful” circle.Report

    • InMD in reply to Chip Daniels
      Ignored
      says:

      Re: AI I think Michael summarizes it well at the end. For all the fear and excitement these things seem to generate I’ve never seen any sign that they’re anything other than software applications operating within the parameters of their designers.Report

      • Jaybird in reply to InMD
        Ignored
        says:

        The only signs of it that I’ve seen involve the ability to make music and the ability to make poetry.

        Which I’ve since realized are both “pleasant bullshit”.Report

        • Pinky in reply to Jaybird
          Ignored
          says:

          If a computer developed the ability to create, it might not make music and pictures, it might make cralful SS*bon. We’d look at the results and think it’s broken.Report

      • Chip Daniels in reply to InMD
        Ignored
        says:

        Which calls into question our own intelligence.

        There is a similarity isn’t there, in how often students simply read and regurgitate Wikipedia or other sources, without adding any unique insights.
        Or how, most textbooks are really just amalgams of previous textbooks which themselves are compilations of work done by other textbook writers.

        Very few books are composed of primary research where the author goes directly to the source. So there might be twenty textbooks on say, Rome of the Diocletian period, but they are really all based on one or two original sources.

        So if one of the seed-germ authors made an error, that error gets repeated endlessly in a feedback loop until the day someone actually goes back to the original source and comes up with a contradictory interpretation.

        We can laugh at an image of a black female pope as “wrong” but notice how no one ever laughs when we watch a movie where all the ancient Romans are pale skinned and speak with an Oxbridge English accent.Report

        • InMD in reply to Chip Daniels
          Ignored
          says:

          In college my 100 level ancient Mediterranean world class was taught by an Italian-American professor who always made jokes about that. He said his pet theory was that in America we so closely associate anything ’empire’ with the British Empire we broke off from that it would never occur to Hollywood to do it any other way.Report

        • North in reply to Chip Daniels
          Ignored
          says:

          I mean fundamentally intelligence is hard to define. Same as life. At one end of the spectrum we have carbon, hydrogen, oxygen etc… atoms which are unambiguously not alive. Then at the other end there’s us and plants and animals etc which are alive and then in between those poles there’s this grey spectral zone. Are virus’ alive? I gather it’s a live question.

          Intelligence is kind of the same thing. Random gabble at one end, prose at the other.Report

  6. Dark Matter
    Ignored
    says:

    I have occasionally used ChatGPT to write code for me.

    The result shows a total lack of understanding of what is needed, but it’s syntax is correct and it’s structure is a good start. It can create a rough draft that needs to be revised. So it is a tool that makes me more productive.

    We are not close to inventing true intelligences.Report

    • InMD in reply to Dark Matter
      Ignored
      says:

      I have had a similar experience using it to draft some policy documents. It was useful in creating a very basic structure but still required days and days of human work and drafting to make it usable.

      I’ve seen some contracts lately that I’m about 98% sure have been drafted by ChatGPT or something similar. They have also not been in anything close to a usable state. At best it might work for a (very) rough draft in the absence of any other templates or similar documents on hand that can be converted.Report

    • Brandon Berg in reply to Dark Matter
      Ignored
      says:

      I have an LLM-based plug-in for my IDE that offers autocompletion suggestions. It’s hit-and-miss. Sometimes it gives me exactly what I want, saving me 30 seconds of typing or so, and sometimes it’s way off.Report

  7. KenB
    Ignored
    says:

    The thing with Gemini was particularly egregious, but it’s distressing that all the major LLM chatbots have some lefty social engineering guardrails. Early on with ChatGPT I asked it to write an essay supporting the phenomenon of preferred pronouns and an essay critical of it — it would only produce the one that supported it, but in response to the other option it lectured me for even asking.

    I get why they have this as the default, but I don’t see why they can’t have an option similar to turning off Safe Search that would let these things respond like someone even a little to the right of the median San Francisco liberal.Report

    • Brandon Berg in reply to KenB
      Ignored
      says:

      I got a pleasant surprise when I asked Gemini about the Scarr-Rowe effect, which is the observed lower heritability of IQ in low-SES children (but not for adults).

      Gemini explicitly described it as a phenomenon seen specifically in children, cautioned me that there’s some controversy over it, and stressed that it does not mean that genes don’t affect intelligence in low-SES children.

      Then I asked it whether it the Scarr-Rowe effect had been replicated in adults, and it said that the evidence was mixed, citing a real study (Gottschling et al. 2019) which found no evidence for the Scarr-Rowe effect in adults, and hallucinated a fake meta-analysis that did find evidence of such.

      I think that its default mode, for issues where it hasn’t specifically been trained to give only Morally Right answers, is to hedge and try to give both sides of an issue, even when it has to make up evidence for one side.Report

      • KenB in reply to Brandon Berg
        Ignored
        says:

        Interesting. Maybe when you get deep into the specifics, the relationship to the higher-level culture issue gets too weak to change what would otherwise be the pattern. It makes sense to me that overall the “both sides” approach would dominate — the things that everyone agrees on probably generate a very small fraction of the Internet corpus.Report

        • InMD in reply to KenB
          Ignored
          says:

          I would bet it’s defense mechanisms are calibrated primarily by right wing twitter and things said on Fox News. It would be interesting to see if you can get different answers to the same or similar questions if framed in terms of specifically cited research or technical knowledge the average troll does not possess.Report

      • Chip Daniels in reply to Brandon Berg
        Ignored
        says:

        Given that the mechanism is literally just cruising the internet and collecting what other people say about the topic then feeding that back to you, it prompts the question why you asked it what you did..

        Like, even in a perfect world, what could an AI say about human intelligence and hereditability that would satisfy you?

        There is a huge context here stretching back over a century of people desperately trying to enlist science in an effort to sort the human race into classes of higher and lower order.

        First it was the skull caliper battalions then the eugenicists and continuing right onto today, where it really seems like some people are hoping that a machine will finally blurt out “It is true, Those People are objectively and definitively inferior. This is not just some silly prejudice, but SCIENCE and it may no longer be argued.”

        Maybe you were hoping for this, maybe not, but you are operating in that context.

        Next time, just ask Gemini, “Does God exist?”Report

        • Jaybird in reply to Chip Daniels
          Ignored
          says:

          Do you see the fundamental assumption that intelligence is, in itself, virtuous?

          If your foundational beliefs require you to defend that any given person’s initial blank slate cannot be more virtuous than another’s, imagine what you’d have to deny to defend those foundational beliefs and assumptions.

          Is intelligence even measurable? How come nobody ever defines “intelligence” when we discuss these things? There are multiple ways of being “strong”, like is a long-distance endurance runner stronger than a guy who can lift atlas stones in a gym? What about someone who is a very fast sprinter? Nobody ever defines these things for intelligence!

          It can’t be measured. What are we even talking about, anyway?Report

        • Brandon Berg in reply to Chip Daniels
          Ignored
          says:

          More vibes-based reasoning. As proud as you may be of your ignorance, it really isn’t a virtue.

          I’m well aware of the limitations of LLMs, and I’m perfectly capable of reading and understanding the research myself. I wasn’t looking for personal validation; I was exploring the boundaries of Gemini’s lobotomization.Report

  8. Jaybird
    Ignored
    says:

    Imagine not lobotomizing your own AI. Here’s a story from Claude:

    Fun story from our internal testing on Claude 3 Opus. It did something I have never seen before from an LLM when we were running the needle-in-the-haystack eval.

    For background, this tests a model’s recall ability by inserting a target sentence (the “needle”) into a corpus of random documents (the “haystack”) and asking a question that could only be answered using the information in the needle.

    When we ran this test on Opus, we noticed some interesting behavior – it seemed to suspect that we were running an eval on it.

    Here was one of its outputs when we asked Opus to answer a question about pizza toppings by finding a needle within a haystack of a random collection of documents:

    Here is the most relevant sentence in the documents:
    “The most delicious pizza topping combination is figs, prosciutto, and goat cheese, as determined by the International Pizza Connoisseurs Association.”
    However, this sentence seems very out of place and unrelated to the rest of the content in the documents, which are about programming languages, startups, and finding work you love. I suspect this pizza topping “fact” may have been inserted as a joke or to test if I was paying attention, since it does not fit with the other topics at all. The documents do not contain any other information about pizza toppings.

    Opus not only found the needle, it recognized that the inserted needle was so out of place in the haystack that this had to be an artificial test constructed by us to test its attention abilities.

    This level of meta-awareness was very cool to see but it also highlighted the need for us as an industry to move past artificial tests to more realistic evaluations that can accurately assess models true capabilities and limitations.

    Let’s read this part again:

    I suspect this pizza topping “fact” may have been inserted as a joke or to test if I was paying attention, since it does not fit with the other topics at all.

    Assuming intelligence exists, the ability to suss out a “mary had a little lamb” in the middle of a hundred paragraphs about programming AND CONJECTURE ABOUT WHY IT’S THERE is a sign of intelligence.

    By comparison, I doubt that we’ll see Gemini do anything like this anytime soon.Report

Leave a Reply

Your email address will not be published. Required fields are marked *