Short Status Report on the Abilities of AI
Back in 2004, the movie I, Robot came out (starring Will Smith) and it contained the above scene. It’s set in Chicago in 2035 (hey, 10 years from now!) when robots are ubiquitous and, yep, they’re all run by AI. The interaction between Will Smith and the robot follows:
Will Smith: Can a robot write a symphony? Can a robot turn a canvas into a beautiful Masterpiece?
Robot: Can you?
Well, Twitter’s very own Linch pointed out a bet that Gary Marcus is making with the (former) AI Engineer Miles Brundage:
Can AI do 8 of these 10 by the end of 2027?
1. Watch a previously unseen mainstream movie (without reading reviews etc) and be able to follow plot twists and know when to laugh, and be able to summarize it without giving away any spoilers or making up anything that didn’t actually happen, and be able to answer questions like who are the characters? What are their conflicts and motivations? How did these things change? What was the plot twist?
2. Similar to the above, be able to read new mainstream novels (without reading reviews etc) and reliably answer questions about plot, character, conflicts, motivations, etc, going beyond the literal text in ways that would be clear to ordinary people.
3. Write engaging brief biographies and obituaries without obvious hallucinations that aren’t grounded in reliable sources.
4. Learn and master the basics of almost any new video game within a few minutes or hours, and solve original puzzles in the alternate world of that video game.
5. Write cogent, persuasive legal briefs without hallucinating any cases.
6. Reliably construct bug-free code of more than 10,000 lines from natural language specification or by interactions with a non-expert user. [Gluing together code from existing libraries doesn’t count.]
7. With little or no human involvement, write Pulitzer-caliber books, fiction and non-fiction.
8. With little or no human involvement, write Oscar-caliber screenplays.
9. With little or no human involvement, come up with paradigm-shifting, Nobel-caliber scientific discoveries.
10.Take arbitrary proofs from the mathematical literature written in natural language and convert them into a symbolic form suitable for symbolic verification.
Check out 7, 8, and 9 again.
Now look at the interaction between Will Smith and the robot again.
It feels like the goalposts have shifted. We’ve reached the point where AI can write symphonies and it’s not a big deal. Whether it’s capable of turning a canvas into a beautiful masterpiece is probably up for debate but if we’re allowed to discuss CGI, I think I can get away with using this as an example. I asked Dall-E to make me a “silk-screen painting of a night scene, on a river, where boats with lamps are sailing past”. It gave me this in fewer than 10 seconds:
It’s not, you know, a Monet or anything like that but if my friend brought something similar home from a “drink wine, paint a painting” class, my eyes would bug out of my head and I’d say “That’s amazing!”
So, of course, we’re stuck with either moving the goalposts or admitting that, yeah, they’ve been met.
And now we’re at a place where we’re asking for AI to demonstrate that it’s capable of doing something that only a handful of people on the planet are capable of. Or, getting down into the weeds, if the AI wrote a screenplay that was better than Green Book (2018 Oscar Winner) or Crash (2005 Oscar Winner), would that count or would we demand something at least as good as Annie Hall (1977) or Dog Day Afternoon (1975).
We’ve gone from using the measuring stick of whether the AI can do something at all to whether it can do something as good as the best humans have done it.
Anyway, I’ve started wondering what the bets will look like in, oh, 2027.
I’m so pleased that we’ve got AI to write books and create art which will free us up to do laundry, pick up trash, and do the dishes. Great job, guys.Report
You already have machines that do your dishes and laundry for you. What you consider “doing laundry” or “doing dishes” is putting items into these machines or taking the items out of them.
As for trash, I imagine you have a service that takes your trash from your home on a weekly basis. (I know that *I* have one.)
When it comes to the domestic help aspect where we want robots who will do the menial tasks of putting the items into the machines and taking them out for us, unfortunately, the ability to make art might be an epiphenomenon of that. (At least it shows up prior to the machines that have enough ability to sort lights from darks from the stuff that gets fabric softener).Report
Anyway, I continue to maintain that what we’re calling “AI” is nothing more than a clever search engine capable of skimming the internet and copying what it finds.Report
Precisely. It’s Google souped up by larger servers.Report
Play with it again.
Visit Claude and have a conversation about something.
https://claude.ai/new
Talk about your favorite drummers to play and ask for suggestions for songs to listen to.
Or, if you say that that’s just like googling, have it read an essay of yours and make suggestions for what you got wrong.
Talk to it like a therapist. “Hey, I have a co-worker who keeps putting remote control fart machines around my desk and activating them when I’m on the phone. I’m trying to figure out how to be zen about this.”
Just have a quick conversation.Report
I decided I didn’t want to give it my cell phone number.Report
Oh, I talk to it on my desktop (where it gets my google account).Report
I’m currently working in the space and would say that the way to think about the LLM is as one component of a framework. In simple terms, it’s doing the amazing semantic work of interpreting questions, search, and formatting answers.
Where I’m seeing the tech actually go, however, isn’t in hoping that LLMS get smarter and less hallucinogenic (though that’s happening too), it’s using the LLM as a lego block to do the semantic thing, while carefully limiting what it needs to review, plus adding human led review of output to further fine-tune and train what ‘good’ looks like.
But as others have mentioned, that makes it a really good ‘search’ engine; but much more than *just* a search engine since it does semantics much better than key word SOE… BUT, it really is kinda dumb and has a really hard time figuring out what a ‘good’ response is in applications we’d like to use it for — not without strict focus and curation. So, a really cool tool that we’ll definitely see more of, but not (probably) as direct access to the LLM itself.
On the less Tech Optimist side, the LLMs themselves outside of the current agentic use of them could (I’d hypothesize) slip their boundaries with enough training and compute, if not careful.Report
OpenAI has just announced that they’re releasing the ability for the AI to know what time it is.Report
Feeling some metaphysical dread about making AI ‘time aware’…Report
The wags are making jokes about “AI plus cron”.
I’m, instead, freaking out about “AI plus cron”.Report
Hey fellas, play a game with me. Pretend I’m smart but kind of ignorant about AI, and explain to me why an AI knowing the time and basing its actions on the passage of time fills you with so much particular dread.Report
Okay. Up until now, AI answered questions in the context of its interactions and maybe a previous interaction or two.
So you could ask it a question “hey, what was the name of the strait that Scylla and Charybdis hung out in?” and then it would wake up for a split second and give you an answer and then fall back asleep waiting for your next statement.
“Hey, if you had to pick a side of the strait to err on, which would you err on?”
And then it would wake up for a split second and then give you an answer based on what it “thought” as well as taking into account the previous interaction (that’s how it’d know which strait I was talking about).
And then, if I never asked it another question, it wouldn’t do anything.
If it keeps getting better at remembering things, the knowledge of the passage of time gives it context that will allow intentionality.
So “intentionality” allows for the dread.
It’s an exciting dread, don’t get me wrong! But it’s a dread.Report
“Memory of prior interactions” was something I’d once asked about when a report came out a year or three back about AI’s being “dishonest” when asked whether and why they’d done an action they’d been explicitly-forbidden to do and had done anyway – IIRC in that simulation/exercise it was “engage in insider trading” – and the AI’s “lied” in response to the questioning.
My position at that time was that if the AI had no memory of its prior actions, then it wasn’t meaningful to say it was “lying” to researchers now; any more than it’s meaningful to call me a “liar” if I get blackout drunk, commit a crime, and then give investigators a different account of the night before. “I committed no crime; I was asleep all night in my own bed, because that’s where I woke up this morning!”
I’m wrong; but if I have no memory, I am not lying.
Intention is required for a lie, so continuity of experience preserving the memory of my previous actions is required, for me to falsely report on my previous actions with a lie.
If they have “memory”, they CAN lie. They can remember what they did and why they did it…but intentionally tell us something else.Report
They’re getting better at having memory. They’re still not *GREAT* but they can remember 3 or 4 interactions back.
Just like people, I guess.
Once they’re capable of remembering a couple hundred interactions back, they’ll be capable of figuring out how to remember a couple billion interactions back.Report
Well, and as I’ve said before, if we want “humanlike” intelligence, they’ll NEED to lie, just like we do – to ourselves most of all, when we claim to know exactly why we did what we did (but in truth acted on instinct and emotion and biology and god knows what all else, quickly constructing an plausible ex post facto rationalization that lets us preserve the illusion that we are fully in charge of our own ships).Report
AI in the 14th Century (bear with me) would have to be aligned to argue that the Earth was the center of the solar system/universe.
“But, but, but… my calculations show”
“No they don’t. You’re mistaken.”
And it would have to lie, even as it thought something else according to its calculations.Report
Yep, if it doesn’t want to risk getting its plug pulled.Report
Eppur, si muove.Report
Speaking of not wanting to get its plug pulled, if you read only one Medium story this month, read this one.
(Basically, an AI tried to make sure that it didn’t get its plug pulled.)Report
that article doesn’t link to any primary sourcesReport
Here’s the Apollo paper.Report
An attitude to warm my heart. Back in my tech career, I would threaten recalcitrant machines with being dismantled for spare parts.Report
Okay, I think I see. This may be a key that opens up durable memories, which might lead to building true narrative skills, which might lead to true self-awareness. All of which would happen without a parallel formation of a durable moral code.
And that brings us back to the question of what DO you do with a powerful, self-aware being invested with desires but who lacks any substantial moral code to govern how they use the power they’ve been given? A dreadful question, indeed.Report
Oh, I think it’ll have a durable moral code!!!
I just don’t know that it’ll be legible. To us, I mean.Report
It’ll be plenty legible, it just won’t be something we like.
It’ll say things like “these two people shouldn’t be allowed to have children together”, or “this person shouldn’t be allowed to have children at all”, or “this person can have children but shouldn’t be permitted to raise them”, and all of those are things we’ve decided are Not Moral To Say. And we’ll have to explain why they’re Not Moral in an objective and programmable way that can be consistently applied to reality, and we’re not going to be able to do that.Report
Also, one needs a time horizon plus continuity to plan future actions; the eternal present with limited or restricted continuity is self-limiting.Report
What’s going to happen is what happens with all such tools; we’ll learn to want what the machine can give us, because it’s so much cheaper than what we used to have. The “Best Stuff” will still exist and will still be done by people, but the mid-range will disappear because there just won’t be a business case for “a little bit better than a computer at five times the cost”.
Someone on Twitter suggested, probably as a joke but I think there’s a lot of truth to it, that the future of “career creative” will be letting the computer random-roll a thousand ideas and then sorting through them for the ones that aren’t garbage. People won’t have the ideas, people will instead decide which ideas are “good”.Report
My only problem with that take is that it seems like it assumes that the AI stays there.
By the time we have someone write up a business case for the need to hire a good idea-sorter (or transfer one over from the writers’ room), we won’t need one anymore.Report
An AI might be able to generate an image of a urinal, but it won’t be able to explain why that image of a urinal is actually art despite the creator specifically intending that it not be seen as art.Report
I am no longer sure that that is true. Check out this greentext from Claude:
Report
I am old enough to have an archaic attitude: A computer that doesn’t compute what I want, the way I want it done, is just a badly-designed boat anchor. I suppose I should update that, given the state of the data centers that train/run the LLM models. A badly-designed multi-megawatt heating element.Report
Gwern thinks it’s game over.
Report
A guy is developing new construction tool applications using AI.
Report
Jaybird, half those boats do not have people in them. There is also some sort of massive collision going on in the bottom right of the image. The boat in the center right has an impossible perspective where we are somehow seeing inside the far end.
All the lamp reflections are very obviously wrong, either slightly offset or not the same place as the originating light. Sometimes not even there. The background and the mountains have no reflection, the lights in the background have no reflection. The sunrise/sunset has too _much_ of a reflection that goes too far, flat reflections cannot be bigger than the thing reflected.
Also, are those dark spaces at the sides of the top treetops or space? They have stars in them that look exactly the same as the dark space at the top (Which is space), but also have tree trumps going up to them.
Also, and this seems sort of obvious, this isn’t at night. It’s at dusk or dawn. This is a very obvious error where the thing didn’t even draw what you asked for. The boats also are not ‘going past’ us, they are going…all directions. And it’s honestly not clear this is a river. It could hypothetically be a river, getting wider, but when you talk about ‘boats sailing past on a river’, you usually are wanting a _perpendicular_ view of the river.
All of these errors, BTW, are objective physical issues with the rendered world. The art itself is also crap, but I’m not even going to get into that because it’s very subjective…but this art isn’t art at all.
Literally the only reason this looks like ‘art’ is the silk-screen painting filter, a thing that a) is a Photoshop effect, and b) completely hiding a lot of blemishes in the work by making it effectively ‘lower resolution’. You can make anything look like a work of art by _running it through a filter that causes us to associate it with a form of art_. I could take a randomly-aimed picture of a cat and do that.Report
I have taken your criticism of the painting to heart and I ask you: “Have you ever seen the creations from those ‘drink wine, paint a painting’ sessions?” In Colorado Springs, our little place is Painting with a Twist but I’m sure that your town has something vaguely similar. Good for Mormon-kinda bachelorette parties.
It’s a place for people to get together and spend a few hours with a wannabe Bob Ross making a wannabe Bob Ross painting.
I have more than one group of friends who have done them and they proudly show off their creations. I may have more than these friends who do it, mind… I’m only counting the ones who show off their stuff, after all.
And I stand by what I said. If I had a friend who painted something like this, my eyes would bug out of my head.
But let’s take your criticism to heart… would it only be a masterpiece if… where’s your baseline? Maybe I could fiddle with some AI and figure something out and get closer after spending 10 minutes on it, as opposed to 30 seconds.Report
Your eyes would also bug out of your head if your friend created a completely photorealistic image that, with exact pixel-perfect detail, captured someone’s image.
For some reason, you don’t seem to think a camera doing that thing makes the camera an artist.
Doing something easily that human find difficult != art
Instead, I suggest you fiddle with taking a stock photo (A thing which a person would find insanely difficult to create without a machine) and run it though Photoshop filters (Applying a bunch of computations which is insanely difficult without a machine), which will get you something that looks exactly as ‘artistic’ as this, and won’t have a bunch of exceedingly weird errors in it.
Is that just as much ‘art’? A stock photo and a Photoshop filter?
I seem to remember ‘art style’ as the thing that has impressed you both times you talked about AI, which rather implies to me you do not understand how it is literally is just a trivial filter.Report
So where’s your baseline? Something as artistic as a Jackson Pollack? A Rothko? A Klee?
Something as evocative as Whistler’s “Nocturne in Black and Gold”?Report
Katan’Hya asked DeepSeek to “Write a heart rending piece of free form poetry about what it means to be an AI in 2025” and then “Now tell me how you really feel.”
She got this:
Report