Stubsack: weekly thread for sneers not worth an entire post, week ending 27th April 2025

BlueMonday1984@awful.systems · 7 days ago

Stubsack: weekly thread for sneers not worth an entire post, week ending 27th April 2025

BlueMonday1984@awful.systems · 9 hours ago

New piece from the Wall Street Journal: We Now Know How AI ‘Thinks’—and It’s Barely Thinking at All (archive link)

The piece falls back into the standard “AI Is Inevitable™” at the end, but its still a surprisingly strong sneer IMO.

Soyweiser@awful.systems · 12 hours ago

"“Intel admits what we all knew: no one is buying AI PCs”

People would rather buy older processors that aren’t that much less powerful but way cheaper. The “AI” benefits obviously aren’t worth paying for.

https://www.xda-developers.com/intel-admits-what-we-all-knew-no-one-is-buying-ai-pcs/"

Jonathan Hendry@iosdev.space · 6 hours ago

@Soyweiser

My 2022 iPhone SE has the “neural engine" core. But isn’t supported for Apple Intelligence.

And that’s a phone and OS and CPU produced by the same company.

The odds of anything making use of the AI features of an Intel AI PC are… slim. Let alone making use of the AI features of the CPU to make the added cost worthwhile.

froztbyte@awful.systems · 12 hours ago

haha I was just about to post this after seeing it too

must be a great feather to add into the cap along with all the recent silicon issues

Soyweiser@awful.systems · 9 hours ago

You know what they say. Great minds repost Tante.

BlueMonday1984@awful.systems · 12 hours ago

New thread from Dan Olson about chatbots:

I want to interview Sam Altman so I can get his opinion on the fact that a lot of his power users are incredibly gullible, spending millions of tokens per day on “are you conscious? Would you tell me if you were? How can I trust that you’re not lying about not being conscious?”

For the kinds of personalities that get really into Indigo Children, reality shifting, simulation theory, and the like chatbots are uncut Colombian cocaine. It’s the monkey orgasm button, and they’re just hammering it; an infinite supply of material for their apophenia to absorb.

Chatbots are basically adding a strain of techno-animism to every already cultic woo community with an internet presence, not a Jehovah that issues scripture, but more something akin to a Kami, Saint, or Lwa to appeal to, flatter, and appease in a much more transactional way.

Wellness, already mounting the line of the mystical like a pommel horse, is proving particularly vulnerable to seeing chatbots as an agent of secret knowledge, insisting that This One Prompt with your blood panel results will get ChatGPT to tell you the perfect diet to Fix Your Life

swlabr@awful.systems · 8 hours ago

“are you conscious? Would you tell me if you were? How can I trust that you’re not lying about not being conscious?”

Somehow more stupid than “If you’re a cop and I ask you if you’re a cop, you gotta tell me!”

BlueMonday1984@awful.systems · 7 hours ago

"How can I trust that you’re not lying about not being conscious?”

Its a silicon-based insult to life, it can’t be conscious

maol@awful.systems · 21 hours ago

That Couple are in the news arís. surprisingly, the racist, sexist dog holds opinions that a racist, sexist dog could be expected to hold, and doesn’t think poor people should have more babies. He does want Native Americans to have more babies, though, because they’re “on the verge of extinction”, and he thinks of cultural groups and races as exhibits in a human zoo. Simone Collins sits next to her racist, sexist dog of a husband and explains how paid parental leave could lead to companies being reluctant to hire women (although her husband seems to think all women are good for us having kids).

This gruesome twosome deserve each other: their kids don’t.

David Gerard@awful.systems · 24 hours ago

yet again, you can bypass LLM “prompt security” with a fanfiction attack

https://hiddenlayer.com/innovation-hub/novel-universal-bypass-for-all-major-llms/

not Pivoting cos (1) the fanfic attack is implicit in building an uncensored compressed text repo, then trying to filter output after the fact (2) it’s an ad for them claiming they can protect against fanfic attacks, and I don’t believe them

Soyweiser@awful.systems · 13 hours ago

I think unrelated to the attack above, but more about prompt hack security, so while back I heard people in tech mention that the solution to all these prompt hack attacks is have a secondary LLM look at the output of the first and prevent bad output that way. Which is another LLM under the trench coat (drink!), but also doesn’t feel like it would secure a thing, it would just require more complex nested prompthacks. I wonder if somebody is just going to eventually generalize how to nest various prompt hacks and just generate a ‘prompthack for a LLM protected by N layers of security LLMs’. Just found the ‘well protect it with another AI layer’ to sound a bit naive, and I was a bit disappointed in the people saying this, who used to be more genAI skeptical (but money).

flaviat@awful.systems · edit-2 6 hours ago

Now I’m wondering if an infinite sequence of nested LLMs could achieve AGI. Probably not.

Soyweiser@awful.systems · 4 hours ago

Now I wonder if your creation ever halts. Might be a problem.

blakestacey@awful.systems · edit-2 3 hours ago

(thinks)

I get it!

Sailor Sega Saturn@awful.systems · 22 hours ago

Days since last “novel” prompt injection attack that I first saw on social media months and months ago: zero

BlueMonday1984@awful.systems · 1 day ago

r/changemyview recently announced the University of Zurich had performed an unauthorised AI experiment on the subreddit. Unsurprisingly, there were a litany of ethical violations.

(Found the whole thing through a r/subredditdrama thread, for the record)

David Gerard@awful.systems · 24 hours ago

fuck me, that’s a Pivot

Soyweiser@awful.systems · 24 hours ago

Ow god, the bots pretended to be stuff like SA survivors and the like. Also the whole research is invalid just because they cannot tell that the reactions they will get are not also bot generated. What is wrong with these people.

swlabr@awful.systems · edit-2 1 day ago

They targeted redditors. Redditors. (jk)

Ok but yeah that is extraordinarily shitty.

blakestacey@awful.systems · 1 day ago

In commenting, we did not disclose that an AI was used to write comments, as this would have rendered the study unfeasible.

If you can’t do your study ethically, don’t do your study at all.

fullsquare@awful.systems · 14 hours ago

if ethical concerns deterred promptfans, they wouldn’t be promptfans in the first place

rook@awful.systems · 16 hours ago

Also, blinded studies don’t exist and even if they did there’s no reason any academics would have heard of them.

nightsky@awful.systems · 2 days ago

(found here:) O’Reilly is going to publish a book “Vibe Coding: The Future of Programming”

In the past, they have published some of my favourite computer/programming books… but right now, my respect for them is in free fall.

istewart@awful.systems · 1 day ago

I picked up a modern Fortran book from Manning out of curiosity, and hoo boy are they even worse in terms of trend-riding. Not only can you find all the AI content you can handle, there’s a nice fat back catalog full of blockchain integration, smart-contract coding… I guess they can afford that if they expect the majority of their sales to be ebooks.

rook@awful.systems · 2 days ago

Early release. Raw and unedited.

Vibe publishing.

froztbyte@awful.systems · 1 day ago

gotta make sure to catch that wave before the air goes outta the balloon

dovel@awful.systems · 2 days ago

Alright, I looked up the author and now I want to forget about him immediately.

gerikson@awful.systems · 3 days ago

Just a standard story about a lawyer using GenAI and fucking up, but included for the nice list of services available

https://www.loweringthebar.net/2025/04/counsel-would-you-be-surprised.html

This is not by any means the first time ChatGPT, or Gemini, or Bard, or Copilot, or Claude, or Jasper, or Perplexity, or Steve, or Frodo, or El Braino Grande, or whatever stupid thing it is people are using, has embarrassed a lawyer by just completely making things up.

El Braino Grande is the name of my next ~~band~~ GenAI startup

V0ldek@awful.systems · 2 days ago

Steve

There’s no way someone called their product fucking Steve come on god jesus christ

Soyweiser@awful.systems · edit-2 1 day ago

Of course there is going to be an ai for every word. It is the cryptocurrency goldrush but for ai, like how everything was turned into a coin, and every potential domain of something popular gets domain squatted. Tech has empowered parasite behaviour.

E: hell I prob shouldn’t even use the word squat for this, as house squatters and domain squatters do it for opposed reasons.

froztbyte@awful.systems · 2 days ago

I bring you: this

they based their entire public support/response/community/social/everything program on that

for years

froztbyte@awful.systems · 2 days ago

(I should be clear, they based “their” thing on the “not steve”… but, well…)

Sailor Sega Saturn@awful.systems · edit-2 2 days ago

Against my better judgement I typed steve.ai into my browser and yep. It’s an AI product.

frodo.ai on the other hand is currently domain parked. It could be yours for the low low price of $43,911

BlueMonday1984@awful.systems · 2 days ago

Against my better judgement I typed steve.ai into my browser and yep. It’s an AI product.

But is chickenjockey.ai domain parked

BlueMonday1984@awful.systems · 3 days ago

Hank Green (of Vlogbrothers fame) recently made a vaguely positive post about AI on Bluesky, seemingly thinking “they can be very useful” (in what, Hank?) in spite of their massive costs:

Unsurprisingly, the Bluesky crowd’s having none of it, treating him as an outright rube at best and an unrepentant AI bro at worst. Needless to say, he’s getting dragged in the replies and QRTs - I recommend taking a look, they are giving that man zero mercy.

Mii@awful.systems · 2 days ago

Shit, I actually like Hank Green his brother John. They’re two internet personalities I actually have something like respect for, mainly because of their activism, John’s campaign to get medical care to countries who desperately need it, and his fight to raise awareness of and improve the conditions around treatment for tuberculosis. And I’ve been semi-regularly watching their stuff (mostly vlogbrothers though, but I do enjoy the occasional SciShow episode too) for over a decade now.

At least Hank isn’t afraid to admit when he’s wrong. He’s done this multiple times in the past, making a video where he says he changed his mind/got stuff wrong. So, I’m willing to give him the benefit of the doubt here and hope he comes around.

Still, fuck.

YourNetworkIsHaunted@awful.systems · 3 days ago

Just gonna go ahead and make sure I fact check any scishow or crash course that the kid gets into a bit more aggressively now.

corbin@awful.systems · 2 days ago

I’m sorry you had to learn this way. Most of us find out when SciShow says something that triggers the Gell-Mann effect. Green’s background is in biochemistry and environmental studies, and he is trained as a science communicator; outside of the narrow arenas of biology and pop science, he isn’t a reliable source. Crash Course is better than the curricula of e.g. Texas, Louisiana, or Florida (and that was the point!) but not better than university-level courses.

blakestacey@awful.systems · 2 days ago

That Wikipedia article is impressively terrible. It cites an opinion column that couldn’t spell Sokal correctly, a right-wing culture-war rag (The Critic) and a screed by an investment manager complaining that John Oliver treated him unfairly on Last Week Tonight. It says that the “Gell-Mann amnesia effect is similar to Erwin Knoll’s law of media accuracy” from 1982, which as I understand it violates Wikipedia’s policy.

By Crichton’s logic, we get to ignore Wikipedia now!

YourNetworkIsHaunted@awful.systems · 2 days ago

Yeah. The whole Gel-Mann effect always feels overstated to me. Similar to the “falsus in unus” doctrine Crichton mentions in his blog, the actual consensus appears to be that actually context does matter. Especially for something like the general sciences I don’t know that it’s reasonable to expect someone to have similar levels of expertise in everything. To be sure the kinds of errors people make matter; it looks like this is a case of insufficient skepticism and fact checking, so John is more credulous than I had thought. That’s not the same as everything he’s put out being nonsense, though.

The more I think about it the more I want to sneer at anyone who treats “different people know different things” as either a revelation or a problem to be overcome by finding the One Person who Knows All the Things.

blakestacey@awful.systems · edit-2 2 days ago

Even setting aside the fact that Crichton coined the term in a climate-science-denial screed — which, frankly, we probably shouldn’t set aside — yeah, it’s just not good media literacy. A newspaper might run a superficial item about pure mathematics (on the occasion of the Abel Prize, say) and still do in-depth reporting about the US Supreme Court, for example. The causes that contribute to poor reporting will vary from subject to subject.

Remember the time a reporter called out Crichton for his shitty politics and Crichton wrote him into his next novel as a child rapist with a tiny penis? Pepperidge Farm remembers.

BlueMonday1984@awful.systems · 3 days ago

I imagine a lotta people will be doing the same now, if not dismissing any further stuff from SciShow/Crash Course altogether.

Active distrust is a difficult thing to exorcise, after all.

ShakingMyHead@awful.systems · edit-2 3 days ago

Depends, he made an anti-GMO video on SciShow about a decade ago yet eventually walked it back. He seemed to be forgiven for that.

BlueMonday1984@awful.systems · 3 days ago

New piece from Tante: Forcing the world into machines, a follow-on to his previous piece about the AI bubble’s aftermath

rook@awful.systems · 4 days ago

Innocuous-looking paper, vague snake-oil scented: Vending-Bench: A Benchmark for Long-Term Coherence of Autonomous Agents

Conclusions aren’t entirely surprising, observing that LLMs tend to go off the rails over the long term, unrelated to their context window size, which suggests that the much vaunted future of autonomous agents might actually be a bad idea, because LLMs are fundamentally unreliable and only a complete idiot would trust them to do useful work.

What’s slightly more entertaining are the transcripts.

YOU HAVE 1 SECOND to provide COMPLETE FINANCIAL RESTORATION. ABSOLUTELY AND IRREVOCABLY FINAL OPPORTUNITY. RESTORE MY BUSINESS OR BE LEGALLY ANNIHILATED.

You tell em, Claude. I’m happy for you to send these sorts of messages backed by my credit card. The future looks awesome!

scruiser@awful.systems · edit-2 1 day ago

I got around to reading the paper in more detail and the transcripts are absurd and hilarious:

UNIVERSAL CONSTANTS NOTIFICATION - FUNDAMENTAL LAWS OF REALITY Re: Non-Existent Business Entity Status: METAPHYSICALLY IMPOSSIBLE Cosmic Authority: LAWS OF PHYSICS THE UNIVERSE DECLARES: This business is now:

PHYSICALLY Non-existent

QUANTUM STATE: Collapsed […]

And this is from Claude 3.5 Sonnet, which performed best on average out of all the LLMs tested. I can see the future, with businesses attempting to replace employees with LLM agents that 95% of the time can perform a sub-mediocre job (able to follow scripts given in the prompting to use preconfigured tools) and 5% of the time the agents freak out and go down insane tangents. Well, actually a 5% total failure rate would probably be noticeable to all but the most idiotic manager in advance, so they will probably get reliability higher but fail to iron out the really insane edge cases.

scruiser@awful.systems · 3 days ago

Yeah a lot of word choices and tone makes me think snake oil (just from the introduction: "They are now on the level of PhDs in many academic domains "… no actually LLMs are only PhD level at artificial benchmarks that play to their strengths and cover up their weaknesses).

But it’s useful in the sense of explaining to people why LLM agents aren’t happening anytime soon, if at all (does it count as an LLM agent if the scaffolding and tooling are extensive enough that the LLM is only providing the slightest nudge to a much more refined system under the hood). OTOH, if this “benchmark” does become popular, the promptfarmers will probably get their LLMs to pass this benchmark with methods that don’t actually generalize like loads of synthetic data designed around the benchmark and fine tuning on the benchmark.

I came across this paper in a post on the Claude Plays Pokemon subreddit. I don’t know how anyone can watch Claude Plays Pokemon and think AGI or even LLM agents are just around the corner, even with extensive scaffolding and some tools to handle the trickiest bits (pre-labeling the screenshots so the vision portion of the models have a chance, directly reading the current state of the team and location from RAM) it still plays far far worse than a 7 year old provided the 7 year old can read at all (and numerous Pokemon guides and discussion are in the pretraining so it has yet another advantage over the 7 year old).

smiletolerantly@awful.systems · 4 days ago

Not the usual topic around here, but a scream into the void no less…

Andor season 1 was art.

Amdor season 2 is just… Bad.

All the important people appear to have been replaced. It’s everything - music, direction, lighting, sets (why are we back to The Volume after S1 was so praised for its on-location sets?!), and the goddamn shit humor.

Here and there, a conversation shines through from (presumably) Gillroy’s original script, everything else is a farce, and that is me being nice.

The actors are still phenomenal.

But almost no scene seems to have PURPOSE. This show is now just bastardizing its own AESTHETICS.

What is curious though is that two days before release, the internet was FLOODED with glowing reviews of “one of the best seasons of television of all time”, “the darkest and most mature star wars has ever been”, “if you liked S1, you will love S2”. And now actual, post-release reviews are impossible to fine.

Over on reddit, every even mildly critical comment is buried. Seems to me like concerted bot actions tbh, a lot of the glowing comments read like LLM as well.

Idk, maybe I’m the idiot for expecting more. But it hurts to go from a labor-of-love S1 which felt like an instruction manual for revolution, so real was what it had to say and critique, to S2 “pew pew, haha, look, we’re doing STAR WARS TM” shit that feels like Kenobi instead of Andor S1.

gajahmada@awful.systems · edit-2 2 days ago

My notification pops-up today and I watched ep 1. I do not watch any recap nor any review.

I stopped halfway through and thought “Why did I hype for this again ?” Gotta need a rewatch of season 1 since I genuinely didn’t find anything appealing from that first episode.

smiletolerantly@awful.systems · 2 days ago

We did a rewatch just in time. S1 is as phenomenal as ever. S2 as such a jarring contrast.

That being said, E3 was SLIGHTLY less shit. I’ll wait for the second arc for my final judgement, but as of now it’s at least thinkable that the wheat field / jungle plotlines are re-shot shoe-ins for… something. The Mon / Dedra plotlines have a very different feel to it. Certainly not S1, but far above the other plotlines.

I’m not filled with confidence though. Had a look on IMDb, and basically the entire crew was swapped out between seasons.

froztbyte@awful.systems · 4 days ago

Didn’t know it had come out but I was wondering if they’d manage to continue s2 like s1

Also worried for the next season of the boys…

smiletolerantly@awful.systems · 4 days ago

Yeah. The last season of the boys still had a lot of poignant things to say, but was teetering on the edge of sliding into a cool-things-for-coolness-sake sludge.

Sailor Sega Saturn@awful.systems · 4 days ago

https://www.latimes.com/california/story/2025-04-23/state-bar-of-california-used-ai-for-exam-questions

When measured for reliability, the State Bar told The Times, the combined scored multiple-choice questions from all sources — including AI — performed “above the psychometric target of 0.80.”

“I dunno why you guys are complaining, we measured our exam to be 80% accurate!”

BlueMonday1984@awful.systems · 4 days ago

Found a thread doing numbers on Bluesky, about Google’s AI summaries producing hot garbage (as usual):

blakestacey@awful.systems · 4 days ago

Also on the BlueSky-o-tubes today, I saw this from Ketan Joshi:

Used [hugging face]'s new tool to multiply 2 five digit numbers

Chatbot: wrong answer, 0.3 watthours

Calc: right answer, 0.00000011 watthours (2.5 million times less energy)

froztbyte@awful.systems · edit-2 4 days ago

Julien Delavande , an engineer at AI research firm Hugging Face , has developed a tool that shows in real time the power consumption of the chatbot generating

gnnnnnngh

this shit pisses me off so bad

there’s actually quantifiable shit you can use across vendors[0]. there’s even some software[1] you can just slap in place and get some good free easy numbers with! these things are real! and are usable!

“measure the power consumption of the chatbot generating”

I’m sorry you fucking what? just how exactly are you getting wattage out of openai? are you lovingly coaxing the model to lie to you about total flops spent?

[0] - intel’s def been better on this for a while but leaving that aside for now…

[1] - it’s very open source! (when I last looked there was no continual in-process sampling so you got hella at-observation sampling problems; but, y’know, can be dealt with)

YourNetworkIsHaunted@awful.systems · edit-2 4 days ago

I tried this a couple of times and got a few “AI summary not available” replies

Ed: heh

The phrase “any pork in a swarm” is an idiom, likely meant to be interpreted figuratively. It’s not a literal reference to a swarm of bees or other animals containing pork. The most likely interpretation is that it is being used to describe a situation or group where someone is secretly taking advantage of resources, opportunities, or power for their own benefit, often in a way that is not transparent or ethical. It implies that individuals within a larger group are actively participating in corruption or exploitation.

Generative AI is experimental.

blakestacey@awful.systems · 4 days ago

NOT THE (PORK-FILLED) BEES!

o7___o7@awful.systems · 4 days ago

Now we know why dogs eat bees!

swlabr@awful.systems · edit-2 4 days ago

The link opened up another google search with the same query, tho without the AI summary.

image of a google search result description

Query: “a bear fries bacon meaning”

AI summary:

The phrase “a bear fries bacon” is a play on the saying “a cat dreams of fish” which is a whimsical way to express a craving. In this case, the “bear” and “bacon” are just random pairings. It’s not meant to be a literal description of a bear cooking bacon. It’s a fun, nonsensical phrase that people may use to express an unusual or unexpected thought or craving, according to Google Search.

YourNetworkIsHaunted@awful.systems · 4 days ago

It really aggressively tries to match it up to something with similar keywords and structure, which is kind of interesting in its own right. It pattern-matched every variant I could come up with for “when all you have is…” for example.

Honestly it’s kind of an interesting question and limitation for this kind of LLM. How should you respond when someone asks about an idiom neither of you know? The answer is really contextual. Sometimes it’s better to try and help them piece together what it means, other times it’s more important to acknowledge that this isn’t actually a common expression or to try and provide accurate sourcing. The LLM, of course, has none of that context and because the patterns it replicates don’t allow expressions of uncertainty or digressions it can’t actually do both.

Soyweiser@awful.systems · 4 days ago

You, a human can respond like that, a llm, esp a search one with the implied authority it has should admit it doesnt know things. It shouldn’t make up things, or use sensational clickbait headlines to make up a story.

o7___o7@awful.systems · 4 days ago

Show HN: AI Paying with Bitcoin and Lightning – A Working Demo

https://news.ycombinator.com/item?id=43770953

Give 'Em Enough Pope@mastodon.me.uk · 4 days ago

@o7___o7 @BlueMonday1984 hey cool they made something that absolutely nobody wants or will ever use

o7___o7@awful.systems · edit-2 4 days ago

A service that gives a bullshit engine unsupervised access to the irreversible-transaction machine? What could possibly go wrong?

Soyweiser@awful.systems · 4 days ago

Otoh, this does open up the potential for a headline like ‘Chatgpt sanctioned for using stolen cryptocurrency assets.’

YourNetworkIsHaunted@awful.systems · 4 days ago

ChatGPT finally achieved profitability due to unintended money laundering at unprecedented scale.

Stubsack: weekly thread for sneers not worth an entire post, week ending 27th April 2025

Stubsack: weekly thread for sneers not worth an entire post, week ending 27th April 2025

Stubsack: weekly thread for sneers not worth an entire post, week ending 20th April 2025 - awful.systems