"Mockery of science": Energy Department climate report riddled with errors

brucethemoose · edit-2 31 分钟前

I don’t get the analogy… of course a bunch of students using different tools with different inputs will yield different results? But if they use the same model and input at zero temperature, they will, in fact, get the same results, just like any code.

Predictability has never been a strength of ML, of course.

…That’s not really what it’s for. It’s for finding exotic stars in a mass of astronomical data on a budget, or interpoliating pixels in an image, or for identifying cat videos reasonably well. That’s still a useful tool. And the modern extension of getting a glorified autocomplete engine to press some buttons automatically is no different if structured and constrained appropriately.

The obvious problem, among many I see, is that these Tech Bros are selling underbaked… no, not even half cooked agenic systems as sapient magic lamps. Not niche tools for very specific bits of automation. Just look at the language Suleyman is using:

I grew up playing Snake on a Nokia phone! The fact that people are unimpressed that we can have a fluent conversation with a super smart AI that can generate any image/video is mindblowing to me.

brucethemoose · edit-2 44 分钟前

Disney wouldn’t be doing what every other multinational corporation engaging in AI training is doing, which is scraping any and all dataset they can get access to regardless of propriety since arguably ALL data is useful.

There are actually very few ‘big’ model trainers, or at least trainers worth anything.

OpenAI, Anthrophic, xAI, and Google (and formerly Meta) are the big names to investors. You have Mistral in the EU, LG in Korea, the ‘Chinese Dragons’ like Alibaba and Deepseek, a few enterprise niches like Palantir, Cohere, or AI21, Perplexity and such for search, and…

That’s it, mostly?

The vast, vast majority of corporations don’t even finetune. They just use APIs of others and say they’re making ‘AI.’ And you do have a few niches pursuing, say, TTS or imagegen, but the training sets for that are much more specialized.

…And actually, a lot of research and ‘new’ LLMs largely mixes of public datasets (so no need to scrape), synthetically generated data, outputs of other LLMs and/or more specifically formatted stuff. Take this one, which uses 5.5T of completely synthetic tokens:

https://old.reddit.com/r/LocalLLaMA/comments/1p20zry/gigachat3702ba36bpreview/

That, and rumor on the street is the Chinese govt provides the Chinese trainers with a lot of data (since their outputs/quirks are so suspiciously similar).

Hence, ‘scraping the internet’ is not actually the trend folks think it is. On the contrary, Meta seems to have refuted the ‘quantity over quality’ data approach with how hard their Llama 4 models flopped vs. how well Deepseek did. It’s not very efficient, traning models is generally not profitable, and its done less than you think.

Point I’m making, along with just dumping my thinking, is that Disney is a special case.

Their focus is narrow: they want to generate tiktok-style images/videos of their characters, and only their characters. Not code, not, long chats, not spam articles, just that. They have no financial incentive to ‘scrape all the internet’ beyond the excellent archives that already exist; the only temptation is the ‘quick and dirty’ solution of using Sora instead of properly making something themselves.

brucethemoose · edit-2 2 小时前

Dishonored is from Arkane: https://en.wikipedia.org/wiki/Arkane_Studios#Games_developed

I’d highly recommend Dishonored 2, Prey, and Deathloop as well. You will love them if you like Dishonored 1.

As for being Bethesda… it’s confusing.

There’s Bethesda Softworks, which is a giant publisher, a ‘brand’ with a bunch of studios under them. Including Arkane.

Then there’s Bethesda Game Studios, the game dev, owned by Bethesda Softworks. When most gamers say ‘Bethesda’, they’re usually referring to BGS, as they’re the ones that make the Elder Scrolls/Fallout games, as well as Starfield.

…And just to add to the confusion, all this was bought by Microsoft.

brucethemoose · edit-2 2 小时前

It seems to have regressed vs Gemini 2.5 in some long context comprehension, like asking stuff about papers or stories… Which is basically the only thing I use Gemini for, since open/local models are so good at shorter contexts now.

This isn’t suprising. For that stuff, Gemini’s peak was somewhere in the 2.0/2.5 previews, but then they deep-fried it to benchmaxx coding and lm-arena.

brucethemoose · edit-2 2 小时前

As the old investment saying goes:

The market can remain irrational longer than you can remain solvent.

Be ready, but don’t bet on when it busts.

…Also, I think this is more ‘dotcom bubble,’ where it burst hard, but the internet ultimately didn’t go away.

brucethemoose · 2 小时前

Some linux terminals like cosmic and wezterm support tabbing.

The Windows one is neat though.

brucethemoose · edit-2 2 小时前

It’s trained in to the point its unconscious now. It’s like part of humanity’s collective psyche.

brucethemoose · 2 小时前

Well, according to Microsoft, your AI agent should be… err, neutering AI agents for you?

brucethemoose · 2 小时前

It will entirely depend on the price.

If it’s expensive, odds are you’re better off, like, finding or cobbling together a used RTX 3080 deskop or something like that.

brucethemoose · edit-2 2 小时前

^ This.

It’s a neat, under-construction tool.

A. Tool. An ‘agent’ to do niche things is neat.

…But I don’t need a chatbot on my fucking toaster.

brucethemoose · 3 小时前

I’m BlameTheAntifa and I have a distro-hopping addiction.

“Hi, BlameTheAntifa.” The circle of disto-hoppers echos.

brucethemoose · edit-2 13 小时前

Cachy’s not that bad for beginners. I just did a test install on an old Nvidia PC, and it works for gaming OOTB.

We’ve come a looooong way from Manjaro. I wouldn’t wish Manjaro on my worst enemy, to be clear.

brucethemoose · edit-2 13 小时前

It’s all C++ now, so it doesn’t really need docker! I don’t use docker for any ML stuff, just pip/uv venvs.

You might consider Arch (dockerless) ROCM soon; it looks like 7.1 is in the staging repo right now.

brucethemoose · edit-2 18 小时前

Then the video stops loading.

…They clearly didn’t do enough scrubbing testing with neurodivergent users.

brucethemoose · 18 小时前

Oh, I forgot!

You should check out Lemonade:

https://github.com/lemonade-sdk/lemonade

It’s supports Ryzen NPUs via 2 different runtimes… though apparently not the 8000 series yet?

brucethemoose · edit-2 23 小时前

The most screwed up thing is that doesn’t even matter, because its old news. Decades of lying and controversy (predating his political candidacy) somehow… don’t meet the attention threshold for algorithms? Is that the phrase?

It’s especially weird because I have older relatives that knew way more about “pre politics Trump” than I did, and now all that is forgotten somehow.

brucethemoose · edit-2 23 小时前

To be fair, that game does a lot of stuff.

But yes, its extremely focused too. It’s so medieval it hurts.

They also lucked out picking CryEngine, as (for their use case) it works unbelievably well. Many AAAs fall into development hell wrangling engines, and they easily could have done the same.

brucethemoose · edit-2 23 小时前

Yeah… Even if the LLM is RAM speed constrained, simply using another device to not to interrupt it would be good.

Honestly AMD’s software dev efforts are baffling. They’ve focused on a few on libraries precisely no-one uses, like this: https://github.com/amd/Quark

While ignoring issues holding back entire sectors (like broken flash-attention) with devs screaming about it at the top of their lungs.

Intel suffers from corporate Game of Thrones, but at least they have meaningful contributions in the open source space here, like the SYCL/AMX llama.cpp code or the OpenVINO efforts.

brucethemoose · edit-2 24 小时前

It still uses memory bandwidth, unfortunately. There’s no way around that, though NPU TTS would still be neat.

…Also, generally, STT responses can’t be streamed, so you mind as well use the iGPU anyway. TTS can be chunked I guess, but do the major implementations do that?