heredos (@heredos@heredos.net)

0 ★ 0 ↺

heredos » 2025-10-07
@heredos@heredos.net

@hypolite@friendica.mrpetovan.com i don't think they were engineered to be this way.
We used to understand weights learned by ml techniques, it's just that at some point, in the pursuit of an ever bigger accuracy and predicting capability, it turned out that the models became so big we could barely understand them.
And then we just kinda accepted it was this way, and added techniques that seemed to likely be working, and piled them on top of each other.
We know it is deterministic (cuz when it isn't, it's just some make-believe prng to choose one probablity over another; but you can just set temp to 0 for a fully "predictable" output), but in the end, i fell like it's just that models now so big they feel like black boxes.
To the point where we need ai to understand ai

...

Hypolite Petovan » 2025-10-07
@hypolite@friendica.mrpetovan.com

@heredos In this current intellectually fraught context I can't figure out if your last sentence was sarcastic or earnest.

...

1 ★ 0 ↺

heredos » 2025-10-07
@heredos@heredos.net

@hypolite@friendica.mrpetovan.com i sure wish this was a sarcastic joke
https://transformer-circuits.pub/2023/monosemantic-features/index.html

...

1 ★ 0 ↺

heredos » 2025-10-07
@heredos@heredos.net

They got two follow-up papers improving on that to the point where they successfully managed to steer an llm to say what they wanted without post training

Hypolite Petovan » 2025-10-07
@hypolite@friendica.mrpetovan.com

@heredos Ok, I'm glad I asked.

Hypolite Petovan » 2025-10-07
@hypolite@friendica.mrpetovan.com

@heredos To go back to your first sentence, I do think they were engineered this way in order to launder responsibility. Responsibility from copyright infringement, responsibility from undemocratic decisions, etc...

...

1 ★ 0 ↺

heredos » 2025-10-07
@heredos@heredos.net

@hypolite@friendica.mrpetovan.com maybe they scaled them in order to achieve that. Because all of these big issues arose with the first billion-parameter scale gated models (dall-e, gpt3, stable diff and the like).
But i'm not so sure.

I think it all became known to public when they democratized among non-researchers. When they became products for a wide audience instead of research topics for a bunch of engineers.
But i'm pretty sure they were trained on copyrighted data even before that, it's just that it almost never reached the hands of the people.

It just turns out that to make generative ml good enough that people want to pay for it we also need models too big to not be almost completely opaque.

But then, the fact that we have very little research on model reverse engineering, interpretability and the like, is likely due to big ai labs having absolutely no interest in doing so, for the very reasons you just gave.