NHacker Next
  • new
  • past
  • show
  • ask
  • show
  • jobs
  • submit
Executing programs inside transformers with exponentially faster inference (percepta.ai)
bonoboTP 14 minutes ago [-]
This shows the downside of using AI to write up your project. I see the eloquent sentences, but don't get the message.

> This works, but the actual execution happened outside the model. The model specified the computation, then waited for an external system to carry it out. > Our transformer also emits a program, but instead of pausing for an external tool, it executes that program itself, step by step, within the same transformer.

What's the benefit? Is it speed? Where are the benchmarks? Is it that you can backprop through this computation? Do you do so?

Why is it good that it's "inside" the model? Just making it more elegant and nice? The tool was already "inside" the overall hybrid system. What's the actual problem?

MattPalmer1086 12 minutes ago [-]
Interesting... But why? What is the benefit, other than increasing our understanding of model architectures?

Our brains can also simulate turing machines, slowly. We automated that with computers that are faster and more reliable. So why not allow a model to use external much faster and reliable tools, just as we do?

andy12_ 20 hours ago [-]
This seems a really interesting path for interpretability, specially if a big chunk of a model's behavior occurs pseudo-symbolically. This is an idea I had thought about, integrating tools into the main computation path of a model, but I never imagined that it could be done efficiently with just a vanilla transformer.

Truly, attention is all you need (I guess).

mirekrusin 25 minutes ago [-]
This is brilliant, game changing level.

Hey, give it also access to the dump of its weights and way to propose updates so it can see and tinker its brain directly.

behehebd 34 minutes ago [-]
Is this genius? Or just a new binary executable format? Can't tell.
koolala 53 minutes ago [-]
I'd like to see this combined with reinforcement learning to optimize models to think computationally. Generating ideas with hypothetical results and then running them in the same thought. Their solution sounded like a lot of tokens though.
ndxone 13 minutes ago [-]
big question is how efficient is this compare to executing assembly on CPU
galsapir 18 hours ago [-]
one of the most interesting pieces I've read recently. Not sure I agree with all the statements there (e.g. without execution the system has no comprehension) - but extremely cool
pennomi 18 hours ago [-]
It makes sense that a next token predictor could execute assembly code. This is fascinating work, especially with the memory implementation.
ThouYS 18 minutes ago [-]
what!
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
Rendered at 09:11:44 GMT+0000 (Coordinated Universal Time) with Vercel.