I did trained some research models using the existing PyTorch/XLA on TPUs, and it was a mess of undocumented behavior and bugs (silently hanging after 8 hours of training!).
If anyone is trying to use PyTorch on TPU before TorchTPU is released, you can check out the training pipeline that I ended up building to support my research: https://github.com/aklein4/easy-torch-tpu
Reubend 8 hours ago [-]
Sounds good, but my main question is: is this a fork, or a new backend they're building in (like MPS)?
musebox35 3 hours ago [-]
I attended the related session at Next’26 yesterday. From my understanding it is a new backend and they will release the torch tpu source on github in one or two months. It will not support all ops initially but they are moving fast. Still for a while torchax is mature enough to run torch models on tpus by translating to jax.
noracists 4 hours ago [-]
Very excited for this.
MASNeo 1 hours ago [-]
Now all that’s missing is an actual chip that can be purchased. Any ideas?
7 hours ago [-]
yujunjie 4 hours ago [-]
[dead]
crimebrasil 6 hours ago [-]
[dead]
Rendered at 07:46:06 GMT+0000 (Coordinated Universal Time) with Vercel.
I did trained some research models using the existing PyTorch/XLA on TPUs, and it was a mess of undocumented behavior and bugs (silently hanging after 8 hours of training!).
If anyone is trying to use PyTorch on TPU before TorchTPU is released, you can check out the training pipeline that I ended up building to support my research: https://github.com/aklein4/easy-torch-tpu