NHacker Next
  • new
  • past
  • show
  • ask
  • show
  • jobs
  • submit
TorchCodec 0.14: HDR Video Decoding for CPU and CUDA, and Fast Wav Decoder (github.com)
57 minutes ago [-]
scott_s 5 days ago [-]
For disclosure, I've worked on TorchCodec. I'm happy to answer any questions!
weitendorf 54 minutes ago [-]
> TorchCodec now has a dedicated WavDecoder for decoding WAV files. It bypasses FFmpeg entirely and reads WAV data directly, resulting in significantly faster decoding.

I'm working in this area recently and very keen to use this given the claimed performance benefits, but I tried all your links and didn't see any actual performance numbers. Do you have any to share?

IMO a fair performance benchmark for those not tied to the full pytorch stack would have ffmpeg and the wav already loaded into memory before execution. Given that torchcodec relies on the user-supplied ffmpeg installation I suspect that may not be the case for ffmpeg already, at least not by default.

I understand why meta wouldn't want to do this (then you are inevitably distributing exploitable security vulnerabilities in pytorch, because ffmpeg will probably always have them) but I've been statically linking fmpeg and keeping the binary in-memory while still using separate processes for different batches of audio, with I/O through UDS between the parent and ffmpeg; then the parent does VAD on the pcm on CPU before any further inference. My implementation for static linking is similar to the pattern in https://github.com/amenzhinsky/go-memexec#static-binary - would be interesting to see if this is possible in the pytorch/python ecosystem, or maybe it's already been done.

antixk 5 hours ago [-]
Hi, In the past I have used NVVideoCodec and VPI for gpu accelerated decoding and processing. What would be torchcodec's appeal here? VPI already provides zero-copy interface with pytorch.

Thanks!

hmaarrfk 11 hours ago [-]
What version of ffmpeg does this use? Last I tried torch tools used really outdated version of ffmpeg at the time of their release.
alphatozeta 7 hours ago [-]
its really fast and the performance is great, but its really unfortunate it requires torch>=2.11 Too many NVIDIA libraries are still using 2.10 or an alpha version of 2.11 that doesn't have c++ methods used by torchcodec's underlying C++ code like use_blob and a few others. I had to fall back to ffmpeg-python unfortunately
Reubend 12 hours ago [-]
the WAV file decoding perf improvement is also very welcome!
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
Rendered at 12:45:59 GMT+0000 (Coordinated Universal Time) with Vercel.