UPD. found no evidence that it supports tensor cores, so it's going to be many times slower than implementations that do.
UPD. found no evidence that it supports tensor cores, so it's going to be many times slower than implementations that do.