Pezy and the other Japanese native chips are first and foremost about HPC. The world may have picked up AI in the last 2 years, but the Japanese chipmakers are still thinking primarily about HPC, with AI as just one HPC workload.
These Pezy chips are also made for large clusters. There is a whole system design around the chips that wasn't presented here. The Pezy-SC2, for instance, was built around liquid immersion cooling. I am not sure you could ever buy an air-cooled version.
numpad0 3 hours ago [-]
It's unfortunate that they don't sell them on open markets. There are few of these accelerators that could threaten NVIDIA monopoly if prices(and manufacturing costs!) were right.
WithinReason 50 minutes ago [-]
The hardware is the easy part of accelerating NN training. Nvidia's software and infrastructure is so well designed and established that no competitor can threaten them even if they give away the hardware for free.
DrNosferatu 35 minutes ago [-]
> if they give away the hardware for free.
Seriously doubt that: free hardware (or 10s of bucks) would galvanize the community and achieve huge support - look at the Raspberry Pi project original prices and the consequences.
saagarjha 47 minutes ago [-]
I don't know about well designed but it's definitely established.
CoastalCoder 44 minutes ago [-]
Could you elaborate?
I've only done a little work on CUDA, but I was pretty impressed with it and with their NSys tools.
I'm curious what you wish was different.
saagarjha 33 minutes ago [-]
I actually really hate CUDA's programming model and feel like it's too low-level to actually get any productive work done. I don't really blame Nvidia because they basically invented the programmable GPU and it wouldn't be fair to have them also come up with the perfect programming model right out of the gate but at this point it's pretty clear that having independent threads work on their own programs makes no sense. High performance code requires scheduling across multiple threads in a way that makes no sense if you are coming from CPUs.
Of course, one might mention that GPUs are nothing like CPUs–but the programming model works super hard to try to hide this. So it's not really well designed in my book. I actually quite like the compilers that people are designing these days to write block-level code, because I feel like it better represents the work people want to do and then you pick which way you want it lowered.
As for Nsight (Systems), it is…ok, I guess? It's fine for games and stuff I guess but for HPC or AI it doesn't really surface the information that you would want. People who are running their GPUs really hard know they have kernels running all the time and what the performance characteristics of them are. Nsight Compute is the thing that tells you that but it's kind of a mediocre profiler (some of this may be limitations of hardware performance counters) and to use it effectively you basically have to read a bunch of blog posts by people instead of official documentation.
Despite not having used it much, my impression was that Nvidia's "moat" was that they have good networking libraries, that they are pretty good (relatively) and making sure all their tools work, and they have had consistent investment on this for a decade.
WithinReason 46 minutes ago [-]
Who has better software than Nvidia for NN training? Meaning the least amount of friction getting a new network to train.
saagarjha 42 minutes ago [-]
Just because their tools are the best doesn't mean they are designed well.
KeplerBoy 46 minutes ago [-]
It's not all about NNs and AI. Take a look at the Top500, a lot of people are doing classical HPC work on Nvidia GPUs, which are increasingly not designed for this. Unfortunately the HPC market is just a lot smaller than the AI bubble.
rwmj 36 minutes ago [-]
If the hardware isn't available at all, we'll never find out if the software moat could be overcome.
Great article documenting PEZY. It's incredible how close they are from NVidia despite being a very small team.
To me, this looks like a win.
Governments are there to finance projects like this that enable the country to have certain skillsets that wouldn't exist otherwise because of other countries having better solutions in the global market.
eru 16 minutes ago [-]
Governments are terrible at picking winners.
actionfromafar 4 minutes ago [-]
Everyone is, and what survives, survives.
But what governments often can do, is break local optimums clustering around the quarter economy and take moonshot chances and find paths otherwise never taken. Hopefully one of these paths are great.
The difficult thing becomes deciding when to pull the plug. Is ITER a good thing or not? (Results wise, it is, but for the money? Who can tell really.)
sylware 31 minutes ago [-]
Last time I heard about that it was for "super computers": nearly or even faster than the alternatives with a massive energy consumption advantage.
Rendered at 10:57:19 GMT+0000 (Coordinated Universal Time) with Vercel.
These Pezy chips are also made for large clusters. There is a whole system design around the chips that wasn't presented here. The Pezy-SC2, for instance, was built around liquid immersion cooling. I am not sure you could ever buy an air-cooled version.
Seriously doubt that: free hardware (or 10s of bucks) would galvanize the community and achieve huge support - look at the Raspberry Pi project original prices and the consequences.
I've only done a little work on CUDA, but I was pretty impressed with it and with their NSys tools.
I'm curious what you wish was different.
Of course, one might mention that GPUs are nothing like CPUs–but the programming model works super hard to try to hide this. So it's not really well designed in my book. I actually quite like the compilers that people are designing these days to write block-level code, because I feel like it better represents the work people want to do and then you pick which way you want it lowered.
As for Nsight (Systems), it is…ok, I guess? It's fine for games and stuff I guess but for HPC or AI it doesn't really surface the information that you would want. People who are running their GPUs really hard know they have kernels running all the time and what the performance characteristics of them are. Nsight Compute is the thing that tells you that but it's kind of a mediocre profiler (some of this may be limitations of hardware performance counters) and to use it effectively you basically have to read a bunch of blog posts by people instead of official documentation.
Despite not having used it much, my impression was that Nvidia's "moat" was that they have good networking libraries, that they are pretty good (relatively) and making sure all their tools work, and they have had consistent investment on this for a decade.
To me, this looks like a win.
Governments are there to finance projects like this that enable the country to have certain skillsets that wouldn't exist otherwise because of other countries having better solutions in the global market.
But what governments often can do, is break local optimums clustering around the quarter economy and take moonshot chances and find paths otherwise never taken. Hopefully one of these paths are great.
The difficult thing becomes deciding when to pull the plug. Is ITER a good thing or not? (Results wise, it is, but for the money? Who can tell really.)