NHacker Next
  • new
  • past
  • show
  • ask
  • show
  • jobs
  • submit
Just Ask for Generalization (2021) (evjang.com)
xg15 15 days ago [-]
(2021), still very interesting. Especially the "post-overfitting" training strategy is unexpected.
dev_hugepages 14 days ago [-]
This is talking about the double descent phenomenon (https://en.wikipedia.org/wiki/Double_descent)
luckystarr 14 days ago [-]
I remember vaguely that this was observed when training GPT-3 (probably?) as well. Just trained on and on, and the error went up and then down again. Like a phase transition in the model.
esafak 14 days ago [-]
The low sample efficiency of RL is well explained.
14 days ago [-]
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
Rendered at 06:35:23 GMT+0000 (Coordinated Universal Time) with Vercel.