Discussion about this post

User's avatar
Raunak's avatar

interesting read.

for the part on retraining models on entirely synthetic data collapsing into a dridac delta function, don't model params get initialized with some randomness at the start of each training run? So even if the previous run produces a uniform dataset, what gaurantees that the next run will if there is some noise added in between each traing run?

re information processing inequality, these bounds are pretty hard to reason about. How do we know that even the practical maximum of the quality of models we can extract from our currently available data isn't already way higher than a "superhuman" intelligence?

1 more comment...

No posts

Ready for more?