The Puzzling Success of Overparameterization: Lottery Tickets or Escape Dimensions?
Lotteries and tickets are often used as a didactical analogy to explain the success of overparameterized neural networks: “larger networks succeed because they more likely contain a well-initialized subnetwork that can learn the task in isolation, much like buying more tickets increases the chances of winning a lottery.”
This explanation is intuitive but misleading: it suggests that subnetworks can be treated in isolation from the rest of the network. Following this reasoning leads to interpreti...
Read more at infoscience.epfl.ch