Even the most common neural network architectures have a lot of parameters. ResNet50, which is a frequently used baseline model, has ~25 million. This means that during training, we perform a search in a 25 million dimensional parameter space.
To put this number in perspective, letβs take a look at a cube in this space. An n-dimensional cube has 2βΏ vertices, so in 25 million dimensions, we are talking about 2Β²β΅β°β°β°β°β°β° points. In a search grid, this would be just a single element. For comparison, the number of atoms in the observable universe is estimated to be around 10βΈΒ². It is safe to say that the magnitude of this problem is incomprehensible to us humans.
Report Story