“
Yann LeCun's strategy provides a good example of a much more general notion: the exploitation of innate knowledge. Convolutional neural networks learn better and faster than other types of neural networks because they do not learn everything. They incorporate, in their very architecture, a strong hypothesis: what I learn in one place can be generalized everywhere else.
The main problem with image recognition is invariance: I have to recognize an object, whatever its position and size, even if it moves to the right or left, farther or closer. It is a challenge, but it is also a very strong constraint: I can expect the very same clues to help me recognize a face anywhere in space. By replicating the same algorithm everywhere, convolutional networks effectively exploit this constraint: they integrate it into their very structure. Innately, prior to any learning, the system already “knows” this key property of the visual world. It does not learn invariance, but assumes it a priori and uses it to reduce the learning space-clever indeed!
”
”
Stanislas Dehaene (How We Learn: Why Brains Learn Better Than Any Machine . . . for Now)