Neural Architecture Search (my thesis, slides)
For the non-expert readers that are curious about Neural Architecture Search (NAS), I propose here an unambiguous analogy :
Suppose that God's apprentices would like to have a personal chef and decide to create it. To do so, they need to
assemble parts of the human body into a human puppet, which will be
trained to cook.
Considering all possible layouts of the body sections, they try a first idea and place the hands on top of the head.
After several hours of training, they observe, quite embarrassed, that their puppet is incapable of cooking the simplest recipes. One apprentice suggests that this failure might be linked to the puppet's inability to see its actions while using its hands. Considering this remark, they decide to revise the joining of the puppet and restart the training...
Replace the action
assemble the parts of the human body by
construct the architecture of a neural network, and the apprentices become a computer program of some researchers in the NAS field.
In this analogy, we recognize a well-known optimization process called trial and error that, in NAS, is the repetition of the following :
build an architecture, train it, and evaluate its performance. However, this way of proceeding distinguishes the search of the architecture from the training of the architecture itself, and is by construction time demanding because it requires training multiple architectures.
An alternative would be to jointly optimize the architecture and its weights, that is, we would construct the chef and train it to cook at the same time.
Such NAS techniques exist and they are called
one-shot methods. In fact, they are called like that because, inversely to the latter optimization procedure, only one architecture is trained during the overall search.