I don't. This is so obscenely flawed in obvious ways. The energy to train the model was used only for model training while the energy used by the baby performed a myriad of tasks including image recognition, and can presumably apply the knowledge gained in novel ways. Not only can a baby identify a cat and a dog but it can also speak what the difference is in audible language, fire neurons to operate its musculoskeletal system (albeit poorly), and perhaps even no longer shits its pants. Apples and Oranges. Is model performance getting more impressive every day? Definitely. Has anyone actually demonstrated "AI". Still nope.
The context of this thread is the cost of training brains and models on comparable tasks. Not that the model is comparable to a human in every way.
If you want to be pedantic then 6% of the human brain is the visual cortex but then you also have to argue that AlexNet is horribly inefficient to train. So you cut the brain cost to 6% and the model cost to 1%. They're still within an order of magnitude (favoring the model) which I'd say is pretty close in terms of energy usage.