Photograph: Pete Souza/The White House

“The picture above is funny. But for me it is also one of those examples that make me sad about the outlook for AI and for Computer Vision. What would it take for a computer to understand this image as you or I do? I challenge you to think explicitly of all the pieces of knowledge that have to fall in place for it to make sense. … I hate to say it but the state of CV and AI is pathetic when we consider the task ahead, and when we think about how we can ever go from here to there. The road ahead is long, uncertain and unclear. … In any case, we are very, very far and this depresses me. What is the way forward?”

-- Andrej Karpathy, Director of AI and Autopilot Vision at Tesla

The above image was taken in 2010, and the above comment was made in 2012. Since then, AI technology has advanced significantly, and I’m wondering if the above comments still hold true today.

Please review relevant literature and write an essay on how to make AI understand the above picture. Keep in mind that in his blog, Karpathy has a long (but not exhaustive) list of task that an algorithm must understand to get the joke. You might find the list useful for your essay. Your analysis should be holistic and should include both review of existing works and possible future directions.