Artificial intelligence has always struck me as a fittingly modest name, as I emphasize the artifice over the intelligence. Watson, a question-answering system has recently been playing Jeopardy against humans to test the “DeepQA hypothesis”:
The DeepQA hypothesis is that by complementing classic knowledge-based approaches with recent advances in NLP, Information Retrieval, and Machine Learning to interpret and reason over huge volumes of widely accessible naturally encoded knowledge (or “unstructured knowledge”) we can build effective and adaptable open-domain QA systems. While they may not be able to formally prove an answer is correct in purely logical terms, they can build confidence based on a combination of reasoning methods that operate directly on a combination of the raw natural language, automatically extracted entities, relations and available structured and semi-structured knowledge available from for example the Semantic Web.
As a researcher, I’m excited at the milestone this represents.
Each research area mentioned above has progressed markedly in the last 10-15 years to make the experiment plausible let alone at all successful. But as a researcher, I’m also a little disappointed in what seems to be the recurring lessons we’ve learned from Deep Blue, the NetFlix challenge, and now Watson. The takeaways seem to always be use more data and process it with a combination of more algorithms. And I’m not suggesting that is a simple thing to do; putting that much stuff together is both thankless and incredibly hard. Watson may reflect the strength of ensemble methods and the challenge of open domain problems. But it becomes difficult to distinguish advances in AI from advances in computational scale. I’m not sure this argument has been made better recently than by Garry Kasparov in his description of a “technology-rich and innovation-poor modern world”. Now that processing power is abundant, is there no better approach to open problems than the combination of solutions individually deemed unsatisfactory? Maybe not.
In contrast, Cees Snoek documents recent progress in visual search. While I think declaring victory may be premature, the progress is both convincing and encouraging. As well, it’s founded on advances that reflect both innovation (e.g. improved local image descriptors and visual representations, advances in kernel methods) and emergent technology (e.g. GPUs) for processing yet more data.
I don’t disagree that big data is unreasonably effective and is changing the nature of both the problems we face and the solutions we propose. But does the focus on more data/algorithms/cores come at the expense of new algorithms/representations? It certainly doesn’t need to. And I’m guessing Watson’s descendants will need those AI innovations every bit as much as it will need some more data.