Is IBM’s AI demonstration enough for a quantum killer app?
IBM reports results of running a machine learning classifier on two qubits of its "IBM Q" quantum computer. The researchers haven't yet found the "quantum advantage" over classical computing, but are hopeful they're going in the right direction. Is there approach of massive compute enough to dispel deep learning's fascination with hierarchies of representation?
Last week's issue of Nature magazine contained some intriguing work by IBM and MIT concerning how to implement machine learning on a quantum computer.
The work suggests aspects of machine learning where quantum could actually have a measurable advantage over classical, meaning, electronic, and computers.
Whether that adds up to a "killer app" for quantum is less certain. It's not enough to do something in quantum computing that's hard to do in classical computing; it has to be something that's worth doing.
Researchers at IBM's T.J. Watson Research Center, including Vojtěch Havlíček, Antonio D. Córcoles, Kristan Temme, Abhinav Kandala, Jerry M. Chow and Jay M. Gambetta, teamed up with Aram W. Harrow of MIT's Center for Theoretical Physics, to author the Nature paper, titled, Supervised learning with quantum-enhanced feature spaces. There is a separate article of supplementary material as well that is definitely worth reading. (Subscription to Nature, or individual article purchase, is required to read the articles.)
With these qubits, they built a classifier, a program that learns how to assign data to different categories by picking out patterns in the data. They found that they could compute a more complex function than conventional computers if they built their classifiers using two of what are called "Hadamard gates," a transformation of the data that is akin to a Fourier transform.
The version of machine learning that they pursue in this case is not deep learning, it's what has traditionally been called a "shallow" network, a quantum version of the "support vector machine," or SVM, that was introduced in the 1990s by Vladimir Vapnik.
The SVM has a single "kernel" of weights that transform input data into a "feature map" so that the data can be decisively separated and put into distinct buckets by category of thing. Havlíček and colleagues went looking for a feature map that is hard to compute on a classical computer. They found some, they report, that require the multiple Hadamard gates mentioned above.
The question is whether anyone wants a single, extremely complex feature map. The field of deep learning has for many years now spent a lot of effort to assert the inferiority of the SVM approach, and of similar kernel approaches, in favor of deep neural networks such as convolutional neural networks (CNNs), or recurrent neural networks (RNNs).
The reason, as explained by the University of Montreal's MILA institute's Yoshua Bengio and colleagues, in 2013, is that deep networks afford hierarchies of representation. The whole point of deep learning is that being constrained by computational limits forces a discipline upon the deep network that makes it produce abstractions that lead to meaningful generalization.
As Bengio writes, "The concepts that are useful for describing the world around us can be defined in terms of other concepts, in a hierarchy, with more abstract concepts higher in the hierarchy, defined in terms of less abstract ones."
Intelligence, in terms of deep learning forms of machine learning, comes out of constraint. Constraint forces levels of abstraction that lead to more sophisticated representations of data. The IBM researchers are also seeking to construct a representation, only in their case, a single, very hard-to-compute feature map.
That very hard-to-compute feature map will have to contend with a continued fascination with the phenomenon of depth in deep networks. For example, Stanford University's Ben Poole and colleagues have in recent years explored the geometry of what happens to feature vectors as they traverse a deep neural network. They found that the "manifold," the geometric representation of the data, changes shape in intriguing ways as it goes from order to chaos to order again along the length of a deep neural network.
All of which is to say, the current age is fascinated by depth in a deep neural network, and by what happens to signals as they are transformed in a multi-stage process. The field is gripped by the question of why a computer arrives at hierarchies of representation, not just that it can successfully classify.
The IBM classifier might also have some interesting things to say about representations and how they form, though that doesn't seem to be the focus of the current report. Havlíček and colleagues acknowledge that they haven't yet found the quantum advantage they want, "because we minimized the scope of the problem based on our current hardware capabilities, using only two qubits of quantum computing capacity, which can be simulated on a classical computer."
However, they insist a future classifier will "classify far more complex datasets than anything a classical computer could handle," and is hence, "a promising path forward."
In an accompanying piece in Nature, Maria Schuld, with the Quantum Research Group at the School of Chemistry and Physics in the University of KwaZulu-Natal in Durban, South Africa, describes how her research group arrived at a similar discovery to Havlíček and colleagues. In her description of their work, Schuld asks, "Would these techniques be good enough to beat almost 30 years of classical methods?"
If so, it would mean "the desperate search for a 'killer application' for quantum computers would be over," she observes, before adding, "But the answer to this question is probably more complicated."
Will IBM have a quantum advantage in the next decade? Let me know what you think in the comments section.
Scary smart tech: 9 real times AI has given us the creeps