<<

NEWS FEATURE

THE LEARNING MACHINES Using massive amounts of data to recognize photos and speech, deep-learning computers are taking a big step towards true .

BY NICOLA JONES

hree years ago, researchers at the help computers to crack messy problems that secretive X lab in Mountain humans solve almost intuitively, from recog- View, , extracted some nizing faces to understanding language. 10 million still images from YouTube itself is a revival of an even videos and fed them into older idea for computing: neural networks. — a network of 1,000 computers pro- These systems, loosely inspired by the densely grammed to soak up the world much as a interconnected neurons of the brain, mimic

T BRUCE ROLFF/SHUTTERSTOCK human toddler does. After three days looking human learning by changing the strength of for recurring patterns, Google Brain decided, simulated neural connections on the basis of all on its own, that there were certain repeat- experience. Google Brain, with about 1 mil- ing categories it could identify: human faces, lion simulated neurons and 1 billion simu- human bodies and … cats1. lated connections, was ten times larger than Google Brain’s discovery that the Inter- any deep neural network before it. Project net is full of cat videos provoked a flurry of founder Andrew Ng, now director of the jokes from journalists. But it was also a land- Artificial Intelligence Laboratory at Stanford mark in the resurgence of deep learning: a University in California, has gone on to make three-decade-old technique in which mas- deep-learning systems ten times larger again. sive amounts of data and processing power Such advances make for exciting times in

146 | NATURE | VOL 505 | 9 JANUARY 2014 © 2014 Macmillan Publishers Limited. All rights reserved FEATURE NEWS artificial intelligence (AI) — the often-frus- simpler systems, says Malik. Plus, they were Google Brain. The project’s ability to spot cats trating attempt to get computers to think like tricky to work with. “Neural nets were always was a compelling (but not, on its own, commer- humans. In the past few years, companies such a delicate art to manage. There is some black cially viable) demonstration of un­supervised as Google, Apple and IBM have been aggres- magic involved,” he says. The networks needed learning — the most difficult learning task, sively snapping up start-up companies and a rich stream of examples to learn from — like because the input comes without any explana- researchers with deep-learning expertise. a baby gathering information about the world. tory information such as names, titles or For everyday consumers, the results include In the 1980s and 1990s, there was not much categories. But Ng soon became troubled that software better able to sort through photos, digital information available, and it took too few researchers outside Google had the tools to understand spoken commands and translate long for computers to crunch through what work on deep learning. “After many of my talks,” text from foreign languages. For scientists and did exist. Applications were rare. One of the he says, “depressed graduate students would industry, deep-learning computers can search few was a technique — developed by LeCun — come up to me and say: ‘I don’t have 1,000 com- for potential drug candidates, map real neural puters lying around, can I even research this?’” networks in the brain or predict the functions So back at Stanford, Ng started develop- of proteins. “OVER THE NEXT FEW YEARS ing bigger, cheaper deep-learning networks “AI has gone from failure to failure, with bits using graphics processing units (GPUs) — the of progress. This could be another leapfrog,” super-fast chips developed for home-computer says Yann LeCun, director of the Center for WE’LL SEE A FEEDING FRENZY. gaming3. Others were doing the same. “For at New York University and a about US$100,000 in hardware, we can build deep-learning pioneer. LOTS OF PEOPLE WILL JUMP an 11-billion-connection network, with “Over the next few years we’ll see a feeding 64 GPUs,” says Ng. frenzy. Lots of people will jump on the deep- learning bandwagon,” agrees Jitendra Malik, ON THE DEEP-LEARNING VICTORIOUS MACHINE who studies computer image recognition at But winning over computer-vision scientists the University of California, Berkeley. But BANDWAGON.” would take more: they wanted to see gains on in the long term, deep learning may not win standardized tests. Malik remembers that Hin- the day; some researchers are pursuing other that is now used by banks to read handwritten ton asked him: “You’re a sceptic. What would techniques that show promise. “I’m agnostic,” cheques. convince you?” Malik replied that a victory in says Malik. “Over time people will decide what By the 2000s, however, advocates such as the internationally renowned ImageNet com- works best in different domains.” LeCun and his former supervisor, computer petition might do the trick. scientist of the University In that competition, teams train computer INSPIRED BY THE BRAIN of Toronto in Canada, were convinced that programs on a data set of about 1 million Back in the 1950s, when computers were new, increases in computing power and an explo- images that have each been manually labelled the first generation of AI researchers eagerly sion of digital data meant that it was time for a with a category. After training, the programs predicted that fully fledged AI was right renewed push. “We wanted to show the world are tested by getting them to suggest labels around the corner. But that optimism faded as that these deep neural networks were really for similar images that they have never seen researchers began to grasp the vast complexity useful and could really help,” says George Dahl, before. They are given five guesses for each test of real-world knowledge — particularly when a current student of Hinton’s. image; if the right answer is not one of those it came to perceptual problems such as what As a start, Hinton, Dahl and several others five, the test counts as an error. Past winners makes a face a human face, rather than a mask tackled the difficult but commercially impor- had typically erred about 25% of the time. In or a monkey face. Hundreds of researchers and tant task of . In 2009, the 2012, Hinton’s lab entered the first ever com- graduate students spent decades hand-coding researchers reported2 that after training on petitor to use deep learning. It had an error rate rules about all the different features that com- a classic data set — three hours of taped and of just 15% (ref. 4). puters needed to identify objects. “Coming up transcribed speech — their deep-learning neu- “Deep learning stomped on everything else,” with features is difficult, time consuming and ral network had broken the record for accuracy says LeCun, who was not part of that team. The requires expert knowledge,” says Ng. “You have in turning the spoken word into typed text, a win landed Hinton a part-time job at Google, to ask if there’s a better way.” record that had not shifted much in a decade and the company used the program to update In the 1980s, one better way seemed to be with the standard, rules-based approach. The its Google+ photo-search software in May 2013. deep learning in neural networks. These sys- achievement caught the attention of major Malik was won over. “In science you have tems promised to learn their own rules from players in the smartphone market, says Dahl, to be swayed by empirical evidence, and this scratch, and offered the pleasing symmetry who took the technique to during was clear evidence,” he says. Since then, he has of using brain-inspired mechanics to achieve an internship. “In a couple of years they all adapted the technique to beat the record in brain-like function. The strategy called for switched to deep learning.” For example, the another visual-recognition competition5. Many simulated neurons to be organized into sev- iPhone’s voice-activated digital assistant, , others have followed: in 2013, all entrants to eral layers. Give such a system a picture and relies on deep learning. the Image­Net competition used deep learning. the first of learning will simply notice all With triumphs in hand for image and speech the dark and light pixels. The next layer might GIANT LEAP recognition, there is now increasing interest in realize that some of these pixels form edges; When Google adopted deep-learning-based applying deep learning to natural-language the next might distinguish between horizon- speech recognition in its Android smartphone understanding — comprehending human tal and vertical lines. Eventually, a layer might operating system, it achieved a 25% reduction discourse well enough to rephrase or answer recognize eyes, and might realize that two eyes in word errors. “That’s the kind of drop you questions, for example — and to translation are usually present in a expect to take ten years to achieve,” says Hin- from one language to another. Again, these are human face (see ‘Facial NATURE.COM ton — a reflection of just how difficult it has currently done using hand-coded rules and recognition’). Learn about another been to make progress in this area. “That’s like statistical analysis of known text. The state- The first deep-learn- approach to brain- ten breakthroughs all together.” of-the-art of such techniques can be seen in ing programs did not like computers: Meanwhile, Ng had convinced Google to let software such as Google Translate, which can perform any better than go.nature.com/fktnso him use its data and computers on what became produce results that are comprehensible (if

9 JANUARY 2014 | VOL 505 | NATURE | 147 © 2014 Macmillan Publishers Limited. All rights reserved NEWS FEATURE sometimes comical) but nowhere near were not modelled on bird biology. Etzi- as good as a smooth human translation. oni’s specific goal is to invent a computer “Deep learning will have a chance to do FACIAL RECOGNITION that, when given a stack of scanned text- Deep-learning neural networks use layers of increasingly something much better than the cur- complex rules to categorize complicated shapes such as faces. books, can pass standardized elemen- rent practice here,” says crowd-sourcing tary-school science tests (ramping up expert Luis von Ahn, whose company eventually to pre-university exams). To ANDREW NG IMAGES: , based in , Penn- pass the tests, a computer must be able Layer 1: The sylvania, relies on humans, not com- computer to read and understand diagrams and puters, to translate text. “The one thing identi es pixels text. How the Allen Institute will make everyone agrees on is that it’s time to try of light and dark. that happen is undecided as yet — but for something different.” Etzioni, neural networks and deep learn- ing are not at the top of the list. DEEP SCIENCE One competing idea is to rely on a In the meantime, deep learning has computer that can reason on the basis been proving useful for a variety of of inputted facts, rather than trying to scientific tasks. “Deep nets are really Layer 2: The learn its own facts from scratch. So it computer learns to good at finding patterns in data sets,” identify edges and might be programmed with assertions says Hinton. In 2012, the pharmaceuti- simple shapes. such as ‘all girls are people’. Then, when cal company Merck offered a prize to it is presented with a text that mentions whoever could beat its best programs a girl, the computer could deduce that for helping to predict useful drug can- the girl in question is a person. Thou- didates. The task was to trawl through sands, if not millions, of such facts are entries on more than 30,000 required to cover even ordinary, com- small molecules, each of which had Layer 3: The computer mon-sense knowledge about the world. thousands of numerical chemical-prop- learns to identify more But it is roughly what went into IBM’s complex shapes and erty descriptors, and to try to predict objects. computer, which famously how each one acted on 15 different tar- won a match of the television game get molecules. Dahl and his colleagues show Jeopardy against top human com- won $22,000 with a deep-learning sys- petitors in 2011. Even so, IBM’s Watson tem. “We improved on Merck’s baseline Solutions has an experimental interest by about 15%,” he says. in deep learning for improving pattern

Biologists and computational Layer 4: The computer recognition, says Rob High, chief tech- researchers including Sebastian Seung learns which shapes nology officer for the company, which of the Massachusetts Institute of Tech- and objects can be used is based in Austin, Texas. nology in Cambridge are using deep to de ne a human face. Google, too, is hedging its bets. learning to help them to analyse three- Although its latest advances in picture dimensional images of brain slices. Such tagging are based on Hinton’s deep- images contain a tangle of lines that rep- learning networks, it has other depart- resent the connections between neu- ments with a wider remit. In December rons; these need to be identified so they can be last month to head a new AI department at 2012, it hired futurist to pursue mapped and counted. In the past, undergradu- Facebook. The technique holds the promise various ways for computers to learn from ates have been enlisted to trace out the lines, of practical success for AI. “Deep learning experience — using techniques including but but automating the process is the only way to happens to have the property that if you feed it not limited to deep learning. Last May, Google deal with the billions of connections that are more data it gets better and better,” notes Ng. acquired a quantum computer made by D-Wave expected to turn up as such projects continue. “Deep-learning algorithms aren’t the only ones in Burnaby, Canada (see Nature 498, 286–288; Deep learning seems to be the best way to auto- like that, but they’re arguably the best — cer- 2013). This computer holds promise for non- mate. Seung is currently using a deep-learning AI tasks such as difficult mathematical com- program to map neurons in a large chunk of the putations — although it could, theoretically, be retina, then forwarding the results to be proof- “DEEP LEARNING HAS THE applied to deep learning. read by volunteers in a crowd-sourced online Despite its successes, deep learning is still in game called EyeWire. PROPERTY THAT IF YOU its infancy. “It’s part of the future,” says Dahl. William Stafford Noble, a computer scien- “In a way it’s amazing we’ve done so much with tist at the University of Washington in Seattle, so little.” And, he adds, “we’ve barely begun”. ■ has used deep learning to teach a program to FEED IT MORE DATA, IT GETS look at a string of amino acids and predict the Nicola Jones is a freelance reporter based near structure of the resulting protein — whether BETTER AND BETTER.” Vancouver, Canada. various portions will form a helix or a loop, for 1. Le, Q. V. et al. Preprint at http://arxiv.org/ example, or how easy it will be for a solvent to tainly the easiest. That’s why it has huge prom- abs/1112.6209 (2011). sneak into gaps in the structure. Noble has so ise for the future.” 2. Mohamed, A. et al. 2011 IEEE Int. Conf. Acoustics far trained his program on one small data set, Not all researchers are so committed to the Speech Signal Process. http://dx.doi.org/10.1109/ ICASSP.2011.5947494 (2011). and over the coming months he will move on to idea. Oren Etzioni, director of the Allen Insti- 3. Coates, A. et al. J. Machine Learn. Res. Workshop the Protein Data Bank: a global repository that tute for Artificial Intelligence in Seattle, which Conf. Proc. 28, 1337–1345 (2013). currently contains nearly 100,000 structures. launched last September with the aim of devel- 4. Krizhevsky, A., Sutskever, I. & Hinton, G. E. In Advances in Neural Information Processing For computer scientists, deep learning oping AI, says he will not be using the brain for Systems 25; available at go.nature.com/ibace6. could earn big profits: Dahl is thinking about inspiration. “It’s like when we invented flight,” he 5. Girshick, R., Donahue, J., Darrell, T. & Malik, J. start-up opportunities, and LeCun was hired says; the most successful designs for aeroplanes Preprint at http://arxiv.org/abs/1311.2524 (2013).

148 | NATURE | VOL 505 | 9 JANUARY 2014 © 2014 Macmillan Publishers Limited. All rights reserved