<<

03062012 TechFest Rick Rashid

TechFest Rick Rashid March 6, 2012

ANNOUNCER: Ladies and gentlemen, please welcome 's Chief Research Officer, Rick Rashid. (Applause.)

RICK RASHID: I would like to welcome all of you to sunny Seattle, unfortunately when I got here an hour ago it was snowing. So, I won't be able to do that. But I will welcome you to TechFest 2012.

This is, for me at least, one of my favorite times of the year, because it's an opportunity for us to get so many of our researchers from around the world here in one place, showing off a lot of their ideas and a lot of the technologies they've been developing, and having a chance to interact with our product groups, and of course having a chance to interact with you.

So, let me get moving on this. Now, as the video just showed, I came to Microsoft 20 years ago, actually now 20-1/2 to be precise, September of 1991. And I came to create , to really start a fundamental research lab in the context of a software company, which is Microsoft. Now, the idea of Microsoft Research really started with a memo that Nathan Myhrvold, who is one of the people featured in that video, wrote to the board of directors of Microsoft in 1990, really saying that he felt it was important that Microsoft make fundamental investments in science in order for it to be able to thrive and grow for the long-term future.

Now, it was unusual in a sense to be making that kind of an argument in the context of a company like Microsoft back then, because Microsoft was a very small company. We only had a few thousand employees. We were just crossing over $1 billion in sales. And so it wasn't really clear that a company of that size would necessarily want to invest and make that kind of commitment to long-term fundamental basic research.

But, I think it's a statement about Microsoft, I think it's a statement about our vision as a company that we do tend to look at the long-term. We try to play the long game. And so the decision was made by the board in 1990 to start Microsoft Research. They reached out. They hired me, brought me in in September of 1991. And we've been building the lab ever since then.

Now, through that whole period, we've always had a single mission. We've had the advantage that we've always had the same person running the organization—that's me. And I've always had the same mission. I haven't been messing with my mission statement. And this is the mission here. The key thing is to expand the state of the art in the areas that we do research. That's really our most important activity, because if we're not doing that, if we're not really pushing the frontiers, then we're not really going to be that valuable as an organization to a company like Microsoft.

Obviously, when we have great ideas, when things make sense, then that's the second part of our mission statement, which is, we work really hard to get those ideas into our products. You'll see more about that in a bit.

And then, finally, these are overarching values that we bring to the company that's really embodied in that third point, which is we're really here to make sure that Microsoft will be here 10 years from now, 15 years from now, or 20 years from now. If you think back 20 years ago when Microsoft Research was started, very few of the companies that were Microsoft's peers at that time still exist today. The technology industry is a constantly changing industry. You need to constantly be able to change. And having a fundamental basic research group like Microsoft Research gives Microsoft that ability to be agile and to change.

Now, we've grown from a very small group to what is really now one of the largest basic research groups in the field of computer science. You see on the slide here all the different locations that we have. We have six significant locations for doing basic research. Here in Redmond, Microsoft Research Redmond. Our second largest lab is in Beijing. Our third largest lab is in Cambridge, England. We have Bangalore, India. We have New England, and Mountain View, California. But we also have other groups around the country as well. We have a small group doing in Santa Barbara. We have joint research arrangements with INRIA in Paris. We have advanced technology teams that are doing advanced development work in Germany, and Egypt, and in Israel. So, it gives you a sense of the region, the breadth of the organization as it's grown.

We have about 850 Ph.D. researchers at Microsoft Research. To put that in perspective, that means about 1 percent of all Microsoft employees are Ph.D. researchers doing fundamental basic research, which is a pretty significant fraction of the corporation.

We cover a very broad range of research, and you'll see some of that today as you look around some of the exhibits that you'll be looking at, but we cover not just all the traditional areas of computer science, but we even reach out into what are increasingly becoming other areas of science that computer science is bleeding into, whether that be in areas like biology, and chemistry, and physics, and environmental sciences. Increasingly we're part of those communities as well because the entire field of computer science has, in a sense, merged into those areas.

One of the reasons we have a global organization, and you saw that from the previous slide, is because we are really about hiring the best and the brightest people from around the world. In some sense, people often ask me, how do you decide what you're doing research on, and how do you decide what are the key areas that you're going to invest in? And I don't invest in specific research projects. I don't invest in specific areas. I invest in people. We try to hire the best and brightest people wherever we can find them, because that's what really drives long-term research. That's what really drives the innovations and the breakthroughs that are important to the company.

The impact we have really stretches across all of Microsoft's products, and I think this slide gives you kind of a sense of it. Really, there isn't a product that Microsoft produces that doesn't have either technology from Microsoft Research in it, or wasn't built with technology from Microsoft Research. The impact we've had across our entire product line has been enormous.

When I was a university professor before coming to Microsoft, one of the reasons I came to Microsoft to start Microsoft Research was to have that kind of an impact. It was to make sure that the ideas, the technologies, the artifacts that we created in computer science research would have the ability to change the world and impact people. And we do that through our products and the things that we create.

One way of thinking about research is to think about it using what we sometimes refer to as a four-quadrant diagram, which I got from Peter Lee who heads up our research lab here in Redmond. It gives you a sense of the kinds of research that we do. Over in the lower left-hand corner there, we do a lot of work that really enhances the mission of our product teams, that makes those products better in fundamental ways. So, if you use a Windows Phone, you use the text input facility there, the fact that it is so smart about the things that you're texting and can recognize what you're doing, that comes out of research work that we've done. If you use Bing, our ability to have Bing be competitive with Google in the marketplace and really exceed it in a number of areas has really come from the long-term investments we've made in technology that is now embedded in our Bing product. And, increasingly, we're taking Bing in new directions based on our research.

We also create technologies that sustain our products. Again, I mentioned some of the technologies we've created that really enhance the way we build our products. We've added new proof tools. We'll approve properties of our programs so that we can eliminate large categories of bugs. We've added fuzzy logic to be able to do analysis of our software systems to find problems that would be very difficult to find any other way. And we've also taken the technologies we've created and enhanced our products in significant ways.

So, if you use Bing Translator, which is really an exploding service for us, it's up 600 percent just in the last year in terms of access, and the queries through our translation facility are just going through the roof. That's technology that's based on long-term investments in fundamental basic research, really understanding how to be able to do translation. And, in fact, one of the booths you'll see when you go out is work that we're doing to bring the entire community in, the language community, so that we can translate all languages from one to another, and make that kind of a democratic and egalitarian world out there where it isn't just one organization deciding what languages get translated, but in fact the large community of people and the hundreds and hundreds of languages that are out there will be able to be translated in the future.

I think in the demos that you'll be seeing we'll be emphasizing a couple of key areas -- I wanted to highlight those and give you a couple of demonstrations, so that you'll have a chance to see some of the things that you'll be looking at in the demo booth later today.

One of the trends, one of the things that's happening and certainly Microsoft Research has been leading the drive in that area, is this notion that increasingly the virtual and the physical worlds are merging. Part of this is happening because we're giving the same senses that we have. We're giving them the ability to see. We're giving them the ability to hear and to understand. We're going them the ability to speak, to touch, to feel, to know where they are, to sense motion. Those are the kinds of things that we think of as unique to humans, and yet now our computers are capable of doing those same things. As that happens we're changing the kinds of applications that computers can be used for, and we're changing the way in which we interact with them. You see that with some of the technologies we've been introducing, such as Microsoft , where suddenly we've given the computer the ability to see in three dimensions in real time to recognize what a person is doing and how they're interacting with the world. And that's just changing the way people perceive the relationship between a computer and an individual. So, you'll see a number of demonstrations of that type today.

I'm going to bring on stage Dr. Frank Soong. He's going to talk about some work that we've been doing in our lab, specifically to address the problem of how do we make our computers speak, and in particular, how can we make them speak in multiple languages with a single voice.

So, Frank, show us what you've been working on.

FRANK SOONG: Thank you, Rick. So, in the next few minutes I'm going to tell you about TTS, or text to speak synthesis. And how to start an English text to speech system and we'll just go out, find an English speaker, collect speech data, and then train the TTS. Easy and simple. Similarly, we can do that for other language. But, how about you have a monolingual speaker, you like his English, the TTS output, but unfortunately he's monolingual. Can we train a different language TTS for his voice? So, this is really the challenge. A typical scenario that you do have is -coded or multi-language text to be read out or to be said by the TTS system.

For example, a car navigation TTS for an American driving in Beijing, if he's brave enough to do that. This is the typical mix coded driving instructions, and the major driving instruction, turn left, turn right, are in English and the key terms, landmarks, street names are in Chinese. But, can we do that? There are really serious challenges. So, what we want to have is really a mixed lingual, or multi-lingual, TTS, but we still want the whole output to be seamless, to be spoken out in a consistent, single voice.

Hopefully there's no gaps, no artifacts, glitches between transitions, and of course, you can find the easy way out and say find a truly fluent bilingual speaker to record both Chinese and English and train those two TTS systems and merge them together. But, of course, it's not always easy. Probably you like his English, but you don't like his Chinese, or the other way around. And so the challenge is really, can we do that, using monolingual, say, English data, to train a Chinese TTS. And to make the whole thing even more complicated or more challenging, because there are so many possible combinations, just the pairs, or even triple, or multiple languages in one text.

So, let me just show you what we have done. This is really a monolingual female speaker, TTS, and then we use that training data, to train the corresponding Chinese and mix them together in the output.

So, this is the driving instruction that I showed you earlier. [Audio of TTS system translating mixed coded directions.] I hope that I convince you and that the voice is consistent. So, from Chinese to English, or English to Chinese, still in one consistent female voice.

So, the idea, or the algorithm, how can we do it? It's really starting with we use a reference Chinese speaker, and the Chinese speaker sentence, we constructed the trajectory, the fundamental frequency, the gain, the loudness, the short-time spectral information. But, that's for a reference Chinese speaker, and we need to warp or to equalize the difference between the monolingual English speaker and this reference Chinese speaker, so we warp the trajectory toward that target English monolingual speaker, and then once we warp that then we can do this so-called sausage.

We break the English database into pieces very small pieces. In this case, five milliseconds apiece and then construct all the pieces which are closest to the trajectory of the warped Chinese sentence. So, we form a sausage-like network. So, within the network, we do the optimal search to find what is truly the best concatenation of all these sequences of tiles. And once we make one sentence we can do that for many, many training sentences, and once we have those training sentences then we can train our Chinese TTS. And so using the same technology, we try our boss, Rick.

RICK RASHID: Okay. So, what are you going to make me do now?

FRANK SOONG: I know you are more than monolingual, but how about Spanish?

RICK RASHID: I don't speak Spanish, I'm afraid.

FRANK SOONG: Well, all right. So, here is the English. So, let's try to see how the TTS will speak out in Rick's voice, but in Spanish. [Audio of Rick reading in Spanish] Or another question asking how to take the train from Madrid from Barcelona [Audio of Rick reading in Spanish]. That's all trained from Rick's English recording of about one hour. And so --

RICK RASHID: This is like putting words in your boss's mouth.

FRANK SOONG: Yes, basically. Can I ask for a raise?

RICK RASHID: Only in Spanish.

FRANK SOONG: So, next time we'll make a Chinese one. Actually, we do have a Chinese one for you, too.

RICK RASHID: All right. I'll have to try that next time I'm in China.

FRANK SOONG: So, to make the whole thing even more virtual, and so Rick's boss, . Too busy, so we only got one hour while he was staying in Beijing to do video and audio recording of his English. I'm pretty sure that he doesn't speak Chinese, I think probably some other language. So, using that data we constructed Craig's English TTS, plus his talking head, or the avatar, 3D avatar. So, here it is.

(Video segment of Craig’s avatar speaking in English.)

All right. Here comes the Mandarin.

(Video segment of Craig’s avatar speaking in Mandarin.)

So, actually to summarize, so with this technology, multi-lingual TTS with only monolingual data from a speaker, we will be able to do quite a few possible scenario applications such as learning a foreign language to motivate the user or the learner, to use his own voice, his mother tongue, and to synthesize, if you try harder you can achieve this kind of level, or speech to speech translation. For monolingual speakers, traveling in a foreign country we'll do a followed by translation, followed by the final TTS output will be output in a different language, but still in his own voice, or any kind of miscoded, multi-language text can be read out. So, here is the URL and welcome to our booth. We’ll show you the real-time demo and we have a real time machine ready for running mixed text input.

RICK RASHID: Great. Thanks, Frank. Thanks very much. (Applause.)

So another area, another trend that's going on right now that we're highlighting today at TechFest is what some people are talking about in terms of big data, or the cloud, this idea that now that we have enormous amounts of storage, petabytes of storage, we can bring into a single data center 100,000 machines. We can talk about building data systems that allow us to really understand the vast amounts of knowledge that we can collect in ways that we weren't able to do before. Really give us new insights, whether it's insights into our businesses, insights into our selves, into our biology, insights into astronomy, or insights into the planet, and the environment. So, it's really, again, a way of thinking about solving the world's problems that's now new and different and is really gated by technologies like Windows Azure.

So, I'm going to bring on stage Dr. Drew Purves. Drew heads up our Computational Ecology and Environmental Sciences Group at Microsoft Research Cambridge. Drew is going to talk about some of the work that they've been doing in terms of building tools that really can analyze and process huge amounts of information in novel ways. And of course, one of the most -- one of the biggest sources of information is the planet itself.

Drew.

DREW PURVES: Thank you, Rick.

Hi, everyone. I'm very excited to be able to show you something called FetchClimate!, that's come out in my group in Cambridge. The FetchClimate! allows experts and non- experts alike to very easily extract complex climate information. Climate has a special meaning. Climate describes a typical pattern of weather that you can expect to experience at different times and places around the world. People often say climate is what you expect, weather is what you get. As we all know, weather is highly unpredictable. I also had to make it through the snow this morning. And climate is very predictable. And the good news is there's huge amounts of climate data available. The bad news is, though, that it's extremely difficult for even the experts to extract useful information from those data, which is why we built FetchClimate! It's an intelligent, automatic, and very quick service that lets people extract that complex information about the climate, either with just a few lines of code from inside a program, or with a few clicks of the mouse through a browser and that's what we're looking at here.

So, just to show you the sorts of things we can do, since we're in Redmond we can go and find Redmond on the map. This is running on top of Bing maps. And so before I came out to Redmond from Cambridge I just clicked on the map and hit FETCH, and what this is doing is for the period 1961 to 1990, although we could choose other periods, it's taking the average temperature at that location. So, what's happened there is the Silverlight application running in the browser has sent a message to a service running over Azure. And that service peruses a number of alternative data sets. It runs the query over all of the data sets that are suitable for that query. It calculates the uncertainty associated with the query. And then by default returns the query that has the lowest uncertainty.

So, there's really quite a lot going on mathematically, a lot of calculations, but Azure lets us do that in a way where those calculations can be farmed out in parallel and we're able to reconcile those data sets, which at first glance are mutually incompatible, and treat them in a common currency and return the answer.

So, you see, we can hover our mouse over here and find out that the average temperature in Redmond is 10.7 degrees C. That's fairly mild. We can do better than that. So, we can just choose -- since I was coming out in March, I can say, well, what's the average temperature in March in Redmond, and again that FETCH will happen. We'll get the answer. We can find out that. In Redmond, the average temperature is about 7.2 degrees C in March, and that's what it felt like to me actually when I got up this morning. That's fairly chilly, not freezing. Now, we can run a number of these points in parallel if we want to. Again, then Azure will simply farm those calculations out. We can see the color scheme here matches the temperature. So, we can see although it's 7.2 degrees, like we just saw, it gets colder, and all the way up here the average temperature in March is about -2.8, which is a lot colder.

I'm just going to clear the regions now, and do what's called a grid search. So, I can just hold down control and define a region over here. That's putting a grid, which will also be farmed out in parallel, again, for the same period of years, 1961 to 1990 in March, and then we're actually visualizing the data on top of Bing Maps with another prototype tool from my group called Dynamic Data Display. So, we can see it's very easy then to create a map of temperatures like that, the kind of the thing that would, at the moment, even experts would take quite a while to produce a map like that, and ordinary people wouldn't have a hope of even beginning.

You can imagine, for instance, if you're planning where to put a hotel in the area, you could use that in that kind of scenario. If you want to share that information to enable others to see the same data, it's literally just as simple as copying the URL. So, all of the information is contained in the URL, Silverlight application downloads, the query will be populated, and then the calculation will run again, the answer comes back, and we can visualize it.

So, we could, for instance, include that URL in an e-mail and just say, dear such-and- such, I really don't think that's a good place to put a hotel. Have you checked out the temperatures, and just include that link in the e-mail.

We can also download the data. So, I can download it to my desktop, just give it a random name for now, and if I go and open that you can see all the information that's included in the data. So, here's our query, all the query information that we put. We also get provenance back, so it tells us what data was used to fulfill the query. And we get the uncertainty. That's something that even the experts find really hard to place on these kinds of data, but that comes as just a standard part of a FetchClimate! query.

So, let's look at how we could use FetchClimate!in a kind of planning scenario. I'm just going to clear that region. So, we can imagine that we're the Chinese Government, let's say, and we're wondering where we're going to place our wind farms. So, we can easily just define a grid over a particular region, and now I'm just going to choose wind speed for this period, the whole year, et cetera. Hit FETCH. And we can easily see the wind speed over here is about 1.-something meters per second, down here it's more like 3. And, as you all know, the energy goes as a square of the wind speed, so that means you could expect to get nine times as much energy out of your wind farm if you placed it there.

On a slightly more trivial level, you can think about planning a holiday. And if, like me, you're from Britain, or indeed Seattle, you tend to crave sunshine. So we can just put a grid over the map now, and choose sunshine fraction, hit FETCH. We can run much higher resolution grids here. Everything is editable, so the query is configurable in a number of important ways. Notice we've got some weasel wording here that says, please note retrieval times can vary. And then we can see here that Spain, for example, at 62 percent of available sunshine is over twice as sunny as Britain, which helps to explain why Spain is such a popular holiday destination for people from Britain. We perhaps didn't need FETCH Climate to tell us that, but I hope it illustrates the kind of thing it can do.

On a much more serious note, we can go to Africa. You probably heard about African droughts, and crops, and so on. I can define a region here in Ethiopia, and do a different kind of FETCH. So, instead of a spatial variation here, we can actually do a yearly time series between 1950 and 2009. I happen to know that the date stops prior to 1950. And what this is going -- I'm going to choose precipitation. And so what this is going to do, FetchClimate! is now calculating for that exact region of the map the total rainfall in each year from 1950 to 2009. I can hover over the map and see certainly what looks like a worrying pattern of a 60-year declining rainfall. You can visualize that here as well. This is also Dynamic Data Display running.

So, FetchClimate! is live. Do search for it in Bing. It takes two or three links to get through. You get some information page, but you can get to this Silverlight application. All you have to do is hold down CTRL to swipe regions, and so give it a go and let us know what you think.

Before I come back to Rick, I've actually prepared something that I thought might be interesting or useful for him. So, Rick, the pattern of dots here on the map I think is probably quite familiar to you. This is --

RICK RASHID: I think I've seen that before.

DREW PURVES: Yes. I think we even saw it a minute ago, didn't we. And so that's each of the Microsoft Research labs, and what you can do is hover over here and have a look at a typical year's seasonality of rainfall. And I think what you'll find is that it tells you that Seattle is the wettest lab most of the time. However, in the middle of the summer, Bangalore and Beijing are really wet, and perhaps no surprise but the Cambridge lab, where I'm from, is a constant drizzle all year round.

RICK RASHID: Yes, I don't think we actually had this available when we decided where to put the labs. So, now I'm thinking about Spain.

DREW PURVES: So, I'll send you that URL in an e-mail, and now you can use that when you're planning your next visit.

RICK RASHID: Perfect. Thanks, Drew. Thanks very much.

DREW PURVES: Thanks everyone. (Applause.)

RICK RASHID: Again, these are demos you'll be able to get a chance to look at a little bit more when you go out to the floor. And it gives you a sense, again, of the ability now to be able to apply , mathematics, large-scale computation, and be able to process huge amounts of data in parallel, and really bring that together to be able to give people real intelligence about whatever it is that they need to be looking at, in this case, climate.

Well, Microsoft Research is really a unique asset. We are the number one institution in the world in terms of publishing basic research in computer science. We get more best paper awards than any other organization in the world. There's never been anything quite like Microsoft Research in terms of the level of engagement we have with the academic community and the research communities around the world as well. We're global. We work with the universities, we work with research labs, we work with governments around the world, we work with other corporations. We work hard to impact our products and get cutting-edge research into our products, and really give the company agility.

I mean the reason you have a basic research lab, in some sense, isn't really for the stream of technology that comes out, you get that and that's fabulous, and that's an important asset to have, but it's also because you want to survive. When things change, if you've got a new competitor, if you've got a new technology, if the business climate changes in a fundamental way, you want the intellectual capital, the people, the treasure chest of technologies to be available that allows you as a company to rapidly change when change is critical. And Microsoft Research has done that for Microsoft over the years.

It's one of the reasons we're still here as a company, and it's one of the reasons that I think we'll continue to be here for a long time to come. We really fuel the future, not just of Microsoft, but really of the technology field. We're increasing our knowledge about the field of computer science in a very fundamental way. And the reality is, you know, I say if 20 years from now we were still the way we are now, I would actually be really happy.

But if, as an organization, we're continuing to push the state of the art, if we still have the same set of values in terms of the way we do our work and how we do it, the way we work with our product teams, our determination both to push the state of the art and to make sure that it has an impact on people's lives, if we're still doing that 20 years from now, I'll be awfully darned happy, assuming I'm still alive.

All right. Thank you very much. (Applause.)

END