Empirical Software Engineering at Microsoft Research
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
PERVASIVE BEHAVIOR INTERVENTIONS Using Mobile Devices for Overcoming Barriers for Physical Activity
PERVASIVE BEHAVIOR INTERVENTIONS Using Mobile Devices for Overcoming Barriers for Physical Activity Vom Fachbereich Elektrotechnik und Informationstechnik der Technischen Universität Darmstadt zur Erlangung des akademischen Grades eines Doktor-Ingenieurs (Dr.-Ing.) genehmigte Dissertation von DIPL.-INF. UNIV. TIM ALEXANDER DUTZ Geboren am 20. Juli 1978 in Darmstadt Vorsitz: Prof. Dr. techn. Heinz Koeppl Referent: Prof. Dr.-Ing. habil. Ralf Steinmetz Korreferent: Prof. Dr. rer. nat. Rainer Malaka Tag der Einreichung: 14. September 2016 Tag der Disputation: 28. November 2016 Hochschulkennziffer D17 Darmstadt 2017 Dieses Dokument wird bereitgestellt von tuprints, E-Publishing-Service der Technischen Universität Darmstadt. http://tuprints.ulb.tu-darmstadt.de [email protected] Bitte zitieren Sie dieses Dokument als: URN: urn:nbn:de:tuda-tuprints-61270 URL: http://tuprints.ulb.tu-darmstadt.de/id/eprint/6127 Die Veröffentlichung steht unter folgender Creative Commons Lizenz: International 4.0 – Namensnennung, nicht kommerziell, keine Bearbeitung https://creativecommons.org/licenses/by-nc-nd/4.0/ Für meine Eltern Abstract Extensive cohort studies show that physical inactivity is likely to have negative consequences for one’s health. The World Health Organization thus recommends a minimum of thirty minutes of medium- intensity physical activity per day, an amount that can easily be reached by doing some brisk walking or leisure cycling. Recently, a Taiwanese-American team of scientists was able to prove that even less effort is required for positive health effects and that as little as fifteen minutes of physical activity per day will increase one’s life expectancy by up to three years on the average. However, simply spreading this knowledge is not sufficient. -
Competition-Based User Expertise Score Estimation
Competition-based User Expertise Score Estimation Jing Liu†*, Young-In Song‡, Chin-Yew Lin‡ † ‡ Harbin Institute of Technology Microsoft Research Asia No. 92, West Da-Zhi St, Nangang Dist. Building 2, No. 5 Dan Ling St, Haidian Dist. Harbin, China 150001 Beijing, China 100190 [email protected] {yosong, cyl}@microsoft.com ABSTRACT covered by existing web pages. With the explosive growth of web 2.0 sites, community question and answering services (denoted as In this paper, we consider the problem of estimating the relative 1 2 expertise score of users in community question and answering CQA) such as Yahoo! Answers and Baidu Zhidao , have become services (CQA). Previous approaches typically only utilize the important services where people can use natural language rather explicit question answering relationship between askers and an- than keywords to ask questions and seek advice or opinions from swerers and apply link analysis to address this problem. The im- real people who have relevant knowledge or experiences. CQA plicit pairwise comparison between two users that is implied in services provide another way to satisfy a user’s information needs the best answer selection is ignored. Given a question and answer- that cannot be met by traditional search engines. Users are the ing thread, it’s likely that the expertise score of the best answerer unique source of knowledge in CQA sites and all users from ex- is higher than the asker’s and all other non-best answerers’. The perts to novices can generate content arbitrarily. Therefore, it is goal of this paper is to explore such pairwise comparisons inferred desirable to have a system that can automatically estimate the user from best answer selections to estimate the relative expertise expertise score and identify experts who can provide good quality scores of users. -
The Evaluation of Rating Systems in Online Free-For-All Games
The Evaluation of Rating Systems in Online Free-for-All Games Arman Dehpanah Muheeb Faizan Ghori Jonathan Gemmell Bamshad Mobasher School of Computing School of Computing School of Computing School of Computing DePaul University DePaul University DePaul University DePaul University Chicago, USA Chicago, USA Chicago, USA Chicago, USA [email protected] [email protected] [email protected] [email protected] Abstract—Online competitive games have become increasingly might also be hampered by the inclusion of new players since popular. To ensure an exciting and competitive environment, these the system does not possess any knowledge of these players. games routinely attempt to match players with similar skill levels. In this paper, we consider six evaluation metrics. We include Matching players is often accomplished through a rating system. There has been an increasing amount of research on developing traditional metrics such as accuracy, mean absolute error, such rating systems. However, less attention has been given to the and Kendall’s rank correlation coefficient. We further include evaluation metrics of these systems. In this paper, we present an metrics adapted from the domain of information retrieval, exhaustive analysis of six metrics for evaluating rating systems in including mean reciprocal rank (MRR), average precision online competitive games. We compare traditional metrics such as (AP), and normalized discounted cumulative gain (NDCG). accuracy. We then introduce other metrics adapted from the field of information retrieval. We evaluate these metrics against several We analyze the ability of these metrics to capture meaningful well-known rating systems on a large real-world dataset of over insights when they are used to evaluate the performance of 100,000 free-for-all matches. -
Application and Further Development of Trueskill™ Ranking in Sports
TVE-F 19019 Examensarbete 15 hp Juni 2019 Application and Further Development of TrueSkill™ Ranking in Sports Julia Ibstedt Elsa Rådahl Erik Turesson Magdalena vande Voorde Abstract Application and Further Development of TrueSkill™ Ranking in Sports Julia Ibstedt, Elsa Rådahl, Erik Turesson, Magdalena vande Voorde Teknisk- naturvetenskaplig fakultet UTH-enheten The aim of this study was to explore the ranking model TrueSkill™ developed by Microsoft, applying it on various sports and Besöksadress: constructing extensions to the model. Two different inference Ångströmlaboratoriet Lägerhyddsvägen 1 methods for TrueSkill was constructed using Gibbs sampling and Hus 4, Plan 0 message passing. Additionally, the sequential method using Gibbs sampling was successfully extended into a batch method, in order Postadress: to eliminate game order dependency and creating a fairer, although Box 536 751 21 Uppsala computationally heavier, ranking system. All methods were further implemented with extensions for taking home team advantage, score Telefon: difference and finally a combination of the two into 018 – 471 30 03 consideration. The methods were applied on football (Premier Telefax: League), ice hockey (NHL), and tennis (ATP Tour) and evaluated on 018 – 471 30 00 the accuracy of their predictions before each game. Hemsida: On football, the extensions improved the prediction accuracy from http://www.teknat.uu.se/student 55.79% to 58.95% for the sequential methods, while the vanilla Gibbs batch method reached the accuracy of 57.37%. Altogether, the extensions improved the performance of the vanilla methods when applied on all data sets. The home team advantage performed better than the score difference on both football and ice hockey, while the combination of the two reached the highest accuracy. -
Thesis Template
Tailoring a Psychophysiologically Driven Rating System MASTER DISSERTATION Harryharasuthan Vasantharajah MASTER IN COMPUTER ENGINEERING SUPERVISOR Sergi Bermúdez I Badia TAILORING A PSYCHOPHYSIOLOGICALLY DRIVEN RATING SYSTEM Harryharasuthan Vasantharajah B.Sc. (Hons) Supervised by Sergi Bermúdez I Badia, Ph.D. Submitted in fulfillment of the requirements for the degree of Masters in Computer Engineering. Faculty of Exact Sciences and Engineering University of Madeira 2019 Abstract Humans have always been interested in ways to measure and compare their performances to establish who is best at a particular activity. The first Olympic Games, for instance, were carried out in 776 BC, and it was a defining moment in history where ranking based competitive activities managed to reach the general populous. Every competition must face the issue of how to evaluate and rank competitors, and often rules are required to account for many different aspects such as variations in conditions, the ability to cheat, and, of course, the value of entertainment. Nowadays, measurements are performed out through various rating systems, which considers the outcomes of the activity to rate the participants. However, they do not seem to address the psychological aspects of an individual in a competition. This dissertation employs several psychophysiological assessment instruments intending to facilitate the acquisition of skill level rating in competitive gaming. To do so, an exergame that uses non-conventional inputs, such as body tracking to prevent input biases, was developed. The sample size of this study is ten, and the participants were put on a round-robin tournament to provide equal intervals between games for each player. After analyzing the outcome of the competition, it revealed some critical insights on the psychophysiological instruments; Especially the significance of Flow in terms of the prolificacy of a player. -
Ranking (Trueskill) Map #1 #1 Player Map #2 Player #2 Player Vs Player #3 Map Map #3 Map #4 Map #5 #4 Player Ranking Systems ~ 30 Mins Talk
Math for Game Programmers: Ranking Systems; Elo, TrueSkill and Your Own Mario Izquierdo Sr. Software Engineer at Ranking (TrueSkill) map #1 #1 player map #2 player #2 player vs player #3 map map #3 map #4 map #5 #4 player Ranking Systems ~ 30 mins talk • Elo • TrueSkill • Practical Considerations Elo Rating System Árpád Élő (1903 - 1992) • Physics profesor and master chess player. • Elo's system constituted an improvement on the previous Harkness System. • Elo's system was adopted by the FIDE (World Chess Federation) in 1970. • Published "The Rating of Chessplayers, Past and Present” in 1978. • Fun fact: Up until the mid-80’s, Elo himself made the rating calculations! Elo Rating System: Normal Distribution Assumption: Chess performance is a normally distributed random variable. Using some simplifications (i.e. constant standard deviation) makes easy to calculate the Expected score of a match (probability of win) for two given player skill levels. Elo Rating System: Normal Distribution “Slime Curve” In the eyes of ELO, you are all “slime people” Elo Rating System: Normal Distribution “Slime Curve” Elo Rating System: Formula After a given match, rating points are transferred between players: RatingDiff = (Score - Expected) * K-factor Where: Score is 0 = loss, 0.5 = draw, 1 = win Expected is 0 to 1, the probability of winning K-factor is a constant for maximum change (update “speed”) Elo Rating System: Formula After a given match, rating points are transferred between players: RatingDiff = (Score - Expected) * K-factor Much of the trick is in figuring out what the Expected result of a game is. The original ELO system uses the following formula (from the Normal dist.): Expected[A] = 1/(1+10^(Rating[B-A]/400)) Elo Rating System: Formula After a given match, rating points are transferred between players: RatingDiff = (Score - Expected) * K-factor Much of the trick is in figuring out what the Expected result of a game is. -
Trueskill 2: an Improved Bayesian Skill Rating System
TrueSkill 2: An improved Bayesian skill rating system Tom Minka Ryan Cleven Yordan Zaykov Microsoft Research The Coalition Microsoft Research March 22, 2018 Abstract Online multiplayer games, such as Gears of War and Halo, use skill-based matchmaking to give players fair and enjoyable matches. They depend on a skill rating system to infer accurate player skills from historical data. TrueSkill is a popular and effective skill rating system, working from only the winner and loser of each game. This paper presents an extension to TrueSkill that incorporates additional information that is readily available in online shooters, such as player experience, membership in a squad, the number of kills a player scored, tendency to quit, and skill in other game modes. This extension, which we call TrueSkill2, is shown to significantly improve the accuracy of skill ratings computed from Halo 5 matches. TrueSkill2 predicts historical match outcomes with 68% accuracy, compared to 52% accuracy for TrueSkill. 1 Introduction When a player wants to play an online multiplayer game, such as Halo or Gears of War, they join a queue of waiting players, and a matchmaking service decides who they will play with. The matchmaking service makes its decision based on several criteria, including geographic location and skill rating. Our goal is to improve the fairness of matches by improving the accuracy of the skill ratings flowing into the matchmaking service. The skill rating of a player is an estimate of their ability to win the next match, based on the results of their previous matches. A typical match result lists the players involved, their team assignments, the length of the match, how long each player played, and the final score of each team. -
Influence of Gameplay on Skill in Halo Reach
Influence of Gameplay on Skill in Halo Reach Jeff Huang Abstract University of Washington We study the question of how skill develops in video [email protected] game through a rating called TrueSkill. In a previous paper [1] we used the skill ratings from 7 months of Thomas Zimmermann games from over 3 million players to look at how play Microsoft Corporation intensity, breaks in play, other titles played, and skill [email protected] change over time affect skill. In this paper, we briefly summarize our findings and discuss how we plan to Nachiappan Nagappan continue our research. Microsoft Corporation [email protected] Keywords Games, Analytics, Game Usage, Player Progression Bruce Phillips Microsoft Corporation ACM Classification Keywords [email protected] K.8.0 [Personal Computing]: Games. Chuck Harrison General Terms Microsoft Corporation Human Factors, Measurement. [email protected] Introduction In this paper we present a brief overview of player game characteristics in Halo Reach. We discuss play intensity and play patterns with respect to skill development. We present a general method of our analysis that can be applied to other games. We Copyright is held by the author/owner(s). conclude with some of our future plans for mining CHI’13, April 27 – May 2, 2013, Paris, France. game-player data. ACM 978-1-XXXX-XXXX-X/XX/XX. Analysis of Skill Data Skill in Halo Reach Other Titles Played. Players who did not play Halo 3 For our study, we selected a cohort of 3.2 million Halo previously were less skilled but gained skill at about the Step 1: Select a population of Reach players who started playing the game in its first same rate as everyone else in Halo Reach. -
Measuring Cooperative Behavior in Contemporary Multiplayer Games
MEASURING COOPERATIVE BEHAVIOR IN CONTEMPORARY MULTIPLAYER GAMES Martin Ashton Master of Science School of Computer Science McGill University Montreal, Quebec 2012-08-12 A thesis submitted to the Faculty of Graduate Studies and Research in partial fulfillment of the requirements for the degree of Master of Science Copyright c 2012 Martin Ashton DEDICATION This work is dedicated to my friends Pete, Dave, Kevin, Vanessa, Steve, Mike, Nadina, John, Chris, Max, Dan, Francine, Peter, Daniel, Michael, Dahlia, Sabrina, Michaela, Katrina, Diana, Stevie, Kathy, Bronson, Jess, Theo, Cynthia, Ben, Shayne, Paul, Andre, Fred, Justin, J.P., Aly, Carl, Udhay, Phil, Matt, Simon, Julien, and the rest of the Koalas. ii ACKNOWLEDGEMENTS This work was supported, in part, by the Natural Sciences and Engineering Council of Canada (NSERC), and Le Fonds de Recherche du Qu´ebec - Nature et Technologies (FQRNT). My particular thanks are extended to Prof. Clark Verbrugge for giving me the opportunity to research the exciting domain of modern video games, and to my parents for teaching me the importance of continuously exceeding my own limits. Lastly, I would like to thank the developers at Bungie, Blizzard, and Valve for making such excellent games. iii ABSTRACT Social aspects of multiplayer games are well known as contributors to game success, with online friendships and socialization expected to expand and strengthen a player-base. Understanding the nature of social behavior and determining the impact of cooperation on gameplay is thus important to game design. In this work, we make use of data exposed through in-game and web-based API’s of two contemporary multiplayer games, World of Warcraft and Halo: Reach. -
Model-Based Machine Learning
Model-Based Machine Learning Christopher M. Bishop Microsoft Research, Cambridge, CB3 0FB, U.K. [email protected] Several decades of research in the field of machine learning have resulted in a multitude of different algorithms for solving a broad range of problems. To tackle a new application a researcher typically tries to map their problem onto one of these existing methods, often influenced by their familiarity with specific algorithms and by the availability of corresponding software implementations. In this paper we describe an alternative methodology for applying machine learning, in which a bespoke solution is formulated for each new application. The solution is expressed through a compact modelling language, and the corresponding custom machine learning code is then generated automatically. This model-based approach offers several major advantages, including the opportunity to create highly tailored models for specific scenarios, as well as rapid prototyping and comparison of a range of alternative models. Furthermore, newcomers to the field of machine learning don’t have to learn about the huge range of traditional methods, but instead can focus their attention on understanding a single modelling environment. In this paper we show how probabilistic graphical models, coupled with efficient inference algorithms, provide a very flexible foundation for model-based machine learning, and we outline a large-scale commercial application of this framework involving tens of millions of users. We also describe the concept of probabilistic programming as a powerful software environment for model-based machine learning, and we discuss a specific probabilistic programming language called Infer.NET, which has been widely used in practical applications. -
Microsoft Research Cambridge
Welcome to Microsoft Research Cambridge 21st November Alex Butler Cambridge Wireless UX SIG Sensors & Devices Group “No Free Lunch: The Consumer as Product in a Data-Driven Economy” Microsoft Research Presence Mission • Advance the state of the art in Computer Science • Transfer technology to Microsoft business • Lead Microsoft into the future Microsoft Research Cambridge The Numbers STAFF HONOURS OUTPUTS 121 RESEARCHERS 1 KNIGHTHOOD HUNDREDS OF TIER PUBLICATIONS PER YEAR 100 INTERNS PER YEAR 1 TURING AWARD WINNER 45+ PATENTS FILED PER YEAR 17 R&D STAFF FROM OTHER 1 KYOTO PRIZE WINNER MICROSOFT R&D GROUPS 2 MARR PRIZE WINNERS 11 VISITING RESEARCHERS PER YEAR 2 ACM FELLOWS OVER 150 DAY VISITORS, 1 IEEE FELLOW SEMINAR SPEAKERS PER YEAR 3 ROYAL SOCIETY FELLOWS 1 ROYAL SOCIETY OF EDINBURGH FELLOW 4 ROYAL ACADEMY OF ENGINEERING FELLOWS 1 MACROBERT AWARD 1 ROOKE MEDAL 1 VON NEUMANN MEDAL 1 EADS GRAND PRIZE Research Areas Machine Learning & Perception Computer Vision. Machine Learning. Online Services & Advertising. Constraint Reasoning. Programming Principles & Tools Formal Methods. Programme Structures. Programming Systems. Constructive Security. Systems & Networking Distributed Systems. Networking. Networks, Economics & Algorithms. Operating Systems. Computer-Mediated Living Integrated Systems. Sensors & Devices. Socio-Digital Systems. i3D. Computational Science Biological Computation. Computational Ecology & Environmental Sciences. Research Connections Advanced Research Tools and Services Community, Intellectual Capital Development Earth, -
Microsoft Trademarks
Microsoft Trademarks March 2021 Microsoft Trademarks The following is a non-exhaustive list of trademarks owned by Microsoft Corporation and its affiliated companies. Their use is subject to the Microsoft Trademark & Brand Guidelines. To the extent a name, icon, or other brand asset does not appear in the list below, this does not constitute a waiver of any intellectual property rights that Microsoft Corporation and/or its subsidiaries may have established in them anywhere in the world. 0-9, A 343 Industries Alt Access AltspaceVR Access Designs AppLocker AppSource Arc AccountGuard As Dusk Falls Active Directory Avatar Famestar ActiveX Avowed Adera Azure Affirmed Design Azure Brainwave Azure Cosmos DB Afternoon Cyber Tea Azure Design Age of Empires Age of Mythology Azure PlayFab AI for Earth Azure Sentinel Aldhabi Azure Sphere B Bahnshrift Blue Dragon Banjo-Kazooie BlueTrack Technology Battletoads BlueTrack Technology Design Bing Bing Designs Bookdings Brainwave BitLocker BranchCache BitLocker To Go Broken Age BizTalk Brutal Legend Bleeding Edge Bloodforge C Calibri Continuum Cambria Convection Candara Corbel ClearType Cortana Clippy Cortana Design Clippy Design Crackdown Creeper Design Comic Sans Compulsion Games Crimson Alliance Conker Crimson Dragon ConnectR Crimson Omen Design Consolas Constantia Crimson Skies D DataSense Direct3D Delve DirectX Delve Designs Double Fine Drivatar Dynamics DeployR Dynamics 365 Desert Ranger Dynamics 365 Design DevelopR DigiGirlz E Edge Designs Excel Excel Designs ElectionGuard Empowering Us All Exchange