Google's Data Centers, Noo-Politics and the Architecture of Hegemony

Total Page:16

File Type:pdf, Size:1020Kb

Google's Data Centers, Noo-Politics and the Architecture of Hegemony Pearce, Thomas. “‘...So it really is a series of tubes.’ Google’s Data Centers, Noo-politics and the Architecture of Hegemony in Cyberspace ” Enquiry 10.1 (2013) 43-53. 13 volume 10|issue 1 “...So it really is a series of tubes.” Google’s Data Centers, Noo-politics and the Architecture of Hegemony in Cyberspace Thomas Pearce ABSTRACT “You walk through those doors and the world opens up to you. It’s a culture where information is free. A culture where the more you share and In recent years, the physical manifestation and infrastructure of the the more you ask the more you get.”1 informational age has increasingly drawn the attention both of the popular imagination and of architectural theorists. This paper focuses on an “This is the beating heart of the digital age — a physical location where aspect mostly overlooked to date, namely on its artistic representation. the scope, grandeur, and geekiness of the kingdom of bits become It provides a critical analysis of a series of data center photographs manifest.” published by Google in October 2012 under the name “Where the internet “As we leave the floor, I feel almost levitated by my peek inside Google’s lives”. The photographs are examined as carefully staged constructions inner sanctum.” 2 of a specific imagination of information technology that, transcending a purely aesthetical or corporate critique, has broad political, socio- “Google, I really love you. If you were a person I would leave my wife geographical and economical implications. A first analysis of their and marry you :)_” composition, digital manipulation and visual impact situates the images “no Google , no protection!” “Beautiful. Clean and organized but with Google you would expect within a recent photographic current of the so called “anthropogenic 3 Sublime”. The paper then zooms out to reframe the photographs as a nothing less.” continuation of the euphoric techno-utopian discourse that surrounded the “This is corporate propaganda that truly is a treat for the eyes”4 popular dissemination of the internet in the early nineteen-nineties. This discourse hailed the internet as an inherently moral and emancipatory “...So it really is a series of tubes.“5 vehicle that, because of the non-physical nature of cyberspace, would liberate its users from traditional hegemonic dispositifs based on (Some reactions to the data center images) techniques of physical coercion. Tracing the transition from bio-political (Foucault) to noo-political dispositifs (Lazarrato, Deleuze) and discussing 1. Transparency the inextricable connection between information technology, territorial On the 17th of October 2012, the world was in awe as Google went conflict and socio-geographic inequality, the article goes on to account transparent. On a website called “Where the internet lives”, Google for the demise of the dream of a “bodiless and moral internet”. Finally, released a series of over 80 images by photographer Connie Zhou, the data center images are re-read in more detail and discussed as offering the first ever glimpse into their top-secret data centers.6 part of the life-support system of a failed utopia - sustaining a popular yet reductionist understanding of the informational society and its key Up to that point, the Google data centers, accommodating the search players. engine’s ever growing number of servers, had been clouded in secrecy. An article, published earlier that year in Wired Magazine, described how, KEYWORDS: civilian-military-industrial complex, cybernetics, “paranoid about the competitors catching a glimpse of its gear”, Google cyberspace, data centers, hegemony, infrastructure, internet, noo- let the maintenance staff of an outsourced server farm work in complete politics, photography, sustainability darkness, only allowing the use of miner’s headlamps.7 In their own data centers, the company developed the most advanced customized server hardware, including highly efficient techniques for the demanding task of cooling its servers. Once disclosed, as critics like Greenpeace argued, these techniques could also help to sink the carbon footprint of other companies and thus enhance the sustainability of the entire sector. Facebook, one of the advocates of sharing know- ENQUIRY / VOLUME 10 ISSUE 1 2013 43 certain awe-inspiring quality. Presenting a highly aestheticizing gaze on the man-made environment, the images have a certain kinship to the work of photographers like Andreas Gursky, Edward Burtynsky or Bas Princen, who operate within a recent tradition of what could be called the “anthropogenic Sublime”. The Sublime was originally described by Edmund Burke in A Philosophical Enquiry into the Origin of Our Ideas of the Sublime and Beautiful (1756) as “whatever is fitted in any sort to excite the ideas of pain and danger... Whatever is in any sort terrible, or is conversant about terrible objects, or operates in a manner analogous to terror”.10 Figure 1: “Where the internet lives”, Photographer: Connie Zhou Figure 2: Bas Princen, Valley (China), 2008 how had made a point of publishing its data and allowing journalists and Whereas for Burke the Sublime was to be found in the vastness of nature architectural photographers into their own data centers.8 and the inevitability of the Divine, for the photographers in question the natural landscape of the romantic tradition - in its entirety cultivated and Then Google struck back, and its strategy was sophisticated: by not just exploited, known and broadcasted - has lost its aura of mystery and allowing a newspaper photographer to visit their facilities for a day but danger. Instead, the infrastructural, (post)industrial, and urban landscape instead hiring an architectural photographer, who spent multiple weeks has become the source of the Sublime. The dual emotion of danger and exploring and photographing9, Google proved that its data centers were attraction is now invoked by the “vertiginous dynamic of globalization”, a not only transparent, but above all, that they were beautiful. globalization that arguably has become our new “Divine”.11 2. The Anthropogenic Sublime Let us take as an example the series of photographs of stock exchanges Even when refraining from the quasi-religious enthusiasm of the quotes from around the world that Gursky produced in the late nineties. The opening this article, it is hard to deny that the photographs have a images show, from a bird’s eye perspective, vast and colorful seas of 44 ENQUIRY / VOLUME 10 ISSUE 1 2013 people, paper and screens. The depth of field stretches the entire image, completely mirrored around their central axes. Blogs and newspapers, showing every element in crisp detail. Gursky thereby refuses to establish which just previously had published the images and had shared in the hierarchy, bathing the image in an extensive sameness that does not excitement surrounding them, felt deceived and published illustrated seem to be confined by its frame but extends far beyond and deep into analyses of the manipulations.13 our own existence. It is the vastness and abstraction of man-made global capitalism gone out of control that is at once disturbing and aesthetic and Google was fast to react and in turn released some before-after images thus serves as the source of the Sublime. In these photographs, man and an interview with the photographer: himself becomes the “man-made” landscape. Figure 3: Andreas Gursky, Chicago Board of Trade II, 1999 Interestingly, Gursky has often been compared to the painter Caspar “Beauty is something with a lot of symmetry. It’s more about the lines of David Friedrich, the master of the romantic Sublime. Gursky, in turn, everything I photograph [...] My goal was to help people see the beauty regards the distinction between the painter and the photographer as of the data centers. I had a certain artistic license to interpret the scene, outdated, seeing both as producers of “images” in a broad sense of but Google didn’t want me to hide or obscure anything - any changes I the word.12 Any Photoshop user will confirm how similar the process of made to a scene were aesthetic.”14 digital post-production - with its brushes, stamps and sponges - is to the process of painting. Like a painter, the photographer can manipulate that It is not my intention here to do the work of Photoshop forensics and which is depicted and thereby amplify (his understanding of) the truth. uncover that the images are manipulated. Nor do I want to engage in the discussion on whether it is correct to use artistic license to manipulate Soon after Google’s data center photographs were published, critical these images. Rather, I want to concentrate here on the question what observers started noticing and blogging about the fact that they too kind of image is constructed through manipulation and how techniques were heavily manipulated. In the most blatant cases, the images were of manipulation and composition contribute to this constructed reality. ENQUIRY / VOLUME 10 ISSUE 1 2013 45 Because, in spite of the artist’s statement, composition is not just a He goes on to state: matter of aesthetics. “The instantaneous world of electric informational media involves all of 3. Soothing Symmetry us, all at once. No detachment or frame is possible.”15 Symmetric composition, for example, is much more than an aesthetic choice. The construction of many of the data center images is based on Following McLuhan’s argument, the data center images thus try to central perspective (achieved sometimes, but not exclusively, through achieve the impossible: to depict the “instantaneous world of electric mirroring). Describing an example of renaissance central perspective, informational media” (the “internet” didn’t exist at the time of writing) Marshall McLuhan dramatically states: as a framed, thus controlled and externalized central perspective. The human, the viewer, is placed outside of the framed image. In the few Figure 5: Fransesco di Giorgio, Ideal City, 1477 cases where a human appears in the data center images, he is neatly arranged to coincide with the central perspective’s vanishing point.
Recommended publications
  • “Where the Internet Lives” Data Centers As Cloud Infrastructure
    C HAPTER 3 “Where the Internet Lives” Data Centers as Cloud Infrastructure JN OEN IFER H LT AND PATRICK VONDERAU mblazoned with the headline “Transparency,” Google released dozens of Einterior and exterior glossy images of their data centers on the company’s website in 2012. Inviting the public to “come inside” and “see where the In- ternet lives,” Google proudly announced they would reveal “what we’re made of— inside and out” by offering virtual tours through photo galleries of the tech- nology, the people, and the places making up their data centers.1 Google’s tours showed the world a glimpse of these structures with a series of photographs showcasing “the physical Internet,” as the site characterized it. The pictures consisted mainly of slick, artful images of buildings, wires, pipes, servers, and dedicated workers who populate the centers. Apple has also put the infrastructure behind its cloud services on display for the digital audience by featuring a host of infographics, statistics, and polished inside views of the company’s “environmentally responsible” data center fa- cilities on its website.2 Facebook, in turn, features extensive photo and news coverage of its global physical infrastructure on dedicated Facebook pages, while Microsoft presents guided video tours of their server farms for free download on its corporate website.3 Even smaller data centers like those owned by Eu- ropean Internet service provider Bahnhof AB, located in Sweden, are increas- ingly on digital exhibit, with their corporate parents offering various images of server racks, cooling and power technology, or even their meeting rooms, Copyright © ${Date}.
    [Show full text]
  • Google's Hyperscale Data Centres and Infrastructure Ecosystem in Europe
    GOOGLE’S HYPERSCALE DATA CENTRES AND INFRASTRUCTURE ECOSYSTEM IN EUROPE Economic impact study CLIENT: GOOGLE SEPTEMBER 2019 AUTHORS Dr Bruno Basalisco, Managing economist, Head of Digital Economy service Martin Bo Westh Hansen, Managing economist, Head of Energy & Climate service Tuomas Haanperä, Managing economist Erik Dahlberg, Senior economist Morten May Hansen, Economist Joshua Brown, Analyst Laurids Leo Münier, Analyst 0 Malthe Faber Laursen, Analyst Helge Sigurd Næss-Schmidt, Partner EXECUTIVE SUMMARY Digital transformation is a defining challenge and opportunity for the European economy, provid- ing the means to reinvent and improve how firms, consumers, governments, and citizens interact and do business with each other. European consumers, firms, and society stand to benefit from the resulting innovative products, processes, services, and business models – contributing to EU productivity. To maximise these benefits, it is key that the private and public sector can rely on an advanced and efficient cloud value chain. Considerable literature exists on cloud solutions and their transformative impact across the econ- omy. This report contributes by focusing on the analysis of the cloud value chain, taking Google as a relevant case study. It assesses the economic impact of the Google European hyperscale data cen- tres and related infrastructures which, behind the scenes, underpin online services such as cloud solutions. Thus, the report is an applied analysis of these infrastructure layers “above the cloud” (upstream inputs to deliver cloud solutions) and quantifies Google’s European economic contribu- tion associated with these activities. Double-clicking the cloud Services, such as Google Cloud, are a key example of a set of solutions that can serve a variety of Eu- ropean business needs and thus support economic productivity.
    [Show full text]
  • Data Centers
    REPORT HIGHLIGHTS Technological innovations are rapidly changing our lives, our Heat sensing drones deployed after natural disasters to locate businesses, and our economy. Technology, no longer an isolated survivors and deliver lifesaving equipment can arrive at the scene business sector, is a facilitator enabling innovation, growth, and faster than first responders. Wearable technologies that we sport the strengthening of America’s traditional business sectors. From help us lead healthier lifestyles. Distance learning courses empower transportation and energy to finance and medicine, businesses children and adults to learn new skills or trades to keep up with rely on technology to interact with their customers, improve their the constantly evolving job market. Innovations in science, energy, services, and make their operations more globally competitive. manufacturing, health care, education, transportation and many Innovative technology is deeply integrated into the economy and other fields—and their jobs—are being powered by data centers. is the driving force behind the creation of new jobs in science, health care, education, transportation, and more. Technology has But the benefits of data centers go beyond powering America’s fundamentally transformed our economy—and is poised to fuel even cutting-edge innovations. The economic impact, direct and indirect, more growth in the future. is substantial. Overall, there were 6 million jobs in the U.S. technology industry last While being built, a typical data center employs 1,688 local workers, year, and we expect this to increase by 4.1% in 2017. Technology- provides $77.7 million in wages for those workers, produces $243.5 related jobs run the gamut—from transportation logistics and million in output along the local economy’s supply chain, and warehousing to programmers and radiologists.
    [Show full text]
  • Mapreduce Programming Oct 30, 2012
    15-440" MapReduce Programming Oct 30, 2012" Topics" n" Large-scale computing" l" Traditional high-performance computing (HPC)! l" Cluster computing! n" MapReduce" l" Definition! l" Examples! n" Implementation" n" Alternatives to MapReduce" n" Properties" – 1 –! Typical HPC Machine" Compute Nodes" Compute Nodes n" High end processor(s)" CPU CPU CPU • • • n" Lots of RAM" Mem Mem Mem Network" n" Specialized" Network n" Very high performance" Storage Server" • • • n" RAID-based disk array" Storage Server – 2 –! HPC Machine Example" Jaguar Supercomputer" n" 3rd fastest in world" Compute Nodes" n" 18,688 nodes in largest partition" n" 2X 2.6Ghz 6-core AMD Opteron" n" 16GB memory" n" Total: 2.3 petaflop / 300 TB memory" Network" n" 3D torus" l" Each node connected to 6 neighbors via 6.0 GB/s links! Storage Server" n" 10PB RAID-based disk array" – 3 –! HPC Programming Model" Application Programs Software Packages Machine-Dependent Programming Model Hardware n" Programs described at very low level" l" Specify detailed control of processing & communications! n" Rely on small number of software packages" l" Written by specialists! l" Limits classes of problems & solution methods! – 4 –! Bulk Synchronous Programming" Solving Problem Over Grid" n" E.g., finite-element computation" Partition into Regions" n" p regions for p processors" Map Region per Processor" n" Local computation sequential" n" Periodically communicate boundary values with neighbors" – 5 –! Typical HPC Operation" Characteristics" n" Long-lived processes" Message Passing" n" Make use
    [Show full text]
  • Envisioning the Cloud: the Next Computing Paradigm
    Marketspace® Point of View Envisioning the Cloud: The Next Computing Paradigm March 20, 2009 Jeffrey F. Rayport Andrew Heyward Research Associates Raj Beri Jake Samuelson Geordie McClelland Funding for this paper was provided by Google, Inc Table of Contents Executive Summary .................................................................................... i Introduction................................................................................................ 1 Understanding the cloud...................................................................... 3 The network is the (very big, very powerful) computer .................... 6 How the data center became possible ............................................. 8 The passengers are driving the bus ...................................................11 Where we go from here......................................................................12 Benefits and Opportunities in the Cloud ..............................................14 Anywhere/anytime access to software............................................14 Specialization and customization of applications...........................17 Collaboration .......................................................................................20 Processing power on demand...........................................................21 Storage as a universal service............................................................23 Cost savings in the cloud ....................................................................24 Enabling the Cloud
    [Show full text]
  • Alphabet 2020
    Alphabet Alphabet 2020 Annual 2020 Report 2020 Rev2_210419_YIR_Cover.indd 1-3 4/19/21 7:02 PM Alphabet Year in Review 2020 210414_YIR_Design.indd 1 4/15/21 3:57 PM From our CEO 2 Year in Review 210414_YIR_Design.indd 2 4/15/21 3:57 PM To our investors, You might expect a company’s year in review to open with big numbers: how many products we launched, how many consumers and businesses adopted those products, and how much revenue we generated in the process. And, yes, you will see some big numbers shared in the pages of this report and in future earnings calls as well, but 22-plus years in, Google is still not a conventional company (and we don’t intend to become one). And 2020 was anything but a conventional year. That’s why over the past 12 months we’ve measured our success by the people we’ve helped in moments that matter. Our success is in the researchers who used our technology to fight the spread of the coronavirus. It’s in job seekers like Rey Justo, who, after being laid off during the pandemic, earned a Google Career Certificate online and was hired into a great new career. And our success is in all the small businesses who used Google products to continue serving customers and keeping employees on payroll … in the students who kept learning virtually on Google Classroom … and in the grandparents who read bedtime stories to grandchildren from thousands of miles away over Google Meet. We’ve always believed that we only succeed when others do.
    [Show full text]
  • In the United States District Court for the Eastern District of Texas Marshall Division
    Case 2:18-cv-00549 Document 1 Filed 12/30/18 Page 1 of 40 PageID #: 1 IN THE UNITED STATES DISTRICT COURT FOR THE EASTERN DISTRICT OF TEXAS MARSHALL DIVISION UNILOC 2017 LLC § Plaintiff, § CIVIL ACTION NO. 2:18-cv-00549 § v. § § PATENT CASE GOOGLE LLC, § § Defendant. § JURY TRIAL DEMANDED § ORIGINAL COMPLAINT FOR PATENT INFRINGEMENT Plaintiff Uniloc 2017 LLC (“Uniloc”), as and for their complaint against defendant Google LLC (“Google”) allege as follows: THE PARTIES 1. Uniloc is a Delaware limited liability company having places of business at 620 Newport Center Drive, Newport Beach, California 92660 and 102 N. College Avenue, Suite 303, Tyler, Texas 75702. 2. Uniloc holds all substantial rights, title and interest in and to the asserted patent. 3. On information and belief, Google, a Delaware corporation with its principal office at 1600 Amphitheatre Parkway, Mountain View, CA 94043. Google offers its products and/or services, including those accused herein of infringement, to customers and potential customers located in Texas and in the judicial Eastern District of Texas. JURISDICTION 4. Uniloc brings this action for patent infringement under the patent laws of the United States, 35 U.S.C. § 271 et seq. This Court has subject matter jurisdiction pursuant to 28 U.S.C. §§ 1331 and 1338(a). Page 1 of 40 Case 2:18-cv-00549 Document 1 Filed 12/30/18 Page 2 of 40 PageID #: 2 5. This Court has personal jurisdiction over Google in this action because Google has committed acts within the Eastern District of Texas giving rise to this action and has established minimum contacts with this forum such that the exercise of jurisdiction over Google would not offend traditional notions of fair play and substantial justice.
    [Show full text]
  • Google's Approach to IT Security
    Google’s Approach to IT Security A Google White Paper Introduction . 3 Overview . 3 Google Corporate Security Policies . 3 Organizational Security . 4 Data Asset Management . 5 Access Control . 6 Physical and Environmental Security . 8 Infrastructure Security . 9 Systems Development and Maintenance . 11 Disaster Recovery and Business Continuity . 14 Summary . 14 WHITE PAPER: GOOGle’s APPROACH TO IT SECURITY The security controls that isolate Introduction data during processing in the cloud Google technologies that use cloud computing (including Gmail, Google Calendar, were developed alongside the core technology from the Google Docs, Google App Engine, Google Cloud Storage among others) provide famiiar, beginning . Security is thus a key easy to use products and services for business and personal/consumer settings . These component of each of our services enable users to access their data from Internet-capable devices . This common cloud computing elements . cloud computing environment allows CPU, memory and storage resources to be shared and utilized by many users while also offering security benefits . Google provides these cloud services in a manner drawn from its experience with operating its own business, as well as its core services like Google Search . Security is a design component of each of Google’s cloud computing elements, such as compartmentalization, server assignment, data storage, and processing . This paper will explain the ways Google creates a platform for offering its cloud products, covering topics like information security, physical security and operational security . The policies, procedures and technologies described in this paper are detailed as of the time of authorship . Some of the specifics may change over time as we regularly innovate with new features and products .
    [Show full text]
  • Why Google Dominates Advertising Markets Competition Policy Should Lean on the Principles of Financial Market Regulation
    Why Google Dominates Advertising Markets Competition Policy Should Lean on the Principles of Financial Market Regulation Dina Srinivasan* * Since leaving the industry, and authoring The Antitrust Case Against Face- book, I continue to research and write about the high-tech industry and competition, now as a fellow with Yale University’s antitrust initiative, the Thurman Arnold Pro- ject. Separately, I have advised and consulted on antitrust matters, including for news publishers whose interests are in conflict with Google’s. This Article is not squarely about antitrust, though it is about Google’s conduct in advertising markets, and the idea for writing a piece like this first germinated in 2014. At that time, Wall Street was up in arms about a book called FLASH BOYS by Wall Street chronicler Michael Lewis about speed, data, and alleged manipulation in financial markets. The controversy put high speed trading in the news, giving many of us in advertising pause to appre- ciate the parallels between our market and trading in financial markets. Since then, I have noted how problems related to speed and data can distort competition in other electronic trading markets, how lawmakers have monitored these markets for con- duct they frown upon in equities trading, but how advertising has largely remained off the same radar. This Article elaborates on these observations and curiosities. I am indebted to and thank the many journalists that painstakingly reported on industry conduct, the researchers and scholars whose work I cite, Fiona Scott Morton and Aus- tin Frerick at the Thurman Arnold Project for academic support, as well as Tom Fer- guson and the Institute for New Economic Thinking for helping to fund the research this project entailed.
    [Show full text]
  • Moving Toward 24X7 Carbon-Free Energy at Google Data Centers: Progress and Insights
    Moving toward 24x7 Carbon-Free Energy at Google Data Centers: Progress and Insights Introduction In recent years, Google has become the world’s largest corporate buyer of renewable energy. In 2017 alone, we purchased more than seven billion kilowatt-hours of electricity (roughly as much as is used yearly by the state of Rhode Island1) from solar and wind farms that were built specifically for Google. This enabled us tomatch 100% of our annual electricity consumption through direct purchases of renewable energy; we are the first company of our size to do so. Reaching our 100% renewable energy purchasing goal was an important milestone, and we will continue to increase our purchases of renewable energy as our operations grow. However, it is also just the beginning. It represents a head start toward achieving a much greater, longer-term challenge: sourcing carbon-free energy for our operations on a 24x7 basis. Meeting this challenge requires sourcing enough carbon-free energy to match our electricity consumption in all places, at all times. Such an approach looks markedly different from the status quo, which, despite our large-scale procurement of renewables, still involves carbon-based power. Each Google facility is connected to its regional power grid just like any other electricity consumer; the power mix in each region usually includes some carbon-free resources (e.g. wind, solar, hydro, nuclear), but also carbon-based resources like coal, natural gas, and oil. Accordingly, we rely on those carbon-based resources — particularly when wind speeds or sunlight fade, and also in places where there is limited access to carbon-free energy.
    [Show full text]
  • Networking Best Practices for Large Deployments Google, Inc
    Networking Best Practices for Large Deployments Google, Inc. 1600 Amphitheatre Parkway Mountain View, CA 94043 www.google.com Part number: NETBP_GAPPS_3.8 November 16, 2016 © Copyright 2016 Google, Inc. All rights reserved. Google, the Google logo, G Suite, Gmail, Google Docs, Google Calendar, Google Sites, Google+, Google Talk, Google Hangouts, Google Drive, Gmail are trademarks, registered trademarks, or service marks of Google Inc. All other trademarks are the property of their respective owners. Use of any Google solution is governed by the license agreement included in your original contract. Any intellectual property rights relating to the Google services are and shall remain the exclusive property of Google, Inc. and/or its subsidiaries (“Google”). You may not attempt to decipher, decompile, or develop source code for any Google product or service offering, or knowingly allow others to do so. Google documentation may not be sold, resold, licensed or sublicensed and may not be transferred without the prior written consent of Google. Your right to copy this manual is limited by copyright law. Making copies, adaptations, or compilation works, without prior written authorization of Google. is prohibited by law and constitutes a punishable violation of the law. No part of this manual may be reproduced in whole or in part without the express written consent of Google. Copyright © by Google Inc. Google provides this publication “as is” without warranty of any either express or implied, including but not limited to the implied warranties of merchantability or fitness for a particular purpose. Google Inc. may revise this publication from time to time without notice.
    [Show full text]
  • SIGMETRICS Tutorial: Mapreduce
    Tutorials, June 19th 2009 MapReduce The Programming Model and Practice Jerry Zhao Technical Lead Jelena Pjesivac-Grbovic Software Engineer Yet another MapReduce tutorial? Some tutorials you might have seen: Introduction to MapReduce Programming Model Hadoop Map/Reduce Programming Tutorial and more. What makes this one different: Some complex "realistic" MapReduce examples Brief discussion of trade-offs between alternatives Google MapReduce implementation internals, tuning tips About this presentation Edited collaboratively on Google Docs for domains Tutorial Overview MapReduce programming model Brief intro to MapReduce Use of MapReduce inside Google MapReduce programming examples MapReduce, similar and alternatives Implementation of Google MapReduce Dealing with failures Performance & scalability Usability What is MapReduce? A programming model for large-scale distributed data processing Simple, elegant concept Restricted, yet powerful programming construct Building block for other parallel programming tools Extensible for different applications Also an implementation of a system to execute such programs Take advantage of parallelism Tolerate failures and jitters Hide messy internals from users Provide tuning knobs for different applications Programming Model Inspired by Map/Reduce in functional programming languages, such as LISP from 1960's, but not equivalent MapReduce Execution Overview Tutorial Overview MapReduce programming model Brief intro to MapReduce Use of MapReduce inside Google MapReduce programming examples MapReduce, similar and alternatives Implementation of Google MapReduce Dealing with failures Performance & scalability Usability Use of MapReduce inside Google Stats for Month Aug.'04 Mar.'06 Sep.'07 Number of jobs 29,000 171,000 2,217,000 Avg. completion time (secs) 634 874 395 Machine years used 217 2,002 11,081 Map input data (TB) 3,288 52,254 403,152 Map output data (TB) 758 6,743 34,774 reduce output data (TB) 193 2,970 14,018 Avg.
    [Show full text]