Exploring Visual Content in Social Networks

Total Page:16

File Type:pdf, Size:1020Kb

Exploring Visual Content in Social Networks FACULDADE DE ENGENHARIA DA UNIVERSIDADE DO PORTO Exploring Visual Content in Social Networks Mariana Afonso WORKING VERSION PREPARATION FOR THE MSCDISSERTATION Supervisor: Prof. Dr. Luís Filipe Pinto de Almeida Teixeira February 18, 2015 c Mariana Fernandez Afonso, 2015 Contents 1 Introduction 1 1.1 Background . 1 1.2 Motivation . 2 1.3 Objectives . 2 1.4 Document Structure . 3 2 Related Concepts and Background 5 2.1 Social network mining . 5 2.1.1 Twitter . 6 2.1.2 Research directions and trends . 6 2.1.3 Opinion mining and sentiment analysis . 6 2.1.4 Cascade and Influence Maximization . 7 2.1.5 TweeProfiles . 8 2.2 Data mining techniques and algorithms . 9 2.2.1 Clustering . 10 2.2.2 Data Stream Clustering . 22 2.2.3 Cluster validity . 22 2.3 Content based image retrieval . 26 2.3.1 The semantic gap . 27 2.3.2 Reducing the sematic gap . 28 2.4 Image descriptors . 32 2.4.1 Low-level image features . 32 2.4.2 Mid level image descriptors . 54 2.5 Performance Evaluation . 58 2.5.1 Image databases . 59 2.6 Visualization of image collections . 61 2.7 Discussion . 65 3 Related Work 67 3.1 Discussion . 70 4 Work plan 71 4.1 Methodology . 71 4.2 Development Tools . 71 4.3 Task description and planning . 72 References 75 i ii CONTENTS List of Figures 2.1 Clusters in Portugal; Content proportion 100%. 9 2.2 Clusters in Portugal; Content 50% + Spacial 50%. 9 2.3 Clusters in Portugal; Content 50% + Temporal 50%. 9 2.4 Clusters in Portugal; Content 50% + Social 50% . 9 2.5 Visualization of the clusters obtained by TweeProfiles using different weights for the different dimensions. Extracted from [1]. 9 2.6 Example of kmeans clustering algorithm in a 2-D dataset. 14 2.7 Example of a dentrogram for hierarchical clustering, using either agglomerative or divisive methods. Extracted from [2] . 15 2.8 Illustration of single-link clustering. Extracted from [2] . 16 2.9 Illustration of complete-link clustering. Extracted from [2] . 16 2.10 Illustration of average-link clustering. Extracted from [2] . 16 2.11 Clusters found by DBSCAN in 3 databases. Extracted from [3] . 21 2.12 Example of two clustering results (obtained by K-means algorithm) in two differ- ent datasets. 25 2.13 A general model of CBIR systems. Extracted from [4]. 27 2.14 Object-ontology model used in [5]. Extracted from [5] . 29 2.15 CBIR with RF. Extracted from [6]. 30 2.16 Schematic representation of the different types of image descriptors. Extracted from [7]. 32 2.17 The RGB color model. Extracted from [8]. 33 2.18 HSV color representation. Extracted from [8] . 34 2.19 Example of color histograms for an example image. 35 2.20 Two images with similar color histograms. Extracted from [9]. 36 2.21 Three different textures with the same distribution of black and white. Extracted from [10]. 37 2.22 First level of a wavelet decomposition in 3 steps: Low and High pass filtering in horizontal direction, the same in the vertical direction, subsampling. Extracted from [11]. 39 2.23 The basis of Hough transform line detection; (A) (x,y) point image space; (B) (m,c) parameter space. Extracted from [12]. 42 2.24 For each octave, adjacent Gaussian images are subtracted to produce the difference- of-Gaussian images. After each octave, the images are downsampled with a factor of 2, and the process is repeated. Extracted from [13]. 46 2.25 Detection of the maxima and minima of the difference-of-Gaussian images by comparing the neighbors. Extracted from [13]. 46 iii iv LIST OF FIGURES 2.26 Stages of keypoint selection. (a) The 233 189 pixel original image. (b) The ini- ⇥ tial 832 keypoints locations at maxima and minima of the difference-of-Gaussian function. Keypoints are displayed as vectors indicating scale, orientation, and lo- cation. (c) After applying a threshold on minimum contrast, 729 keypoints remain. (d) The final 536 keypoints that remain following an additional threshold on ratio of principal curvatures. Extracted from [13]. 48 2.27 Example of a SIFT descriptor matching found in two different images of the same sight. Extracted from [14]. 49 2.28 Left to right: The (discretised and cropped) Gaussian second order partial deriva- tives in y-direction and xy-direction, and our approximations thereof using box filters. The grey regions are equal to zero. Extracted from [15]. 51 2.29 The descriptor entries of a sub-region represent the nature of the underlying inten- sity pattern. Left: In case of a homogeneous region, all values are relatively low. Middle: In presence of frequencies in x direction, the value of dx is high, but | | all others remain low. If the intensity is gradually increasing in x direction, both values dx and dx are high. Extracted from [15]. 52 | | | | 2.30 Process for Bag-of-Features Image Representation for CBIR. Extracted from [16]. 55 2.31 Example of a construction of a three-level pyramid. Extracted from [17]. 57 2.32 An analogy between image and text document in semantic granularity. Extracted from [18]. 58 2.33 Examples of images from the Corel dataset. 59 2.34 The 100 objects (classes) from the Coil-100 image databse. Extracted from [19]. 60 2.35 Examples of images from the MIR Flickr dataset. Also listed are Creative Com- mon attribution license icons and the creators of the images. Extracted from [20]. 61 2.36 A snapshot of two root-to-leaf branches of ImageNet: the top row is from the mammal subtree; the bottom row is from the vehicle subtree. For each synset, 9 randomly sampled images are presented. Extracted from [21]. 61 2.37 Results obtained by AverageExplorer: the many informative modes and examples of images and patches within each mode. Extracted from [22]. 62 2.38 Examples of average images edited and aligned compared to the unaligned ver- sions, using the AverageExplorer interactive visualization tool. Extracted from [22]. 64 2.39 A summary of 10 images extracted from a set of 2000 images of the Vatican com- puted by our algorithm. Extracted from [23]. 65 4.1 Gantt diagram of the work plan. 73 Chapter 1 Introduction 1.1 Background With the emergence of social media websites like Twitter, Facebook, Flickr and YouTube, there has a been a huge amount of information produced everyday by the millions of users. Facebook alone reports 6 billion photo uploads per month and Youtube sees 72 hours of video uploaded every minute[22]. The truth is that we have experienced an explosion of visual data available online, mainly shared by users trough the social media websites. The goal of this thesis is to find patterns in images shared via the social network Twitter. These patterns will be found using a technique called clustering, where given a set of unlabeled data, groups are formed based on the similarity of the content of the data itself. This means that the each cluster will represent a pattern or concept, which could depict, for example, a location, an activity or a photographic trend (e.g. "selfies"). Image analysis has been a popular topic of research among scientists for decades. Inside this large topic, there are more specific areas of research, such as image classification, image retrieval, image clustering and image annotation. All these areas use techniques and methods from the image processing, computer vision, machine learning and pattern recognition. In this thesis, image clustering will be the main focus, nonetheless it is important to understand and to have some background on the other fields as well. Twitter is one of the most popular social networks available and is mainly used as a microblog- ging service. Twitter users share what is called tweets, which are short messages (maximum of 140 characters). These tweets can also include links and multimedia content such as images. In order to obtain data shared in Twitter, many crawlers (a bot for automatic indexing of the Web) have been developed. This project will make use of data collected by TwitterEcho [24]. After the image clusters have been found, a visualization tool will be developed. This should allow the user to explore the clusters obtained by visualizing relevant or representative images among the clusters. This system will be implemented using the library OpenCV, which is a popular computer vision open-source library. It is available for C++, Python, Java and Android. 1 2 Introduction 1.2 Motivation The information shared in the social media websites can have many formats such as text, images or video. Until recent years, there has been a research focus on the analysis and extraction of relevant information from the text content of the information produced. This is because text is simpler to analyze and categorize than the other two formats. However, ignoring the visual content shared in the social media could be seen as a waste of important information for research. Therefore, there is a need to develop more robust and more powerful algorithms for the anal- ysis of images from large image collections. The big issue with visual content is that the features which can be extracted directly from the image, called low-level, do not give information about the content or high-level concepts present in an image. For that reason, it is extremely difficult to compare images in a way that is understandable and acceptable to humans. Consequently, this is an area of great potential for research. The possibilities for applications of those types of systems are endless and could include safety, marketing and behavioral studies.
Recommended publications
  • 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
    1 TABLE OF CONTENTS 2 I. INTRODUCTION ...................................................................................................... 2 3 II. JURISDICTION AND VENUE ................................................................................. 8 4 III. PARTIES .................................................................................................................... 9 5 A. Plaintiffs .......................................................................................................... 9 6 B. Defendants ....................................................................................................... 9 7 IV. FACTUAL ALLEGATIONS ................................................................................... 17 8 A. Alphabet’s Reputation as a “Good” Company is Key to Recruiting Valuable Employees and Collecting the User Data that Powers Its 9 Products ......................................................................................................... 17 10 B. Defendants Breached their Fiduciary Duties by Protecting and Rewarding Male Harassers ............................................................................ 19 11 1. The Board Has Allowed a Culture Hostile to Women to Fester 12 for Years ............................................................................................. 19 13 a) Sex Discrimination in Pay and Promotions: ........................... 20 14 b) Sex Stereotyping and Sexual Harassment: .............................. 23 15 2. The New York Times Reveals the Board’s Pattern
    [Show full text]
  • Should Google Be Taken at Its Word?
    CAN GOOGLE BE TRUSTED? SHOULD GOOGLE BE TAKEN AT ITS WORD? IF SO, WHICH ONE? GOOGLE RECENTLY POSTED ABOUT “THE PRINCIPLES THAT HAVE GUIDED US FROM THE BEGINNING.” THE FIVE PRINCIPLES ARE: DO WHAT’S BEST FOR THE USER. PROVIDE THE MOST RELEVANT ANSWERS AS QUICKLY AS POSSIBLE. LABEL ADVERTISEMENTS CLEARLY. BE TRANSPARENT. LOYALTY, NOT LOCK-IN. BUT, CAN GOOGLE BE TAKEN AT ITS WORD? AND IF SO, WHICH ONE? HERE’S A LOOK AT WHAT GOOGLE EXECUTIVES HAVE SAID ABOUT THESE PRINCIPLES IN THE PAST. DECIDE FOR YOURSELF WHO TO TRUST. “DO WHAT’S BEST FOR THE USER” “DO WHAT’S BEST FOR THE USER” “I actually think most people don't want Google to answer their questions. They want Google to tell them what they should be doing next.” Eric Schmidt The Wall Street Journal 8/14/10 EXEC. CHAIRMAN ERIC SCHMIDT “DO WHAT’S BEST FOR THE USER” “We expect that advertising funded search engines will be inherently biased towards the advertisers and away from the needs of consumers.” Larry Page & Sergey Brin Stanford Thesis 1998 FOUNDERS BRIN & PAGE “DO WHAT’S BEST FOR THE USER” “The Google policy on a lot of things is to get right up to the creepy line.” Eric Schmidt at the Washington Ideas Forum 10/1/10 EXEC. CHAIRMAN ERIC SCHMIDT “DO WHAT’S BEST FOR THE USER” “We don’t monetize the thing we create…We monetize the people that use it. The more people use our products,0 the more opportunity we have to advertise to them.” Andy Rubin In the Plex SVP OF MOBILE ANDY RUBIN “PROVIDE THE MOST RELEVANT ANSWERS AS QUICKLY AS POSSIBLE” “PROVIDE THE MOST RELEVANT ANSWERS AS QUICKLY
    [Show full text]
  • Ouellette, the Google Shortcut to Trademark Law, 102 CALIF
    Preferred citation: Lisa Larrimore Ouellette, The Google Shortcut to Trademark Law, 102 CALIF. L. REV. (forthcoming 2014), available at http://ssrn.com/abstract=2195989. The Google Shortcut to Trademark Law Lisa Larrimore Ouellette* Trademark distinctiveness—the extent to which consumers view a mark as identifying a particular source—is the key factual issue in assessing whether a mark is protectable and what the scope of that protection should be. But distinctiveness is difficult to evaluate in practice: assessments of “inherent distinctiveness” are highly subjective, survey evidence is expensive and unreliable, and other measures of “acquired distinctiveness” such as advertising spending are poor proxies for consumer perceptions. But there is now a simpler way to determine whether consumers associate a word or phrase with a certain product: Google. Through a study of trademark cases and contemporaneous search results, I argue that Google can generally capture both prongs of the test for trademark distinctiveness: if a mark is strong—either inherently distinctive or commercially strong—then many top search results for that mark relate to the source it identifies. The extent of results overlap between searches for two different marks can also be relevant for assessing the likelihood of confusion of those marks. In the cases where Google and the court disagree, I argue that Google more accurately reflects how consumers view a given mark. Courts have generally given online search results little weight in offline trademark disputes. But the key factual questions in these cases depend on the wisdom of the crowds, making Google’s “algorithmic authority” highly probative. * Yale Law School Information Society Project, Postdoctoral Associate in Law and Thomson Reuters Fellow.
    [Show full text]
  • Revisiting Search Engine Bias Eric Goldman
    William Mitchell Law Review Volume 38 | Issue 1 Article 14 2011 Revisiting Search Engine Bias Eric Goldman Follow this and additional works at: http://open.mitchellhamline.edu/wmlr Recommended Citation Goldman, Eric (2011) "Revisiting Search Engine Bias," William Mitchell Law Review: Vol. 38: Iss. 1, Article 14. Available at: http://open.mitchellhamline.edu/wmlr/vol38/iss1/14 This Article is brought to you for free and open access by the Law Reviews and Journals at Mitchell Hamline Open Access. It has been accepted for inclusion in William Mitchell Law Review by an authorized administrator of Mitchell Hamline Open Access. For more information, please contact [email protected]. © Mitchell Hamline School of Law Goldman: Revisiting Search Engine Bias REVISITING SEARCH ENGINE BIAS † Eric Goldman I. INTRODUCTION ........................................................................ 96 II. DEVELOPMENTS IN THE LAST HALF-DOZEN YEARS .................. 97 A. Google Rolled Up the Keyword Search Market but Faces Other New Competitors ........................................................ 97 B. Google’s Search Results Page Has Gotten More Complicated ...................................................................... 102 C. Google’s Portalization ........................................................ 103 D. “Net Neutrality” and “Search Neutrality” ........................... 105 III. THE END OF RATIONAL DISCUSSION ABOUT SEARCH ENGINE BIAS........................................................................... 107 I. INTRODUCTION
    [Show full text]
  • From #Deleteuber. to 'Hell': a Short History of Uber's Recent Struggles
    From #deleteUber to 'Hell': A short history of Uber's recent straggles - The Washington ... Page 1 of 5 sit- t I cur 3-14-t? The Washington post The Switch From #deleteUber. to 'Hell': A short history of Uber's recent struggles By Brian Fung May 5 Uber has not had a great start to the year. The ride-hailing company has been reeling from a public battering over claims of willful discrimination, allegations of intellectual-property theft and the departure of several executives. The controversies have resurfaced a debate over Uber's hard-charging internal culture and the consequences of its win-at-any-cost attitude to business and regulation. If you're having trouble keeping up with all the Uber news, here's a brief history of its struggles in 2017. Late January Hundreds of thousands of Uber customers reportedly canceled their accounts in response to the #deleteUber social media campaign after the company continued to run service while taxis were on strike at John F. Kennedy International Airport in New York. The strike was to protest President Trump's travel ban targeting foreign nationals from seven Muslim-majority countries. Some users viewed Uber's decision to continue to pick up passengers as an attempt at profiteering. Uber suffered a lo percent drop in rides that weekend, according to some estimates. Feb. 2 Amid continuing criticism for not suspending service to JFK over the travel ban, Uber chief executive Travis Kalanick stepped down from his position as a strategic adviser to Trump. Kalanick had been a member of a special advisory council to the White House that included Tesla chief executive Elon Musk and a number of other executives.
    [Show full text]
  • Revisiting Search Engine Bias Eric Goldman Santa Clara University School of Law, [email protected]
    Santa Clara Law Santa Clara Law Digital Commons Faculty Publications Faculty Scholarship 1-1-2011 Revisiting Search Engine Bias Eric Goldman Santa Clara University School of Law, [email protected] Follow this and additional works at: http://digitalcommons.law.scu.edu/facpubs Part of the Law Commons Recommended Citation 38 Wm. Mitchell L. Rev. 96 (2011-2012) This Article is brought to you for free and open access by the Faculty Scholarship at Santa Clara Law Digital Commons. It has been accepted for inclusion in Faculty Publications by an authorized administrator of Santa Clara Law Digital Commons. For more information, please contact [email protected]. REVISITING SEARCH ENGINE BIAS † Eric Goldman I. INTRODUCTION ........................................................................ 96 II. DEVELOPMENTS IN THE LAST HALF-DOZEN YEARS .................. 97 A. Google Rolled Up the Keyword Search Market but Faces Other New Competitors ........................................................ 97 B. Google’s Search Results Page Has Gotten More Complicated ...................................................................... 102 C. Google’s Portalization ........................................................ 103 D. “Net Neutrality” and “Search Neutrality” ........................... 105 III. THE END OF RATIONAL DISCUSSION ABOUT SEARCH ENGINE BIAS........................................................................... 107 I. INTRODUCTION Questions about search engine bias have percolated in the academic literature for over a decade. In the past few years, the issue has evolved from a quiet academic debate to a full blown regulatory and litigation frenzy. At the center of this maelstrom is Google, the dominant market player. This Essay looks at changes in the industry and political environment over the past half-dozen years that have contributed to the current situation,1 and supplements my prior contribution to † Associate Professor and Director, High Tech Law Institute, Santa Clara University School of Law.
    [Show full text]
  • Amended Complaint
    19CV341522 Santa Clara – Civil Electronically Filed 1 OTTINI OTTINI INC B & B , . by Superior Court of CA, FrancisA.Bottini,Jr.(SBN175783) County of Santa Clara, 2 AlbertY.Chang(SBN296065) on 8/16/2019 3:18 PM YuryA.Kolesnikov(SBN271173) 3 Reviewed By: R. Walker 7817IvanhoeAvenue,Suite102 Case #19CV341522 LaJolla,California92037 4 Envelope: 3274669 Telephone:(858)914-2001 5 Facsimile:(858)914-2002 Email:[email protected] 6 [email protected] [email protected] 7 8 COHENMILSTEINSELLERS&TOLLPLLC JulieGoldsmithReiser( ) 9 MollyBowen( ) 1100NewYorkAvenue,N.W.,Suite500 10 Washington,D.C.20005 11 Telephone:(202)408-4600 Facsimile:(202)408-4699 12 Email:[email protected] [email protected] 13 14 15 [AdditionalCounselonSignaturePage] 16 SUPERIORCOURTOFTHESTATEOFCALIFORNIA INANDFORTHECOUNTYOFSANTACLARA 17 18 INREALPHABETINC.SHAREHOLDER LeadCaseNo.19CV341522 19 DERIVATIVELITIGATION 20 CONSOLIDATED 21 STOCKHOLDERDERIVATIVE ThisDocumentRelatesto: COMPLAINTFOR: 22 DEMANDFUTILITYACTION (1)BREACHOFFIDUCIARYDUTY; 23 (2)UNJUSTENRICHMENT; (3)CORPORATEWASTE;and 24 (4)ABUSEOFCONTROL 25 JURYTRIALDEMANDED 26 27 PUBLIC-REDACTSMATERIALSFROMCONDITIONALLYSEALEDRECORD 28 CONSOLIDATEDSTOCKHOLDERDERIVATIVECOMPLAINT 1 TABLE OF CONTENTS 2 I. INTRODUCTION ...................................................................................................... 2 3 II. JURISDICTION AND VENUE ............................................................................... 11 4 III. PARTIES .................................................................................................................
    [Show full text]
  • 2012, Dec, Google Introduces Metaweb Searching
    Google Gets A Second Brain, Changing Everything About Search Wade Roush12/12/12Follow @wroush Share and Comment In the 1983 sci-fi/comedy flick The Man with Two Brains, Steve Martin played Michael Hfuhruhurr, a neurosurgeon who marries one of his patients but then falls in love with the disembodied brain of another woman, Anne. Michael and Anne share an entirely telepathic relationship, until Michael’s gold-digging wife is murdered, giving him the opportunity to transplant Anne’s brain into her body. Well, you may not have noticed it yet, but the search engine you use every day—by which I mean Google, of course—is also in the middle of a brain transplant. And, just as Dr. Hfuhruhurr did, you’re probably going to like the new version a lot better. You can think of Google, in its previous incarnation, as a kind of statistics savant. In addition to indexing hundreds of billions of Web pages by keyword, it had grown talented at tricky tasks like recognizing names, parsing phrases, and correcting misspelled words in users’ queries. But this was all mathematical sleight-of-hand, powered mostly by Google’s vast search logs, which give the company a detailed day-to-day picture of the queries people type and the links they click. There was no real understanding underneath; Google’s algorithms didn’t know that “San Francisco” is a city, for instance, while “San Francisco Giants” is a baseball team. Now that’s changing. Today, when you enter a search term into Google, the company kicks off two separate but parallel searches.
    [Show full text]
  • (2019-01-10) BUZZFEED. a New Lawsuit Claims Google's Board
    A New Lawsuit Claims Google's Board Tolerated Sexual Misconduct By Former Execs Andy Rubin And Amit Singhal 1/12/19, 10:26 AM REPORTING TO YOU TECH Google's Board Is Being Sued For Allegedly Silencing Misconduct Claims Against Former Executives Lawyers say an internal investigation by Google claimed to find bondage videos on former company executive Andy Rubin’s computer. “You won’t believe what’s in these [shareholder] minutes.” Davey Alba BuzzFeed News Reporter Ryan Mac BuzzFeed News Reporter https://www.buzzfeednews.com/article/daveyalba/google-board-lawsuit-silencing-sexual-misconduct-claims Page 1 of 9 A New Lawsuit Claims Google's Board Tolerated Sexual Misconduct By Former Execs Andy Rubin And Amit Singhal 1/12/19, 10:26 AM Last updated on January 11, 2019, at 2:41 p.m. ET Posted on January 10, 2019, at 5:01 p.m. ET Mason Trinca / Getty Images Attorneys representing a Google shareholder revealed details of a new lawsuit that claims the company's board of directors covered up sexual misconduct by former Google senior executives, including Andy Rubin, the creator of the popular mobile operating software Android, and Amit Singhal, the former head of search. Even after Google investigated and found that the allegations of Rubin's sexual misconduct were credible, the suit says, it allowed him to resign and approved a $90 million severance for him. The suit, which was filed in San Mateo County Superior Court in California https://www.buzzfeednews.com/article/daveyalba/google-board-lawsuit-silencing-sexual-misconduct-claims Page 2 of 9 A New Lawsuit Claims Google's Board Tolerated Sexual Misconduct By Former Execs Andy Rubin And Amit Singhal 1/12/19, 10:26 AM Thursday morning, contains minutes from board of directors meetings that detail how the tech giant dealt with employee complaints made against Rubin in 2014 and Singhal in 2016, and how it compensated these executives when they left the company.
    [Show full text]
  • Modern Information Retrieval: a Brief Overview
    Modern Information Retrieval: A Brief Overview Amit Singhal Google, Inc. [email protected] Abstract For thousands of years people have realized the importance of archiving and finding information. With the advent of computers, it became possible to store large amounts of information; and finding useful information from such collections became a necessity. The field of Information Retrieval (IR) was born in the 1950s out of this necessity. Over the last forty years, the field has matured considerably. Several IR systems are used on an everyday basis by a wide variety of users. This article is a brief overview of the key advances in the field of Information Retrieval, and a description of where the state-of-the-art is at in the field. 1 Brief History The practice of archiving written information can be traced back to around 3000 BC, when the Sumerians designated special areas to store clay tablets with cuneiform inscriptions. Even then the Sumerians realized that proper organization and access to the archives was critical for efficient use of information. They developed special classifications to identify every tablet and its content. (See http://www.libraries.gr for a wonderful historical perspective on modern libraries.) The need to store and retrieve written information became increasingly important over centuries, especially with inventions like paper and the printing press. Soon after computers were invented, people realized that they could be used for storing and mechanically retrieving large amounts of information. In 1945 Vannevar Bush published a ground breaking article titled “As We May Think” that gave birth to the idea of automatic access to large amounts of stored knowledge.
    [Show full text]
  • Google's Worldwidewatch Over the Worldwideweb
    Google’s WorldWideWatch over the WorldWideWeb Charting Google’s Internet Empire & Data Hegemony Google’s Data Dominance is Vast, Purposeful, Lasting & Harmful By Scott Cleland* President, Precursor LLC** [email protected] Publisher, www.GoogleMonitor.com & www.Googleopoly.net Author, Search & Destroy: Why You Can’t Trust Google Inc. www.SearchAndDestroyBook.com September 2014 * The views expressed in this presentation are the author’s; see Scott Cleland’s full biography at: www.ScottCleland.com **Precursor LLC serves Fortune 500 clients, some of which are Google competitors. September 2014 Scott Cleland -- Precursor LLC 1 Outline • Executive Summary • Introduction • Google’s Data Dominance is: – Vast – Purposeful – Lasting – Harmful • Recommendations Appendices September 2014 Scott Cleland -- Precursor LLC 2 EXECUTIVE SUMMARY September 2014 Scott Cleland -- Precursor LLC 3 Executive Summary Of Google’s WorldWideWatch Google’s Internet Empire & Data Hegemony/Dominance Google’s de facto WorldWideWatch. • Sir Tim Berners-Lee invented the WorldWideWeb, a public and decentralized system to enable universal access to information online; Google’s Larry Page has invented, and is perfecting, Google’s de facto WorldWideWatch, a privately-controlled and hyper- centralized universal system to enable universal accessibility to information Google determines is “useful.” • Google’s unique WorldWideWatch surveillance and data collection capabilities have created an unrivaled Internet empire and data hegemony, because Google’s data dominance is demonstrably vast, purposeful, and lasting. Google’s Vast Internet Empire is Unique (and on path to corner most of the global data economy). • Over half of all daily Internet traffic involves Google in some way, and Google Analytics tracks 98% of top ~15 million websites • Google controls 5 of the top 6, billion-user, universal web platforms: search, video, mobile, maps, and browser.
    [Show full text]
  • At 15, Google Revisits Past, Eyes Future 26 September 2013
    At 15, Google revisits past, eyes future 26 September 2013 billions of people have come online, the web has grown exponentially, and now you can ask any question on the powerful little device in your pocket," said Google Search chief Amit Singhal in a blog post. "You can explore the world with the Knowledge Graph, ask questions aloud with voice search, and get info before you even need to ask with Google Now." Singhal said the change includes "a simpler, more unified design on mobile devices." A woman searches the web using Google on May 13, 2013 in Rennes. Google celebrated its 15th birthday Thursday with a trip down memory lane, and an update to the search engine formula which helped spawn the tech giant. The company took journalists on a tour of where it all started—Susan Wojcicki's garage in Menlo Park, California, where Larry Page and Sergey Brin began working on Google in 1998. Wojcicki is currently a Google vice president. Susan Wojcicki of Google on September 10, 2013 in San Francisco, California. A Google+ page meanwhile included a photo album of the original home search page, and collected dozens of birthday wishes. "You'll also notice a new look and feel for Google But Google, which has grown into one of the Search and ads on your phones and tablets," he world's biggest companies, was not content to just added. look at the past. It announced an upgrade to its main search engine, with new ways to integrate its "It's cleaner and simpler, optimized for touch, with use across different devices.
    [Show full text]