Rankbrain and the Era of Artificial Intelligence Rankbrain and the Era of Artificial Intelligence

Total Page:16

File Type:pdf, Size:1020Kb

Rankbrain and the Era of Artificial Intelligence Rankbrain and the Era of Artificial Intelligence RANKBRAINRANKBRAIN ANDAND THETHE ERAERA OFOF ARTIFICIALARTIFICIAL INTELLIGENCEINTELLIGENCE www.greenlightdigital.com Search Algorithms 2 www.greenlightdigital.com | +44 (0)20 7253 7000 RANKBRAIN AND THE ERA OF ARTIFICIAL INTELLIGENCE ADAM BUNN, DIRECTOR OF SEO AND CONTENT & ENGAGEMENT Last October’s RankBrain story was one of the splashiest of 2015. When I studied the impact RankBrain seemed to be having on the SERPs and the information available about it, including two different patents that seemed closely related to what RankBrain is supposed to do, I was reminded a lot of a previous update from Google called Hummingbird. You may remember it as 2013’s major algorithm update, which I wrote about in the 2013 SEO Briefing that year. Hummingbird marked the first time Google had ostensibly are an ambiguous term that humans understand only named its entire algorithm – but if you dig into what it because of the context that the search term sits within. But, was all about, you’ll see it was predominantly a change post- Hummingbird, places would have been substituted to the way Google processed search queries, with a with restaurants, resulting in more relevant results being few supporting tweaks to existing algorithms as well as returned. Effectively, Hummingbird allows Google to changes to the way sites were indexed. All the existing understand ambiguous parts of queries by their context. components, such as PageRank, Panda, Penguin and so on – the ones which actually determine rankings - were still But, here’s the thing: it does this with a sort of “brute running. There was a lot of panic and excitement about force” approach that relied on trying out various known Hummingbird caused by the apparent change to the entire potential substitutions of any ambiguous words in context algorithm (even though the actual ranking algorithms with the words either side of it, then the words one word barely changed at all) which was further spurred by the away either side, then the words two words away, until reports from Google that it affected 90% of queries an adequate substitution was found. Finally, the revised (it was a change to the way they processed queries, so query is run alongside, or instead of, the original query. naturally it affected nearly all of them!). Ultimately, it isn’t a real “understanding” of the terms. In my estimation, the big change with Hummingbird All of the context around Hummingbird’s query processing was the introduction of a series of synonym databases engine is important if you want to understand the and “substitution engines” that allowed Google to do a RankBrain we know today, as it seems to be an evolution better job of returning relevant results for ambiguous of Hummingbird’s bid to comprehending search terms. natural language queries. A good example from one of the The big difference is that instead of revising some of the Hummingbird patents was handling queries such as “pizza queries with the above described brute force approach, places in London”. This would have been a difficult query it revises them by guessing the intent of a query using a for pre-Hummingbird Google to deal with because places machine learning algorithm, which can learn from previous www.greenlightdigital.com | +44 (0)20 7253 7000 3 Search Algorithms data about what searchers meant when making ambiguous A final point is that when RankBrain came out, Google queries, and applying those learnings to understand new publicly stated that it was “the third most important ambiguous queries. ranking factor”. Based on what I’ve put forward so far, you’d be right to conclude that it’s certainly not a “ranking Compared to Hummingbird’s query processing, this makes factor” that you’d need to consider in the traditional sense, RankBrain particularly adept at understanding queries so I don’t quite know how to reconcile that statement. In that Google has never seen before. According to Google, my view it was almost certainly hyperbole designed to RankBrain was rolled out gradually at the beginning of attract coverage on the company’s advances in artificial 2015, and helps with “a large fraction” of the queries it intelligence, or a twisted version of the truth. If I had to receives (15% was initially reported, but this later proved to guess, I’d say that the “large fraction” of queries it affects be an unfounded assumption). means that it technically impacts the most top tier queries I won’t pretend to be an expert on machine learning but of any Google algorithm, having the third greatest reach in from what we know, Google is using a branch of the science terms of ranking factors. Measured like this, algorithms we called “deep learning” that tries to create AI-like behaviour consider to have important ranking factors, such as Panda based on large data sets, and has probably been applying or Penguin, look insignificant in comparison as they usually it to their products for a while. They’ve had a dedicated only impact 2-3% of queries each time they’re rolled out. deep learning project called Google Brain since 2011, and This kind of chicanery would provide a technically sound Google’s capabilities in this field took big strides forward basis for Google’s claim, even though actual rankings are in 2013 and 2014 with the acquisition of DNNResearch not impacted too much by RankBrain. Inc. and DeepMind Technologies respectively, with both companies focussed on deep learning. 4 www.greenlightdigital.com | +44 (0)20 7253 7000 The Quality Update or “Phantom” On 5th May 2015, a noticeable algorithm update took place. The update was initially named “Phantom” by the industry because there was no word from Google on its nature, or even that an update had happened. Our testing and analysis showed significant changes in the ranking impact of key user signals such as bounce rate. Later that month, Google confirmed it had changed the way site quality was being processed in its core algorithm, but provided no further detail on the changes. NEWS WAVE PANDA A comparatively minor update that The industry headed into 2015 believing seemingly benefitted sites that regularly that Panda had been incorporated as updated their content – namely a permanent “switched on” part of the newspapers and magazines – was reported algorithm. This marked a change from the in June 2015. It corresponded with Google traditional behaviour of a Panda update, regaining access to the “Twitter Firehose” which would typically be pushed out – a raw feed of tweets directly from Twitter monthly. By April, Google had stated this – lending credence to the theory that wasn’t the case, and July saw the start of Google was analysing trending topics and the first Panda update of the year – almost responding with fresh content, leading to a 10 months after the previous confirmed boost for news websites. update. Google eventually stated it would roll out the update over the course of PENGUIN several months. Meanwhile, Penguin didn’t seem to In August 2015, many sites saw gains run at all last year; there was no public they’d made in July reversed, leading some announcement of an update and no notable to speculate that Google had wound back shifts in rankings that pointed towards the update. Penguin as the cause. We were all braced Given the huge reduction in frequency, the for a big update in October 2015 when long roll outs and the general silence on Google stated the new Penguin would the subject of Panda, it certainly seems be coming “soon”, but they eventually we’re in a post-Panda era – at least as far postponed it to 2016 (and, of course, it still as Google’s public acknowledgement of the hasn’t happened as I write this). algorithm. Google may have found a way to The last confirmed Penguin update was in score quality without Panda, or more likely December 2014. considering it was always their intention, finally and quietly incorporated part, or all of it, into the main algorithm. www.greenlightdigital.com | +44 (0)20 7253 7000 5 www.greenlightdigital.com The Varnish Works, 3 Bravingtons Walk, King’s Cross, London, N1 9AJ +44 (0)20 7253 7000 [email protected].
Recommended publications
  • Received Citations As a Main SEO Factor of Google Scholar Results Ranking
    RECEIVED CITATIONS AS A MAIN SEO FACTOR OF GOOGLE SCHOLAR RESULTS RANKING Las citas recibidas como principal factor de posicionamiento SEO en la ordenación de resultados de Google Scholar Cristòfol Rovira, Frederic Guerrero-Solé and Lluís Codina Nota: Este artículo se puede leer en español en: http://www.elprofesionaldelainformacion.com/contenidos/2018/may/09_esp.pdf Cristòfol Rovira, associate professor at Pompeu Fabra University (UPF), teaches in the Depart- ments of Journalism and Advertising. He is director of the master’s degree in Digital Documenta- tion (UPF) and the master’s degree in Search Engines (UPF). He has a degree in Educational Scien- ces, as well as in Library and Information Science. He is an engineer in Computer Science and has a master’s degree in Free Software. He is conducting research in web positioning (SEO), usability, search engine marketing and conceptual maps with eyetracking techniques. https://orcid.org/0000-0002-6463-3216 [email protected] Frederic Guerrero-Solé has a bachelor’s in Physics from the University of Barcelona (UB) and a PhD in Public Communication obtained at Universitat Pompeu Fabra (UPF). He has been teaching at the Faculty of Communication at the UPF since 2008, where he is a lecturer in Sociology of Communi- cation. He is a member of the research group Audiovisual Communication Research Unit (Unica). https://orcid.org/0000-0001-8145-8707 [email protected] Lluís Codina is an associate professor in the Department of Communication at the School of Com- munication, Universitat Pompeu Fabra (UPF), Barcelona, Spain, where he has taught information science courses in the areas of Journalism and Media Studies for more than 25 years.
    [Show full text]
  • Machine Learning for Marketers
    Machine Learning for Marketers A COMPREHENSIVE GUIDE TO MACHINE LEARNING CONTENTS pg 3 Introduction pg 4 CH 1 The Basics of Machine Learning pg 9 CH. 2 Supervised vs Unsupervised Learning and Other Essential Jargon pg 13 CH. 3 What Marketers can Accomplish with Machine Learning pg 18 CH. 4 Successful Machine Learning Use Cases pg 26 CH. 5 How Machine Learning Guides SEO pg 30 CH. 6 Chatbots: The Machine Learning you are Already Interacting with pg 36 CH. 7 How to Set Up a Chatbot pg 45 CH. 8 How Marketers Can Get Started with Machine Learning pg 58 CH. 9 Most Effective Machine Learning Models pg 65 CH. 10 How to Deploy Models Online pg 72 CH. 11 How Data Scientists Take Modeling to the Next Level pg 79 CH. 12 Common Problems with Machine Learning pg 84 CH. 13 Machine Learning Quick Start INTRODUCTION Machine learning is a term thrown around in technol- ogy circles with an ever-increasing intensity. Major technology companies have attached themselves to this buzzword to receive capital investments, and every major technology company is pushing its even shinier parentartificial intelligence (AI). The reality is that Machine Learning as a concept is as days that only lives and breathes data science? We cre- old as computing itself. As early as 1950, Alan Turing was ated this guide for the marketers among us whom we asking the question, “Can computers think?” In 1969, know and love by giving them simpler tools that don’t Arthur Samuel helped define machine learning specifi- require coding for machine learning.
    [Show full text]
  • The Machine Learning Journey with Google
    The Machine Learning Journey with Google Google Cloud Professional Services The information, scoping, and pricing data in this presentation is for evaluation/discussion purposes only and is non-binding. For reference purposes, Google's standard terms and conditions for professional services are located at: https://enterprise.google.com/terms/professional-services.html. 1 What is machine learning? 2 Why all the attention now? Topics How Google can support you inyour 3 journey to ML 4 Where to from here? © 2019 Google LLC. All rights reserved. What is machine0 learning? 1 Machine learning is... a branch of artificial intelligence a way to solve problems without explicitly codifying the solution a way to build systems that improve themselves over time © 2019 Google LLC. All rights reserved. Key trends in artificial intelligence and machine learning #1 #2 #3 #4 Democratization AI and ML will be core Specialized hardware Automation of ML of AI and ML competencies of for deep learning (e.g., MIT’s Data enterprises (CPUs → GPUs → TPUs) Science Machine & Google’s AutoML) #5 #6 #7 Commoditization of Cloud as the platform ML set to transform deep learning for AI and ML banking and (e.g., TensorFlow) financial services © 2019 Google LLC. All rights reserved. Use of machine learning is rapidly accelerating Used across products © 2019 Google LLC. All rights reserved. Google Translate © 2019 Google LLC. All rights reserved. Why all the attention0 now? 2 Machine learning allows us to solve problems without codifying the solution. © 2019 Google LLC. All rights reserved. San Francisco New York © 2019 Google LLC. All rights reserved.
    [Show full text]
  • Profiles in Innovation: Artificial Intelligence
    EQUITY RESEARCH | November 14, 2016 Artificial intelligence is the apex technology of the information era. In the latest in our Profiles in Innovation Heath P. Terry, CFA series, we examine how (212) 357-1849 advances in machine [email protected] learning and deep learning Goldman, Sachs & Co. have combined with more Jesse Hulsing powerful computing and an (415) 249-7464 ever-expanding pool of data [email protected] to bring AI within reach for Goldman, Sachs & Co. companies across Mark Grant industries. The development (212) 357-4475 [email protected] of AI-as-a-service has the Goldman, Sachs & Co. potential to open new markets and disrupt the Daniel Powell (917) 343-4120 playing field in cloud [email protected] computing. We believe the Goldman, Sachs & Co. ability to leverage AI will Piyush Mubayi become a defining attribute (852) 2978-1677 of competitive advantage [email protected] for companies in coming Goldman Sachs (Asia) L.L.C. years and will usher in a Waqar Syed resurgence in productivity. (212) 357-1804 [email protected] Goldman, Sachs & Co. PROFILESIN INNOVATION Artificial Intelligence AI, Machine Learning and Data Fuel the Future of Productivity Goldman Sachs does and seeks to do business with companies covered in its research reports. As a result, investors should be aware that the firm may have a conflict of interest that could affect the objectivity of this report. Investors should consider this report as only a single factor in making their investment decision. For Reg AC certification and other important disclosures, see the Disclosure Appendix, or go to www.gs.com/research/hedge.html.
    [Show full text]
  • Large-Scale Deep Learning with Tensorflow
    Large-Scale Deep Learning With TensorFlow Jeff Dean Google Brain team g.co/brain In collaboration with many other people at Google What is the Google Brain Team? ● Research team focused on long term artificial intelligence research ○ Mix of computer systems and machine learning research expertise ○ Pure ML research, and research in context of emerging ML application areas: ■ robotics, language understanding, healthcare, ... g.co/brain We Disseminate Our Work in Many Ways ● By publishing our work ○ See papers at research.google.com/pubs/BrainTeam.html ● By releasing TensorFlow, our core machine learning research system, as an open-source project ● By releasing implementations of our research models in TensorFlow ● By collaborating with product teams at Google to get our research into real products What Do We Really Want? ● Build artificial intelligence algorithms and systems that learn from experience ● Use those to solve difficult problems that benefit humanity What do I mean by understanding? What do I mean by understanding? What do I mean by understanding? What do I mean by understanding? Query [ car parts for sale ] What do I mean by understanding? Query [ car parts for sale ] Document 1 … car parking available for a small fee. … parts of our floor model inventory for sale. Document 2 Selling all kinds of automobile and pickup truck parts, engines, and transmissions. Example Needs of the Future ● Which of these eye images shows symptoms of diabetic retinopathy? ● Find me all rooftops in North America ● Describe this video in Spanish
    [Show full text]
  • Big Data Systems Big Data Parallelism
    Big Data Systems Big Data Parallelism • Huge data set • crawled documents, web request logs, etc. • Natural parallelism: • can work on different parts of data independently • image processing, grep, indexing, many more Challenges • Parallelize applicaFon • Where to place input and output data? • Where to place computaFon? • How to communicate data? How to manage threads? How to avoid network boJlenecks? • Balance computaFons • Handle failures of nodes during computaFon • Scheduling several applicaFons who want to share infrastructure Goal of MapReduce • To solve these distribuFon/fault-tolerance issues once in a reusable library • To shield the programmer from having to re-solve them for each program • To obtain adequate throughput and scalability • To provide the programmer with a conceptual framework for designing their parallel program Map Reduce • Overview: • ParFFon large data set into M splits • Run map on each parFFon, which produces R local parFFons; using a parFFon funcFon R • Hidden intermediate shuffle phase • Run reduce on each intermediate parFFon, which produces R output files Details • Input values: set of key-value pairs • Job will read chunks of key-value pairs • “key-value” pairs a good enough abstracFon • Map(key, value): • System will execute this funcFon on each key-value pair • Generate a set of intermediate key-value pairs • Reduce(key, values): • Intermediate key-value pairs are sorted • Reduce funcFon is executed on these intermediate key- values Count words in web-pages Map(key, value) { // key is url // value is
    [Show full text]
  • Bringing Data Into Focus
    Bringing Data into Focus Brian F. Tankersley, CPA.CITP, CGMA K2 Enterprises Bringing Data into Focus It has been said that data is the new oil, and our smartphones, computer systems, and internet of things devices add hundreds of millions of gigabytes more every day. The data can create new opportunities for your cooperative, but your team must take care to harvest and store it properly. Just as oil must be refined and separated into gasoline, diesel fuel, and lubricants, organizations must create digital processing platforms to the realize value from this new resource. This session will cover fundamental concepts including extract/transform/load, big data, analytics, and the analysis of structured and unstructured data. The materials include an extensive set of definitions, tools and resources which you can use to help you create your data, big data, and analytics strategy so you can create systems which measure what really matters in near real time. Stop drowning in data! Attend this session to learn techniques for navigating your ship on ocean of opportunity provided by digital exhaust, and set your course for a more efficient and effective future. Copyright © 2018, K2 Enterprises, LLC. Reproduction or reuse for purposes other than a K2 Enterprises' training event is prohibited. About Brian Tankersley @BFTCPA CPA, CITP, CGMA with over 25 years of Accounting and Technology business experience, including public accounting, industry, consulting, media, and education. • Director, Strategic Relationships, K2 Enterprises, LLC (k2e.com) (2005-present) – Delivered presentations in 48 US states, Canada, and Bermuda. • Author, 2014-2019 CPA Firm Operations and Technology Survey • Director, Strategic Relationships / Instructor, Yaeger CPA Review (2017-present) • Freelance Writer for accounting industry media outlets such as AccountingWeb and CPA Practice Advisor (2015-present) • Technology Editor, The CPA Practice Advisor (CPAPracAdvisor.com) (2010-2014) • Selected seven times as a “Top 25 Thought Leader” by The CPA Practice Advisor.
    [Show full text]
  • Business Owners Guide to SEO
    A Business Owners Guide To SEO Search Engine Optimization can increase your sales or blow your marketing budget. This guide will help you know how SEO works. reddmg.com [email protected] Quick History About SEO search en·gine op·ti·mi·za·tion: (SEO) is the ar and science of driving targeted online trafc to your website from search engines. RankBrain has changed SEO. Years ago, SEO was about manipulating search engines. We used black hat techniques like buying backlinks, duplicating pages, stufng keywords into pages, and building duplicate websites. Business owners need to know that black hat SEO tactics used to work. Today, they don't. Black hat is another word for manipulating search engines. Don't be fooled by companies that promise page one of Google by using black hat tactics. Save your money and time. They don't work. This is the frst thing business owners need to understand before hiring someone for SEO work. If a business owner wants to fnd himself or herself at the top of SERPs, (SERPS is another name for Search Engine Results Pages) there is one rule. Write good content. Below you will fnd a snapshot of the history of Google Algorithm Updates. No-one paid any atention to Google updates until about 2011. In 2011 the Panda update de-indexed thousands of website because of backlinks. Google changed the rules and took down the sites of anyone who was buying backlinks or trying to manipulate the search engines. Some big companies had to shut their doors. reddmg.com [email protected] One day you were in business the next day you had zero trafc and zero orders.
    [Show full text]
  • 3/21/2016 1 an Introduction to Search Engine Optimization
    3/21/2016 AN INTRODUCTION TO SEARCH ENGINE OPTIMIZATION DCBA LAW PRACTICE MANAGEMENT & TECHNOLOGY SECTION MARCH 22, 2016 Presenter: Christine P. Miller, OVC Lawyer Marketing Search Engine Optimization (SEO) Basics for Attorneys • Search engine optimization (SEO) for attorneys is a marketing discipline focused on increasing a law firm’s website visibility in non-paid search engine results. • SEO results are also known as “organic” or “natural”. • SEO is important. In general, when a site or a webpage is higher ranked on a search engine results page (SERP) and is listed more frequently on search engine listings, the greater the probability the site will receive more visits from users. • The major goal of SEO is to have your website found for relevant searches that will encourage a search engine user to visit your site and make a decision to contact and/or hire you. SEO also increases awareness of your law firm. Search Engines – What They Are and What They Do A web search engine is a software system that is designed to: 1.) Crawl websites on the Internet. 2.) Build an index of websites. 3.) Search its index and provide search users with a ranked list of the websites they've determined are the most relevant every time a user conducts a search. These listings are called search engine results pages (SERPs). Search results can show webpages, images, videos, news, and more. What search engines are most important? (display chart) • Google, Bing, and Yahoo 1 3/21/2016 Search Engine Usage Figures Data from www.comScore.com This is what a Google Search Engine Results Page (SERP) typically Google looks like when you type in AdWords Pay a city and a service.
    [Show full text]
  • How to Conduct a Technical SEO Audit Table of Contents
    How to Conduct a Technical SEO Audit Table of Contents 1 Introduction 4 Goals of a technical audit 10 Essential tools for technical auditing 14 Indexing 21 Server performance 29 Site structure 39 Keyword performance 45 On-page optimization 52 Duplicate content 56 Mobile issues 60 International issues 65 Schema.org 71 Drawing conclusions from your audit 73 Conclusion Introduction 1 2 Introduction Understanding the technical aspects of your website’s SEO is crucial to the success of your efforts online. However, keeping up with Google’s numerous algorithm changes proves challenging to even the most tech-savvy of marketers. To help you set your website up for success, HubSpot and OnCrawl have teamed up to provide you with this Guide to Conducting a Technical SEO audit. This ebook will provide you with all the information, strategies, and tools necessary to help you understand how to use technical SEO to improve the performance of your website. SEO (search engine optimisation) refers to techniques that help your website rank higher in search engine results pages (SERPs). One aspect of the overarching SEO topic is technical SEO. Technical SEO includes key ranking factors that are embedded in the structure of your website, including crawling, indexing and more! To understand how technical SEO affects the overall performance of your website, it is necessary to conduct a technical SEO audit. What is a technical audit? A technical SEO audit is a process of analyzing and tracking key website performance metrics that impact all areas of your website. The goal of a technical SEO audit is to identify the shortcomings of your website’s structure that negatively affect its performance and thereby hurt its SEO and rankings in search.
    [Show full text]
  • This Transcript Is Provided for the Convenience of Investors Only, for a Full Recording Please See the Q 2 2016 Earnings Call Webcast
    This transcript is provided for the convenience of investors only, for a full recording please see the Q 2 2016 Earnings Call webcast . Alphabet Q2 2016 Earnings Call Candice (Operator): G ood day, ladies and gentlemen, and welcome to the Alphabet Q2 2016 earnings conference call. At this time, all participants are on a listen­only mode. Later, we will conduct a Q&A session and instructions will follow at that time. If anyone should require operator assistance, please press star then zero on your Touch­Tone telephone. As a reminder, today's conference call is being recorded. I will now turn the conference over to Ellen West, head of Investor Relations. Please go ahead. Ellen West, VP ­ Investor Relations: T hank you. Good afternoon everyone and welcome to Alphabet's second quarter 2016 earnings conference call. With us today are Ruth Porat and Sundar Pichai. While you've been waiting for the call to start, you've been listening to Aurora, an incredible new artist from Norway who is finding a rapidly growing audience on YouTube all over the world. Now I'll quickly cover the safe harbor. Some of the statements that we make today may be considered forward­looking, including statements regarding our future investments, our long­term growth and innovation, the expected performance of our businesses, and our expected level of capital expenditures. These statements involve a number of risks and uncertainties that could cause actual results to differ materially. For more information, please refer to the risk factors discussed in our form 10­K for 2015 filed with the SEC.
    [Show full text]
  • Machine Intelligence at Google Scale: Vision/Speech API, Tensorflow and Cloud ML Kaz Sato
    Machine Intelligence at Google Scale: Vision/Speech API, TensorFlow and Cloud ML Kaz Sato Staff Developer Advocate +Kazunori Sato Tech Lead for Data & Analytics @kazunori_279 Cloud Platform, Google Inc. What we’ll cover Deep learning and distributed training Large scale neural network on Google Cloud Cloud Vision API and Speech API TensorFlow and Cloud Machine Learning Deep Learning and Distributed Training From: Andrew Ng DNN = a large matrix ops a few GPUs >> CPU (but it still takes days to train) a supercomputer >> a few GPUs (but you don't have a supercomputer) You need Distributed Training on the cloud Google Brain. Large scale neural network on Google Cloud Google Cloud is The Datacenter as a Computer Enterprise Jupiter network 10 GbE x 100 K = 1 Pbps Consolidates servers with microsec latency Borg No VMs, pure containers 10K - 20K nodes per Cell DC-scale job scheduling CPUs, mem, disks and IO Google Cloud + Neural Network = Google Brain 13 The Inception model (GoogLeNet, 2015) What's the scalability of Google Brain? "Large Scale Distributed Systems for Training Neural Networks", NIPS 2015 ○ Inception / ImageNet: 40x with 50 GPUs ○ RankBrain: 300x with 500 nodes Large-scale neural network for everyone Cloud Vision API Pre-trained models. No ML skill required REST API: receives images and returns a JSON $2.5 or $5 / 1,000 units (free to try) Public Beta - cloud.google.com/vision Demo 22 Cloud Speech API Pre-trained models. No ML skill required REST API: receives audio and returns texts Supports 80+ languages Streaming or non-streaming
    [Show full text]