Semantic Analytics

Oct 2016 – NorthField Asia Research Seminar, Sydney

Dr Alex Johnston, Director of Client Technology

THOMSON REUTERS: WHO ARE WE? Thomson Reuters is the world’s leading source of intelligent information for businesses and professionals. We are powered by the world’s most trusted news organization.

Recently Sold

F&R has now enjoyed 7 consecutive quarters of positive growth driven by investments in content, service, new platform. Achieved 30% EBITDA margin target in 4Q15, an improvement of more than 400 basis points since 2013. FINANCIAL & RISK: WHAT WE DO

Driving Performance We serve more than 40,000 customers and 400,000 end-users in more than 150 countries:

• 2 million news stories per year • 5,000+ investment firms and hedge funds supported world-wide Enabling Connectivity • $250 billion in bond trading supported daily • $420 billion+ in FX trading per day • 40,000+ regulatory alerts supplied to the world’s banks per year • 2 million+ individuals and entities that can pose a potential risk Managing Risk & Regulation to the international business community are tracked daily • 11 million+ Messaging interactions daily • 2.5 million~ price updates distributed per second to the financial markets OUR CUSTOMERS VOICE: FINDING INFORMATION IS PAIN

“I spend over 10% of my time on Google Activity Pain looking for information that others may not have.” Analyst, Large UK Hedge fund Information overload 1. Increasingly difficulty to keep up with “Data points get lost in translation. Data – 20% available information to develop better Amazon analyst puts it in, but Best-Buy Assemble insight or ideas analyst does not get it” Director, 2. Cannot effectively surface company $4B US L/S equity fund management views and intent through Data – 15% traditional methods/sources Synthesize “They all still do it manually” Understanding relationships External 10% Equities, Market data team $20B+ hedge fund 3. Inability to better understand the Meetings relationships of a company with “I spend a lot of my time reading research Data – 20% customers and suppliers reports, gathering economic data, getting Interpret 4. Inability to predict events or track international company filings ” catalysts that impact companies and industries Senior Analyst – $8B US Value Fund Financial 15% Integrating information Modeling “Fundamental guys are still working with 5. Cannot link and integrate internal rudimentary tools compared to quants” research/data to external research and Head, Large US Broker Principal Investing Communicate 20% data sources Findings 6. Unable to improve insights into a company “The problem has gotten worse with or industry by mining new sources of more data and more information” unstructured data Director, Research/Tech, $10B Multi-strat Source: Customer meetings, TR internal analyst survey THOMSON REUTERS CONTENT COVERAGE NEWS & COMMENTARY REFERENCE DATA SPECIALIZED DATA RISK & COMPLIANCE INTELLECTUAL . Commentary . Index Constituents and Weightings . Commodities Fundamentals . Know Your Client (KYC) PROPERTY . Global and Domestic News . Industry Classifications . Deals & Transactions . Operational Risk Management . Intellectual Property . Newsletters . Security Identifiers Intelligence . Regulatory Risk Management . Copyrights . Significant Developments . Terms and Conditions . Mutual Fund Data (Lipper) LEGAL DISPUTES . Patents/Applications . Video . Trademarks . Quantitative Analytics and . Commodities Research & Forecasts . Arbitration COMPANY DATA Models . Administrative Case Law VALUE CHAIN DATA . MACRO-ECONOMIC . Broker Research . Private Equity Data . Jury Verdicts . Suppliers . Business Classifications DATA . Tax Case Law . Distributors . Credit (CDS) RISK & REGULATORY . Court Dockets . Network of relationships . Company News . Type, relevance and . Country Data . Official California Code of . Court Filings . Competitors Regulations characteristics of . Economic Indicators and Polls . Corporate Actions LAWS & REGULATIONS relationships . Industrial Activity . KYC Org ID . Debt & Syndicated Loans . Bills (Legislation) . People Screening . Entity Risk (Corporate Structures) . Regulatory Intelligence . Court Rules MARKET DATA & PRICING . ESG Data (Ranking and Ratings) . Risk Screening . Financial Regulations . Equities . Estimates . Science Regulations . Commodities & Energy . Events & Transcripts . Tax Regulations SCIENTIFIC DATA . Derivatives & Options . Fundamentals . Statutes . Fixed Income . M&A . Biomarkers . Treaties Authority . Foreign Exchange . Officers & Directors . Chemistry . FX and Interest Rate Polls . Ownership & Bond Holdings . Clinical Trials . Futures . Private Company Data . Disease Reports . Global Aggregates . Shareholder Activism Intelligence . Drug Experimental Results . Indexes and Benchmarks . StarMine® Scores . Drug Reports •People . Loan Pricing . Transactions . Drugs / Compounds . Valuation . Genomics •Organization . Zoological Records •...

TRIPLE (subject, predicate, object): Google, is succeeded by, Alphabet SEMANTIC CONCEPTS Example Entities

Account, Acquisition, Anniversary, Asset, Business Activity, City, Company, Continent, CorporateAction, Country, Currency, Document, Event, Editor, EmailAddress, EntertainmentAwardEvent, Facility, FaxNumber, Film, Fund, Industry, Holiday, IndustryTerm, Instrument, Journalist, LipperClassification, Location, MarketIndex, MedicalCondition, MedicalTreatment, Movie, MusicAlbum, MusicGroup, NaturalFeature, OperatingSystem, Organization, Person, Pharmaceutical Drug, PhoneNumber, PoliticalEvent, Position, Product, Project, ProgrammingLanguage, ProvinceOrState, PublishedMedium, Quote, RadioProgram, RadioStation, Region, Sentiment, SportsEvent, SportsGame, SportsLeague, Technology, Transaction, TVShow, TVStation, URL , WorldCheck Example Relationships

Acquisition, Alliance, AnalystEarningsEstimate, AnalystRecommendation, ArmedAttack, ArmsPurchaseSale, Arrest, Bankruptcy, BonusSharesIssuance, BusinessRelation, Buybacks, CandidatePosition, CompanyAccountingChange, CompanyAffiliates, CompanyCompetitor, CompanyCustomer, CompanyEarningsAnnouncement, CompanyEarningsGuidance, CompanyEmployeesNumber, CompanyExpansion, CompanyForceMajeure, CompanyFounded, CompanyInvestigation, CompanyInvestment, CompanyLaborIssues, CompanyLayoffs, CompanyLegalIssues, CompanyListingChange, CompanyLocation, CompanyMeeting, CompanyNameChange, CompanyProduct, CompanyReorganization, CompanyRestatement, CompanyTechnology, CompanyTicker, CompanyUsingProduct, ConferenceCall, ContactDetails, Conviction, CreditRating, Deal, DebtFinancing, DelayedFiling, DiplomaticRelations, Dividend, EmploymentChange, EmploymentRelation, EnvironmentalIssue, EquityFinancing, Extinction, FamilyRelation, FDAPhase, IndicesChanges, Indictment, IPO, JointVenture, ManMadeDisaster, Merger, MilitaryAction, MovieRelease, MusicAlbumRelease, NaturalDisaster, PatentFiling, PatentIssuance, PersonAttributes, PersonCareer, PersonCommunication, PersonEducation, PersonEmailAddress, PersonLocation, PersonParty, PersonRelation, PersonTravel, PoliticalEndorsement, PoliticalRelationship, PollsResult, ProductIssues, ProductRecall, ProductRelease, Quotation, SecondaryIssuance, StockSplit, Trial, VotingResult BIG OPEN – RELATIONSHIPS & SEMANTICS TR Framework TR Capability Issues/Benefits

Stitching ‘The • 4.ExploreInability toRelationships predict events or track Analytics catalysts that impact companies and Graph’ • Analyseindustries Risk Impacts

Intelligent • 6.ExtractUnable toConcepts improve insights with intoAI a Tag company or industry by mining new Tagging • Tagsources & Locate of unstructured Semantically data

PermID & 5. Cannot link and integrate internal • Reduce Technology Cost Identify Semantic research/data to external research • Correlateand data sources Data easier Web* INFORMATION OVERLOAD: SEARCH IS BROKEN

According to a recent study by IDC, “The High Cost of Not Finding Information,” the average knowledge worker spends up to 2.5 hours per day searching for or gathering information or data. This includes searches, email queries and other related tasks that all result in a massive amount of time spent trying to find information that already exists. This equates to approximately 400 or so hours per employee, per year searching or gathering information. Using these numbers, we can calculate that a firm such as Goldman Sachs, with approximately 32,000 employees, earning on average $105,000/employee, would be spending approximately $646 million per year on enterprise search.

What is the relationship between Bill Gates and Warren Buffett? LINKED DATA: DISSECT RELATIONSHIPS

Graph finds the signal in the noise

Source: • TR DataFusion • TR DataLake • 4 Steps (30%) THE GRAPH – DISSECTING RISKS

Event (News) Impact on Risk (Slavery) in the Supply Portfolio Chain

Songlka > Wal Mart

entities

Danaher > Volkswagen

Distance (bacon number) between entities and types of relationship (supplier, parent company, location…) brings meaning and insight to information THE GRAPH – EXPLORING INVESTMENT IDEAS Stemcells and Microbot (Pvt company) reveal reverse M&A - Liquidity Events merger

Historical Officer Desmond O’Connell may have liquidity event: of interest to Pvt Banking

Desmond also has investments in Abiomed & Serologicals

THE GRAPH – EXPLORING INVESTMENT IDEAS Research indicates Infineon disfavors acquisition of Investment Exploration International Rectifier - may shut down operation, affecting Intel who it supplies

Infineon also supplies Microsoft who just announced Layoffs

NXP Semiconductor is an alternative investment to Intel, without the same supply issues

CONNECTING GRAPHS AND CONTENT

Relationship Managers Investment Advisors Risk Managers Public Relations

Thomson Reuters Graph Panama Papers

Internal Analytics

Thematic Investing

8 billion relationships

SCORING GRAPH: QUANTIFYING RELATIONSHIPS NEWS IMPACT ON PORTFOLIO Powered by THOMSON REUTERS LABS

Relationship Weight

Tier 1 supplier 0.80

Subsidiary 0.90 Portfolio

Competitor 0.60 )

Company i l Customer 0.20 …

Tier 2 supplier 0.50

Length ( Length Subsidiary of T1 0.75 News / Path supplier Risk 0.8 Path (i) Customer of T1 0.10 Event supplier Supplier of 0.15 competitor

Algorithms traverse edges to find most relevant paths per use case. Path strength is a new metric 16 What Does it mean? Tools that combine Relationship & Algorithm Variables and link content Pattern & Metrics Identification

• Use PermID, ontology & tools to • Visually traverse a of • New Variable Types ? merge discrete datasets concepts to identify relationships News Impact e.g. news event with price between concepts (companies, Supply Chain Risk industries, people, prices) Officer Risk • Symbology and tooling support Publicity Exposure/Impact merging structured and unstructured • Better: Use AI to crawl the semantic content from any source web and uncover relationships • New metrics? Semantic Distance • Graph algorithms can identify ‘The Bacon Number’ relationships for other algorithms to Impact / Penetration verify

USE CASES ARE BROADBASED AND DIVERSE

Graph tech is rapidly moving into professional scenarios • “25% of enterprises will use graph db by 2017” - Forrester • “Graph analysis is possibly the single most effective competitive differentiator for organizations pursuing data-driven operations and decisions.” – Gartner

“Don’t just give me what I asked for – tell me what I need to know.”

20