> Internet of Things > Blockchain > Social Media > Careers

MAY 2019 www.computer.org Our New Website Has Launched!

Some of the many new features include:

• Digital Library—Improved search • Tech News—Highlights blogs and helps you more easily navigate new articles, as well as trending 700k+ articles in the IEEE papers from the digital library. Computer Society Digital Library. • Education—View the new • Conference Calendar—Plan your IEEE Computer Society Course event learning experiences with Catalog featuring a range of new fi lters for 200+ conferences. educational offerings. • Inside the Computer Society— • Calls for Papers—Discover Keep up with member activities opportunities to write and in the new Inside the Computer speak with the new Calls for Society area. Papers platform.

Discover more at www.computer.org today! IEEE COMPUTER SOCIETY computer.org • +1 714 821 8380

STAFF

Editor Publications Portfolio Managers Cathy Martin Carrie Clark, Kimberly Sperka

Publications Operations Project Specialist Publisher Christine Anthony Robin Baldwin Publications Marketing Project Specialist Meghan O’Dell Senior Advertising Coordinator Debbie Sims Production & Design Carmen Flores-Garvey

Circulation: ComputingEdge (ISSN 2469-7087) is published monthly by the IEEE Computer Society. IEEE Headquarters, Three Park Avenue, 17th Floor, New York, NY 10016-5997; IEEE Computer Society Publications Office, 10662 Los Vaqueros Circle, Los Alamitos, CA 90720; voice +1 714 821 8380; fax +1 714 821 4010; IEEE Computer Society Headquarters, 2001 L Street NW, Suite 700, Washington, DC 20036. Postmaster: Send address changes to ComputingEdge-IEEE Membership Processing Dept., 445 Hoes Lane, Piscataway, NJ 08855. Periodicals Postage Paid at New York, New York, and at additional mailing offices. Printed in USA. Editorial: Unless otherwise stated, bylined articles, as well as product and service descriptions, reflect the author’s or firm’s opinion. Inclusion in ComputingEdge does not necessarily constitute endorsement by the IEEE or the Computer Society. All submissions are subject to editing for style, clarity, and space. Reuse Rights and Reprint Permissions: Educational or personal use of this material is permitted without fee, provided such use: 1) is not made for profit; 2) includes this notice and a full citation to the original work on the first page of the copy; and 3) does not imply IEEE endorsement of any third-party products or services. Authors and their companies are permitted to post the accepted version of IEEE-copyrighted material on their own Web servers without permission, provided that the IEEE copyright notice and a full citation to the original work appear on the first screen of the posted copy. An accepted manuscript is a version which has been revised by the author to incorporate review suggestions, but not the published version with copy-editing, proofreading, and formatting added by IEEE. For more information, please go to: http://www.ieee.org/publications_standards/publications/rights/paperversionpolicy.html. Permission to reprint/republish this material for commercial, advertising, or promotional purposes or for creating new collective works for resale or redistribution must be obtained from IEEE by writing to the IEEE Intellectual Property Rights Office, 445 Hoes Lane, Piscataway, NJ 08854-4141 or [email protected]. Copyright © 2019 IEEE. All rights reserved. Abstracting and Library Use: Abstracting is permitted with credit to the source. Libraries are permitted to photocopy for private use of patrons, provided the per- copy fee indicated in the code at the bottom of the first page is paid through the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923. Unsubscribe: If you no longer wish to receive this ComputingEdge mailing, please email IEEE Computer Society Customer Service at [email protected] and type “unsubscribe ComputingEdge” in your subject line. IEEE prohibits discrimination, harassment, and bullying. For more information, visit www.ieee.org/web/aboutus/whatis/policies/p9-26.html.

IEEE Computer Society Magazine Editors in Chief

Computer IEEE Security & Privacy Computing in Science David Alan Grier (Interim), David Nicol, University of Illinois & Engineering Djaghe LLC at Urbana-Champaign Jim X. Chen, George Mason University IEEE Micro IEEE Software IEEE Intelligent Systems Lizy Kurian John, University of Ipek Ozkaya, Software V.S. Subrahmanian, Dartmouth Texas, Austin Engineering Institute College IEEE MultiMedia IEEE Internet Computing IEEE Computer Graphics Shu-Ching Chen, Florida George Pallis, University of and Applications International University Cyprus Torsten Möller, University of Vienna IEEE Annals of the History IT Professional of Computing Irena Bojanova, NIST IEEE Pervasive Computing Marc Langheinrich, University of Gerardo Con Diaz, University of Lugano California, Davis

www.computer.org/computingedge 1 MAY 2019 • VOLUME 5, NUMBER 5

THEME HERE 18 31 38 Toward a Machine Emoji: Lingua The Online Intelligence Franca or Trolling Layer for Diverse Passing Fancy? Ecosystem Industrial IoT Use Cases Internet of Things 10 Semantic Enablement in IoT Service Layers— Standard Progress and Challenges KOMAL GILANI, JAHO KIM, JAESEUNG SONG, DALE SEED, AND CHONGGANG WANG 18 Toward a Machine Intelligence Layer for Diverse Industrial IoT Use Cases JAN HÖLLER, VLASIOS TSIATSIS, AND CATHERINE MULLIGAN

Blockchain 26 Self-Managing Real Estate NATHAN SHEDROF 27 Blockchain in Developing Countries NIR KSHETRI AND JEFFREY VOAS

Social Media 31 Emoji: Lingua Franca or Passing Fancy? GEORGE HURLBURT 38 The Online Trolling Ecosystem HAL BERGHEL AND DANIEL BERLEANT

Careers 46 CareerVis: Hierarchical Visualization of Career Pathway Data MINGRAN LI, WENJIE WU, JUNHAN ZHAO, KEYUAN ZHOU, DAVID PERKIS, TIMOTHY N. BOND, KEVIN MUMFORD, DAVID HUMMELS, AND YINGJIE VICTOR CHEN

Departments 4 Magazine Roundup 8 Editor’s Note: Managing IoT Diversity 46 72 Conference Calendar CareerVis: Hierarchical Visualization of Career Pathway Data Subscribe to ComputingEdge for free at www.computer.org/computingedge. CS FOCUS

Magazine Roundup

December 2018 issue of Com- puter considers the challenges and off ers a potential solution for a hybrid scenario involving both driverless cars and human- controlled vehicles, within the limited task budget.

he IEEE Computer Computer Computing in Science & Society’s lineup of 12 Engineering T peer-reviewed techni- Hybrid Vehicular cal magazines covers cutting- Crowdsourcing with Evidence-Based Detection edge topics ranging from soft- Driverless Cars: Challenges of Advanced Persistent ware design and computer and a Solution Threats graphics to Internet comput- Although vehicular crowdsourc- This article from the Novem- ing and security, from scien- ing represents an emerging ber/December 2018 issue of tifi c applications and machine technology to assist many smart Computing in Science & Engi- intelligence to visualization and city applications, maintaining neering presents an approach microchip design. Here are sensing data quality is still a to the automation of cyberse- highlights from recent issues. challenge. This article from the curity operations centers with

4 May 2019 Published by the IEEE Computer Society 2469-7087/19/$33.00 © 2019 IEEE cognitive assistants that cap- adjusts an existing one to match a approach for the safeguarding ture and automatically apply the line graph. Since aesthetics is an and transmission of ICH that goes expertise employed by cyberse- important element in visualizing beyond the mere digitization of curity analysts when they investi- personal data, Graphoto provides ICH content. gate advanced persistent threats. users with aesthetically pleas- The goal is to signifi cantly ing displays for casual line graph IEEE Internet Computing increase the probability of detect- information visualization. More ing intrusion activity while dras- specifi cally, after creating a line Considering Jurisdiction tically reducing the workload of graph of the input data, a photo When Assessing End-to-End the operators. that resembles the input data on Network Neutrality the line graph is selected from a Existing solutions designed to IEEE Annals of the History photo archive. The authors of this assess end-to-end neutrality vio- of Computing article from the November/Decem- lations do not consider the nor- ber 2018 issue of IEEE Computer mative jurisdictions. The authors Oral History of Dov Frohman Graphics and Applications present of this article from the Novem- Dov Frohman is an Israeli electri- a user study to show the eff ective- ber/December 2018 issue of IEEE cal engineer and businessman. ness of Graphoto in terms of data Internet Computing argue that In 1970, he invented the Electri- interpretation and aesthetics. jurisdiction-aware violation detec- cally Programmable Read-Only tion can be achieved through fur- Memory (EPROM), a key enabling IEEE Intelligent Systems ther steps that can be added to technology for rapid development current solutions. As a proof-of- of microprocessor-based systems, A Multimodal Approach concept, they propose a prototype from personal computers to indus- for the Safeguarding and to expose and discuss the chal- trial controls. Intel founder Gordon Transmission of Intangible lenges and open issues that need Moore called it “as important in Cultural Heritage: The Case of to be faced to consider the norma- the development of the microcom- i-Treasures tive jurisdiction when assessing puter industry as the micropro- Intangible cultural heritage (ICH) end-to-end network neutrality. cessor itself.” Frohman was also creations include music, dance, responsible for establishing Intel’s singing, theater, human skills, IEEE Micro R&D and manufacturing pres- and craftsmanship. These cultural ence in Israel, one of Intel’s most expressions are usually transmit- Image Recognition productive and advanced design ted orally or using gestures and Accelerator Design Using In- centers. Frohman is a nationally are modifi ed over a period of time, Memory Processing known fi gure in Israel. Read more through a process of collective This article from the January/Feb- in the October–December 2018 recreation. As the world becomes ruary 2019 issue of IEEE Micro issue of IEEE Annals of the History more interconnected and many proposes a hardware accelera- of Computing. cultures come into contact, local tor design, called object recogni- communities run the risk of losing tion and classifi cation hardware IEEE Computer Graphics important elements of their ICH, accelerator on resistive devices, and Applications while young people fi nd it diffi cult which processes object recogni- to maintain the connection with tion tasks inside emerging non- Graphoto: Aesthetically the cultural heritage treasured by volatile memory. The in-memory Pleasing Charts for Casual their elders. In this article from the processing dramatically low- Information Visualization November/December 2018 issue ers the overhead of data move- Graphoto is a framework that auto- of IEEE Intelligent Systems, the ment, improving overall system matically generates a photo or authors present a novel holistic effi ciency. The proposed design www.computer.org/computingedge 5 MAGAZINE ROUNDUP

interactions on Twitter. This can IEEE Software eff ectively alleviate the problem of lacking specifi c information of fol- Spotify Guilds: How to lowing relationships. The experi- Succeed with Knowledge mental results demonstrate the Sharing in Large-Scale Agile eff ectiveness of the designed fea- Organizations tures and diff erent classifi ers. The new generation of software companies has revolutionized IEEE Pervasive Computing the way companies are designed. While bottom-up governance and Pervasive Agriculture: IoT- team autonomy improve motiva- Enabled Greenhouse for Plant tion, performance, and innova- Growth Control tion, managing agile development The authors of this article from the at scale is a challenge. In this October–December 2018 issue of article from the March/April 2019 IEEE Pervasive Computing present issue of IEEE Software, the authors an Internet of Things (IoT) deploy- describe how Spotify cultivates ment in a tomato greenhouse in guilds to help the company share WWW.COMPUTER.ORG Russia. The IoT-enabling tech- knowledge, align, and make collec- /COMPUTINGEDGE nologies in this deployment are tive decisions. a wireless sensor network, cloud computing, and artifi cial intelli- IT Professional gence. They are to help in moni- toring and controlling both plants Autonomous Cars: Social and accelerates key subtasks of and greenhouse conditions, as Economic Implications image recognition, including text, well as predicting the growth rate One of the major issues with face, pedestrian, and vehicle rec- of tomatoes. autonomous cars is their future ognition. The evaluation shows impact on society, as well as on signifi cant improvements on per- IEEE Security & Privacy the research community, aca- formance and energy effi ciency demia, and industry. As interest as compared to state-of-the-art The Good, the Bad, and in autonomous car technology processors and accelerators. the Ugly: Two Decades of grows, the social and economic E-Voting in Brazil implications of this technology IEEE MultiMedia Brazil pioneered the adoption will aff ect various stakeholders, of nationwide electronic voting including its commercialization. Social Relationship Labeling 20 years ago. However, today its In this article from the November/ Based on Multimodal system is outdated in terms of December 2018 issue of IT Pro- Behaviors and Social recent properties. The authors of fessional, the authors critically Interactions this article from the November/ review and analyze both the eco- This article from the October– December 2018 issue of IEEE nomic and social implications December 2018 issue of IEEE Security & Privacy discuss the sys- of the autonomous car. The sig- MultiMedia addresses the social tem’s organization and transpar- nifi cance of these implications relationship labeling problem ency mechanisms in the context of will play an important role in the by exploiting users’ multi-modal security requirements derived from future of autonomous cars among behaviors and abundant social a conventional election. consumers.

6 ComputingEdge May 2019 IEEE COMPUTER GRAPHICS AND APPLICATIONS APPLICATIONS AND GRAPHICS COMPUTER IEEE IEEE COMPUTER GRAPHICS AND APPLICATIONS APPLICATIONS AND GRAPHICS COMPUTER IEEE IEEE COMPUTER GRAPHICS AND APPLICATIONS APPLICATIONS AND GRAPHICS COMPUTER IEEE IEEE COMPUTER GRAPHICS AND APPLICATIONS APPLICATIONS AND GRAPHICS COMPUTER IEEE

November/December 2016 July/August 2016 September/October 2016 January/February 2017 November/December 2016 September/October 2016 January/February 2017January/February

July/August 2016July/August Qualit Assessment and Defense Quality Assessment and Perception in Computer Graphics Computer in Perception and Assessment Quality Perception Applications Element Human Water, the Sky, and

in Computer Graphics Visualization Data Sports Defense Applications

VOLUME 36 NUMBER 4 NUMBER 36 VOLUME VOLUME 37 NUMBER 1 37 NUMBER VOLUME VOLUME 36 NUMBER 5 NUMBER 36 VOLUME VOLUME 36 NUMBER 6 NUMBER 36 VOLUME

c1.indd 1 12/14/16 12:21 PM

c1.indd 1 6/22/16 1:20 PM c1.indd 1 8/22/16 2:59 PM c1.indd 1 10/24/16 3:44 PM

CG& www.computer.org/cgaA IEEE Computer Graphics and Applications bridges the theory and practice of computer graphics. Subscribe to CG&A and • stay current on the latest tools and applications and gain invaluable practical and research knowledge, • discover cutting-edge applications and learn more about the latest techniques, and • benefit fromCG&A ’s active and connected editorial board.

ADVERTISER INFORMATION

Advertising Personnel Southwest, California: Mike Hughes Debbie Sims: Advertising Coordinator Email: [email protected] Email: [email protected] Phone: +1 805 529 6790 Phone: +1 714 816 2138 | Fax: +1 714 821 4010

Advertising Sales Representative (Classifieds & Jobs Board) Advertising Sales Representatives (display) Heather Buonadies Central, Northwest, Southeast, Far East: Email: [email protected] Eric Kincaid Phone: +1 201 887 1703 Email: [email protected] Phone: +1 214 673 3742 Fax: +1 888 886 8599 Advertising Sales Representative (Jobs Board)

Northeast, Midwest, Europe, Middle East: Marie Thompson David Schissler Email: [email protected] Email: [email protected] Phone: 714-813-5094 Phone: +1 508 394 4026 Fax: +1 508 394 1707

www.computer.org/computingedge 7 EDITOR’S NOTE

Managing IoT Diversity

he Internet of Things (IoT) pervades Blockchain technology is being employed in many industries, domains, and facets of a fl ood of diverse new applications. In Computer’s T our lives. Yet, these connected objects “Self-Managing Real Estate,” the author explains are often not standardized and don’t work across how blockchain records could soon be used for diff erent applications. This lack of interoperabil- buying a house. IT Professional’s “Blockchain in ity is a key challenge in increasing adoption and Developing Countries” details ways that block- eff ectiveness of IoT systems. Two articles in this chain could help fi ght corruption and promote sta- issue of ComputingEdge present innovations for bility in developing countries. managing this heterogeneous IoT ecosystem. Next, two articles address phenomena that The authors of IEEE Internet Computing’s unite otherwise varied social media platforms. “Semantic Enablement in IoT Service Layers— IT Professional’s “Emoji: Lingua Franca or Pass- Standard Progress and Challenges” argue that ing Fancy?” evaluates how people use emojis in standardizing a set of common functions across digital communication. Computer’s “The Online IoT applications would reduce the development Trolling Ecosystem” laments the ubiquity of disin- cost of IoT devices. They propose a semantic- formation on social media and calls for a renewed enabled IoT service-layer platform based on eff ort to battle its spread. oneM2M standards. IEEE Intelligent Systems’ When it comes to careers, diversity is impor- “Toward a Machine Intelligence Layer for Diverse tant. “CareerVis: Hierarchical Visualization of Industrial IoT Use Cases” off ers guidelines to Career Pathway Data,” from IEEE Computer Graph- help designers create scalable and replicable IoT ics and Applications, presents a tool for helping systems. young adults explore their many career options.

8 May 2019 Published by the IEEE Computer Society 2469-7087/19/$33.00 © 2019 IEEE PURPOSE: The IEEE Computer Society is the world’s largest EXECUTIVE COMMITTEE association of computing professionals and is the leading provider President: Cecilia Metra of technical information in the field. President-Elect: Leila De Floriani; Past President: Hironori MEMBERSHIP: Members receive the monthly magazine Kasahara; First VP: Forrest Shull; Second VP: Avi Mendelson; Computer, discounts, and opportunities to serve (all activities Secretary: David Lomet; Treasurer: Dimitrios Serpanos; are led by volunteer members). Membership is open to all IEEE VP, Member & Geographic Activities: Yervant Zorian; members, affiliate society members, and others interested in the VP, Professional & Educational Activities: Kunio Uchiyama; computer field. VP, Publications: Fabrizio Lombardi; VP, Standards Activities: COMPUTER SOCIETY WEBSITE: www.computer.org Riccardo Mariani; VP, Technical & Conference Activities: OMBUDSMAN: Direct unresolved complaints to ombudsman@ William D. Gropp computer.org. 2018–2019 IEEE Division V Director: John W. Walz 2019 IEEE Division V Director Elect: Thomas M. Conte CHAPTERS: Regular and student chapters worldwide provide the 2019–2020 IEEE Division VIII Director: Elizabeth L. Burd opportunity to interact with colleagues, hear technical experts, and serve the local professional community. BOARD OF GOVERNORS AVAILABLE INFORMATION: To check membership status, report Term Expiring 2019: Saurabh Bagchi, Leila De Floriani, David S. an address change, or obtain more information on any of the Ebert, Jill I. Gostin, William Gropp, Sumi Helal, Avi Mendelson following, email Customer Service at [email protected] or call Term Expiring 2020: Andy Chen, John D. Johnson, Sy-Yen Kuo, +1 714 821 8380 (international) or our toll-free number, +1 800 272 David Lomet, Dimitrios Serpanos, Forrest Shull, Hayato Yamana 6657 (US): Term Expiring 2021: M. Brian Blake, Fred Douglis, Carlos E. • Membership applications Jimenez-Gomez, Ramalatha Marimuthu, Erik Jan Marinissen, • Publications catalog Kunio Uchiyama • Draft standards and order forms • Technical committee list EXECUTIVE STAFF • Technical committee application • Chapter start-up procedures Executive Director: Melissa Russell • Student scholarship information Director, Governance & Associate Executive Director: • Volunteer leaders/staff directory Anne Marie Kelly • IEEE senior member grade application (requires 10 years Director, Finance & Accounting: Sunny Hwang practice and significant performance in five of those 10) Director, Information Technology & Services: Sumit Kacker Director, Marketing & Sales: Michelle Tubb PUBLICATIONS AND ACTIVITIES Director, Membership Development: Eric Berkowitz

Computer: The flagship publication of the IEEE Computer Society, COMPUTER SOCIETY OFFICES Computer, publishes peer-reviewed technical content that covers all aspects of computer science, computer engineering, Washington, D.C.: 2001 L St., Ste. 700, Washington, D.C. technology, and applications. 20036-4928 • Phone: +1 202 371 0101 • Fax: +1 202 728 9614 Email: [email protected] Periodicals: The society publishes 12 magazines, 15 transactions, and two letters. Refer to membership application or request Los Alamitos: 10662 Los Vaqueros Cir., Los Alamitos, CA 90720 information as noted above. Phone: +1 714 821 8380 • Email: [email protected] Conference Proceedings & Books: Conference Publishing Asia/Pacific: Watanabe Building, 1-4-2 Minami-Aoyama, Services publishes more than 275 titles every year. Minato-ku, Tokyo 107-0062, Japan • Phone: +81 3 3408 3118 Fax: +81 3 3408 3553 • Email: [email protected] Standards Working Groups: More than 150 groups produce IEEE standards used throughout the world. MEMBERSHIP & PUBLICATION ORDERS Technical Committees: TCs provide professional interaction in Phone: +1 800 272 6657 • Fax: +1 714 821 4641 more than 30 technical areas and directly influence computer Email: [email protected] engineering conferences and publications. IEEE BOARD OF DIRECTORS Conferences/Education: The society holds about 200 conferences each year and sponsors many educational activities, including President & CEO: Jose M.D. Moura computing science accreditation. President-Elect: Toshio Fukuda Certifications: The society offers three software developer Past President: James A. Jefferies credentials. For more information, visit www.computer Secretary: Kathleen Kramer .org/certification. Treasurer: Joseph V. Lillie Director & President, IEEE-USA: Thomas M. Coughlin 2019 BOARD OF GOVERNORS MEETINGS Director & President, Standards Association: Robert S. Fish Director & VP, Educational Activities: Witold M. Kinsner 6 – 7 June: Hyatt Regency Coral Gables, Miami, FL Director & VP, Membership and Geographic Activities: (TBD) November: Teleconference Francis B. Grosz, Jr. Director & VP, Publication Services & Products: Hulya Kirkici Director & VP, Technical Activities: K.J. Ray Liu

revised 13 February 2019

DEPARTMENT: Standards

Semantic Enablement in IoT Service Layers— Standard Progress and Challenges

Komal Gilani We believe that applying semantic technologies to Sejong University IoT service layer platforms can improve data Jaho Kim accessibility, data discoverability, and the ability to Korea Electronics Technology Institute extract knowledge about the data. Therefore, this JaeSeung Song article shows how semantic technologies can be Sejong University leveraged by IoT service layer platforms. Dale Seed InterDigital Communications

Chonggang Wang The Internet of Things (IoT) is significantly growing with InterDigital Communications an aim to make a connected world by providing numerous opportunities for many industrial sectors and domains such as smart cities, smart factories, and smart homes. Cur- rently, however, IoT applications in these domains are not interoperable with each other. The heterogeneous nature of these applications provide justification for defining a standard way of abstracting vertical data models.1,2 IoT data is usually collected from various sources such as sensory devices and/or crowd sensing. The data is often stored in IoT platforms as resources based on different data models. This collection of data can vary in quality and context. Accord- ingly, a semantic approach—for example, used in the Semantic Web3—can provide great agility toward resource representation, sharing information, and inferring new knowledge from data in the IoT on a global scale.4 Standardizing a set of common functions (such as registration and discovery) across IoT applica- tions and devices would reduce the development cost of IoT devices. This IoT service layer ena- bles application development independent of the underlying network communication and protocols (such as HyperText Transfer Protocol [HTTP] and Constrained Application Protocol [CoAP]) by abstracting different network technologies.5 As most IoT service layer platforms simply store IoT data in a non-semantic aware fashion, the meaning of the data cannot be con- veyed to IoT applications. Therefore, they are unable to understand the context of the data. Meaningful use of any IoT data requires knowledge about its context such as its geolocation, its

IEEE Internet Computing Published by the IEEE Computer Society July/August10 2018 May 2019 Published56 by the IEEE Computer Society 1089-7801/18/$33.002469-7087/19/$33.00 USD ©2018© 2019 IEEE IEEE IEEE INTERNET COMPUTING

units and its producer. We believe that applying semantic technologies to IoT service layer plat- forms can improve data accessibility, data discoverability, and the ability to extract knowledge DEPARTMENT: Standards about the data. Therefore, this article shows how semantic technologies can be leveraged by IoT service layer platforms. Since the way of storing and managing data in IoT platforms is different compared to the Web, this article shows mechanisms of using a data modelling language such as Resource Description Framework (RDF) to semantically describe IoT data, methods of associating IoT data with this semantic metadata, and methods of handling semantic queries to discover meaningful data from IoT platforms. For this purpose, we have designed a semantic-enabled IoT service-layer platform based on oneM2M global IoT standards supporting semantic features such as annotation and dis- Semantic Enablement in covery.

IoT Service Layers— SEMANTIC TECHNOLOGIES AND RELATED STANDARDS This section explains why semantic technologies are required for IoT service layer platforms6 Standard Progress and and gives an overview of core semantic technologies.7 Semantic technologies can play a critical role in data and knowledge management for context-awareness in IoT service platforms.

Challenges Ontology Ontology represents concepts as objects that have properties and relationships with other objects. An ontology describes linguistic artifacts using a shared vocabulary of basic concepts about a piece of reality. It helps to support semantic exchange and context-driven communications 8 Komal Gilani We believe that applying semantic technologies to among people and machines by defining shared and common theories. Sejong University IoT service layer platforms can improve data Jaho Kim accessibility, data discoverability, and the ability to RDF and RDF Schema Korea Electronics Technology Institute extract knowledge about the data. Therefore, this RDF is a standard model and language that represents the ontological level of facts about a re- 9 JaeSeung Song article shows how semantic technologies can be source or an individual—for example, types of individuals and their relations, respectively. Sejong University RDF Schema provides a vocabulary for structuring RDF resources and describing relationships leveraged by IoT service layer platforms. among resources. This includes the modelling of classes (rdfs:Class), the rdf:type property that Dale Seed provides the links of instances to a class, and the rdfs:subClassOf property, which allows the InterDigital Communications specification of class hierarchies. Chonggang Wang The Internet of Things (IoT) is significantly growing with InterDigital Communications an aim to make a connected world by providing numerous opportunities for many industrial sectors and domains such OWL as smart cities, smart factories, and smart homes. Cur- As an ontology language, RDF and RDFS have limited expressiveness, as they have difficulties rently, however, IoT applications in these domains are not interoperable with each other. The describing cardinality constraints (e.g., Parking Garage A has more than 10 unoccupied parking heterogeneous nature of these applications provide justification for defining a standard way of spots). Therefore, OWL was introduced to provide greater expressiveness and even support onto- abstracting vertical data models.1,2 IoT data is usually collected from various sources such as logical reasoning. OWL offers different sublanguages with different levels of expressiveness and sensory devices and/or crowd sensing. The data is often stored in IoT platforms as resources related properties regarding reasoning completeness and time complexity. based on different data models. This collection of data can vary in quality and context. Accord- ingly, a semantic approach—for example, used in the Semantic Web3—can provide great agility toward resource representation, sharing information, and inferring new knowledge from data in SPARQL the IoT on a global scale.4 SPARQL is a query language for interacting with a triple store to process stored RDF triples. Standardizing a set of common functions (such as registration and discovery) across IoT applica- SPARQL can support ontological reasoning and semantic discovery. The triple store typically tions and devices would reduce the development cost of IoT devices. This IoT service layer ena- provides an interface to receive SPARQL query requests from a user and to send responses back bles application development independent of the underlying network communication and to the user. Now the question is whether and how these technologies can be leveraged in an IoT protocols (such as HyperText Transfer Protocol [HTTP] and Constrained Application Protocol service layer platform to support semantic interoperability. [CoAP]) by abstracting different network technologies.5 As most IoT service layer platforms simply store IoT data in a non-semantic aware fashion, the meaning of the data cannot be con- veyed to IoT applications. Therefore, they are unable to understand the context of the data. Meaningful use of any IoT data requires knowledge about its context such as its geolocation, its

IEEE Internet Computing Published by the IEEE Computer Society July/August 2018 56 1089-7801/18/$33.00 USD ©2018 IEEE July/Augustwww.computer.org/computingedge 2018 57 11 STANDARDS

SEMANTIC-ENABLED IOT SERVICE LAYER A common IoT service layer platform is required by the IoT market to facilitate multi-industry IoT applications. The oneM2M Global Initiative is an international partnership project to de- velop a globally acceptable IoT service-layer standard. The common service layer specified by oneM2M can be embedded into various IoT entities such as end devices, gateways. and serv- ers.10 It provides various IoT common service functionalities such as device registration, group management, and security and privacy. The oneM2M service layer provides a means for connecting various IoT devices regardless of their access technologies, collecting data from these devices, and managing the collected data. Through its semantic capabilities, it also supports the annotation of semantic descriptions to oneM2M resources. Figure 1 shows the high-level design of a semantic-enabled IoT service layer platform. In order to support semantic features, IoT service-layer platforms have to support at least three basic features as follows:

• Semantic annotation: To achieve data interoperability, the service layer first should be able to support describing the meaning of resources/data.11 IoT service-layer resources (i.e. data sets) can be annotated with semantic information using standardized ontolo- gies and data structures. • Semantic query and discovery: The platform can support queries from IoT applications based on a semantic query language. When a semantic query is received, the platform executes the query by retrieving semantic information for the targeted resources and processing the discovery query.

IoT Application

Discover using semantics IoT Service Layer Virtual Sensor Semantic (A + B) Discovery

Perform Mashup Semantic Semantic Mashup Query Mashup Inference & Reasoning Add semantics Semantic A Resources B semantic Ontology Annotation Store semantics repository repository

IoT Networks

Things are represented as Resources Sensor A Actuator B

Figure 1. Semantic capabilities in an IoT service layer.

• Semantic mashup: Like a traditional Web mashup, a semantic mashup is used to com- pose a virtual IoT resource from more than one IoT resource, which can be other exist- ing virtual resources as well.

In order to provide semantic services to users properly, it is necessary to define common vocabu- laries, standardized data formats and description rules that can eventually solve the interoperabil- ity challenges caused by heterogeneous IoT data. The standard RDF language can be used to describe the semantic information. Also the annotated semantic metadata is then stored by the platform in a new resource designed to accommodate semantic information in an RDF/RDFS format. The metadata can also be stored in a triple store/ontology repository.

July/August12 2018 ComputingEdge 58 May 2019 STANDARDS IEEE INTERNET COMPUTING

SEMANTIC-ENABLED IOT SERVICE LAYER SEMANTICS IN ONEM2M STANDARDS A common IoT service layer platform is required by the IoT market to facilitate multi-industry In this section, we describe how the semantic IoT features mentioned in the previous section can IoT applications. The oneM2M Global Initiative is an international partnership project to de- be realized in an IoT service-layer platform. velop a globally acceptable IoT service-layer standard. The common service layer specified by oneM2M can be embedded into various IoT entities such as end devices, gateways. and serv- ers.10 It provides various IoT common service functionalities such as device registration, group management, and security and privacy. oneM2M Resources for Storing Semantic Information The two basic logical entities that play a major role in the oneM2M system are an Application The oneM2M service layer provides a means for connecting various IoT devices regardless of Entity (AE) and a Common Service Entity (CSE). In the oneM2M architecture, both CSEs and their access technologies, collecting data from these devices, and managing the collected data. AEs can reside within different nodes, such as an Infrastructure Node (IN) for a server platform, Through its semantic capabilities, it also supports the annotation of semantic descriptions to a Middle Node (MN) for a gateway, an Application Service Node (ASN) and an Application oneM2M resources. Figure 1 shows the high-level design of a semantic-enabled IoT service Dedicated Nodes (ADN) for a constrained device.10 The AE is the logical entity that provides layer platform. In order to support semantic features, IoT service-layer platforms have to support application’s business logic. It is used for hosting sensors, applications, and it resides in the Ap- at least three basic features as follows: plication dedicated node, which is called AND-AE. On the other hand, the IN-CSE entity is hosted on a server. The CSE functionality is provided for utilization by various AE resources. • Semantic annotation: To achieve data interoperability, the service layer first should be oneM2M adopted a resource based data model, in which all services are represented as re- able to support describing the meaning of resources/data.11 IoT service-layer resources sources. A resource can be uniquely addressed by a Uniform Resource Identifier (URI) and ma- (i.e. data sets) can be annotated with semantic information using standardized ontolo- gies and data structures. nipulated via create, retrieve, update, delete, and notify operations (CRUD+N). • Semantic query and discovery: The platform can support queries from IoT applications To enable semantic technologies, the oneM2M service layer defines a based on a semantic query language. When a semantic query is received, the platform resource, as highlighted in Figure 2. This resource is responsible for storing semantic infor- executes the query by retrieving semantic information for the targeted resources and mation related to its parent resource and potentially sub-resources. It is created inside an existing processing the discovery query. container resource or AE resource of CSE in the oneM2M resource structure. The contents of this resource can be provided based on ontologies. The resource con- IoT Application tains various attributes—that is, ontologyRef for the URI of an ontology, descriptorRepre- sentation to indicate the format of the semantic information, relatedSemantics to contain Discover using semantics the URIs of other related descriptor resources, and descriptor for semantic information it- IoT Service Layer Virtual Sensor Semantic self—to facilitate semantic information management. The resource can be (A + B) Discovery added as a child resource by any CSE/AE that expects to receive automatic notifications on the Perform Mashup changes of a resource. Semantic Semantic Mashup Query Mashup Inference & Reasoning Add semantics Semantic A Resources B semantic Ontology Annotation Store semantics repository repository

IoT Networks

Things are represented as Resources Sensor A Actuator B

Figure 1. Semantic capabilities in an IoT service layer.

• Semantic mashup: Like a traditional Web mashup, a semantic mashup is used to com- Figure 2. oneM2M resource structure pose a virtual IoT resource from more than one IoT resource, which can be other exist- ing virtual resources as well. Let’s describe an example of the semantic information management process, where two sensors In order to provide semantic services to users properly, it is necessary to define common vocabu- measure temperature information in different units and a smartphone application makes a discov- 12 laries, standardized data formats and description rules that can eventually solve the interoperabil- ery request of relevant semantic information. These two sensors are represented as ADN-AE-1 ity challenges caused by heterogeneous IoT data. The standard RDF language can be used to and ADN-AE-2 in an IoT platform server and periodically store measured temperature values in describe the semantic information. Also the annotated semantic metadata is then stored by the the server. The measured temperature sensor values are stored to a re- platform in a new resource designed to accommodate semantic information in an RDF/RDFS source for each sensor reading with a as its child resource. The resource is used to store the semantic information about the temperature sensor reading and the measured value. Once the resource is created, the smartphone application (i.e. ADN-AE-3) sends a semantic discovery request to the IN-CSE,

July/August 2018 58 July/Augustwww.computer.org/computingedge 2018 59 13 STANDARDS

which contains a semantic filter. Then the IN-CSE will use the semantic filter to discover desired resources. After this the application ADN-AE-3 receives a response in the form of unique re- source identifiers. Based on the returned list of unique resource identifiers, ADN-AE-3 can make another request to the IN-CSE to retrieve one or more semantic descriptor resources.

oneM2M Base Ontology In general, information and operations in each IoT system can be described by ontologies, which provide a vocabulary with a structure. These ontologies (with OWL representations) can be used to support interoperability between different systems via ontology integration or mapping.12 For this purpose, oneM2M has defined its own ontology called the oneM2M Base Ontology. Various external ontologies from other IoT systems can be mapped to the oneM2M Base Ontology (e.g., by sub-classing and equivalence) so the interworking between the oneM2M system and external systems can be achieved. The oneM2M Base Ontology contains Classes (i.e. sets of individuals) and Properties (i.e. relationships and links between individuals), but no instances since the Base Ontology only supports a semantic description of these entities in the oneM2M architecture.

Semantic Annotation Semantic annotation, which is the first step toward a semantic IoT system, is a process of adding semantic information to resources in oneM2M IoT platforms so that an annotated resource can be discovered semantically by heterogeneous IoT applications. In the oneM2M system, semantic information is represented using RDF/RDFS (or OWL) as RDF triples. Since the oneM2M sys- tem uses a hierarchical tree structure to store and manage its resources, semantic information is added as a special semantic resource. For this purpose, an IoT semantic annotator (IoT-SA) is introduced that runs within the oneM2M IoT system to automatically annotate semantic infor- mation for various resources representing sensors/devices registered to the oneM2M system with the following five steps:

1. As inputs to the IoT-SA, users/admins select IoT resource(s) to be annotated from the IoT platform and choose ontology(s) to be used during this annotation. 2. The IoT-SA then parses the given ontology to retrieve its classes and properties. The IoT-SA also retrieves other resources having related semantic information from the platform as candidate resources to establish relationships. The related semantic infor- mation is retrieved from attribute, which contains URIs of other linked descriptor resource (s). 3. Users/Admins repeat a process to define semantic information in a triple format (i.e. subject Æ predicate Æ object) based on the given classes and attributes/properties from the given ontology. 4. Selected resources and semantic information are then converted into the defined RDF format and the IoT-SA uploads encoded RDF triples to the resource under the target resource. 5. The semantically annotated resources can now be discoverable by IoT applications. The updated semantic information can also be seen by users/admins for other pur- poses.

July/August14 2018 ComputingEdge 60 May 2019 STANDARDS IEEE INTERNET COMPUTING

which contains a semantic filter. Then the IN-CSE will use the semantic filter to discover desired IN-CSE resources. After this the application ADN-AE-3 receives a response in the form of unique re- A SPARQL Query Statement ADN-AE source identifiers. Based on the returned list of unique resource identifiers, ADN-AE-3 can make Server Return results (List of Resources) another request to the IN-CSE to retrieve one or more semantic descriptor resources. Semantic IoT Execute Application Returns on Semantic Graph Scoping (SGS) SD_3 SD_4 RDF oneM2M Base Ontology Data basis

SD_3 SD_4 In general, information and operations in each IoT system can be described by ontologies, which Base provide a vocabulary with a structure. These ontologies (with OWL representations) can be used AE-1 SD_1 to support interoperability between different systems via ontology integration or mapping.12 For this purpose, oneM2M has defined its own ontology called the oneM2M Base Ontology. Various AE-2 SD_2 Cot_2_1 SD_3 external ontologies from other IoT systems can be mapped to the oneM2M Base Ontology (e.g., Normal resource Cot_2_2 SD_4 by sub-classing and equivalence) so the interworking between the oneM2M system and external …… resource systems can be achieved. The oneM2M Base Ontology contains Classes (i.e. sets of individuals) AE-x SD_n and Properties (i.e. relationships and links between individuals), but no instances since the Base Ontology only supports a semantic description of these entities in the oneM2M architecture. Figure 3. Semantic resource discovery procedures in oneM2M. Once a query is targeted to a resource, then platform determines the scope of the semantic query and executes it against all semantic information. Semantic Annotation Semantic annotation, which is the first step toward a semantic IoT system, is a process of adding semantic information to resources in oneM2M IoT platforms so that an annotated resource can Semantic Resource Discovery and Semantic Query be discovered semantically by heterogeneous IoT applications. In the oneM2M system, semantic One of the key benefits of semantic descriptions is to enable semantic resource discovery. Se- information is represented using RDF/RDFS (or OWL) as RDF triples. Since the oneM2M sys- mantic resource discovery is basically a capability for an IoT application to discover resources tem uses a hierarchical tree structure to store and manage its resources, semantic information is based on certain specified characteristics of resources it is interested in. Semantic resource dis- added as a special semantic resource. For this purpose, an IoT semantic annotator (IoT-SA) is covery can be achieved by using a SPARQL query. Figure 3 shows semantic discovery proce- introduced that runs within the oneM2M IoT system to automatically annotate semantic infor- dures in oneM2M. An IoT application is notified of the discovered resources and can retrieve mation for various resources representing sensors/devices registered to the oneM2M system with desired resources based on the returned URIs after a semantic query is executed in the oneM2M the following five steps: platform. Specifically, a semantic filter is specified in oneM2M, which is formulated as a SPARQL query and contained in a semantic resource discovery request. An IoT application that 1. As inputs to the IoT-SA, users/admins select IoT resource(s) to be annotated from the wants to discover resources using semantics has to form a semantic query statement using IoT platform and choose ontology(s) to be used during this annotation. SPARQL based on its needs. 2. The IoT-SA then parses the given ontology to retrieve its classes and properties. The IoT-SA also retrieves other resources having related semantic information from the When a SPARQL query is received targeting a specific resource (a.k.a. target resource), the re- platform as candidate resources to establish relationships. The related semantic infor- ceiver (i.e. IN-CSE in Figure 3) performs Semantic Graph Scoping (SGS) to decide the scope of mation is retrieved from attribute, which contains URIs of other the SPARQL query execution (i.e. to formulate a RDF basis for executing the SPARQL query). linked descriptor resource (s). Semantic descriptors which are distributed and are hosted in the IoT platform’s resource struc- 3. Users/Admins repeat a process to define semantic information in a triple format (i.e. ture are collected together to formulate a complete RDF data basis. subject Æ predicate Æ object) based on the given classes and attributes/properties from the given ontology. 4. Selected resources and semantic information are then converted into the defined RDF Semantic Mashup format and the IoT-SA uploads encoded RDF triples to the resource under the target resource. Semantic Mashup is a process to discover and collect data from more than one IoT data sources 5. The semantically annotated resources can now be discoverable by IoT applications. and apply relevant business logic on the collected data to generate meaningful mashup results. The updated semantic information can also be seen by users/admins for other pur- For example, let us consider a case where users are interested in a service called “weather com- poses. fort index,” which provides and expresses satisfaction level regarding weather conditions. The comfort levels can be calculated based on the temperature and humidity sensors deployed in a specific location together with additional weather conditions; this is actually a mashup process and can be provided as a mashup service by an IoT platform. oneM2M specifies a semantic mashup service, which is implemented via a set of mashup procedures as shown in Figure 4. In order to utilize a mashup service, an IoT application should first discover the corresponding SMJP (i.e. a resource as defined in oneM2M (Step 1). A SMJP describes the profile and necessary information required for a specific mashup service such as input parameters, member resources, mashup function, and output parameters. The SMJP resource shall contain , and as child resources.13 Based on the profile described in the SMJP, Originators (e.g. AEs) can create corresponding semantic mashup instances where semantic mashup results will be generated and stored in . The Mashup Requestor may use to retrieve the mashup result.

July/August 2018 60 July/Augustwww.computer.org/computingedge 2018 61 15 STANDARDS

IoT Application 5 Mashup Result Retrieval 1 2 SMI creation SMJP discovery

SMI SMI Instantiation Identified Data SMJP Mashup R1 R2 R3 Result oneM2M Semantic Mashup Function 3 4 Data source identification Data collection and result generation

Resource R1 Resource R4 oneM2M Resource R2 Resource R5 Resources Resource R3 Resource R6

Figure 4. IoT Semantic mashup procedures in oneM2M. Each specific mashup service is described by a Semantic Mashup Job Profile (SMJP) which defines all required elements (e.g. types of input parameters, types of member resources, mashup operations or business logic, etc.) in RDF triples by this mashup service.

Based on the discovered SMJP, the next step is for the IoT application to create a Semantic Mashup Instance (SMI) resource, for example, by giving appropriate input parameters and mem- ber resources (Step 2). The SMI resource is used to contain input parameters, member resources, and any generated mashup results. Basically, SMJP provides a guidance on how an SMI shall be created and how the mashup result shall be calculated. The third step is for the IoT platform (i.e. CSE in oneM2M) to discover and collect original data from each member resource (e.g. via se- mantic resource discovery procedures) (Step 3). After the data is collected from the identified member resources (i.e. data sources), the IoT platform calculates the mashup result according to the business logic as described in the SMJP (Step 4). The generated semantic mashup result is stored in the SMI, which can be retrieved by the IoT application or other entities (Step 5).

CONCLUSION There is a strong need to resolve the interoperability issue in the IoT service layer using semantic technologies inspired by semantic Web. This article described a semantic-enabled IoT service layer architecture based on the oneM2M global IoT service layer standards. In this architecture, semantic descriptor resources are introduced to represent semantic information in RDF triples. This semantic descriptor resource allows an IoT service layer or an IoT application to annotate existing IoT resources/data with additional semantic information using selected ontologies and RDF/RDFS. The added semantic information is then leveraged for semantic filtering/discovery and semantic mash-up. The proposed semantic-enabled IoT architecture also supports a semantic repository to maintain all semantic information in a centralized triple store. Then, a SPARQL query can be executed directly on the triple store against the semantic information stored there. Future work includes advanced semantic annotation with other data models, information synchronization between oneM2M service layer resource structure and the triple store, distributed semantic analytics and other functions, as well as interoperability with other standards.

ACKNOWLEDGMENT This work was supported by Institute for Information & communications Technology Pro- motion(IITP) grant funded by the Korea government(MSIT) (No. B0184-15-1003, No. B0184-15-1001)

July/August16 2018 ComputingEdge 62 May 2019 STANDARDS IEEE INTERNET COMPUTING

IoT Application 5 Mashup Result Retrieval REFERENCES 2 SMI creation 1 1. D. Bandyopadhyay and J. Sen, “Internet of Things: Applications and Challenges in SMJP discovery Technology and Standardization,” Wireless Personal Communications, vol. 58, no. 1, 2011, pp. 49–69. SMI SMI 2. C. Perera et al., “Context Aware Computing for The Internet of Things: A Survey,” Instantiation Identified Data SMJP Mashup IEEE Communications Surveys & Tutorials, vol. 16, no. 1, 2014, pp. 414–454. R1 R2 R3 Result 3. T. Berners-Lee, J. Hendler, and O. Lassila, “The Semantic Web,” Scientific American, oneM2M Semantic vol. 284, no. 5, 2001. Mashup Function 3 4 4. A. Palavalli, D. Karri, and S. Pasupuleti, “Semantic Internet of Things,” IEEE Tenth Data collection and result generation Data source identification International Conference on Semantic Computing (ICSC 16), 2016, pp. 91–95. Resource R1 Resource R4 5. M. Palattella et al., “Standardized Protocol Stack for the Internet of (Important) oneM2M Resource R2 Resource R5 Things,” IEEE Communications Surveys & Tutorials, vol. 15, no. 3, 2013, pp. 1389– Resources Resource R3 Resource R6 1406. 6. p. Sethi and s. Sarangi, “Internet of Things: Architectures, Protocols, and Applications,” Journal of Electrical and Computer Engineering, 2017, pp. 1–25. Figure 4. IoT Semantic mashup procedures in oneM2M. Each specific mashup service is described 7. E. Kovacs et al., “Standards-Based Worldwide Semantic Interoperability for IoT,” by a Semantic Mashup Job Profile (SMJP) which defines all required elements (e.g. types of input IEEE Communications, vol. 54, no. 12, 2016, pp. 40–46. parameters, types of member resources, mashup operations or business logic, etc.) in RDF triples 8. A. Sheth, C. Henson, and S.S. Sahoo, “Semantic Sensor Web,” IEEE Internet by this mashup service. Computing, vol. 12, no. 4, 2003, pp. 78–83. 9. M.N. Meenachi and S.M. Babi, “A Survey on Usage of Ontology in Different Domain,” International Journal of Applied Information Systems, vol. 4, no. 2, 2012, Based on the discovered SMJP, the next step is for the IoT application to create a Semantic pp. 46–55. Mashup Instance (SMI) resource, for example, by giving appropriate input parameters and mem- 10. J. Swetina et al., “Toward a standardized common M2M service layer platform: ber resources (Step 2). The SMI resource is used to contain input parameters, member resources, Introduction to oneM2M,” IEEE Wireless Communications, vol. 21, no. 3, 2014, pp. and any generated mashup results. Basically, SMJP provides a guidance on how an SMI shall be 20–26. created and how the mashup result shall be calculated. The third step is for the IoT platform (i.e. 11. E. Kovacs et al., “Standards-Based Worldwide Semantic Interoperability for IoT,” CSE in oneM2M) to discover and collect original data from each member resource (e.g. via se- IEEE Communications Magazine, vol. 54, no. 12, 2016, pp. 40–46. mantic resource discovery procedures) (Step 3). After the data is collected from the identified 12. “Developer Guide: Implementing Semantics,” oneM2M, TR-0045, vol. v.1.0.0, member resources (i.e. data sources), the IoT platform calculates the mashup result according to November 2017. the business logic as described in the SMJP (Step 4). The generated semantic mashup result is 13. “Functional Architecture,” oneM2M, TS-0001, vol. v.3.10.0, February 2018. stored in the SMI, which can be retrieved by the IoT application or other entities (Step 5).

CONCLUSION ABOUT THE AUTHORS There is a strong need to resolve the interoperability issue in the IoT service layer using semantic Komal Gilani is an MSc student at Sejong University. Her research interests include IoT technologies inspired by semantic Web. This article described a semantic-enabled IoT service semantics, smart cities, and industrial IoT. She has a BS in computer science from Arid Ag- layer architecture based on the oneM2M global IoT service layer standards. In this architecture, riculture University Rawalpindi. Contact her at [email protected]. semantic descriptor resources are introduced to represent semantic information in RDF triples. Jaeho Kim is a principal engineer at Korea Electronics Technology Institute. His research This semantic descriptor resource allows an IoT service layer or an IoT application to annotate interests include IoT, smart cities, and digital twin technologies. He has a PhD in electrical existing IoT resources/data with additional semantic information using selected ontologies and and electronic engineering from Yonsei University. Contact him at [email protected]. RDF/RDFS. The added semantic information is then leveraged for semantic filtering/discovery and semantic mash-up. JaeSeung Song is an associate professor at Sejong University. His research interests in- clude IoT semantics, software testing, and industrial IoT. He has a PhD in computing from The proposed semantic-enabled IoT architecture also supports a semantic repository to maintain Imperial College London. Contact him at [email protected]. all semantic information in a centralized triple store. Then, a SPARQL query can be executed directly on the triple store against the semantic information stored there. Future work includes Dale Seed is a principal engineer at InterDigital Communications. His research interests advanced semantic annotation with other data models, information synchronization between include IoT protocols and services. He has a MS in computer science and engineering from oneM2M service layer resource structure and the triple store, distributed semantic analytics and the Pennsylvania State University. Contact him at [email protected]. other functions, as well as interoperability with other standards. Chonggang Wang is a principal engineer at InterDigital Communications. His research in- terests include IoT, blockchain and distributed ledger technologies, and AI-powered future networking and computing. He has a PhD in computer science from Beijing University of Posts and Telecommunications. Contact him at [email protected]. ACKNOWLEDGMENT This work was supported by Institute for Information & communications Technology Pro- motion(IITP) grant funded by the Korea government(MSIT) (No. B0184-15-1003, No. This article originally appeared in B0184-15-1001) IEEE Internet Computing, vol. 22, no. 4, 2018.

July/August 2018 62 www.computer.org/computingedge 17 INTERNET OF THINGS Editor: Amit Sheth, Kno.e.sis—the Ohio Center of Excellence in Knowledge-Enabled Computing, [email protected] Toward a Machine Intelligence Layer for Diverse Industrial IoT Use Cases

Jan Höller and Vlasios Tsiatsis, Ericsson Catherine Mulligan, Imperial College London

he Internet of Things (IoT) has moved be- is no formal defi nition of machine intelligence, for yond the hype, and today we see promising the purposes of this article we defi ne it as the com- T bination of machine learning and artifi cial intel- applications materializing and industries trans- ligence technologies, which includes data analyt- forming through well-known digitalization as well ics, symbolic reasoning, and action planning. It is as servitization—that is, delivering a service as an expected that the value created by IoT lies mainly integral part of a product. This is evident in the in the services provided by machine intelligence increasing number of physical industry assets rep- rather than devices and connectivity services. As resented and manipulated in both the digital and a result, this article focuses on sharing our experi- physical worlds and in the fact that the business ence with the analysis of several IoT use cases and models for physical and digital assets are converg- a machine intelligence framework that combines ing toward service as opposed to product sales. knowledge of solution design for them. IoT can be used in many industry sectors with numerous benefi ts. Cost optimization and envi- Approach ronmental effi ciency are just two factors driving The main goal of building replicable solu- this expansion. Examples of IoT applications in- tions can be likened to the goal of formulating clude predictive maintenance and condition-based a reference architecture that encompasses and monitoring, which are mainly used in industrial encodes the knowledge of numerous solution ar- settings. However, the envisioned IoT applications chitectures. In turn, according to Nick Rozanski are so diverse and include such a broad spectrum and Eóin Woods, a solution architecture can be of technologies that system designers need design described as a set of architecture views or blue- support tools and guidelines. This article provides prints, each addressing the concerns of a specifi c a few of these design guidelines in the form of ar- stakeholder. 1 chitecture and design patterns to enable scalable In generating a reference architecture one typi- and replicable solutions rather than point solu- cally follows the design process for a single stake- tions stemming from point problems. As a result, holder concern that is typically expressed with we attempt to structure both the problem (use case one or more use cases. The process is then re- space) and the solution domain (architecture). peated for all possible stakeholder concerns. In IoT involves the instrumentation of physi- the end, all solution architectures are combined cal world assets or infrastructures (collectively in a union. called an entity of interest) with sensors, actua- In this article, we follow the reference archi- tors, and identifi cation devices to enable monitor- tecture process for a subset of stakeholders (end ing and control of these entities. The main high- users); therefore, the union of different solu- level building blocks are devices, connectivity, and tion architectures is a partial version of a refer- distributed machine intelligence. Although there ence architecture called the machine intelligence

64 1541-1672/17/$33.00 © 2017 IEEE IEEE INTELLIGENT SYSTEMS 18 May 2019 Published by thePublished IEEE Computer by the IEEE Society Computer Society 2469-7087/19/$33.00 © 2019 IEEE INTERNET OF THINGS Editor: Amit Sheth, Kno.e.sis—the Ohio Center of Excellence in Knowledge-Enabled Computing, [email protected]

Problem domain Solution domain Toward a Machine Use case patterns Blueprints MI framework

User use cases SLO and workflow Intelligence Layer Pattern #1 Blueprint #1.1 order management ...... Developer use Use case Design Pattern #2 cases taxonomy process for Diverse Industrial Blueprint #1.N1 Task and objective analysis Service provider Pattern #K ...... Data and use cases Task Multi-objective resource Controllers planning optimization IoT Use Cases processing

Insight Jan Höller and Vlasios Tsiatsis, Ericsson generation Catherine Mulligan, Imperial College London Knowledge Object management management

Figure 1. Structuring the problem and solutions domains (MI: machine intelligence, SLO: service-level objective). he Internet of Things (IoT) has moved be- is no formal defi nition of machine intelligence, for yond the hype, and today we see promising the purposes of this article we defi ne it as the com- T bination of machine learning and artifi cial intel- applications materializing and industries trans- ligence technologies, which includes data analyt- framework. Figure 1 illustrates this for developing solution blueprints. As as stated earlier. The end user for- forming through well-known digitalization as well ics, symbolic reasoning, and action planning. It is process, referred to as structuring a result, instead of generating solution mulates concerns that are concretely as servitization—that is, delivering a service as an expected that the value created by IoT lies mainly the solution domain. Other types blueprints from each specific use case, described in use cases. These are integral part of a product. This is evident in the in the services provided by machine intelligence of stakeholders are developers and we generated architecture blueprints grouped logically into (use case) ap- increasing number of physical industry assets rep- rather than devices and connectivity services. As service providers. The resulting ma- for a representative use case from a plications apart from their market resented and manipulated in both the digital and a result, this article focuses on sharing our experi- chine intelligence framework with a family of use cases or a use case pat- classification into sector and verti- physical worlds and in the fact that the business ence with the analysis of several IoT use cases and different stakeholder will be some- tern. We then iterated the design pro- cal. Our assumption is that a sec- models for physical and digital assets are converg- a machine intelligence framework that combines what different from that of an end cess over all the identified use case tor contains multiple verticals, each ing toward service as opposed to product sales. knowledge of solution design for them. user. However, following the same patterns. containing multiple applications and IoT can be used in many industry sectors with methodology outlined here, frame- multiple use case instances of dif- numerous benefi ts. Cost optimization and envi- Approach works for other stakeholders can be Structuring the Problem Domain: ferent use case types. We refer to a ronmental effi ciency are just two factors driving The main goal of building replicable solu- generated. Use Case Patterns use case instance as simply a “use this expansion. Examples of IoT applications in- tions can be likened to the goal of formulating Because of the huge number of avail- We have studied about 100 use cases case” (see Figure 2). In turn, each clude predictive maintenance and condition-based a reference architecture that encompasses and able use cases, with new ones arising from different sources, which have use case expresses a stakeholder monitoring, which are mainly used in industrial encodes the knowledge of numerous solution ar- all the time, following the design pro- varying degrees of detail and are clas- concern requiring a desired (tech- settings. However, the envisioned IoT applications chitectures. In turn, according to Nick Rozanski cess to generate a single solution archi- sified or grouped according to mar- nical) system functionality or set of are so diverse and include such a broad spectrum and Eóin Woods, a solution architecture can be tecture is not tractable. Therefore, we ket-related terms. The most typical characteristics. of technologies that system designers need design described as a set of architecture views or blue- also structure the problem domain or of these are the (market) sector and Although this classification is done support tools and guidelines. This article provides prints, each addressing the concerns of a specifi c the potential use cases. For the pur- (market) vertical that a use case be- mainly from a market perspective, we a few of these design guidelines in the form of ar- stakeholder. 1 poses of this exercise, we limited the longs to. Although not well-defined aimed to reshuffle the same set of use chitecture and design patterns to enable scalable In generating a reference architecture one typi- studied use case sources to oneM2M,2 in the literature, we have followed cases into another set of groups that and replicable solutions rather than point solu- cally follows the design process for a single stake- the Industrial Internet Consortium the IIC definition and taxonomy.4 emphasize the technical characteris- tions stemming from point problems. As a result, holder concern that is typically expressed with (IIC) use cases and testbeds (www. According to the IIC, a sector (such tics. We call these groups (technical) we attempt to structure both the problem (use case one or more use cases. The process is then re- iiconsortium.org), and National In- as healthcare) is a logical group of use case patterns. We also identified space) and the solution domain (architecture). peated for all possible stakeholder concerns. In stitute of Standards and Technology related verticals (for example, hos- a set of technical characteristics based IoT involves the instrumentation of physi- the end, all solution architectures are combined big data.3 In the context of this arti- pitals), and a vertical is a market on which the different (technical) use cal world assets or infrastructures (collectively in a union. cle, structuring the problem domain in which vendors offer goods and case patterns can be described; how- called an entity of interest) with sensors, actua- In this article, we follow the reference archi- (Figure 1) means that individual uses services that meet a particular set ever, for brevity we omit the techni- tors, and identifi cation devices to enable monitor- tecture process for a subset of stakeholders (end cases are grouped into use case fami- of usage, technical, or regulatory cal characteristics from the pattern ing and control of these entities. The main high- users); therefore, the union of different solu- lies or patterns. We have used the use requirements. definitions. level building blocks are devices, connectivity, and tion architectures is a partial version of a refer- cases’ structure to limit the number of We focus our study on the end Through our research, we have distributed machine intelligence. Although there ence architecture called the machine intelligence times we applied the design process user as the main system stakeholder identified seven use case patterns.

64 1541-1672/17/$33.00 © 2017 IEEE IEEE INTELLIGENT SYSTEMS JULY/AUGUST 2017 65 Published by the IEEE Computer Society www.computer.org/computingedge 19 Market classification Use case patterns Infrastructure monitor and control. Sector Massive monitoring This use case pattern covers manage- 1 contains Asset management ment of large-scale industrial or ex- m tended infrastructures that need to Vertical be monitored and controlled. Exam- Logistics ples include a transportation infra- 1 contains structure such as a national road net- m Remote operations work, a utility infrastruc ture such as Application Robots and the electric grid, oil and gas pipelines, m autonomous machines and street lighting. appears n Infrastructure monitor and control Use case pattern Device swarms. This use case pat- tern covers devices and systems that Device swarms Technical classification operate autonomously with a simple or pattern extraction set of rules and no central intelli- gence, and form peer-to-peer groups Figure 2. Structuring Internet of Things sectors, verticals, applications, and use cases to collectively reach a common goal. toward recurring use case patterns. Typical examples include microgrid producers and consumers and zero- trust computing applications such as Massive monitoring. This use case Typical examples include fleet manage- home automation. pattern involves numerous sensors ment, supply-chain optimization, and deployed across a large geographical pickup-delivery services. Structuring the Solution Domain area. Data is collected over a period of The solution domain consists mainly time for bulk batch or stream analysis. Remote operations. Remote opera- of architectural blueprints that in- Data analysis aims to find trends, de- tion generally refers to the control clude a few main types of system tect anomalies or abnormal situations, and operation of a system or equip- components. This domain encom- or simply learn the behavior of the ment from a remote location either by passes devices, connectivity, cloud monitored asset or phenomena. Typi- humans or software. In this case, the and distributed computing, machine cal examples include environmental remotely operated system or equip- intelligence, and mechanisms for vi- and climate monitoring and pollution ment cannot or is not designed to sualizing information and integrat- monitoring. operate completely autonomously to ing applications to an enterprise accomplish the task. Typical exam- environment. These blueprints typi- Asset management. This pattern in- ples include remote mining, remote- cally concretely express the func- volves managing physical assets that controlled vehicles and drones, and tional components of a resulting are well-defined and confined. Exam- remote surgery. solution. However, a system also ple assets include a building, vehicle, typically consists of a set of non- piece of industrial machinery, tur- Robots and autonomous machines. functional characteristics. Through bine, or human patient. Typical man- This use case pattern covers the op- our research we have identified a agement aspects are to optimize the erations and management of par- core set of functional and nonfunc- asset’s operation, perform diagnos- tially or fully autonomous systems tional characteristics that can be tics, or do predictive maintenance. such as robots, vehicles, and drones. grouped into different perspectives: Such systems are often described as data and information perspectives Logistics. A typical logistics scenario cyberphysical systems (CPS), which (multimodality of IoT data, the need can be described as the process of coor- integrate the dynamics of the physi- for insight and analysis-driven func- dination, management, and orchestra- cal processes with those of the soft- tions, knowledge representation, tion of a collection of tasks in a work- ware and networking. Typical ex- cognition, and so on); control per- flow to achieve a set of goals (such as amples include static or mobile spectives, meaning whether a control time or cost optimization) by making factory floor robots and autonomous functionality exists (sensing and ac- efficient use of the available resources. vehicles. tuation in feedback loops, workflow/

66 IEEE INTELLIGENT SYSTEMS 20 ComputingEdge May 2019 Market classification Use case patterns Infrastructure monitor and control. process-driven); and general charac- might be wrongly set or continu- domains as well as some of the main Sector Massive monitoring This use case pattern covers manage- teristics (locality, timing criticality, ously do model training. interfaces between them. 1 contains Asset management ment of large-scale industrial or ex- safety, and security). m tended infrastructures that need to Machine Intelligence Data and Resource Processing Vertical be monitored and controlled. Exam- A Framework for Functional Domains By data and resources we mean sensor Logistics ples include a transportation infra- Distributed Machine The machine intelligence framework data, actuator services, and their rep- 1 contains structure such as a national road net- Intelligence for provides the functionality needed to resentations. Individual sensor data m Remote operations work, a utility infrastruc ture such as Industrial Use Cases realize the use case applications rele- and actuator control is the raw fab- Application Robots and the electric grid, oil and gas pipelines, Our approach is to aggregate solution vant to end users. As such, it processes ric for interacting with the physi- m autonomous machines and street lighting. blueprints for the identified recurring IoT data from various sources, de- cal world. Sensor data includes in- appears n Infrastructure use case patterns, thus arriving at our rives and executes control operations dividual data items and events and monitor and control Use case pattern Device swarms. This use case pat- desired framework. As mentioned, to manipulate the asset or infrastruc- datastreams from a single sensor. Re- tern covers devices and systems that our focus in this article is distrib- ture via actuators, and maintains a sources are abstractions of sensors Device swarms Technical classification operate autonomously with a simple uted machine intelligence functional- cognitive knowledge base related to and actuators in the system. or pattern extraction set of rules and no central intelli- ity supporting the diversity of IoT use the assets. An objective of the frame- Massive-scale IoT deployments re- gence, and form peer-to-peer groups cases we have identified. work is to partition functionality quire some key considerations in the Figure 2. Structuring Internet of Things sectors, verticals, applications, and use cases to collectively reach a common goal. For the sake of brevity, we have collection phase. Data can be re- toward recurring use case patterns. Typical examples include microgrid left out two important, but for the ceived as an event stream with vary- producers and consumers and zero- objective of this article, secondary ing speed, volume, and dynamicity trust computing applications such as considerations. The first is the gen- The machine intelligence over time. Information creation, at- Massive monitoring. This use case Typical examples include fleet manage- home automation. eral need for distributed processing tribute validation, and verification pattern involves numerous sensors ment, supply-chain optimization, and of machine intelligence logic, and framework provides the are necessary steps in the data collec- deployed across a large geographical pickup-delivery services. Structuring the Solution Domain the second is lifecycle management tion. Data can be received asynchro- area. Data is collected over a period of The solution domain consists mainly of the solution. functionality needed nously or synchronously, depending time for bulk batch or stream analysis. Remote operations. Remote opera- of architectural blueprints that in- IoT data processing and decision on the type of application at hand. Data analysis aims to find trends, de- tion generally refers to the control clude a few main types of system making is generally a highly distrib- to realize the use case Data management, curation, and tect anomalies or abnormal situations, and operation of a system or equip- components. This domain encom- uted capability. The need for distrib- resource management are crucial in or simply learn the behavior of the ment from a remote location either by passes devices, connectivity, cloud uted processing in IoT comes from applications relevant to IoT systems and comprise the follow- monitored asset or phenomena. Typi- humans or software. In this case, the and distributed computing, machine different requirements.5 Data vol- ing steps. First, data is collected and cal examples include environmental remotely operated system or equip- intelligence, and mechanisms for vi- umes, cost, performance and latency, end users. distributed. Then, data and resources and climate monitoring and pollution ment cannot or is not designed to sualizing information and integrat- autonomous local asset operation, ro- are modeled to capture heterogeneity monitoring. operate completely autonomously to ing applications to an enterprise bustness, and safety around IoT asset (structured, unstructured), high dis- accomplish the task. Typical exam- environment. These blueprints typi- operation are the main requirements. into application-independent building tribution, large size (number of data Asset management. This pattern in- ples include remote mining, remote- cally concretely express the func- IoT distributed processing extends blocks. These can then be intercon- sources, streams, and actuator end volves managing physical assets that controlled vehicles and drones, and tional components of a resulting beyond the datacenter to constrained nected to realize the use case applica- points), and semantic annotation de- are well-defined and confined. Exam- remote surgery. solution. However, a system also IoT devices, and the resulting het- tions according to a service-oriented scribing meaning of the endpoint ca- ple assets include a building, vehicle, typically consists of a set of non- erogeneity needs to be managed to paradigm, lending themselves to mi- pabilities. Resources need the proper piece of industrial machinery, tur- Robots and autonomous machines. functional characteristics. Through meet service-level agreements for the croservices implementations.6 annotation, describing such attributes bine, or human patient. Typical man- This use case pattern covers the op- our research we have identified a applications. The main functional domains, as as meaning, origin, and quality. Also, agement aspects are to optimize the erations and management of par- core set of functional and nonfunc- Lifecycle management includes shown in Figure 3, include the capa- transmitted data needs to be filtered to asset’s operation, perform diagnos- tially or fully autonomous systems tional characteristics that can be operational aspects such as the type bility to manage data and relevant the application needs. IoT data, when tics, or do predictive maintenance. such as robots, vehicles, and drones. grouped into different perspectives: of logic to deploy and location of the IoT resources (sensing, actuation, needed, will require distributed stor- Such systems are often described as data and information perspectives deployment, ensuring necessary ro- and identification) and process data age, taking into account factors such Logistics. A typical logistics scenario cyberphysical systems (CPS), which (multimodality of IoT data, the need bustness, and trust and security. In and extract information (analytics as cost and storage capacity. can be described as the process of coor- integrate the dynamics of the physi- for insight and analysis-driven func- addition, the system should be adap- or machine learning); various types Real resources require appropriate dination, management, and orchestra- cal processes with those of the soft- tions, knowledge representation, tive and cognitive to handle chang- of control and execution (control- abstract representations and manage- tion of a collection of tasks in a work- ware and networking. Typical ex- cognition, and so on); control per- ing external requirements or chang- lers or planning tools); and capabili- ment in an IoT system, for example, flow to achieve a set of goals (such as amples include static or mobile spectives, meaning whether a control ing contexts. For instance, the ties to manage and represent knowl- as a representation of a datastream time or cost optimization) by making factory floor robots and autonomous functionality exists (sensing and ac- system should ideally detect the need edge relating to the assets and the or as an aggregation of sensor data. efficient use of the available resources. vehicles. tuation in feedback loops, workflow/ to change controller parameters that system. Figure 3 shows the functional Resource abstraction implies a level

66 IEEE INTELLIGENT SYSTEMS JULY/AUGUST 2017 67 www.computer.org/computingedge 21 SLO and workflow order management

Task and objective analysis

Controllers Task planning Multi-objective optimization Data and resource processing PID Temporal Tiered KPI conflict arbitration Data Long term Tier 1 collection Curation Rule based KPI tradeoffs Actuation Mid term Tier 2 Plan selection Annotation dispatch Short term Tier 3 Data/event distribution Resource management Insight generation Forecasting Fusion Long term Model selection Anomaly detection Knowledge management Mid term Model training Namespace: Short term Classification real-world model Name space: Object management X Identification Catalog Semantic interoperability Localization

Figure 3. Framework for machine intelligence supporting a diversity of IoT use cases (KPI: key performance indicator, PID: proportional, integral, and derivative; SLO: service-level objective).

of indirection requiring a resolution via reinforcement learning. Typical sharing knowledge about a physical function that dynamically maps be- forecasting models can be statistical asset or infrastructure. Knowledge tween the resource representation or neural networks-based, Bayesian is the collected set of data, inferred and the real resource. or non-Bayesian, linear or nonlinear, knowledge and insights, and control parametric or nonparametric, univar- capabilities of the asset. The knowl- Insight Generation iate or multivariate. edge can further be structured so the Forecasting is a key issue in the Sensor fusion is another technique. different data, insights, and control prominent IoT use case of predictive In general, fusion concerns combin- capabilities can be directly mapped to maintenance, which is used to deter- ing data and information from di- the asset’s real-world structure, thus mine the health of a piece of machin- verse sources so that the resulting becoming a proper digital representa- ery and understand when any mainte- information is more accurate than tion of the asset. nance might be needed. if one had relied on a single source. Knowledge is generally of two Forecasting involves predicting An example is the localization of types: declarative knowledge (also re- new outcomes based on previously an object that can rely on a combi- ferred to as propositional knowledge) known results. Depending on the IoT nation of ultra-wide band (UWB) and procedural knowledge (also re- use case, different forecasting time- transponders, camera detection, and ferred to as imperative knowledge). frames apply. For example, trajectory contextual information sensed by Declarative knowledge describes forecasting of moving objects can be the object itself, and when fused pro- what an entity is and how it is struc- real time, whereas machine degrada- vide a much higher degree of location tured and formally expressed using tion is more long term. Forecasting accuracy. ontologies. Procedural knowledge de- can be data driven or model driven scribes how an entity behaves, for ex- depending on the problem require- Knowledge Management ample, in response to stimuli; and the ments. Model training is a necessity Knowledge management involves rep- formal description format is typically and can be based on training sets or resenting, modeling, structuring, and via state machines.

68 IEEE INTELLIGENT SYSTEMS 22 ComputingEdge May 2019 SLO and workflow order management

Knowledge is captured and made Object localization needs to be controlling operations based on input Task and objective analysis available in what can be referred to tailored to the IoT needs and de- from an a priori desired and defined as a knowledge base, which is man- ployment scenarios (such as indoor operational behavior. The use of dif- Controllers Task planning Multi-objective optimization ifested by a set of ontologies. Typi- or outdoor environments). Typical ferent controller types is based on func- Data and resource processing cal ontologies in IoT are not only for indoor localization technologies in- tional and nonfunctional characteris- PID Temporal Tiered KPI conflict arbitration Data the actual real-world model of the clude video or image processing, tics meeting application needs. Curation Long term Tier 1 KPI tradeoffs collection Rule based asset and expert knowledge but also Bluetooth beacons, use of Wi-Fi ac- Whereas many control systems in Mid term Tier 2 Plan selection Actuation knowledge about system and appli- cess points, or UWB ranging. Out- robotics and other real-world continu- dispatch Annotation Short term Tier 3 cation objectives, such as key per- doors, GPS-based localization is ous and industrial systems use propor- Data/event distribution formance indicators (KPIs), a work typically relied on. For any local- tional, integral, and derivative (PID) Resource order, task plans, and constraints of ization solution, the required ac- controls, other IoT use cases, such as management Insight generation the IoT system itself. The real-world curacy, size of area covered, and home automation, often use rule-based Forecasting Fusion model is typically a hierarchical or systems for event-driven control. Long term graph structure. A PID controller is a control loop Model selection Anomaly detection Knowledge management Mid term Across domains in IoT, seman- feedback mechanism using a math- Model training Namespace: Short term Classification real-world model tic interoperability is essential for Semantic interoperability ematical function that takes the de- Name space: achieving many business applica- viation between the desired state and 7 enables data and Object management X tions. Semantic interoperability the measure state as input for control. enables data and information to Proportional control means that pro- Identification Catalog Semantic interoperability be shared across domains and un- information to be shared portional feedback of the deviation Localization derstood by systems without need- is provided to determine the control ing manual interpretations on top across domains and value. A derivative part of the devia- Figure 3. Framework for machine intelligence supporting a diversity of IoT use cases (KPI: key performance indicator, of technical details or protocol and tion dampens the error. An integral PID: proportional, integral, and derivative; SLO: service-level objective). syntactic interoperability. Semantic understood by systems part of the deviation provides errors interoperability requires mapping to be removed over time. Examples methods that can be predefined or without needing manual include inverse kinematics for robot of indirection requiring a resolution via reinforcement learning. Typical sharing knowledge about a physical self-learning. The latter requires al- control and temperature control of function that dynamically maps be- forecasting models can be statistical asset or infrastructure. Knowledge gorithms that consider structural, interpretations on top a fluid system. This requires knowl- tween the resource representation or neural networks-based, Bayesian is the collected set of data, inferred terminological, and semantic differ- edge about the physical behavior and and the real resource. or non-Bayesian, linear or nonlinear, knowledge and insights, and control ences and similarities. of technical details or properties of the asset controlled. parametric or nonparametric, univar- capabilities of the asset. The knowl- Rule-based controllers are based Insight Generation iate or multivariate. edge can further be structured so the Object Management protocol and syntactic on a set of predefined rules that are Forecasting is a key issue in the Sensor fusion is another technique. different data, insights, and control Object management involves identify- trigger-action pairs, where a trigger prominent IoT use case of predictive In general, fusion concerns combin- capabilities can be directly mapped to ing, localizing, and cataloging physi- interoperability. is a condition and an action is a pre- maintenance, which is used to deter- ing data and information from di- the asset’s real-world structure, thus cal assets that are handled by the IoT defined workflow typically containing mine the health of a piece of machin- verse sources so that the resulting becoming a proper digital representa- system. This is important for some commands to the devices or related ery and understand when any mainte- information is more accurate than tion of the asset. types of use cases, such as logistics real timeliness of location must be services—for example, following the nance might be needed. if one had relied on a single source. Knowledge is generally of two involving transported goods or local- considered. simple logic of “if this, then that.” Forecasting involves predicting An example is the localization of types: declarative knowledge (also re- ization of tools on a factory floor. A catalog function can also be re- In sufficiently complex, dynamic, new outcomes based on previously an object that can rely on a combi- ferred to as propositional knowledge) Object identification is possible us- quired. This function works as a re- and nondeterministic situations one known results. Depending on the IoT nation of ultra-wide band (UWB) and procedural knowledge (also re- ing various techniques, such as tags pository of all assets of interest and can enhance the usability and main- use case, different forecasting time- transponders, camera detection, and ferred to as imperative knowledge). based on optical or radio technolo- includes other properties of the asset. tainability of both PID and rule- frames apply. For example, trajectory contextual information sensed by Declarative knowledge describes gies (for example, QR codes or RFID EPCIS is an example.8 based control systems by making forecasting of moving objects can be the object itself, and when fused pro- what an entity is and how it is struc- tags). The purpose is to uniquely them use task planning technologies real time, whereas machine degrada- vide a much higher degree of location tured and formally expressed using identify and name objects. A resolu- Controllers to help infer the actions to be taken. tion is more long term. Forecasting accuracy. ontologies. Procedural knowledge de- tion infrastructure is usually in place Control is a core automation point in can be data driven or model driven scribes how an entity behaves, for ex- to find information about the object. any IoT system involving actuators. Task Planning depending on the problem require- Knowledge Management ample, in response to stimuli; and the A prominent example is electronic Control software commands the assets’ Task planning can be defined as the ments. Model training is a necessity Knowledge management involves rep- formal description format is typically product code information services desired behavior. Common to all con- process of generating a sequence of ac- and can be based on training sets or resenting, modeling, structuring, and via state machines. (EPCIS).8 trollers is the deterministic behavior of tions with certain objectives. Planning

68 IEEE INTELLIGENT SYSTEMS JULY/AUGUST 2017 69 www.computer.org/computingedge 23 can be applied to a variety of prob- control applications, the proposed sys- from the insight generation func- lems such as route planning of au- tem can monitor the profitability for tional domain. For task planning, the tonomous vehicles, optimization of a the whole chain as well as the overall extracted inputs should correspond logistics flows, and automation of product shortage risk. Those two KPIs to goal states that can be used to field personnel. are clearly in conflict as optimization compute an appropriate plan. The planning problem is normally at an extreme for one results in a risk For controllers, workflow orders represented by three key elements— for the other. might specify new set levels of pa- states, actions, and goals. State iden- A key difference between task plan- rameters or rules. For multiobjective tifies the model of the world, actions ning and optimization is that in the optimizers, workflow orders should represent different operations that af- latter does not assume that desired specify a set of KPIs to be balanced fect the system’s state, and goals are goal states will be input by the system by automated tradeoff analysis to states to achieve or maintain. Deriv- stakeholders. This stems from the fact comply to overall service objectives as ing the task plan is to take the cur- that it can be impossible for humans well as mitigating conflicts. rent state, the desired state and the to cope with the underlying complex- possible actions and from that gener- ity of explicitly specifying goal states ate a plan as a sequence of possible or while simultaneously fulfilling all ser- IoT is about the digital representa- proposed actions. A plan can also be vice-level objectives (SLOs). In such tion of the physical world to enable a partially ordered list of tasks. One cases, it is possible to leverage simu- the digitalization and servitization of possible way to perform task plan- lation-based MOO to automatically physical assets or entities of interest. ning is using AI planners, where the explore the space of all candidate goal Since the application spread in to- world and the problem are modeled states that not only fulfill all SLOs but day’s IoT is wide and is typically using a planning domain definition actually surpass them and deliver out- structured in market-oriented groups, language (PDDL). standing performance. a system designer needs IoT system design patterns to assist in designing Multiobjective Optimization Service Level Objectives for scalable and replicable solutions. Automating complex system opera- and Workflow Management The work presented here provides a tions by leveraging data-driven strat- The end user’s interests in the sys- generic blueprint for designers to egies designed to analyze alternatives tem can be specified as a set of high- jumpstart the design process of an under multiple conflicting views or level, quantifiable performance met- unknown use case. KPIs is challenging. First, KPI evalua- rics by SLOs and workflow orders. tions are not always reliable and might SLOs are translated into KPIs, which References be subject to changes over time; sec- are deemed critical for verifying ser- 1. N. Rozanski and E. Woods, Software ond, the costs incurred in adapting vice execution and detecting devia- Systems Architecture: Working with solutions under operation must be ac- tions from SLOs. KPIs can further be Stakeholders Using Viewpoints and counted for. In such cases, it is difficult broken down to needed insights and, Perspectives, 2nd ed., Addison-Wesley, to track how the underlying tradeoffs together with workflow orders, the in- 2011. (such as return versus risk or through- tentions or actions of the system. The 2. ETSI, oneM2M Use Case Collection, put versus cost) will evolve over time, insights and actions can then be used ETSI TR118501v1.0.0, tech. report, and decision-making preferences are to define the needed sensor data and 2015; www.etsi.org/deliver/etsi_tr hard to elicit and represent computa- actuator controls. For instance, in a lo- /118500_118599/118501/01.00.00_60 tionally. In the absence of clear pref- gistics use case, a workflow order can /tr_118501v010000p.pdf. erences and priorities over the KPIs, request that a number of products be 3. Nat’l Inst. of Standards and Tech- general problem-solving strategies and delivered to a certain subset of retail- nology, Big Data Interoperability architectures must be designed for au- ers within a specified deadline to keep Framework: Volume 3, Use Cases and tomating general data-driven multiob- shortage risk under the agreed levels. General Requirements, NIST Special jective opti mization (MOO) systems The KPIs and workflow orders en- Publication 1500-3, 2015; http://dx.doi under uncertainty. capsulate information that allows the .org/10.6028/NIST.SP.1500-3. MOO can play a key role in appli- extraction of inputs to task planning, 4. R.A. Martin and A. Soellinger, “The cations where conflict resolution is ex- controllers, and MOO, which also Emerging IIC Verticals Taxonomy Land- pected. For instance, in supply-chain includes the necessary information scape,” IIC J. Innovation, June 2016;

70 IEEE INTELLIGENT SYSTEMS 24 ComputingEdge May 2019 can be applied to a variety of prob- control applications, the proposed sys- from the insight generation func- www.iiconsortium.org/news/joi 8. EPC Information Services (EPCIS) Tsiatsis has a PhD in electrical engineering lems such as route planning of au- tem can monitor the profitability for tional domain. For task planning, the -articles/2016-June-The-Emerging-IIC Standard, release 1.2, GS1, 2016; http:// from the University of California, Los Ange- tonomous vehicles, optimization of a the whole chain as well as the overall extracted inputs should correspond -Verticals-Taxonomy-Landscape.pdf. www.gs1.org/sites/default/files/docs/epc les. He is a member of IEEE and ACM. Con- logistics flows, and automation of product shortage risk. Those two KPIs to goal states that can be used to 5. OpenFog Consortium, OpenFog Refer- /EPCIS-Standard-1.2-r-2016-09-29.pdf. tact him at [email protected]. field personnel. are clearly in conflict as optimization compute an appropriate plan. ence Architecture for Fog Computing, The planning problem is normally at an extreme for one results in a risk For controllers, workflow orders tech. report OPFRA001.020817, 2017; Jan Höller is a research fellow at Ericsson Cathy Mulligan is a research fellow at represented by three key elements— for the other. might specify new set levels of pa- www.openfogconsortium.org/ra. Research, Sweden. His research interests in- Imperial College London and vice chair states, actions, and goals. State iden- A key difference between task plan- rameters or rules. For multiobjective 6. J. Lewis and M. Fowler, “Microser- clude industrial Internet of Things systems, for ETSI Industry Specification Group for tifies the model of the world, actions ning and optimization is that in the optimizers, workflow orders should vices: A Definition of This New Archi- digital transformation, and machine intel- Context Information Management. Her re- represent different operations that af- latter does not assume that desired specify a set of KPIs to be balanced tectural Term,” MartinFowler.com, ligence in autonomous systems. Höller has search interests include digital technolo- fect the system’s state, and goals are goal states will be input by the system by automated tradeoff analysis to 25 Mar. 2014; https://martinfowler an MSc in engineering physics from Lund gies and its impact on industrial structures. states to achieve or maintain. Deriv- stakeholders. This stems from the fact comply to overall service objectives as .com/articles/microservices.html. Institute of Technology. He serves on the Mulligan has a PhD in engineering from ing the task plan is to take the cur- that it can be impossible for humans well as mitigating conflicts. 7. M. Serrano et. al., IoT Semantic Board of Directors of the IP for Smart Ob- the University of Cambridge. She is a mem- rent state, the desired state and the to cope with the underlying complex- Interoperability: Research Challenges, jects Alliance. Contact him at jan.holler@ ber of IEEE, IET, and ACM. Contact her at possible actions and from that gener- ity of explicitly specifying goal states Best Practices, Recommendations and ericsson.com. [email protected]. ate a plan as a sequence of possible or while simultaneously fulfilling all ser- IoT is about the digital representa- Next Steps,” European Research Clus- proposed actions. A plan can also be vice-level objectives (SLOs). In such tion of the physical world to enable ter on the Internet of Things (IERC), Vlasios Tsiatsis is a senior researcher at Er- Read your subscriptions a partially ordered list of tasks. One cases, it is possible to leverage simu- the digitalization and servitization of 2015; www.internet-of-things-research icsson Research, Sweden, and an Internet This article originallythrough theappeared myCS in publications portal at possible way to perform task plan- lation-based MOO to automatically physical assets or entities of interest. .eu/pdf/IERC_Position_Paper_IoT of Things architect. His research interests IEEE Intelligent Systems, vol. 32, ning is using AI planners, where the explore the space of all candidate goal Since the application spread in to- _Semantic_Interoperability_Final.pdf. include IoT, cloud, and analytics security. no.http://mycs.computer.org 4, 2017. world and the problem are modeled states that not only fulfill all SLOs but day’s IoT is wide and is typically using a planning domain definition actually surpass them and deliver out- structured in market-oriented groups, language (PDDL). standing performance. a system designer needs IoT system design patterns to assist in designing Multiobjective Optimization Service Level Objectives for scalable and replicable solutions. IEEE WORLDTake CONGRESS the Automating complex system opera- and Workflow Management The work presented here provides a tions by leveraging data-driven strat- The end user’s interests in the sys- generic blueprint for designers to ON SERVICES 2019 egies designed to analyze alternatives tem can be specified as a set of high- jumpstart the design process of an 8–13 July 2019 CS• University Library of Milan • Milan, Italy under multiple conflicting views or level, quantifiable performance met- unknown use case. KPIs is challenging. First, KPI evalua- rics by SLOs and workflow orders. tions are not always reliable and might SLOs are translated into KPIs, which References Engage, Learn, and Connectwherever at IEEE SERVICES 2019— The leading technical be subject to changes over time; sec- are deemed critical for verifying ser- 1. N. Rozanski and E. Woods, Software forum covering services computing and applications, as well as service ond, the costs incurred in adapting vice execution and detecting devia- Systems Architecture: Working with software technologies, for building and delivering innovative industry solutions. solutions under operation must be ac- tions from SLOs. KPIs can further be Stakeholders Using Viewpoints and you go! counted for. In such cases, it is difficult broken down to needed insights and, Perspectives, 2nd ed., Addison-Wesley, ■ IEEE International Congress on Big ■ IEEE International Congress on Data (BigData Congress 2019) Internet of Things (ICIOT 2019) to track how the underlying tradeoffs together with workflow orders, the in- 2011. ■ IEEE International Conference onIEEE Computer■ Society IEEE International magazines and Conference Transactions on are now (such as return versus risk or through- tentions or actions of the system. The 2. ETSI, oneM2M Use Case Collection, Cloud Computing (CLOUD 2019)available to subscribersWeb Services in the portable (ICWS ePub2019) format. put versus cost) will evolve over time, insights and actions can then be used ETSI TR118501v1.0.0, tech. report, ■ IEEE International Conference on ■ IEEE International Conference on and decision-making preferences are to define the needed sensor data and 2015; www.etsi.org/deliver/etsi_tr Edge Computing (EDGE Just2019) download the articlesServices from the Computing IEEE Computer (SCC Society2019) Digital hard to elicit and represent computa- actuator controls. For instance, in a lo- /118500_118599/118501/01.00.00_60 ■ IEEE International ConferenceLibrary, on and you can ■read Plus them two on additional any device signaturethat supports ePub. gistics use case, a workflow order can /tr_118501v010000p.pdf. tionally. In the absence of clear pref- Cognitive Computing (ICCCFor 2019)more information, includingsymposia a list on offuture compatible digital devices, health visit erences and priorities over the KPIs, request that a number of products be 3. Nat’l Inst. of Standards and Tech- services and future fi nancial services general problem-solving strategies and delivered to a certain subset of retail- nology, Big Data Interoperability Don’t miss IEEE SERVICES 2019—the ONLY services conference that publishes architectures must be designed for au- ers within a specified deadline to keep Framework: Volume 3, Use Cases and www.computer.org/epub its proceedings in the IEEE Xplore digital library—where the brightest minds tomating general data-driven multiob- shortage risk under the agreed levels. General Requirements, NIST Special converge for service computing’s latest developments and breakthroughs. jective opti mization (MOO) systems The KPIs and workflow orders en- Publication 1500-3, 2015; http://dx.doi under uncertainty. capsulate information that allows the .org/10.6028/NIST.SP.1500-3. MOO can play a key role in appli- extraction of inputs to task planning, 4. R.A. Martin and A. Soellinger, “The cations where conflict resolution is ex- controllers, and MOO, which also Emerging IIC Verticals Taxonomy Land- Register Now conferences.computer.org/services/2019 pected. For instance, in supply-chain includes the necessary information scape,” IIC J. Innovation, June 2016;

70 IEEE INTELLIGENT SYSTEMS JULY/AUGUST 2017 71 www.computer.org/computingedge 25 EDITOR BRIAN DAVID JOHNSON THE FUTURE TODAY Frost and Sullivan; [email protected]

the house will continue collecting and Self-Managing recording this information through a network of blockchain-enabled ser- vices, including identity, storage, and transaction tokens. Your home and its Real Estate relationships will be recorded for all to see, which might be liberating or creepy—or both. Nathan Shedro , Seed Vault Ltd.

Blockchain is poised to make a sea change e don’t normally think of ob- jects as having agency (be- in just about every industry and business, Wing entitled to act and make decisions for themselves), though even real estate. we all know they have a history. This „ rst step is an extension of that his- tory: making it visible to all. The next emember that folder with all retrieved forever, creating a radical step—giving objects autonomy—is the the important papers related new kind of transparency. And every- subject for another column entirely. to your house? You know, the one should be considering how such one with your mortgage and transparency will a‡ ect their business Rinsurance documents, the foundation and their industry. repair bill, the estimate to redo the But back to your house. Blockchain NATHAN SHEDROFF is executive electrical for that home theater, and technologies are governed by software director of the non-profi t Seed Vault the map showing that your neighbor’s code called smart contracts. Think of Ltd., an independent bot blockchain driveway is actually  ft. on your side this code as self-running rules that community. Contact him at nathan@ of the property line? Very soon, a new automate all of the processes. If de- nathan.com. technology called blockchain might signed correctly, a business can prac- allow the house itself to track what tically manage itself based on the self- happens to it, so that you (and subse- running code. quent owners) don’t miss anything. In Therefore, blockchain technology Read your subscriptions fact, blockchain is poised to make a sea could enable your house to manage through the myCS This article originallypublications appeared portal at in change in just about every industry its own transactions. It’s not going to Computer,http://mycs.computer.org vol. 51, no. 1, 2018. and business. call the plumber—yet—but it could Blockchain is an open, peer-to-peer manage all of the those documents. (meaning shared) ledger of transac- Sometime in the next ‰Š years, rather tions. It’s accounting, but the ledger than relying on a real estate agent or doesn’t live in a central place—it’s current owner to share the provenance distributed and supported by every- of a property you might be able to sub- one’s systems. So to authenticate a mit a query and get a report on every transaction, more than one node has transaction for a house or condo: ev- to “see” the transaction occur and ery repair, change of hands, valuation agree that it’s accurate. Only then is estimate, tax assessment, redistrict- the transaction added to the ledger. ing, construction document, and lien. Once the block (transaction page) is You’ll no longer wonder whether that „ lled, it’s permanently recorded, and house is built on an old burial ground a new block is started. The block can’t or whether the agent failed to mention WWW.COMPUTER.ORG be changed once it’s veri„ ed—it’s part the meth lab the past owners oper- of the permanent record. This means ated. The blockchain record will tell /COMPUTER that every single transaction can be you, and you’ll be able to trust it. And

26 May 2019 Published by the IEEE Computer Society 2469-7087/19/$33.00 © 2019 IEEE 104 COMPUTER PUBLISHED BY THE IEEE COMPUTER SOCIETY 0018-9162/18/$33.00 © 2018 IEEE

r1fut.indd 104 12/29/17 1:33 PM

COLUMN: IT TRENDS

Blockchain in Developing Countries

A large portion of the population in the developing world can benefit from blockchain technologies. According to the Nir Kshetri ICT Facts and Figures 2017 report, 42.9 percent of University of North Carolina households in developing countries have Internet access.1 at Greensboro This percentage is rising quickly due to the increasing Jeffrey Voas affordability and usability of smartphones. It can be argued IEEE Fellow that in many ways, blockchain has a much higher value proposition for the developing world than for the Editor: developed world. Why? Because blockchain has the Jeffrey Voas, NIST; potential to make up for a lack of effective formal [email protected] institutions—rules, laws, regulations, and their enforcement. In this article, we will discuss key concerns regarding institutions in the developing world and evaluate the potential use of blockchain to address them.

PROPERTY RIGHTS According to a 2011 UN report, weak governance led to corruption in land occupancy and administration in more than 61 countries. Corruption varied from small-scale bribes to the abuse of government power at the national, state, and local levels.2 Enforcement of property rights incentivizes investment and provides resources to avoid poverty. Agreed-upon property rights allow entrepreneurs to use the assets as collateral and thus increase their access to capital. However, a large proportion of the poor lack property rights. Around 90 percent of land is undocumented or unregistered in rural Africa. Likewise, a lack of land ownership remains among the barriers to entrepreneurship and economic development in India.3 One estimate suggests that more than 20 million rural families in India do not own land and millions more lack legal ownership of the land where they have built houses and worked.4 Landlessness is arguably a more powerful predictor of poverty in India than caste or illiteracy.4 In addition, according to the United States Agency for International Development (USAID), only 14 percent of Hondurans legally own their properties. Among those properties that are occupied legally, only 30 percent are registered.5 It is not uncommon for government officials to alter titles of registered properties, and there are cases where government officials have allocated properties with altered titles to themselves. Bureaucrats have reportedly altered titles and registered beachfront properties for themselves,6 and have allegedly accepted bribes in exchange for property titles. Citizens often lack access to records, and those records that are accessible might provide conflicting information. Property owners are often unable to defend themselves against infringement of property usage and mineral rights.7

IT Professional Published by the IEEE Computer Society March/April 2018 11 1520-9202/18/$33.00 ©2018 IEEE 2469-7087/19/$33.00 © 2019 IEEE Published by the IEEE Computer Society May 2019 27 IT PROFESSIONAL

Blockchain can reduce friction and conflict, as well as the costs associated with property registration. It is possible to do all or most of the processing using smartphones.8 Given this, it is encouraging that various initiatives have been undertaken. The US-based platform for real-estate registration, Bitland, announced the introduction of a blockchain-based land registry system in Ghana, where 78 percent of land is unregistered.9 There is a long backlog of land-dispute cases in Ghanaian courts.10 Bitland records transactions securely, with GPS coordinates, written descriptions, and satellite photos. This and similar processes are expected to guarantee property rights and reduce corrupt practices. As of mid-2016, 24 communities in Ghana had expressed interest in the project.9 Bitland is planning to expand to Nigeria in collaboration with the OPEC Fund for International Development.11 The bitcoin company BitFury and the Georgian government signed a deal to develop a system for registering land titles using blockchain.12 Currently, to buy or sell land in Georgia, the buyer and the seller must use public registry. They will pay between $50 and $200, depending on the speed with which they want the transaction notarized. This pilot blockchain project will move the registry process to blockchain. The costs for the buyer and the seller is now expected to be in $.05-$.10 range.13 In 2017, India’s Telangana and Andhra Pradesh states announced plans to use blockchain for land registry. Telangana started a land registry pilot project in the capital city of Hyderabad. It was reported in September 2017 that a complete rollout of the program in Hyderabad and nearby areas would take place within a year.14 In October 2017, the Andhra Pradesh government collaborated with a Swedish start-up, ChromaWay, to create a blockchain-based land registry system for the planned city of Amaravati.15

CONTROLLING CORRUPTION Blockchain creates a tamper-proof digital ledger of transactions and shares the ledger, thus offering transparency. Cryptography allows for access to add to the ledger securely. It is extremely difficult—if not impossible—to change or remove data recorded on a ledger. With this feature, blockchain makes it possible to reduce or eliminate integrity violations such as fraud and corruption while also reducing transaction costs. As an example, the use of fake export invoices to disguise cross-border capital flows has been pervasive in China. During April to September of 2014, $10 billion worth of fake trade transactions were discovered.16 Major fraud cases occurred at the Qingdao port, where companies had used fake receipts to secure multiple loans against a single cargo of metal.17 The Qingdao incident involved 300,000 tons of alumina, 20,000 tons of copper, and 80,000 tons of aluminum ingots.16 As a result, Chinese banks charge higher interest rates and are less likely to offer collateral-based financing.17 Blockchain can thwart such scandals. Blockchain also makes it possible to generate smart (“tagged”) property and control it with smart contracts.18 Examples of such properties include physical property (car, house, container of metal) as well as nonphysical property (shares in a company).19 Blockchain-based smart properties only undergo actions based on the information published in a smart contract.18 If property is being used as collateral, the smart contract might not allow the owner to extend the same property as a collateral or security to another bank. Thus, the process of verifying collateral prior to the loan being made is greatly simplified for custodians.20 Here, a trusted trading system is created for smart properties, making credit more readily available and cheaper.19

DISADVANTAGED GROUPS Blockchain might also help refugees and displaced persons. Current systems that offer aid to refugees and displaced persons suffer from inefficiency, fraud, and gross misallocations of resources. For instance, fees and costs account for up to 3.5 percent of an aid transaction. Moreover, an estimated 30 percent of development funds fail to reach the intended recipients due to third-party theft, mismanagement, and other problems.21

March/April28 2018 ComputingEdge 12 May 2019

IT PROFESSIONAL IT TRENDS

Blockchain can reduce friction and conflict, as well as the costs associated with property Various blockchain-based solutions to such problems now exist. For example, blockchain can 8 registration. It is possible to do all or most of the processing using smartphones. Given this, it is empower donors by ensuring that their donations reach the intended recipients. For instance, encouraging that various initiatives have been undertaken. The US-based platform for real-estate donors can buy electricity for South African schools using bitcoin. A blockchain-enabled smart registration, Bitland, announced the introduction of a blockchain-based land registry system in meter makes it possible to send money directly to the meter, and there are no organizations 9 Ghana, where 78 percent of land is unregistered. There is a long backlog of land-dispute cases involved to redistribute funds. Donors can also track the electricity being consumed by schools 10 in Ghanaian courts. Bitland records transactions securely, with GPS coordinates, written and calculate the amount of power their donations provide.22 This program was launched by descriptions, and satellite photos. This and similar processes are expected to guarantee property South African bitcoin startup Bankymoon via a crowdfunding platform.23 rights and reduce corrupt practices. As of mid-2016, 24 communities in Ghana had expressed interest in the project.9 Bitland is planning to expand to Nigeria in collaboration with the OPEC The UN’s World Food Program (WFP) has used blockchain to help refugees. Money is paid Fund for International Development.11 directly to the merchants instead of the recipients. No banks are involved—beneficiaries receive goods directly from the merchants.24 In early 2017, WFP launched the first stage of what it calls The bitcoin company BitFury and the Georgian government signed a deal to develop a system Building Blocks, giving food and cash assistance to needy families in Pakistan’s Sindh province. 12 for registering land titles using blockchain. Currently, to buy or sell land in Georgia, the buyer An Internet-connected smartphone authenticates and records payments from the UN agency to and the seller must use public registry. They will pay between $50 and $200, depending on the food vendors, ensuring the recipients got the expected help, the merchants got paid, and the speed with which they want the transaction notarized. This pilot blockchain project will move agency could keep a watchful eye on the money. the registry process to blockchain. The costs for the buyer and the seller is now expected to be in $.05-$.10 range.13 Starting in May 2017, WFP started distributing food vouchers in Jordan’s refugee camps by delivering cryptographically unique coupons to participating camp supermarkets. Supermarket In 2017, India’s Telangana and Andhra Pradesh states announced plans to use blockchain for cashiers were equipped with iris scanners to identify the beneficiaries and settle payments (UN land registry. Telangana started a land registry pilot project in the capital city of Hyderabad. It databases verify biometric data about refugees). Building Blocks’ ledger records the transactions was reported in September 2017 that a complete rollout of the program in Hyderabad and nearby on a private version of ethereum (a cryptocurrency). WFP reported that by October 2017, it had 14 areas would take place within a year. In October 2017, the Andhra Pradesh government distributed $1.4 million in food vouchers to 10,500 Syrian refugees in Jordan.25 WFP expects collaborated with a Swedish start-up, ChromaWay, to create a blockchain-based land registry blockchain to reduce its overhead costs from 3.5 percent to less than 1 percent and to hasten aid 15 system for the planned city of Amaravati. to remote or disaster-struck areas (where ATMs might not exist or banks are not functioning normally). Blockchain currency can even replace scarce local cash, allowing aid organizations, residents, and merchants to exchange money quickly and electronically. CONTROLLING CORRUPTION Blockchain creates a tamper-proof digital ledger of transactions and shares the ledger, thus offering transparency. Cryptography allows for access to add to the ledger securely. It is SUMMARY extremely difficult—if not impossible—to change or remove data recorded on a ledger. With this Blockchain will positively affect developing countries: it can help reduce fraud and corruption feature, blockchain makes it possible to reduce or eliminate integrity violations such as fraud and and increase legal property titles, which provides entrepreneurial initiatives to the world’s corruption while also reducing transaction costs. poorest. It can also help financial transactions take place more quickly and ensure that aid is As an example, the use of fake export invoices to disguise cross-border capital flows has been distributed with a smaller chance of theft and fraud. pervasive in China. During April to September of 2014, $10 billion worth of fake trade transactions were discovered.16 Major fraud cases occurred at the Qingdao port, where companies had used fake receipts to secure multiple loans against a single cargo of metal.17 The Qingdao incident involved 300,000 tons of alumina, 20,000 tons of copper, and 80,000 tons of REFERENCES aluminum ingots.16 As a result, Chinese banks charge higher interest rates and are less likely to offer collateral-based financing.17 Blockchain can thwart such scandals. 1. ICT Facts and Figures 2017, report, International Telecommunication Union, 2017; www.itu.int/en/ITU-D/Statistics/Documents/facts/ICTFactsFigures2017.pdf. Blockchain also makes it possible to generate smart (“tagged”) property and control it with smart 2. Corruption Leading to Unequal Access, Use and Distribution of Land--UN Report, contracts.18 Examples of such properties include physical property (car, house, container of report, UN News, 2011; https://news.un.org/en/story/2011/12/397982-corruption- metal) as well as nonphysical property (shares in a company).19 Blockchain-based smart leading-unequal-access-use-and-distribution-land-un-report#.WEMpP33QCWl. properties only undergo actions based on the information published in a smart contract.18 If 3. N. Kshetri, “Fostering Startup Ecosystems in India,” Asian Research Policy, vol. 7, no. property is being used as collateral, the smart contract might not allow the owner to extend the 1, 2016, pp. 94–103. same property as a collateral or security to another bank. Thus, the process of verifying collateral 4. T. Hanstand, “The Case for Land Reform in India,” Foreign Affairs, blog, 2013; prior to the loan being made is greatly simplified for custodians.20 Here, a trusted trading system www.foreignaffairs.com/articles/india/2013-02-19/untitled?cid=soc-twitter-in- is created for smart properties, making credit more readily available and cheaper.19 snapshots-untitled-022013. 5. USAID Country Profile: Honduras, report, USAID, 2016; https://usaidlandtenure.net/wp- content/uploads/2016/09/USAID_Land_Tenure_Honduras_Profile_0.pdf. DISADVANTAGED GROUPS 6. T. Puiu, “How Bitcoin’s Blockchain Could Mark an End to Corruption,” ZME Science, Blockchain might also help refugees and displaced persons. Current systems that offer aid to blog, 2015; www.zmescience.com/research/technology/bitcoin-blockchain-corruption- refugees and displaced persons suffer from inefficiency, fraud, and gross misallocations of 04232. resources. For instance, fees and costs account for up to 3.5 percent of an aid transaction. 7. J. Jeong, “Bitcoin, Blockchain, and Land,” The Global Anticorruption Blog, blog, Moreover, an estimated 30 percent of development funds fail to reach the intended recipients due 2016; https://globalanticorruptionblog.com/2016/01/08/bitcoin-blockchain-and-land- reform-can-an-incorruptible-technology-cure-corruption. to third-party theft, mismanagement, and other problems.21 8. L. Shin, “Republic of Georgia to Pilot Land Titling on Blockchain with Economist Hernando De Soto, BitFury,” Forbes, blog, 2016;

March/April 2018 12 March/Aprilwww.computer.org/computingedge 2018 13 29

IT PROFESSIONAL

www.forbes.com/sites/laurashin/2016/04/21/republic-of-georgia-to-pilot-land-titling- on-blockchain-with-economist-hernando-de-soto-bitfury. 9. O. Ogundeji, “Land Registry Based on Blockchain for Africa,” IT Web Africa, blog, 2016; www.itwebafrica.com/enterprise-solutions/505-africa/236272-land-registry- based-on-blockchain-for-africa. 10. A. Jones, “How Blockchain Is Impacting Industry,” International Banker, blog, 2016; https://internationalbanker.com/finance/blockchain-impacting-industry. 11. “Bitland Partners with CCEDK to Improve Blockchain Land Registry in West Africa,” EconoTimes, blog, 2016; www.econotimes.com/Bitland-partners-with-CCEDK-to- improve-blockchain-land-registry-in-West-Africa-271517. 12. S. Higgins, “Republic of Georgia to Develop Blockchain Land Registry,” Coindesk, blog, 2016; www.coindesk.com/bitfury-working-with-georgian-government-on- blockchain-land-registry. 13. S. Higgins, “Survey: Blockchain Capital Markets Spending to Reach $1 Billion in 2016,” Coindesk, blog, 2016; www.coindesk.com/capital-markets-1-billion-2016- blockchain. 14. “Indian State Plans to Store Citizen Data on a Blockchain,” CCN, blog, 2017; www.ccn.com/indian-state-plans-blockchain-storage-citizen-data. 15. “Leveraging Blockchain for the Real Estate Industry,” Lawfuel, blog, 2017; www.lawfuel.com/blog/leveraging-blockchain-real-estate-industry. 16. S. Shengxia, “China Uncovers $10b Worth of Falsified Trade,” Global Times, blog, 2014; www.globaltimes.cn/content/883512.shtml. 17. P. Smyth, Blockchain Technology: 7 Ways Blockchain Technology Could Disrupt the Post-Trade Ecosystem, white paper, Kynetix, 2015; www.the- blockchain.com/docs/Seven%20ways%20the%20Blockchain%20can%20change%20t he%20trade%20system.pdf. 18. K. Bheemaiah, “Block Chain 2.0: The Renaissance of Money,” Wired, blog, 2015; www.wired.com/insights/2015/01/block-chain-2-0. 19. A. Mizrahi, A Blockchain-Based Property Ownership Recording System, ChromaWay, 2016; https://chromaway.com/papers/A-blockchain-based-property-registry.pdf. 20. M.A. Calandra Jr. et al., Blockchain Technology, Finance and Securitization, blog, Alston & Bird, 2016; www.alston.com/- /media/files/insights/publications/2016/06/ifinance-and-financial-services--products- advisory/files/view-advisory-as-pdf/fileattachment/161075-blockchain- technology2.pdf. 21. B. Paynter, “How Blockchain Could Transform the Way International Aid Is Distributed,” Fast Company, blog, 2017; www.fastcompany.com/40457354/how- blockchain-could-transform-the-way-international-aid-is-distributed. 22. S. Higgins, “How Bitcoin Brought Electricity to a South African School,” Coindesk, blog, 2016; www.coindesk.com/south-african-primary-school-blockchain. 23. G. Mulligan, “5 African Crowdfunding Startups to Watch,” Disrupt Africa, blog, 2015; http://disrupt-africa.com/2015/11/5-african-crowdfunding-startups-to-watch. 24. N. Menezes, “UN Uses Ethereum to Distribute Funds to Jordanians,” BTCManager.com, blog, 2017; https://btcmanager.com/un-uses-ethereum-to- distribute-funds-to-jordanians. 25. J.I. Wong, “The UN Is Using Ethereum’s Technology to Fund Food for Thousands of Refugees,” Quartz, blog, 2017; https://qz.com/1118743/world-food-programmes- ethereum-based-blockchain-for-syrian-refugees-in-jordan.

ABOUT THE AUTHORS Nir Kshetri is a professor of management at the Bryan School of Business and Economics, University of North Carolina at Greensboro. Contact him at [email protected]. Jeffrey Voas is an IEEE Fellow. Contact him at [email protected].

This article originally appeared in IT Professional, vol. 20, no. 2, 2018.

March/April 2018 14 30 ComputingEdge May 2019

COLUMN: IT TRENDS

Emoji: Lingua Franca or Passing Fancy?

Many express awe over the creative use of emoji. Others George F. Hurlburt STEMCorp, Inc. distain the perceived dissolution of proper English. Editor: Jeffrey Voas, NIST; Yet others fully embrace the free expression of [email protected] emotions that emoji enable , while some are infuriated

by the wanton devolution of culture exemplified by

such primitive drawings. Many, however, remain indifferent. Emoji are not new. The humble emoji, as a pictogram, a pictorial representation of an object, or an ideogram (a symbolic representation of a more abstract concept), enjoys a rather long heritage. One could argue that symbolic visualization extends back to prehistoric cave drawings.1 Legendary tribal norms, however, were mostly conveyed by aural means. Starting around 3200 BC, specially selected and educated scribes began etching Egyptian hieroglyphics into stone depicting nobility, conquests, and mysticism. Around the same time, the Sumerian cuneiform emerged as a pictographic script. It morphed over centuries to a more symbolic form of expression. Chinese calligraphy originated around 1200 BC as pictographic script. Around the same time, early pictograms predated the Aztec culture in Mesoamerica with its distinctive illustrative style of writing. In medieval times, educated monks scribed illuminated manuscripts, combining symbolic visual artistry with the written word to preserve religious history on paper. The hybrid visual rebus, also mixing emoji-like illustration with words, often in the form of visual puzzles, also enjoyed growing popularity. Along the way, symbolic alphabets eventually enabled printing. Once in print, linear strings of symbols rapidly led to universal literacy-based education. After Gutenberg in 1450, knowledge became reproducible, portable, and essential. Printing, the very notion of linearity as reinforced by Newtonian physics, eventually led to production lines. Industrialized economies followed. Eventually, radio reopened aural space and television re-opened visual kinetics. In a short period of relative time, attention shifted from mass production to mass media starting around 1900 and culminating in the dynamic World Wide Web by 1999. In response, post-modernism elevated consumerism to artfulness in the later 1900s. The smiley face pin became an overnight cultural icon in 1963. Around 1982, emoticons, the use of fonts to form facsimiles of human expression, became vogue. These font combinations conveyed emotion into otherwise dull texts. Influenced by Japanese graphics, Shigetaka Kurita first created the emoji in 1999. It burst quickly onto the Internet. Figure 1 loosely traces this long tail of visual language in human communications, leading to today’s comic-inspired emoji.

*51SPGFTTJPOBM 1VCMJTIFECZUIF*&&&$PNQVUFS4PDJFUZ 4FQUFNCFS0DUPCFS  ª*&&&

2469-7087/19/$33.00 © 2019 IEEE Published by the IEEE Computer Society May 2019 31 IT PROFESSIONAL

Figure 1. A conceptual timeline leading to the emergence of the emoji.

While emoji have a long heritage, they are also clearly a product of the digital age. They are now largely standardized into some 1,644 icons in the Unicode Emoji Version 11.0 released on June 5, 2018 (https://unicode.org/emoji/charts/full-emoji-list.html). Thus, they can be quickly produced via keystroke with no need for hand drawing. Using pictograms and ideograms, they frequently convey both thought and emotion. Emoji even follow loose syntax and grammatical rules. This suggests some degree of competence, perhaps even emoji literacy, to become a truly effective emoji communicator.2 This gives rise to the question: Might emoji become the new lingua franca of the Internet?

A NEW FORM OF EXPRESSION? Some might agree that emoji is becoming the new universal language of Marshall McLuhan’s “global village.” For example, the Oxford Dictionary declared the “face with tears of joy” emoji

as its “Word of the Year” in 2015 (https://en.oxforddictionaries.com/word-of-the- year/word-of-the-year-2015). This is in recognition of the widespread global acceptance of the emoji as a popular means of expressing ideas and sentiment in an otherwise dry world of emotionless technocratic prose. The fact that a robust Unicode standard exists for emoji further reinforces a sense of universality. The need for maximum compression in Tweets, social media, text messages, and other digital media strongly encourages an economy of characters needed to express basic concepts. Whereas alphabets provide a finite set of characters to express any idea, many characters must be combined to do so. Emoji, at 144 pixels and 18 bytes, easily replace costly words with far greater economy.

Advertisers, quick to pick up on trends, regularly target Internet users with hip emoji messages. The level of monetization even extends to the service economy where employees are encouraged to quite literally present a smiley face to their clients, much less to cope emotionally in an otherwise insensitive world.3 Emoji appear to have “staying power” as an enduring visual code. Below is a list of a number of useful emoji-related websites.

• Unicode Emoji Standard V 11.0: https://unicode.org/emoji/charts/full-emoji-list.html • Real-Time Twitter Emoji Usage Tracker: http://emojitracker.com

4FQUFNCFS0DUPCFS32 ComputingEdge  May 2019

IT PROFESSIONAL IT TRENDS

• Real-Time IOS Emoji Usage Tracker: http://www.emojistats.org • Emoji Encyclopedia: https://emojipedia.org • MIT NLP & AI-Based Sentiment Analysis: https://deepmoji.mit.edu • Popular Emoji Grams: https://emojisaurus.com • Personalized Emojis: https://www.bitmoji.com • Moby Dick in Emoji: http://www.emojidick.com • Worldwide Use of Emoji: http://nlp.ffzg.hr/data/emoji-atlas • Emoji Statistics: https://worldemojiday.com/statistics

Intended meanings of many emoji, however, can too easily be misconstrued. While the Unicode standard defines “core emoji,” many more (less well-defined) emoji continue to emerge daily worldwide. Soon a set of scientific emoji are poised to appear. This leads to a bit of a tower of Babel situation, as emoji are often culturally or contextually dependent. In fact, cultures with varying economic descriptors as defined by the Hofstede Culture Index are liable to use emoji differently to describe their particular relationship to the world. For example, people from countries with high uncertainty-avoidance scores tend to disfavor emoji that express positive emotion.4 Moreover, the same emoji might carry different meanings as determined by the culture where it is being used.

While the Unicode standard for emoji tends to reinforce meaning, there are at least 17 different proprietary platform-based fonts in place that significantly render the same Unicode emoji Figure 1. A conceptual timeline leading to the emergence of the emoji. differently. The Unicode site (https://unicode.org/emoji/charts/full-emoji-list.html) shows 11 different platform-based renderings of standard emoji. Thus, a given standard Unicode emoji can While emoji have a long heritage, they are also clearly a product of the digital age. They are now appear differently on iOS than it does on an Android device. This leads to statistically different largely standardized into some 1,644 icons in the Unicode Emoji Version 11.0 released on June interpretations of both sentiment and meaning when specific standard emoji codes cross 5, 2018 (https://unicode.org/emoji/charts/full-emoji-list.html). Thus, they can be quickly platforms. Nonetheless, variation in interpretation also occurs within 5 produced via keystroke with no need for hand drawing. Using pictograms and ideograms, they the same platform, although to a lesser degree. frequently convey both thought and emotion. Emoji even follow loose syntax and grammatical The rather generalized lack of commonality in emoji interpretation rules. This suggests some degree of competence, perhaps even emoji literacy, to become a truly suggests that emoji are actually less than a universal form of 2 effective emoji communicator. This gives rise to the question: Might emoji become the new expression. As noted, cultural influences, context, and symbolic The same emoji lingua franca of the Internet? variation can potentially compromise intended meaning. Worse, it might carry different would appear that emoji are less than a complete form of expression. A NEW FORM OF EXPRESSION? Standardized emoji codes do not really exist for personal pronouns or meanings as most intransitive verbs. This limits the expressiveness of the language, Some might agree that emoji is becoming the new universal language of Marshall McLuhan’s while simultaneously opening the door for creativity in usage among determined by the “global village.” For example, the Oxford Dictionary declared the “face with tears of joy” emoji various user cliques. It is the case, however, that volunteers using Amazon’s Mechanical Turk encoded the entire text of Melville’s culture where it is as its “Word of the Year” in 2015 (https://en.oxforddictionaries.com/word-of-the- Moby Dick into a book entitled Emoji Dick. Moby Dick's iconic first year/word-of-the-year-2015). This is in recognition of the widespread global acceptance of the being used. sentence, “Call me Ishmael,” was emoji encoded as follows: emoji as a popular means of expressing ideas and sentiment in an otherwise dry world of emotionless technocratic prose. The fact that a robust Unicode standard exists for emoji further reinforces a sense of universality. The need for maximum compression in Tweets, social media, text messages, and other digital media strongly encourages an economy of characters needed to express basic concepts. Whereas alphabets provide a finite set of characters to express any idea, many characters must be combined to do so. Emoji, at 144 pixels and 18 bytes, easily replace costly words with far greater economy.

Figure 2. The first sentence of Moby Dick in emoji.

Advertisers, quick to pick up on trends, regularly target Internet users with hip emoji messages. While clearly a period novel, the use of a telephone (a nonexistent item in the time of the novel) The level of monetization even extends to the service economy where employees are encouraged induces a form of contextual irony. Likewise, Alice in Wonderland, a rebus-friendly text by the to quite literally present a smiley face to their clients, much less to cope emotionally in an intent of author Lewis Carroll, has also been translated fully into emoji. In both cases, however, otherwise insensitive world.3 Emoji appear to have “staying power” as an enduring visual code. the level of effort necessary to successfully navigate these annotated texts exceeds the ability of Below is a list of a number of useful emoji-related websites. most readers. Emoji datasets, while highly creative, become highly subjective, induce repetition, and become exceedingly difficult to contextualize.6 In other cases, multiple emoji must be • Unicode Emoji Standard V 11.0: https://unicode.org/emoji/charts/full-emoji-list.html creatively combined to suggest common items. For example, “sweetheart” might be written as a • Real-Time Twitter Emoji Usage Tracker: http://emojitracker.com piece of candy next to a heart, hardly a literal translation.

4FQUFNCFS0DUPCFS  4FQUFNCFS0DUPCFSwww.computer.org/computingedge  33

IT PROFESSIONAL

Ultimately, emoji are technically oriented. As such, they are driven by advancing technology. Thus, as natural language processing (NLP) and artificial intelligence (AI) join forces to reinforce the effectiveness of vocal interaction, emoji might give way to vocalized inflections. Moreover, the number of bot-generated emoji could potentially overpower human users, much like spam often overwhelms the inbox. Both trends could signal a setback for emoji advocates. The notion of emoji as an emergent universal language seems to be limited at best. The use of emoji as a hybrid form of expression to augment regular text, however, appears to be a strong and growing possibility in a world that increasingly demands symbolic economy and some level of personalization. Together with otherwise impersonal texts, selective use of emoji sets the tone for satisfying communication. Emoji tend to defuse what otherwise might be considered offensive messages with a friendly salutation, closing, or strategically placed emoji intended to add a more conciliatory tone.

EMOJI IN A NETWORK AGE Emoji represent a network phenomenon. An analysis of an early August 2018 snapshot of the frequency of emoji usage on Twitter using the website http://emojitracker.com reveals a clear power curve relationship. Figure 3 shows this plot in the form of a vertical bar graph.

Figure 3. Distribution of 846 popular emoji on Twitter in early August 2018.

In this figure, the emoji occupying the top position was the familiar “face with tears of joy.” This emoji was invoked 2,145,510,490 times. The emoji at the last-used position, number 846, was called only 132,848 times. It was an emoji for uppercase Latin letters. The top 10 emoji were: face with tears of joy, a single heart, the recycling symbol, face with hearts for eyes, a slimmer single heart, a sad crying face, a simple happy face, a face with a furrowed brow and frown, a double heart, and a kissing face (see Figure 4).

Figure 4. Top 10 emoji on Twitter in August 2018.

4FQUFNCFS0DUPCFS34 ComputingEdge  May 2019

IT PROFESSIONAL IT TRENDS

It is interesting to note that the majority of the popular emoji are positive in nature, which is in keeping with most research on the use of emoji. Other research shows that applied network science techniques outperform state-of-the art methods, including NLP for sentiment analysis.7 Ultimately, emoji are technically oriented. As such, they are driven by advancing technology. As noted, printing introduced a prevalent linear relationship that helped usher in an industrial Thus, as natural language processing (NLP) and artificial intelligence (AI) join forces to age, enhancing the world’s economy. The advent of mass media, especially the Internet, awoke reinforce the effectiveness of vocal interaction, emoji might give way to vocalized inflections. other sensitivities. The rise of the emoji as a popular means of visual expression suggests a Moreover, the number of bot-generated emoji could potentially overpower human users, much return to age-honored visual space. Moreover, despite distinct cultural differences in usage, the like spam often overwhelms the inbox. Both trends could signal a setback for emoji advocates. world-wide emoji acceptance is itself significant. It represents a broad-based trend toward the The notion of emoji as an emergent universal language seems to be limited at best. The use of reality of networked global sharing. Steeped in older linear technology models, many people fail emoji as a hybrid form of expression to augment regular text, however, appears to be a strong to appreciate or perhaps even fear such openness. To some, emoji represent nothing short of a and growing possibility in a world that increasingly demands symbolic economy and some level tragic fallback to primitive behaviors. Further social research along of personalization. Together with otherwise impersonal texts, selective use of emoji sets the tone these attitudinal lines might better help further delineate growing for satisfying communication. Emoji tend to defuse what otherwise might be considered protectionist movements in many nations. offensive messages with a friendly salutation, closing, or strategically placed emoji intended to As industrialization engaged, literacy-focused education became add a more conciliatory tone. indispensable. Now formal education increasingly seeks creative The increased and online outlets, and traditional literacy-based instruction seems sustained use of EMOJI IN A NETWORK AGE somehow outdated. Yet computer literacy continually gains credence. Importantly, the growing cost of formal higher education leaves many emoji might suggest Emoji represent a network phenomenon. An analysis of an early August 2018 snapshot of the indebted well beyond any entry-level thresholds. Perhaps it is time to frequency of emoji usage on Twitter using the website http://emojitracker.com reveals a clear acknowledge the shift from book-borne portable personal knowledge new innovative power curve relationship. Figure 3 shows this plot in the form of a vertical bar graph. to online networked general knowledge. Such a shift likely has a profound effect on future educational strategies. Here, new forms of research initiatives digital literacy become prerequisite for future opportunity. The increased and sustained use of emoji might suggest new innovative to help identify new research initiatives to help identify new educational vectors, perhaps even extending to mathematics.8 educational vectors, Finally, visualization is endemic. For example, most nations regulate perhaps even driving behavior by varying shapes and color cues. Emoji only represents one form of the resurgence of visualization in the digital extending to world. As a case in point, augmented reality and virtual reality are opening new perceptual doors. Graphical representation of data is also mathematics. increasingly pressing. Networks of all types frequently involve large sparse matrices. The ability to visualize these diverse datasets be- comes an increasingly critical skill. Conceptualizing and constructing such graphs require new mathematical insights and new means of depicting their hidden realities accurately and convincingly. More importantly, the ability to evaluate and interpret such visual representations on their merit is equally important for an informed citizenry. To this end, the ability to acquire visual literacy, including the use of emoji, becomes an Figure 3. Distribution of 846 popular emoji on Twitter in early August 2018. increasingly important skill—not only for dedicated data scientists, but across virtually all the increasingly entwined domains of human knowledge.

In this figure, the emoji occupying the top position was the familiar “face with tears of joy.” This emoji was invoked 2,145,510,490 times. The emoji at the last-used position, number 846, was called only 132,848 times. It was an emoji for uppercase Latin letters. The top 10 emoji were: face with tears of joy, a single heart, the recycling symbol, face with hearts for eyes, a slimmer REFERENCES single heart, a sad crying face, a simple happy face, a face with a furrowed brow and frown, a 1. G. Hurlburt and J. Voas, “Storytelling: From Cave Art to Digital Media,” IT double heart, and a kissing face (see Figure 4). Professional, vol. 13, no. 7, 2011, pp. 4–7. 2. M. Dansi, The Semiotics of Emoji, Bloomsbury Academic, 2017. 3. L. Stark and K. Crawford, “The Conservatism of Emoji: Work, Affect and Communication,” Social Media + Society, vol. 1, no. 2, 2015; doi.org/10.1177/2056305115604853. 4. X. Lu et al., “Learning from the Ubiquitous Language: an Empirical Analysis of Emoji Figure 4. Top 10 emoji on Twitter in August 2018. Usage of Smartphone Users,” Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp), 2015, pp. 770–780.

4FQUFNCFS0DUPCFS  4FQUFNCFS0DUPCFSwww.computer.org/computingedge  35

IT PROFESSIONAL

5. H. Miller, J. Thebault, and I. Johnson, “'Blissfully Happy' or 'Ready to Fight': Varying Interpretations of Emoji,” 10th International Conference on Web and Social Media (ICWSM), 2016, pp. 259–268. 6. W. Radford et al., “'Call me Ishmael': How do you Translate Emoji?,” Proceedings of Australasian Language Technology Association Workshop, 2016, pp. 150–154. 7. A. Illendula and R. Yedulla, “Learning Emoji Embedding using Emoji Co-occurrence Network Graph,” International Workshop on Emoji Understanding and Applications in Social Media, pending publication, 2018; https://arxiv.org/abs/1806.07785. 8. T. McCafery and P.G. Mathews, “An Emoji Is Worth a Thousand Variables,” The Mathematics Teacher, vol. 111, no. 2, October 2017, pp. 96–102.

ABOUT THE AUTHOR George Hurlburt is chief scientist at STEMCorp, a nonprofit that works to further economic development via adoption of network science and to advance autonomous technologies as useful tools for human use. He is engaged in dynamic, graph-based IoT architecture. Hurlburt is on the editorial board of IT Professional and is a member of the board of governors of the Southern Maryland Higher Education Center. Contact him at [email protected].

This article originally appeared in IT Professional, vol. 20, no. 5, 2018.

From the analytical engine to the supercomputer, from Pascal to von Neumann, from punched cards to CD-ROMs—IEEE Annals of the History of Computing covers the breadth of computer history. e quarterly publication is an active center for the collection and dissemination of information on historical projects and organizations, oral history activities, and international conferences. www.computer.org/annals

4FQUFNCFS0DUPCFS36 ComputingEdge  May 2019

COMPSAC 2019 DAT A DRIVEN INTELLIGENCE FOR A SMARTER WORLD

www.compsac.org Hosted by Marquette University, Milwaukee, Wisconsin, USA In the era of "big data" there is an unprecedentedJuly 15� 19increase in the amount of data collectedin data warehouses. Extracting meaning and knowledge from these data is crucial for governments and businesses to support their strategic and tactical decision making. Furthermore, artificialintelligence (AI) and machine learning (ML) makes it possible for machines, processing large amounts of such data, to learn and execute tasks never before accomplished. Advances in big data-related technologies are increasing rapidly. For example, virtual assistants, smart cars, and smart home devices in the emerging Internet of Things world, can, we think, make our lives easier. But despite perceived benefits of these technologies/methodologies, there are many challenges ahead. What will be the social, cultural, and economic challenges arising from these developments? What are the technical issue related, for example, to the privacy and security of data used by AI/ML systems? How might humans interact with, rely on, or even trust AI predictions or decisions emanating from these technologies? How can we prevent such data-driven intelligence from being used to make malicious decisions? Authors are invited to submit original, unpublished research work, as well as industrial practice reports. Simultaneous submission to other publication venues is not permitted. All submissions must adhere to IEEE Publishing Policies, and all will be vetted through the IEEE CrossCheck portal. For full CFP and conference information, please visit the conferencewebsite at WWW.COMPSAC.ORG IMPORTANT DATES ORGANIZING COMMITTEE April 7, 2019: Paper notifications General Chairs: Jean-Luc Gaudiot, University of California, April 15, 2019: Workshop papers due Irvine, USA; Vladimir Getov, University of Westminster, May 1, 2019: Workshop paper notifications UK May 17, 2019 - Camera ready submissions and Program Chairs in Chief: Morris Chang, University of advance author registration due South Florida, USA; Stelvio Cimato, University of Milan, Italy; Nariyoshi Yamai, Tokyo University of Agriculture&: Technology,Japan Workshop Chairs: Hong Va Leong, Hong Kong Polytechnic University, Hong Kong; Yuuichi Teranishi, National Institute of Information and Communications Technology, Japan; Ji-Jiang Yang, Tsinghua University, China ♦IEEE Local Organizing Committee Chair: Praveen Madiraju, Marquette University, USA -�CW£1TE Standing Committee Chair: Sorel Reisman, California State University, USA

Be The Difference. Standing Committee Vice Chair: Sheikh Iqbal Ahamed, Marquette Universiity,USA AFTERSHOCK

The Online Trolling Ecosystem

Hal Berghel, University of Nevada, Las Vegas Daniel Berleant, University of Arkansas at Little Rock

As trolling becomes inseparable from modern social media, a renewed effort is needed to of disinformation to include not unmask and abate the risks of this reality. A just governments but also political groups, ideological movements, and proposed taxonomy offers useful clarifi cation. other social entities. Disinformation is more pernicious, being necessar- ily both intentional and deceptive he practice of using disinformation and misin- in its pursuit of social engineering goals. Although some formation to promote parochial agendas isn’t trolling might be without willful deception (as in the case new. Both have been used by tyrants, dema- of mistaken “true believers”), disinformation is the more gogues, dictators, authoritarians, and manipula- natural ally of trolling and is thus our focus. Ttors of every stripe for millennia. One thing that’s new to The topic of disinformation is both complex and varied: our generation is the digital twist of Internet trolling. The it’s complex owing to its convoluted methods; it’s varied e ectiveness and increasing use of this tactic, highlighted because of its di erent practitioners and contexts. It can in the  US presidential election, justi­ es increased at- be used to enlist support, confuse, de-legitimize, defame, tention. An earlier Computer column encouraged such at- intimidate, confound, escape detection or blame, avoid tention, and we elaborate here. prosecution, and on and on. The public relations strate- Disinformation and misinformation both involve the gist uses disinformation in di erent ways than the tyrant distribution of false information, but with di ering objec- owing to the latter’s assumed greater imperviousness to tives. Disinformation involves the intentional planting of punishment or retribution. Similarly, the ideologue’s use false information to conceal truth or deceive the audience, of disinformation is di erent from that of the corrupt pol- especially by state actors, whereas misinformation is more itician. Disinformation techniques and content vary with generic and relaxed regarding intention, concealment, the purpose, targeted demographic, medium, and social and source. For our purposes, we intend the de­ nition networking platform.

38 May 2019 Published by the IEEE Computer Society 2469-7087/19/$33.00 © 2019 IEEE 44 COMPUTER PUBLISHED BY THE IEEE COMPUTER SOCIETY 0018-9162/18/$33.00 © 2018 IEEE

r8aft.indd 44 7/31/18 11:16 AM EDITORS HAL BERGHEL University of Nevada, Las Vegas; [email protected] ROBERT N. CHARETTE ITABHI Corp.; [email protected] AFTERSHOCK JOHN L. KING University of Michigan; [email protected]

These issues apply to trolling as forward in such subsequent document ease of use and accessibility to anyone well. Consequently, we’ve developed type de nitions as the Open Source with an Internet connection virtually a partial taxonomy to better charac- Metadata Framework and the Resource eliminates entry barriers. Its appeal terize trolling’s many manifestations. Description Framework. To overcome as a communication tactic to tyrants, This is an appropriate time for a tax- this de ciency, more user control is demagogues, and manipulators of all onomy, for trolling is mature enough needed—perhaps a user-driven meta- kinds is obvious. It thus  ts comfort- now to reveal interesting patterns and data insertion tool for elements like ably within such models as pathocracy suggest future trends and defenses. “suspect,” “disproved,” and “content (rule by the maladjusted, psychopaths, warning,” or some sort of Bayesian trig- narcissists, and the like)Œ and kakis- ROOTS AND MISSING LINKS ger to deal with today’s fake news and tocracy (rule by the least competent)€ Trolling is con rmation, in a sense, of a alt-facts. Otherwise, the –­st century’s as an e¢ ective tool of online manipu- fundamental  aw in the notional roots spin on Bush’s vision might progres- lation, obfuscation, and deceit. It’s no of the modern Internet-enabled Web. sively become “As We May Deceive.” surprise that trolling has become in- Those roots are typi ed by, for example, The study of disinformation, from creasingly popular. Paul Otlet’s Mundaneum system, imple- an information-theoretic point of The relationship of trolling to dis- The Online Trolling mented in ­€­‚ to collect and categorize view, has thus far regrettably been information and politics has reached all of the world’s important knowledge at best occasional and informal. We a modern zenith owing to the current (www.mundaneum.org/en); H.G. Wells’s have in mind, for example, contribu- US administration’s relaxation of the Ecosystem notion of a World Brain, outlined in a tions by David Martin and H. Michael norms and expectations of veridical ­€‹Œ collection of essays and addresses with that title; and Vannevar Bush’s Me- Hal Berghel, University of Nevada, Las Vegas mex system, described in his in uential – Daniel Berleant, University of Arkansas at Little Rock ­€‘’ article “As We May Think.” Bush Online trolling is readily weaponized—it fi ts envisioned a collective memory sys- tem that would advance a knowledge comfortably within pathocracy and kakistocracy As trolling becomes inseparable from modern explosion by serving up the corpus to as an e ective tool of online manipulation, social media, a renewed effort is needed to anyone on demand through associa- obfuscation, and deceit. of disinformation to include not tive indexing and browser history-like unmask and abate the risks of this reality. A just governments but also political “paths” not unlike the use of hypertext Sweeney on disinformation‹,‘ and communication and the Russian gov- groups, ideological movements, and to organize the Web. As was custom- traits of disinformationists.’ While ernment’s embrace of trolling. That proposed taxonomy offers useful clarifi cation. other social entities. Disinformation ary in the early information age, Bush informative, especially with respect to said, the White House’s proneness to is more pernicious, being necessar- was driven by the simultaneous de- the current political landscape, these misinformation and even outright ily both intentional and deceptive sire for ease of information access and works are largely anecdotal, lack ex- disinformation is a symptom of a more he practice of using disinformation and misin- in its pursuit of social engineering goals. Although some avoidance of information overload. He amples, and aren’t directly related to general social problem—namely, polit- formation to promote parochial agendas isn’t trolling might be without willful deception (as in the case wasn’t concerned about data reliabil- trolling. Spy the Liež provides a prac- ical emotionalism, in which facts are new. Both have been used by tyrants, dema- of mistaken “true believers”), disinformation is the more ity and source authentication. tical guide, with examples, for detect- too often considered less of a founda- gogues, dictators, authoritarians, and manipula- natural ally of trolling and is thus our focus. As it turns out, this overly simplis- ing deception, including an analysis tion and more of a hindrance.­‚,­­ That Ttors of every stripe for millennia. One thing that’s new to The topic of disinformation is both complex and varied: tic and naive view of the information of behavioral cues that might betray trend manifests itself in a tolerance our generation is the digital twist of Internet trolling. The it’s complex owing to its convoluted methods; it’s varied access challenge has been perpetuated the act. A rough equivalent for social of falsehoods under the guise of alt- e ectiveness and increasing use of this tactic, highlighted because of its di erent practitioners and contexts. It can ever since on the Web. To wit, subse- media deceptions is sorely needed. facts, the inability to distinguish con- in the  US presidential election, justi­ es increased at- be used to enlist support, confuse, de-legitimize, defame, quent work on metadata standards, Alas, self-published contributions on  rmable statements from beliefs and tention. An earlier Computer column encouraged such at- intimidate, confound, escape detection or blame, avoid including the Dublin Core elements the Web, and those from the popular opinions, and an unre ective commit- tention, and we elaborate here. prosecution, and on and on. The public relations strate- (http://dublincore.org/documents/dces; press, fail to do justice to the full im- ment to ideology-based and simplistic Disinformation and misinformation both involve the gist uses disinformation in di erent ways than the tyrant https://tools.ietf.org/html/rfc’‚­‹), pact of disinformation generally¡ and slogans, catch phrases, sound bites, distribution of false information, but with di ering objec- owing to the latter’s assumed greater imperviousness to completely ignore any measure of au- trolling in particular.­ formulas, and beliefs. Social scien- tives. Disinformation involves the intentional planting of punishment or retribution. Similarly, the ideologue’s use thenticity and reliability. The closest tists have developed theories of social false information to conceal truth or deceive the audience, of disinformation is di erent from that of the corrupt pol- metadata elements would include TROLLING AS AN dominance, authoritarianism, and in- especially by state actors, whereas misinformation is more itician. Disinformation techniques and content vary with oblique terms such as “provenance,” IDEOLOGICAL WEAPON stability that explain some these char- generic and relaxed regarding intention, concealment, the purpose, targeted demographic, medium, and social “conforms to,” and “is referenced Online trolling as a form of commu- acteristics in terms of group behavior, and source. For our purposes, we intend the de­ nition networking platform. by.” This de ciency has been carried nication is readily weaponized. Its economics, and social hierarchy.­­–­‘

www.computer.org/computingedge 39 44 COMPUTER PUBLISHED BY THE IEEE COMPUTER SOCIETY 0018-9162/18/$33.00 © 2018 IEEE AUGUST 2018 45

r8aft.indd 44 7/31/18 11:16 AM r8aft.indd 45 7/31/18 11:16 AM AFTERSHOCK

WHY DISINFORMATION? devices as part of a Machiavellian pro- Analytica executive Mark Turnbull WHY TROLLING? paganda or “messaging” campaign to took credit for playing a key role in Disinformation generally and troll- create the desired artificial duality in lieu Donald Trump’s win,16 and there’s now ing specifically are expedient ways to of the more nuanced and reality-based sufficient concern over the use of troll- manipulate public opinion. Authori- presentation that would result from ing by foreign governments to under- tarians of all generations understood clear-headed analysis. Modern online mine US federal elections that, as part that sound and reasoned argument disinformation and trolling campaigns of the Mueller probe, the US Depart- isn’t sufficient to exercise control over functionally resemble phishing at- ment of Justice indicted the Russian others. Something more powerful but tacks in combining a modest amount trolling factory, the Internet Research short of force is needed. Such machina- of computing and networking skill to Agency, for 8 federal crimes17 as well tions, to be effective, must be carefully cloak the real goal and lure the target as 13 Russians and 3 Russian compa- engineered and targeted, an objective using perception management (manip- nies for attempting to subvert the 2016 often unachievable through reasoned ulating the public into thinking they election.18 public debate. If politicians were to perceive something they don’t, or vice One thing is certain: online trolling rely on logical debate, free of manip- versa) and social engineering (moti- is here to stay. Even if federal legislation ulative rhetorical devices, public con- vating the public to do something they were passed to outlaw it, problems like sensus might be influenced by the otherwise wouldn’t have done). reliable cyber-attribution19—at least merits of the arguments themselves In his book Factfulness,15 Rosling that which is admissible in court—will when interests, often authoritarian or describes how evolutionary traits provide trolls many avenues to circum- domineering, wish to avoid this. like hard-wired fast-response brains vent whatever laws might be enacted. So what’s the future of online troll- ing and its containment? We offer the following informal taxonomy as a Disinformation and trolling are expedient means to focus our response. ways to manipulate public opinion. They can Provocation trolling. To elicit a par- polarize issues to exploit a human bias toward ticular response, such as hostility, binary choices. from participants of an online forum. For example, in the “Reactions” section Carefully crafted disinformation produce simplistic world views that of a Yahoo! article about a 20-year-old campaigns and trolling efforts can be discourage adequate reflection and Guatemalan woman shot dead in instrumental in achieving the desired deliberation for decision making. He Texas by a US border agent, many top effect. They can artificially polarize identifies 10 evolutionary “instincts” comments seemed intended to spark issues to exploit a human bias toward that no longer serve humanity well a flame rather than shed light. For ex- binary choices—seeing the world in in separating truth from predatory ample, the first comment was “Medal black and white, big and small, rich and fiction. Such instincts should be criti- of Honor!!!” (http://www.webcitation poor. This is related to what Hans Ros- cally discussed as part of college-level .org/710m5n0WF). Similarly, in an ling calls the gap instinct.15 Its appeal general education, if not in high online discussion, blaming liberals or must follow in part from the cognitive school. Primary education should conservatives for a tragic or contro- simplicity of binary distinctions, much provide practical skill in BS detection, versial incident will likely cause some as we experience with true/false ques- right along with the 3 Rs. Call it the 4th offended readers to lunge for the bait. tions on exams. Other things being R: reality checking. equal, cognitive effort is lower on true/ Social-engineering trolling. To incite false than multiple-choice questions A TAXONOMY OF TROLLING participants to activities they normally because there’s less to think about. Online trolling has matured to the wouldn’t have undertaken—convince Disinformationists and trolls seek point that we can discern some evolu- readers to join an organization, send a to create a sense of extremes where tionary patterns and future directions. donation, observe a boycott, vote for/ the extreme they tout is cast in a more The value proposition is obvious from against a candidate, and so on. appealing way than the alternative. the 2016 US presidential election: low- In order to force the information con- cost, potentially high-impact voter ma- Grooming trolling. Sending mes- sumer to the desired extreme, they use nipulation through micro-targeting. sages intended to insinuate the sender lies, prevarications, untruths, alt-facts, Political scientists and others continue into the mind of the recipient as a unlikely theories, distortions, ad ho- to study the degree to which trolling in- slippery slope to further persuasion. minem attacks, and other rhetorical fluenced the vote. UK-based Cambridge Radical organizations are notorious for

40 ComputingEdge May 2019 46 COMPUTER WWW.COMPUTER.ORG/COMPUTER

r8aft.indd 46 7/31/18 11:16 AM using this variant of social-engineering ML: [Controversial claim] Any- by diverting a thread in a direction trolling to recruit members: ISIS was body who claims otherwise is that’s misleading, irrelevant, false, and widely noted for “fishing” for new ignorant, uninformed, or lying. so on. Thus, a discussion about rising members on Twitter this way, and US crime rates could be diverted by citing extremist groups are frequently noted A naive respondent might be whip- a small community that hasn’t had for using this tactic. lashed at this point because a counter- a murder in 20 years, or a discussion argument, reasoned or not, has already about falling crime rates could be di- Partisan trolling. To use social me- been pre-characterized as ignorant, verted by mentioning a recent crime. dia surreptitiously to achieve political uninformed, or a lie. The best response ends. Here’s where the heavyweights is probably to simply point out the rhe- False-flag trolling. Pretending to be really get involved. For example, troll- torical device used here, as respondent of a group or hold an opinion that the ing has been exposed as an import- PD does next. troll actually opposes, and present- ant component of Russia’s “firehose ing a message intended to make that of falsehood” (see below) propaganda PD: Ooh—is this the group or opinion look bad. This is one strategy, especially in the recent US choose-your-own-ad-homi- of the harder forms of trolling to de- presidential race.20 nem part of the show? tect, because the writer could in the- ory really have the opinion claimed Firehose trolling. High-volume, rapid, Yet even this response is hobbled be- but not realize how his obnoxiousness continuous trolling without concern cause the discussion has now been di- is creating the opposite of the desired for consistency. Apparently a favorite verted into a rhetorical cul-de-sac that effect. For example, a type of robocall of Russia, it focuses not on promoting saves ML from losing the argument. used in political campaigns pretends a particular position or viewpoint but on divisiveness for its own sake. For example, according to Charles Clover, Aleksandr Dugin’s book The Founda- Problems like reliable cyber-attribution will tions of Geopolitics is influential at the highest levels of the Russian govern- provide trolls many avenues to circumvent ment and “assigned as a textbook at the whatever laws might be enacted against trolling. General Staff Academy and other mil- itary universities in Russia.”21 (A good English translation of the entire book Jam trolling. Disrupting a discussion to support one candidate but is so an- isn’t yet available.) Clover quotes Dugin or communication channel with high noying that it actually helps the oppos- as writing, “It is especially important message volume (the trolling equiva- ing candidate. to introduce geopolitical disorder into lent to a DOS attack). Technologically, internal American activity, encourag- automated trollbots will make this an Huckster trolling. The online world’s ing all kinds of separatism and ethnic, increasing problem. equivalent to street vendors. A typical social and racial conflicts, actively example: “Loved your insightful post! supporting all dissident movements— Sport trolling. Trolling for the self- Smash financial barriers with our per- extremist, racist, and sectarian groups, gratification of the troll (just for the sonalized method. Click now to unlock thus destabilizing internal political fun of it). YOUR potential!” Here’s where adver- processes in the U.S.” Trolling is cer- tising meets trolling. tainly well suited to this activity. And Snag trolling. Evoking responses to it can be tough to counter. Christopher satisfy curiosity. One of the less toxic Amplification/relay trolling. This Paul22 recommends against trying “to varieties, this nevertheless tends to di- occurs when one trolling venue is used fight the firehose of falsehood with the vert and obscure. to amplify the message of some other squirt gun of truth,” but fails to provide source—for example, a politician us- fully satisfying alternatives. Nuisance trolling. Derailing the ing Twitter to repeat something re- thread of an online forum (blog, cha- ported on Fox & Friends or Morning Joe. Ad hominem trolling. Defaming or troom, and so on) for no other reason discrediting individuals or groups to than to irritate other participants. A Rehearsal trolling. Baiting opponents delegitimize their positions without variant of sport trolling. to respond in order to reel in the “fish,” engaging them on their merits. The or victim, to practice arguing with. following snippet from an exchange Diversion trolling. An insidious tactic The more annoyed the respondent, the on an email list exemplifies this. for blocking legitimate communi cation more energy that person will expend

www.computer.org/computingedge 41 AUGUST 2018 47

r8aft.indd 47 7/31/18 11:16 AM AFTERSHOCK

providing the spirited practice the with “Right on!” or “Thank you for say- a particular position or public figure. troll wants. The troll thus hones debate ing what so many know but are afraid It then posts replies randomly picked skills for uses like higher-stakes troll- to say.” This boosts persuasiveness via from a set of stock replies like “You ing later. a bandwagon effect. tell’em baby!” and “That’s SO right.” Informally, let’s refer to a trollbot Proxy trolling. Using intermediary Chaff trolling.Sending messages that that’s indistinguishable from a hu- trolls to do the heavy lifting. De ri- are essentially content free and thus man troll as a Turing trollbot—one gueur for large organizations, which vacuous. For example, on social me- that has passed the trolling equiva- hire people to do it.23 One application dia platform Quora someone claimed lent of the Turing test. A computer- is astroturfing: promoting a position, that a relative assigned to help guard controlled chatbot passes the tradi- product, person, and so on for which former president Obama said that the tional Turing test if and only if the there’s little awareness or support by president was “… fake as [expletive de- human tester cannot distinguish the making it look like that entity is widely leted].” One might well question if this chatbot from a human. Compared to approved of. Websites and organiza- relative really existed, and if he did, a chatbot, a trollbot has a much easier tions set up by special interests but whether the quote was accurate. Yet time passing—the weaker constraints given names like “Citizens for X” are consider also the word “fake”: here it on trolling make it so. Sure, there standard examples. Proxy trolling pro- carries little if any information about are human trolls for whom sophisti- vides rich opportunities for all manner its subject but is an effective insult for cated trolling is an unsavory art form of resource-rich, unscrupulous actors. the many unsavvy readers. that would be hard to imitate, but a Turing trollbot need only mimic the Faux-facts trolling. Deliberate spread- Wheat trolling. High-quality trolling lowest-common-denominator human ing of fake news, alt-facts, and other using content that’s hard or impossi- troll to masquerade as a real person. lies under the guise of truth. To fight ble to refute—for example, a cleverly The concept of the Turing troll - bot is increasingly recognized.24 The hardest technical aspect of primitive Turing trollbot design is sneaking A trollbot has a much easier time passing a through smart filters like CAPTCHA. In fact, such trollbots could soon Turing test than a chatbot. emerge as easily downloaded freeware apps. But primitive Turing trollbots are just a start. As we were writing this type of trolling, refereeing organi- doctored photo or text incorporating this article, IBM unveiled its Debater zations, typified by the well-regarded seemingly well-sourced “facts.” Some system,25 which successfully took on Snopes (https://www.snopes.com/about lies contain their own logical incon- a college debate champion. This is a -snopes), are a socially valuable, even sistencies; others smell bad only to a much greater challenge than deploy- essential institution. We can expect domain expert. ing successful trollbots, which can be large organizational trolls to sow chaos ever so much more efficient and eco- and confusion with fake fact-checking Satire trolling. Good satire cuts deep. nomical than a paid human. organizations of their own. It’s hard to create and even harder to With armies of well-nigh unde- generate automatically. Thus, effec- tectable trollbots on the horizon, Insult trolling. Insults spark re- tive as it is, satire trolling will likely what’s one to do against this threat? sponses that drain the target’s energy. remain a relatively small player in the One approach is to simply ignore out- They also make the target look bad and trolling world. right all controversial social media are demoralizing. comments—that might protect indi- TURING TROLLBOTS vidual readers. Another approach is PR trolling. Making the troll or the A trollbot is simply an automated troll. mass immunization. The simplest way views the troll is promulgating look Like a chatbot, it generates texts com- to ensure public health is for enough good rather than attacking others. For putationally. Unlike chatbot texts, people to reply to suspected troll example, the troll could make a claim trollbot output possesses markedly messages by shining a light on them. and unverifiably cite a brother-in-law weaker requirements for coherence “Are you a troll?” might serve not just “who was there.” But the most com- and continuity from its context. Con- as a comment but as a warning and mon example is to state approval of an- sider, for example, a program that uses reminder to readers who otherwise other text. It’s easy to upvote another a simple bag-of-words algorithm to might have overlooked the possibility. troll’s message, or respond to a posting detect tweets or other posts critical of But one way or another, society must

42 ComputingEdge May 2019 48 COMPUTER WWW.COMPUTER.ORG/COMPUTER

r8aft.indd 48 7/31/18 11:16 AM AFTERSHOCK

providing the spirited practice the with “Right on!” or “Thank you for say- a particular position or public figure. develop strategies to reduce trolling become more apparent. Our children, The cognitive load for detection troll wants. The troll thus hones debate ing what so many know but are afraid It then posts replies randomly picked and trollbot effectiveness. like all too many adults, lack the basic and prevention is considerable, even skills for uses like higher-stakes troll- to say.” This boosts persuasiveness via from a set of stock replies like “You Research is also needed to investi- skills to look upon divisive, emotive for a coalition of the willing to do so. ing later. a bandwagon effect. tell’em baby!” and “That’s SO right.” gate the potential for automatic troll- communication critically. This is a There’s little cognitive load for tribal- Informally, let’s refer to a trollbot ing detection software. What kinds of severe educational shortcoming that ists because of illusory feelings of su- Proxy trolling. Using intermediary Chaff trolling.Sending messages that that’s indistinguishable from a hu- trolling are undetectable? What kinds promises to exact a considerable toll periority, anosognosia (critical lack of trolls to do the heavy lifting. De ri- are essentially content free and thus man troll as a Turing trollbot—one have already been detected, and who on democratic systems. self-awareness), and other cognitive gueur for large organizations, which vacuous. For example, on social me- that has passed the trolling equiva- are their sponsors? We also need to biases. Part of the threat (and hence hire people to do it.23 One application dia platform Quora someone claimed lent of the Turing test. A computer- educate the public. An increasingly the value) of trolling is that so many is astroturfing: promoting a position, that a relative assigned to help guard controlled chatbot passes the tradi- necessary goal of primary education ociety needs to understand independent-minded people don’t product, person, and so on for which former president Obama said that the tional Turing test if and only if the is training people to approach social why people troll. It seems to be have the time and energy to check there’s little awareness or support by president was “… fake as [expletive de- human tester cannot distinguish the media statements with suspicion, es- Sone of many addictive behav- facts or verify claims, while tribalists making it look like that entity is widely leted].” One might well question if this chatbot from a human. Compared to pecially when it comes to bias and mis- iors mostly afflicting alienated young and authoritarianist followers don’t approved of. Websites and organiza- relative really existed, and if he did, a chatbot, a trollbot has a much easier information. The Internet—through males and enabled by the anonymity feel the need. tions set up by special interests but whether the quote was accurate. Yet time passing—the weaker constraints social media and fake news outlets— and easy accessibility of the Internet, As a consequence, trolling is con- given names like “Citizens for X” are consider also the word “fake”: here it on trolling make it so. Sure, there has saddled us with the biases of those much like overindulging in online venient fodder for the gullible. It’s standard examples. Proxy trolling pro- carries little if any information about are human trolls for whom sophisti- seeking to manipulate others through porn or videogames (https://www free, self-reinforcing propaganda that vides rich opportunities for all manner its subject but is an effective insult for cated trolling is an unsavory art form new forms of information corruption .quora.com/Whats-it-like-to-be-an unifies true believers and confuses of resource-rich, unscrupulous actors. the many unsavvy readers. that would be hard to imitate, but a such as source displacement/conceal- -Internet-troll). But perhaps it’s not as or obfuscates issues sufficiently to Turing trollbot need only mimic the ment, decontextualization, and the important to understand the psychol- manipulate fence-sitters. The game Faux-facts trolling. Deliberate spread- Wheat trolling. High-quality trolling lowest-common-denominator human like. Where the traditional measures ogy underlying trolling as it is to avoid changing potential lies with the lat- ing of fake news, alt-facts, and other using content that’s hard or impossi- troll to masquerade as a real person. of networks were in terms of value,26,27 being manipulated by it. As Lee Edwin ter (for example, the 40,000 votes in lies under the guise of truth. To fight ble to refute—for example, a cleverly The concept of the Turing troll - a new and useful measure of networks Coursey34 advises, three states that effected the Electoral bot is increasingly recognized.24 The is their potential for abuse.28 hardest technical aspect of primitive Turing trollbot design is sneaking POLITICAL TROLLING A trollbot has a much easier time passing a through smart filters like CAPTCHA. In addition to the computer and net- Free societies are the most susceptible to In fact, such trollbots could soon working context, online trolling must Turing test than a chatbot. emerge as easily downloaded freeware be understood in a geopolitical con- political trolling because in those countries mass apps. But primitive Turing trollbots text,29,30 especially with respect to its opinion is a strong driver of national policy. are just a start. As we were writing utility in international competition this type of trolling, refereeing organi- doctored photo or text incorporating this article, IBM unveiled its Debater and rivalry. For example, a measur- zations, typified by the well-regarded seemingly well-sourced “facts.” Some system,25 which successfully took on able amount of the identified external The next time you see a hyperbolic College outcome of the 2016 US presi- Snopes (https://www.snopes.com/about lies contain their own logical incon- a college debate champion. This is a political trolling used to influence the social media post that confirms dential election). This is where trolls -snopes), are a socially valuable, even sistencies; others smell bad only to a much greater challenge than deploy- outcome of the 2016 US election ap- your worst fears about people of a and other social media manipulators essential institution. We can expect domain expert. ing successful trollbots, which can be pears to have been either sponsored particular race, gender, religion, or see the real payoff. It’s for this reason large organizational trolls to sow chaos ever so much more efficient and eco- or inspired by Russia. China certainly political affiliation, your first reac- that so much trolling content tends to and confusion with fake fact-checking Satire trolling. Good satire cuts deep. nomical than a paid human. has the capability for effective political tion should be, “nice try, Russian be shocking, distressing, offensive, organizations of their own. It’s hard to create and even harder to With armies of well-nigh unde- trolling as well. As time passes, more troll,” rather than “OMG I MUST and the like—it’s designed to arouse generate automatically. Thus, effec- tectable trollbots on the horizon, countries will inevitably engage in it REPOST THIS EVERYWHERE!!!” the passions of the recipient while not Insult trolling. Insults spark re- tive as it is, satire trolling will likely what’s one to do against this threat? as a useful and cost-effective way to Learn to take a breath and pause lending itself easily to deliberation. sponses that drain the target’s energy. remain a relatively small player in the One approach is to simply ignore out- project influence. Free societies are the before you immediately like, retweet, The more independent fence-sitters They also make the target look bad and trolling world. right all controversial social media most susceptible to political trolling or share divisive messages from can thus be stimulated to action or are demoralizing. comments—that might protect indi- because in those countries mass opin- obscure sources. Be especially wary opinion without benefit of the reflec- TURING TROLLBOTS vidual readers. Another approach is ion is a strong driver of national policy. of emotional manipulation. Most tion that would call into question the PR trolling. Making the troll or the A trollbot is simply an automated troll. mass immunization. The simplest way Moreover, polarization and par- importantly, fact check yourself be- validity of the message or stimulate views the troll is promulgating look Like a chatbot, it generates texts com- to ensure public health is for enough tisanship have been increasing for fore spreading information designed thoughtful evaluation. Fact check- good rather than attacking others. For putationally. Unlike chatbot texts, people to reply to suspected troll decades.11,31–33 Trolling’s utility is re- to foment outrage and factionalism. ing, introspection, and analysis work example, the troll could make a claim trollbot output possesses markedly messages by shining a light on them. lated to the political divisiveness of the Remember that the phrase “Russian against the interests of trolls. In this and unverifiably cite a brother-in-law weaker requirements for coherence “Are you a troll?” might serve not just target society. As trolling and other disinformation campaign” does not way, trolling is similar to a military “who was there.” But the most com- and continuity from its context. Con- as a comment but as a warning and ways of abusing social media and net- describe some outdated method campaign where the goal is action mon example is to state approval of an- sider, for example, a program that uses reminder to readers who otherwise works evolve, the current deficiencies from a bygone era, but instead without debate. other text. It’s easy to upvote another a simple bag-of-words algorithm to might have overlooked the possibility. in teaching disinformation tactics represents an active, effective tool We might take a lesson from Winn troll’s message, or respond to a posting detect tweets or other posts critical of But one way or another, society must widely as an important civic skill will being used against you right now. Schwartau’s Time-Based Security Model

www.computer.org/computingedge 43 48 COMPUTER WWW.COMPUTER.ORG/COMPUTER AUGUST 2018 49

r8aft.indd 48 7/31/18 11:16 AM r8aft.indd 49 7/31/18 11:16 AM AFTERSHOCK

in this regard.35 The model posits that pp. 89–93. Effort to Aid Trump Campaign,” The a security system can be effective only 8. E. Mika, “Who Goes Trump? Tyr- New York Times, 16 Feb. 2018; https:// when the time it takes to detect a se- anny as a Triumph of Narcissism,” www.nytimes.com/2018/02/16 curity breach and mitigate against The Dangerous Case of Donald /us/politics/russians-indicted the threat is less than the time it takes Trump: 27 Psychiatrists and Mental -mueller-election-interference.html. for the security breach to achieve its Health Experts Assess a President, 19. H. Berghel, “On the Problem of (Cy- objective. There’s a parallel when it B. Lee, ed., St. Martin’s Press, 2017, ber) Attribution,” Computer, vol. 50, comes to mitigating against the effects pp. 298–318. no. 3, 2017, pp. 84–89. of abusive social media. For it to be 9. E.J. Dionne Jr., N.J. Ornstein, and 20. C. Paul and M. Matthews, “The effective, the detection time must be T.E. Mann, One Nation after Trump: A Russian ‘Firehose of Falsehood’ near zero because the reaction time re- Guide for the Perplexed, the Disillu- Propaganda Model: Why It Might quired to re-tweet, forward, and so on sioned, the Desperate, and the Not-Yet Work and Options to Counter It,” is negligible. The parallel with trolling Deported, St. Martin’s Press, 2017. RAND Corp., 2016; https://www is that the troll is focused on achieving 10. M. Stewart, “The 9.9 Percent Is the .rand.org/content/dam/rand quick results before second thoughts New American Aristocracy,” The /pubs/perspectives/PE100/PE198 might be raised. Atlantic, June 2018; https://www /RAND_PE198.pdf. It’s worth adding that trolling’s .theatlantic.com/magazine/archive 21. C. Clover, “The Unlikely Origins of ability to promote division can also /2018/06/the-birth-of-a-new Russia’s Manifest Destiny,” Foreign be used to nurture social reform and -american-aristocracy/559130. Policy, 27 July 2016; https:// is thus a doubled-edged sword for au- 11. P. Turchin, Ages of Discord: A Struc- foreignpolicy.com/2016/07/27 thoritarian and totalitarian states. For tural-Demographic Analysis of Ameri- /geopolitics-russia-mackinder that reason, such states must carefully can History, Beresta Books, 2016. -eurasia-heartland-dugin-ukraine monitor and control trolling and re- 12. T.W. Adorno et al., The Authoritarian -eurasianism-manifest-destiny lated digital media manipulation tools Personality, Harper & Row, 1950. -putin. within their borders. 13. B. Altemeyer, Right-Wing Authoritari- 22. S. Bennett, “Beyond the Headlines: New though it is in the toolbox of anism, Univ. of Manitoba Press, 1981. RAND’s Christopher Paul Discusses Machiavellian kingpins and social 14. J. Duckitt and C. Sibley, “Right Wing the Russian ‘Firehose of Falsehood,’” misfits alike, the effectiveness of troll- Authoritarianism, Social Dominance blog, 13 Dec. 2016; https://www.rand ing ensures that it’ll continue to play Orientation and the Dimensions of .org/blog/2016/12/beyond-the an important role in future politics. Generalized Prejudice,” European -headlines-rands-christopher-paul J. of Personality, vol. 21, no. 2, 2007, -discusses.html. REFERENCES pp. 113–130. 23. S. Shuster and S. Ifraimova, “A 1. H. Berghel, “Trolling Pathologies,” 15. H. Rosling, O. Rosling, and A.R. Former Russian Troll Explains How Computer, vol. 51, no. 3, 2018, pp. 66–69. Ronnlund, Factfulness: Ten Reasons to Spread Fake News,” Time, 14 Mar. 2. V. Bush, “As We May Think,” The We’re Wrong about the World—and 2018, http://time.com/5168202 Atlantic Monthly, vol. 176, no. 1, 1945, Why Things Are Better than You Think, /russia-troll-internet-research pp. 101–108. Flatiron Books, 2018. -agency. 3. D. Martin, “Thirteen Techniques for 16. E. Graham-Harrison and C. Cadwal- 24. E. Ferrara et al., “The Rise of Social Truth Suppression,” http://www ladr, “Cambridge Analytica Execs Bots,” Comm. ACM, vol. 59, no. 7, .brasscheck.com/martin.html. Boast of Role in getting Donald 2016, pp. 96–104. 4. H.M. Sweeney, “Twenty-Five Ways to Trump Elected,” The Guardian, 21 25. C. Metz and S. Lohr, “IBM Unveils Suppress Truth: The Rules of Disin- Mar. 2018; https://www.theguardian System That ‘Debates’ with Hu- formation,” Apr. 2000; http://whale .com/uk-news/2018/mar/20 mans,” The New York Times, 18 June .to/m/disin.html. /cambridge-analytica-execs-boast 2018; https://www.nytimes 5. H.M. Sweeney, “Eight Traits of the -of-role-in-getting-trump-elected. .com/2018/06/18/technology/ibm Disinformationalist,” Apr. 2000; 17. M. Wheeler, “What Did Mueller -debater-artificial-intelligence.html. http://whale.to/b/sweeney.html. Achieve with the Internet Research 26. R. Metcalf, “Metcalf’s Law after 40 6. P. Houston et al., Spy the Lie: Former Agency Indictment?,” blog, 17 Feb. Years of Ethernet,” Computer, vol. 46, CIA Officers Teach You How to Detect 2018; http://www.emptywheel.net no. 12, 2013, pp. 26–31. Deception, reprint ed., St. Martin’s /2018/02/17/what-did-mueller 27. D.P. Reed, “That Sneaky Exponen- Griffin, 2013. -achieve-with-the-internet-research tial—Beyond Metcalfe’s Law to the 7. H. Berghel, “Disinformatics: The -agency-indictment. Power of Community Building,” Discipline behind Grand Decep- 18. M. Apuzzo and S. LaFraniere, “13 1999; https://www.deepplum.com tions,” Computer, vol. 51, no. 1, 2018, Russians Indicted as Mueller Reveals /dpr/locus/gfn/reedslaw.html.

44 ComputingEdge May 2019 50 COMPUTER WWW.COMPUTER.ORG/COMPUTER

r8aft.indd 50 7/31/18 11:16 AM AFTERSHOCK

in this regard.35 The model posits that pp. 89–93. Effort to Aid Trump Campaign,” The . H. Berghel, “Weaponizing Twitter Š„. L.E. Coursey, “Russia’s Plan for World Read your subscriptions a security system can be effective only 8. E. Mika, “Who Goes Trump? Tyr- New York Times, 16 Feb. 2018; https:// Litter: Abuse-Forming Networks and Domination—and America’s Unwit- through the myCS This article originallypublications portal at when the time it takes to detect a se- anny as a Triumph of Narcissism,” www.nytimes.com/2018/02/16 Social Media,” Computer, vol. ‚ƒ, no. ting Cooperation with It,” blog, † appearedhttp://mycs.computer.org in curity breach and mitigate against The Dangerous Case of Donald /us/politics/russians-indicted „, ƒ, pp. † –†‚. Jan. ƒ; http://www Computer, vol. 51, no. 8, 2018. the threat is less than the time it takes Trump: 27 Psychiatrists and Mental -mueller-election-interference.html. ˆ. W. Blum, Killing Hope: US Military .leecoweb.com/russian_plan. for the security breach to achieve its Health Experts Assess a President, 19. H. Berghel, “On the Problem of (Cy- and CIA Interventions since World War Š‚. W. Schwartau, Time Based Security, objective. There’s a parallel when it B. Lee, ed., St. Martin’s Press, 2017, ber) Attribution,” Computer, vol. 50, II, updated and rev. ed., Zed Books, Interpact Press, ƒˆˆˆ. comes to mitigating against the effects pp. 298–318. no. 3, 2017, pp. 84–89. ƒ„. of abusive social media. For it to be 9. E.J. Dionne Jr., N.J. Ornstein, and 20. C. Paul and M. Matthews, “The Š . S. Kinzer, Overthrow: America’s Cen- effective, the detection time must be T.E. Mann, One Nation after Trump: A Russian ‘Firehose of Falsehood’ tury of Regime Change from Hawaii to near zero because the reaction time re- Guide for the Perplexed, the Disillu- Propaganda Model: Why It Might Iraq, Times Books, †. HAL BERGHEL is an IEEE and ACM quired to re-tweet, forward, and so on sioned, the Desperate, and the Not-Yet Work and Options to Counter It,” Šƒ. E. Klein, ed., “What Is Political Polar- Fellow and a professor of computer is negligible. The parallel with trolling Deported, St. Martin’s Press, 2017. RAND Corp., 2016; https://www ization?,” Vox, ƒ‚ May ƒ‚; https:// science at the University of Nevada, is that the troll is focused on achieving 10. M. Stewart, “The 9.9 Percent Is the .rand.org/content/dam/rand www.vox.com/cards/congressional Las Vegas. Contact him at hlb@ quick results before second thoughts New American Aristocracy,” The /pubs/perspectives/PE100/PE198 -dysfunction/what-is-political computer.org. might be raised. Atlantic, June 2018; https://www /RAND_PE198.pdf. -polarization. FOLLOW US It’s worth adding that trolling’s .theatlantic.com/magazine/archive 21. C. Clover, “The Unlikely Origins of Š . E. Voeten, “Polarization and In- DANIEL BERLEANT is a professor of ability to promote division can also /2018/06/the-birth-of-a-new Russia’s Manifest Destiny,” Foreign equality,” blog, ƒ Oct. ƒƒ; http:// information science at the University be used to nurture social reform and -american-aristocracy/559130. Policy, 27 July 2016; https:// themonkeycage.org/ ƒƒ/ƒ of Arkansas at Little Rock and author is thus a doubled-edged sword for au- 11. P. Turchin, Ages of Discord: A Struc- foreignpolicy.com/2016/07/27 /polarization-and-inequality. of the book The Human Race to the Future (4th ed., Lifeboat Foundation, WWW.COMPUTER.ORG thoritarian and totalitarian states. For tural-Demographic Analysis of Ameri- /geopolitics-russia-mackinder ŠŠ. K.T. Poole, “The Polarization of the @securityprivacy that reason, such states must carefully can History, Beresta Books, 2016. -eurasia-heartland-dugin-ukraine Congressional Parties,” ƒ Mar. 2017). Contact him at berleant@ /COMPUTER monitor and control trolling and re- 12. T.W. Adorno et al., The Authoritarian -eurasianism-manifest-destiny ƒ‚; https://legacy.voteview.com gmail.com. lated digital media manipulation tools Personality, Harper & Row, 1950. -putin. /political_polarization_ ƒ„.htm. within their borders. 13. B. Altemeyer, Right-Wing Authoritari- 22. S. Bennett, “Beyond the Headlines: New though it is in the toolbox of anism, Univ. of Manitoba Press, 1981. RAND’s Christopher Paul Discusses Machiavellian kingpins and social 14. J. Duckitt and C. Sibley, “Right Wing the Russian ‘Firehose of Falsehood,’” misfits alike, the effectiveness of troll- Authoritarianism, Social Dominance blog, 13 Dec. 2016; https://www.rand ing ensures that it’ll continue to play Orientation and the Dimensions of .org/blog/2016/12/beyond-the IEEE TRANSACTIONS ON an important role in future politics. Generalized Prejudice,” European -headlines-rands-christopher-paul J. of Personality, vol. 21, no. 2, 2007, -discusses.html. AFFECTIVE COMPUTING REFERENCES pp. 113–130. 23. S. Shuster and S. Ifraimova, “A 1. H. Berghel, “Trolling Pathologies,” 15. H. Rosling, O. Rosling, and A.R. Former Russian Troll Explains How A publication of the IEEE Computer Society Computer, vol. 51, no. 3, 2018, pp. 66–69. Ronnlund, Factfulness: Ten Reasons to Spread Fake News,” Time, 14 Mar. 2. V. Bush, “As We May Think,” The We’re Wrong about the World—and 2018, http://time.com/5168202 Atlantic Monthly, vol. 176, no. 1, 1945, Why Things Are Better than You Think, /russia-troll-internet-research pp. 101–108. Flatiron Books, 2018. -agency. 3. D. Martin, “Thirteen Techniques for 16. E. Graham-Harrison and C. Cadwal- 24. E. Ferrara et al., “The Rise of Social Affective computing is the eld of study concerned Truth Suppression,” http://www ladr, “Cambridge Analytica Execs Bots,” Comm. ACM, vol. 59, no. 7, with understanding, recognizing, and utilizing human .brasscheck.com/martin.html. Boast of Role in getting Donald 2016, pp. 96–104. emotions in the design of computational systems. IEEE 4. H.M. Sweeney, “Twenty-Five Ways to Trump Elected,” The Guardian, 21 25. C. Metz and S. Lohr, “IBM Unveils Transactions on Affective Computing (TAC) is intended to Suppress Truth: The Rules of Disin- Mar. 2018; https://www.theguardian System That ‘Debates’ with Hu- be a cross-disciplinary and international archive journal aimed at disseminating results of research on the design formation,” Apr. 2000; http://whale .com/uk-news/2018/mar/20 mans,” The New York Times, 18 June of systems that can recognize, interpret, and simulate .to/m/disin.html. /cambridge-analytica-execs-boast 2018; https://www.nytimes human emotions and related affective phenomena. 5. H.M. Sweeney, “Eight Traits of the -of-role-in-getting-trump-elected. .com/2018/06/18/technology/ibm Disinformationalist,” Apr. 2000; 17. M. Wheeler, “What Did Mueller -debater-artificial-intelligence.html. Subscribe today or submit your manuscript at: http://whale.to/b/sweeney.html. Achieve with the Internet Research 26. R. Metcalf, “Metcalf’s Law after 40 www.computer.org/tac 6. P. Houston et al., Spy the Lie: Former Agency Indictment?,” blog, 17 Feb. Years of Ethernet,” Computer, vol. 46, CIA Officers Teach You How to Detect 2018; http://www.emptywheel.net no. 12, 2013, pp. 26–31. Deception, reprint ed., St. Martin’s /2018/02/17/what-did-mueller 27. D.P. Reed, “That Sneaky Exponen- Griffin, 2013. -achieve-with-the-internet-research tial—Beyond Metcalfe’s Law to the 7. H. Berghel, “Disinformatics: The -agency-indictment. Power of Community Building,” Discipline behind Grand Decep- 18. M. Apuzzo and S. LaFraniere, “13 1999; https://www.deepplum.com tions,” Computer, vol. 51, no. 1, 2018, Russians Indicted as Mueller Reveals /dpr/locus/gfn/reedslaw.html.

www.computer.org/computingedge 45 50 COMPUTER WWW.COMPUTER.ORG/COMPUTER AUGUST 2018 51

r8aft.indd 50 7/31/18 11:16 AM r8aft.indd 51 7/31/18 11:16 AM 38mcg06-li-2874514.3d (Style 4) 26-03-2019 19:51

DEPARTMENT: Applications

CareerVis: Hierarchical Visualization of Career Pathway Data

Mingran Li1 We present our CareerVis system, an interactive Wenjie Wu1 visualization tool to aid career education for high school Junhan Zhao1 Keyuan Zhou1 and freshman college students. In additional to its David Perkis2 practical use, we believe our design approach has Timothy N. Bond2 Kevin Mumford2 potential to inspire the design community to develop 2 David Hummels simple visualizations that convey complex information Yingjie Victor Chen1,2 1Department of Computer to novice users. Graphics Technology, Purdue University. , 2Krannert School Management, To help students prepare for success in college, career, and life, Purdue University education stakeholders have a long-term commitment to developing curriculum that allows students to explore their Editor: college and career options as well as their aptitudes and Mike Potel employability (www.doe.in.gov/sites/default/files/standards/ [email protected] cte-family-and-consumer-sciences/cf-busfacs-preparingcc_7- 11-14.pdf). Studies have shown that the right kinds of education can help people ease their transition into the job market.1 Stu- dents should be able to plan for college and career pathways that are suitable for their interests, abili- ties, and lifelong goals.2 Within this context, the community needs efficient tools to help students better prepare for their careers after graduation.3 Based on a synthesized dataset from Purdue graduates’ job placement survey data and a national survey database (www.onetonline.org; www. mynextmove.org), we developed CareerVis, a visualization system aimed to help young students comprehend the broad range of educational and occupational paths as part of the college-career selection process. In order to improve the efficiency of decisions made by students, parents, and other career education stakeholders, we need to anticipate the following frequently-asked questions:

IEEE Computer Graphics and Applications Published by the IEEE Computer Society 46November/December Ma 2018y 2019 Published96 by the IEEE Computer Society 0272-1716/19/$33.002469-7087/19/$33.00 ©ß2019 2019 IEEEIEEE 38mcg06-li-2874514.3d (Style 4) 26-03-2019 19:51 38mcg06-li-2874514.3d (Style 4) 26-03-2019 19:51

APPLICATIONS

1) Which major should I choose? Which occupation should I pursue? 2) What majors can help me pursue this occupation? DEPARTMENT: Applications 3) What occupations am I qualified for with this major? 4) What are the characteristics of these majors and occupations?

THE DATA AND MESSAGE Our underlying dataset contained flow information with hierarchical structure and multidimensional characteristics, which could be decomposed into several typical data structures in information visualization design.4 Particularly, the dataset was composed of college majors, occupations, flows of CareerVis: Hierarchical students from majors to their first jobs, 12 numeric measurements for majors (e.g., GPA and SAT scores), and 12 measurements for occupations (e.g., salaries and future trends in globalization and Visualization of Career automation). The dataset presented the following challenges on visualization: 1) three different types of data formed a more complex data structure; 2) the relatively large amount of data; 3) the data would be presented for the general public, who have little experience with reading Pathway Data and understanding visualization applications; and 4) the visualization would also be published on a relatively small screen (e.g., a tablet or smartphone).

The data is inherently hierarchical. Purdue University’s West Lafayette campus houses 145 Mingran Li1 We present our CareerVis system, an interactive departments within the 10 colleges: Agriculture, Education, Engineering, Health and Human Sciences, Wenjie Wu1 Liberal Arts, Management, Pharmacy, Science, Technology, and Veterinary Medicine. Students’ 1 visualization tool to aid career education for high school Junhan Zhao occupations were aggregated into 130 specific job positions and put into occupation groups by the 1 Keyuan Zhou and freshman college students. In additional to its standard occupational classification system (www.bls.gov/soc). Moreover, the data contains David Perkis2 practical use, we believe our design approach has proportions of the student body enrolled in majors or landed in a different job. Timothy N. Bond2 Kevin Mumford2 potential to inspire the design community to develop There are multiple pathways for students to pursue their ideal occupations from majors. For instance, David Hummels2 among students who received accounting degrees, 70% secured accountant and auditor positions, 1,2 simple visualizations that convey complex information Yingjie Victor Chen while 17% worked as financial analysts, and 1%–3% worked in 8 other occupations. Vice versa, other 1 Department of Computer to novice users. than two majors (Accounting and General Management) under Management, about 10% of accounts Graphics Technology, come from majors under six different colleges which range from Agriculture to Science. There are Purdue University. , about one thousand possible major-to-occupation pathways. 2Krannert School College majors and occupations have many numerical measurements that can offer students insight Management, To help students prepare for success in college, career, and life, into the requirements of certain majors (e.g., GPA, SAT scores required for enrollment), important Purdue University education stakeholders have a long-term commitment to developing curriculum that allows students to explore their measurements for jobs (e.g., salary and work hours), characteristics of certain majors and occupations Editor: college and career options as well as their aptitudes and (e.g., percentage of Indiana students, percentage of domestic/international students, and diversity Mike Potel employability (www.doe.in.gov/sites/default/files/standards/ status of minorities), and future trends that could be affected by automation and globalization. These [email protected] cte-family-and-consumer-sciences/cf-busfacs-preparingcc_7- characteristics are complementary descriptions of majors and jobs necessary to guide students’ 11-14.pdf). Studies have shown that the right kinds of education decision making. People may find certain jobs are better suited for family-oriented employees since can help people ease their transition into the job market.1 Stu- their percentage of married workers is higher than other professions (e.g., engineering and dents should be able to plan for college and career pathways that are suitable for their interests, abili- construction). Some job opportunities may increase (or decrease) with the development of automation ties, and lifelong goals.2 and globalization. Some characteristics have single percentage values (e.g., the percentage of the workforce from various ethnic groups), while others have percentile values of 10%, 25%, median, Within this context, the community needs efficient tools to help students better prepare for 75%, and 90%, such as salary, GPA, and SAT scores. their careers after graduation.3 Based on a synthesized dataset from Purdue graduates’ job placement survey data and a national survey database (www.onetonline.org; www. mynextmove.org), we developed CareerVis, a visualization system aimed to help young DESIGN EXPLORATION AND ITERATIONS students comprehend the broad range of educational and occupational paths as part of the college-career selection process. Our team conducted several iterative designs and involved users in the design processes. From more than ten design ideas [e.g., Figure 1(a)], we selected the one with the most simple and intuitive form to In order to improve the efficiency of decisions made by students, parents, and other career education develop. Figure 1(b) shows our first formal design in horizontal layout. Characters are visualized as stakeholders, we need to anticipate the following frequently-asked questions: heat maps. Flows are presented in a hierarchical Sankey diagram. Our testing showed that the majority of users fully comprehended the hierarchies between the colleges and majors, as well as the

IEEE Computer Graphics and Applications Published by the IEEE Computer Society November/December 2018 96 0272-1716/19/$33.00 ß2019 IEEE www.computer.org/computingedgeNovember/December 2018 97 www.computer.org/cga47 38mcg06-li-2874514.3d (Style 4) 26-03-2019 19:51

IEEE COMPUTER GRAPHICS AND APPLICATIONS

Figure 1. CareerVis’s early design concepts and iteration: (a) several concept designs; (b) horizontal layout; (c) better solutions for characteristic plots of vertical design; (d) vertical layout.

occupational groups and occupations. However, the horizontal design presented several problems such as the interface layout generated unnecessary information overlap.

To better utilize screen space and represent characteristics, we developed another version. The main design idea of hierarchical flow remains but was rotated into a vertical presentation. To improve the design of side characteristics, we brainstormed many solutions [Figure 1(c)]. Ultimately, we selected a relatively intuitive option that featured the direct relationship between the line and value to show the characteristics of verbal, quantitative, and reasoning skills [Figure 1(d)]. The straight line would be highlighted with bright red when hovering on a particular major or occupation. Additionally, we presented other characteristics by bar graphs. Based on interviews with 64 participants, 90% of users were more satisfied with the vertical design. We further enhanced this design to address some remaining problems with the implementation of interactions, mainly the lack of a comparison function and the fact that once a user clicked on a college or an occupation group, the block expanded suddenly, which caused the user to lose visual momentum due to a sudden change of the visualization.

VISUAL COMPONENTS Our resulting CareerVis user interface provided multiple views for data exploration [Figure (2)]. The central flow view displayed the hierarchical levels of colleges, majors, occupational groups, and occupations, as well as the one-to-one relationship between major and occupation. Each particular major or occupation featured a complete description. All relevant characteristics of majors and occupations were presented by scatter plots, box plots, and bar charts.5 In the top view, the system provided a guide and a search function. The system is developed using d3 (d3js.org).

Figure 2(a) shows the breadth of occupations chosen by students after graduating from college. In the central section, our team combined three essential visual components:

1) the rectangular blocks of the Purdue colleges and majors on the left side, where lengths of blocks represent the number of students that have graduated; 2) the similar visual element of the occupational groups and occupations on the right side, where lengths represented the number of students in the occupational group; and 3) the connection paths in the central region.

48November/December 2018ComputingEdge 98 www.computer.org/cgaMay 2019 38mcg06-li-2874514.3d (Style 4) 26-03-2019 19:51 38mcg06-li-2874514.3d (Style 4) 26-03-2019 19:51

IEEE COMPUTER GRAPHICS AND APPLICATIONS APPLICATIONS

Figure 1. CareerVis’s early design concepts and iteration: (a) several concept designs; (b) horizontal layout; (c) better solutions for characteristic plots of vertical design; (d) vertical layout.

occupational groups and occupations. However, the horizontal design presented several problems such as the interface layout generated unnecessary information overlap.

To better utilize screen space and represent characteristics, we developed another version. The main design idea of hierarchical flow remains but was rotated into a vertical presentation. To improve the design of side characteristics, we brainstormed many solutions [Figure 1(c)]. Ultimately, we selected a relatively intuitive option that featured the direct relationship between Figure 2. Overview of CareerVis system with the following six sections: (a) career flow; the line and value to show the characteristics of verbal, quantitative, and reasoning skills (b) major quantitative/verbal; (c) occupation quantitative/verbal; (d) major characteristics; [Figure 1(d)]. The straight line would be highlighted with bright red when hovering on a (e) occupation characteristics; (f) accordion of characteristic plots. particular major or occupation. Additionally, we presented other characteristics by bar graphs. Based on interviews with 64 participants, 90% of users were more satisfied with the vertical design. We further enhanced this design to address some remaining problems with the The paths that connected both sides show the percentages of students who graduated with a particular implementation of interactions, mainly the lack of a comparison function and the fact that once a college major(s) and obtained a specific job position(s). The college/major and occupation/ user clicked on a college or an occupation group, the block expanded suddenly, which caused occupational group were embodied in the two content-rich side bars. Clicking on a college expand the the user to lose visual momentum due to a sudden change of the visualization. college into majors. Characteristics plots [Figures 2(b), 2(c), 2(d), and 2(e)] provide users with detailed information about VISUAL COMPONENTS the colleges, majors, occupational groups, and occupations. For the verbal and quantitative scores, we use scatter plots, one for the college side and another for the occupational side. Dots in the plot Our resulting CareerVis user interface provided multiple views for data exploration [Figure (2)]. The represent majors or occupations. The color of a dot is based on the major/occupation’s requirement of central flow view displayed the hierarchical levels of colleges, majors, occupational groups, and verbal and quantitative skills. For characteristics like salary, SAT score, and GPA, we took advantage occupations, as well as the one-to-one relationship between major and occupation. Each particular of their 10th, 25th, 50th, 75th, and 90th percentile data and use the box plot for their visualization. major or occupation featured a complete description. All relevant characteristics of majors and Users can not only compare the differences between majors and occupations, but they can also view occupations were presented by scatter plots, box plots, and bar charts.5 In the top view, the system the data distribution within the major or occupation. provided a guide and a search function. The system is developed using d3 (d3js.org). The accordion6 [Figure 2(f)] was introduced to represent many characteristics of majors and Figure 2(a) shows the breadth of occupations chosen by students after graduating from college. In the occupations. With the accordion, the system can contain numerous characteristics’ charts within a central section, our team combined three essential visual components: relatively small space. Users can expand the characteristics they want and fold those in which they have no interest. An ordinary PC or laptop screen can hold four characteristics charts at the same time. 1) the rectangular blocks of the Purdue colleges and majors on the left side, where lengths of The most important characteristics are opened by default on top of the list when users enter the system. blocks represent the number of students that have graduated; If a user opens several charts that extended beyond the screen, they can scroll down the screen. The 2) the similar visual element of the occupational groups and occupations on the right side, system automatically adjusts the position of the central flow visualization part and keep it centered in where lengths represented the number of students in the occupational group; and the current window. 3) the connection paths in the central region. Color is used to encode the two basic skills—quantitative skills and verbal skills—required by each occupation and major. If one major/occupation requires more quantitative skills, its color is

November/December 2018 98 www.computer.org/cga www.computer.org/computingedgeNovember/December 2018 99 www.computer.org/cga49 38mcg06-li-2874514.3d (Style 4) 26-03-2019 19:51

IEEE COMPUTER GRAPHICS AND APPLICATIONS

bluer. Yellow means the major/occupation requires balanced quantitative and verbal skills, whereas red means the major/occupation requires more verbal skills. The saturation of color represents the strength of skills these majors/occupations required. The quantitative/verbal Cartesian coordinate system is mapped into a hue (blue to yellow to red) saturation polar coordinate system. The radius (saturation) is computed by the distance of the quantitative/verbal (X/Y) to the origin (0, 0). The hue is computed by the angle of the quantitative/verbal point to the X-axis (verbal). Colors are featured consistently across all sections of the quantitative/verbal scatter plot, characteristic graphs, and center major/occupation flows.

INTERACTION AND ANIMATION Brushing and Linking Brushing and linking allow users to locate a major or occupation from the side charts and verify the corresponding values.7,8 The system’s brushing and linking occur when users hover over a bar in the center diagram, and the corresponding side chart elements are highlighted. When users hover over the College of Engineering in the center diagram, for example, the circle of majors in the corresponding scatter plot are highlighted by a thin black stroke and other circles fade [Figure 3(a)], and the bars of the College of Engineering are highlighted in every opened characteristic chart by a gray line. The values of each characteristic are shown at the top of each gray line [Figure 3(b)], which allows users to easily confirm the exact values. Brushing and linking are active within both levels of the chart, which means that when hovering over a college, occupation, or major with an open center chart, the corresponding elements are all be highlighted in the charts of the corresponding sides.

Figure 3. Brushing and linking of (a) scatter plot; (b) other characteristics charts.

50November/December 2018ComputingEdge 100 www.computer.org/cgaMay 2019 38mcg06-li-2874514.3d (Style 4) 26-03-2019 19:51 38mcg06-li-2874514.3d (Style 4) 26-03-2019 19:51

IEEE COMPUTER GRAPHICS AND APPLICATIONS APPLICATIONS

bluer. Yellow means the major/occupation requires balanced quantitative and verbal skills, Focus whereas red means the major/occupation requires more verbal skills. The saturation of color A focus stage was added to the bars for when users hover over the parallel sets. When a bar is represents the strength of skills these majors/occupations required. The quantitative/verbal focused, the transparency of the paths linked to this bar increases, and the transparency of the Cartesian coordinate system is mapped into a hue (blue to yellow to red) saturation polar other paths decreases to the extent that users can easily distinguish the focused paths from other coordinate system. The radius (saturation) is computed by the distance of the quantitative/verbal paths and distributions. There are some differences in the interactions between the first and (X/Y) to the origin (0, 0). The hue is computed by the angle of the quantitative/verbal point to second hierarchies, depending on which one becomes focused, but the percentage of each path the X-axis (verbal). Colors are featured consistently across all sections of the quantitative/verbal will be displayed. scatter plot, characteristic graphs, and center major/occupation flows.

INTERACTION AND ANIMATION Contexts and Details The system constantly locates users and compares the current focus with related information.8 Brushing and Linking There are two categories of context maintenance. The firstisusedintheflow path of the Sankey diagram [Figure 4(a)] and the scatter plot [Figure 4(b)]; the second is applied to the Brushing and linking allow users to locate a major or occupation from the side charts and verify 7,8 bar set of the Sankey diagram [Figure 4(c)], box plots, and bar charts of the second-level the corresponding values. The system’s brushing and linking occur when users hover over a display [Figure 4(d)]. For the flow path, as mentioned above, when a bar set is focused, the bar in the center diagram, and the corresponding side chart elements are highlighted. When users transparency of the connected paths increases, and the other paths become transparent. We hover over the College of Engineering in the center diagram, for example, the circle of majors in maintain the paths as light background context elements. We use the same strategy of the corresponding scatter plot are highlighted by a thin black stroke and other circles fade highlighting the selected circles and fade the others within the scatterplot. As for the two [Figure 3(a)], and the bars of the College of Engineering are highlighted in every opened levels, we keep the first level elements displayed when the second level is opened. In the characteristic chart by a gray line. The values of each characteristic are shown at the top of each center Sankey diagram, the unselected first-level bars are shrunk and faded. In the box plots gray line [Figure 3(b)], which allows users to easily confirm the exact values. Brushing and and bar charts, the unselected first-level bars are shrunk and moved to the left or right based linking are active within both levels of the chart, which means that when hovering over a on their position relative to the selected bar. college, occupation, or major with an open center chart, the corresponding elements are all be highlighted in the charts of the corresponding sides.

Figure 4. Zoom in for contextual details: (a) central flow of majors in one college to occupations in a group; (b) scatter plot highlighting focused majors and occupations; (c) context of other college and occupation groups; (d) detailed characteristics of occupations Figure 3. Brushing and linking of (a) scatter plot; (b) other characteristics charts. within the context of all other groups.

November/December 2018 100 www.computer.org/cga www.computer.org/computingedgeNovember/December 2018 101 www.computer.org/cga51 38mcg06-li-2874514.3d (Style 4) 26-03-2019 19:52

IEEE COMPUTER GRAPHICS AND APPLICATIONS

Animated Transition The system’s animated transitions help the users understand the relationships between the two levels.9 The rich interaction causes the system to change its visual elements significantly. To maintain the cognitive coupling of the user with the system, we reinforce visual momentum by incorporating animated transitions in both the center diagram and the characteristic charts.10 For the center Sankey diagram, when a college bar is opened, the college bar gradually expands to a settled length, and the second college bar levels, which are the majors within this college, gradually expand and occupy the college bar area. At the same time, other college bars shrink to the same smaller size for readability [Figure 5(a)]. The occupation side applies the same rule. The transitional animations of the characteristic charts follow the same rule: the selected bar gradually expands and disappears, the corresponding second level bars gradually appear and expand, and the other first level bars shrink [Figure 5(b)]. All of these animations take 0.5 s, a value selected after user testing of different animation lengths.

Figure 5. Animation design context details of the (a) central flow and (b) two-side characteristics.

General Information Query With the Central Flow Assume a user is a student from mechanical engineering (ME) and they want to find out what occupations they may have in the future. The user should first open the college of Engineering, mouse-over to ME to see connections to occupations several occupation groups [Figure 6(a)], then open the major occupation group of Engineers [Figure 6(b)] to see the most

Figure 6. Flow of interaction in the center graph.

52November/December 2018ComputingEdge 102 www.computer.org/cgaMay 2019 38mcg06-li-2874514.3d (Style 4) 26-03-2019 19:52 38mcg06-li-2874514.3d (Style 4) 26-03-2019 19:52

IEEE COMPUTER GRAPHICS AND APPLICATIONS APPLICATIONS

Animated Transition frequent occupations from ME. The user can then close both sides to return to the initial flow view [Figure 6(c)]. The system’s animated transitions help the users understand the relationships between the two levels.9 The rich interaction causes the system to change its visual elements significantly. To maintain the cognitive coupling of the user with the system, we reinforce visual momentum by incorporating Characteristics animated transitions in both the center diagram and the characteristic charts.10 For the center Sankey diagram, when a college bar is opened, the college bar gradually expands to a settled length, and the Assuming a user is interested in the engineering major, they can compare the differences in verbal and second college bar levels, which are the majors within this college, gradually expand and occupy the quantitative skills required for the major. When they open the College of Engineering chart, several college bar area. At the same time, other college bars shrink to the same smaller size for readability converging points, 0.65–0.9 verbal and 0.6–0.9 quantitative, are highlighted. When hovering on ME, [Figure 5(a)]. The occupation side applies the same rule. The transitional animations of the characteristic the dots of 0.9 verbal and 0.9 quantitative are highlighted with a red stroke [Figure 7(a)]. Compared charts follow the same rule: the selected bar gradually expands and disappears, the corresponding with ME, computer engineering requires on average 0.65 verbal and 0.8 quantitative [Figure 7(b)]. second level bars gradually appear and expand, and the other first level bars shrink [Figure 5(b)]. All of these animations take 0.5 s, a value selected after user testing of different animation lengths.

Figure 7. Hypothetical user’s CareerVis interface flow with the corresponding scatter plots. Explore and compare the values of quantitative and verbal skills between different college majors/ occupations (a, b, and c).

Moreover, when the user moves on to electrical engineering, they find that this major requires 0.75 verbal and 0.8 quantitative. They decide the best possible occupation for their major is Electrical Engineer based on these statistics [Figure 7(c)]. Furthermore, the user proceeds to the side box salary plot and notices that the median earnings for that occupation are $65k [Figure 8(a)]. They learn the average hours of work per week by an electrical engineer is 43 [Figure 8(b)], and the occupation’s value of automation sensitivity is 10 [Figure 8(c)], which means it is very sensitive to the development of automation. They can open up different characteristic tabs to develop their overall knowledge of the Figure 5. Animation design context details of the (a) central flow and (b) two-side pathway. Instead of clicking on a college/occupation group in the center flow bar to see all detailed characteristics. majors and occupations, the user can click on a box or a bar in the characteristic graphs to open up the college/occupation in the center graph. General Information Query With the Central Flow Assume a user is a student from mechanical engineering (ME) and they want to find out what occupations they may have in the future. The user should first open the college of Engineering, mouse-over to ME to see connections to occupations several occupation groups [Figure 6(a)], then open the major occupation group of Engineers [Figure 6(b)] to see the most

Figure 8. Hypothetical user’s CareerVis interface flow, including the corresponding bar graphs and box plots. Characteristics of the focused element marked by gray bars (a, b, and c).

USER FEEDBACK In our usability testing, we designed quantitative and qualitative questions to compare participants’ understandings and expectations toward the relationship between college majors and occupations. The number of the valid participants was 68, all of whom were first-year students who came from Purdue University. We recorded their mouse activities during the experiments and only nominated the participants who spent more than one minute actively in our system, Figure 6. Flow of interaction in the center graph. while the average spending time was more than 30 min. As the result, the overall compatibility

November/December 2018 102 www.computer.org/cga www.computer.org/computingedgeNovember/December 2018 103 www.computer.org/cga53 38mcg06-li-2874514.3d (Style 4) 26-03-2019 19:52

IEEE COMPUTER GRAPHICS AND APPLICATIONS

of participants’ selected majors and occupations increased from 79% to 91% after using our system. Additionally, 13 changed their primary major selection, and 30 changed their primary occupation selection. They further provided feedback about how they evaluated this application by ranking from best to worst various aspects of this application: attractiveness, clearness, colorfulness, helpfulness, difficulties, effectiveness, and efficiency. Their responses indicated that this application received the highest scores in being helpful, colorful, and attractive, with difficulty and efficiency being areas for improvement. In 2017, the Indiana Department of Education proposed a set of learning objectives for the curriculum to prepare students for college and careers (www.doe.in.gov/sites/default/files/standards/cf-bus-facs- pcc-01-2016.pdf). Purdue curriculum specialists examined our CareerVis tool and found that the tool could help achieve these learning objectives and that the tool would be useful to integrate into the course curricula to improve college and career teaching and learning.

CONCLUSION Based on recent years’ job placement data from Purdue students, we designed and developed our visualization system, CareerVis, to represent the vast choices of education and pathways to the occupation. Using innovative yet simple visual forms, we believe we solved two big challenges in the visualization. First, our system is able to deal effectively with hundreds of majors and occupations, and a thousand possible pathways. Second, our system has proven to be easily comprehensible by a general audience, such as high school students, parents, and educators. Collaborating with experts in education, we are now working toward integrating this system with curriculum to address the Indiana Department of Education’s high school academic standards for Preparing for College and Careers. ACKNOWLEDGMENTS This work was supported by a grant to Purdue University from the Lilly Endowment Inc. The grant is part of the Lilly Endowment’s “Round III: Initiative to Promote Opportunities through Educational Collaborations.” Access the CareerVis visualization system at: va.tech.purdue.edu/careerVis.

REFERENCES 1. F. A. Levy, The New Division of Labor: How Computers are Creating the Next Job Market. Princeton, NJ, USA: Princeton Univ. Press, 2005. 2. N. J. Evans, D. S. Forney, F. M. Guido, L. D. Patton, and K. A. Renn, Student Development in College: Theory, Research, and Practice. Hoboken, NJ, USA: Wiley, 2009. 3. W. A. Anderson, “Important events in career counseling: Client and counselor descriptions,” Career Dev. Quarter., vol. 48, pp. 251–263, 2000. 4. B. Lee, C. Plaisant, C. S. Parr, J. D. Fekete, and N. Henry, “Task taxonomy for graph visualization,” in Proc. AVI Workshop Beyond Time Errors: Novel Eval. Methods Inf. Vis., May 2006, pp. 1–5. 5. J. Heer, M. Bostock, and V. Ogievetsky, “A tour through the visualization zoo,” Queue, vol. 8, no. 5, p. 20, 2010. 6. A. A. Cooper, About Face: The Essentials of Interaction Design. Hoboken, NJ, USA: Wiley, 2014. 7. A. Buja, D. Cook, and D. F. Swayne, “Interactive high-dimensional data visualization,” J. Comput. Graph. Statist., vol. 5, no. 1, pp. 78–99, 1996. 8. J. S. Yi, Y. ah Kang, and J. Stasko, “Toward a deeper understanding of the role of interaction in information visualization,” IEEE Trans. Vis. Comput. Graph., vol. 13, no. 6, pp. 1224–1231, Nov./Dec. 2007. 9. J. Stasko and E. Zhang, “Focus context display and navigation techniques for enhancing radial, space-filling hierarchy visualizations,þ ” in Proc. IEEE Symp. Inf. Vis., 2000, pp. 57–65. 10. D. D. Woods, “Visual momentum: A concept to improve the cognitive coupling of person and computer,” Int. J. Man-Mach. Studies, vol. 21, no. 3, pp. 229–244, 1984.

54November/December 2018ComputingEdge 104 www.computer.org/cgaMay 2019 38mcg06-li-2874514.3d (Style 4) 26-03-2019 19:52 38mcg06-li-2874514.3d (Style 4) 26-03-2019 19:52

IEEE COMPUTER GRAPHICS AND APPLICATIONS APPLICATIONS

of participants’ selected majors and occupations increased from 79% to 91% after using our ABOUT THE AUTHORS system. Additionally, 13 changed their primary major selection, and 30 changed their primary occupation selection. They further provided feedback about how they evaluated this application Mingran Li is currently working toward the Ph.D. degree with the research focus on by ranking from best to worst various aspects of this application: attractiveness, clearness, information visualization design at Purdue University, West Lafayette, IN, USA. Contact colorfulness, helpfulness, difficulties, effectiveness, and efficiency. Their responses indicated that her at [email protected]. this application received the highest scores in being helpful, colorful, and attractive, with difficulty and efficiency being areas for improvement. Wenjie Wu is a User Experience Designer. She received the Master of Science degree from Purdue University, West Lafayette, IN, USA. Contact her at wenjie.jessie. In 2017, the Indiana Department of Education proposed a set of learning objectives for the curriculum [email protected]. to prepare students for college and careers (www.doe.in.gov/sites/default/files/standards/cf-bus-facs- pcc-01-2016.pdf). Purdue curriculum specialists examined our CareerVis tool and found that the tool Junhan Zhao is currently working toward the Ph.D. degree at Polytechnic Institute, could help achieve these learning objectives and that the tool would be useful to integrate into the Purdue University, West Lafayette, IN, USA. Contact him at [email protected]. course curricula to improve college and career teaching and learning. Keyuan Zhou is a Research Assistant with the Polytechnic Institute, Purdue University, West Lafayette, IN, USA. Contact him at [email protected].

CONCLUSION David Perkis is Director of the Purdue Center for Economics, Purdue University, West Based on recent years’ job placement data from Purdue students, we designed and developed our Lafayette, IN, USA. Contact him at [email protected]. visualization system, CareerVis, to represent the vast choices of education and pathways to the Timothy N. Bond is an Assistant Professor with the Economics Krannert School of occupation. Using innovative yet simple visual forms, we believe we solved two big challenges in the Management, Purdue University, West Lafayette, IN, USA. Contact him at visualization. First, our system is able to deal effectively with hundreds of majors and occupations, and [email protected]. a thousand possible pathways. Second, our system has proven to be easily comprehensible by a general audience, such as high school students, parents, and educators. Collaborating with experts in Kevin Mumford is Director of the Purdue University Research Center in Economics education, we are now working toward integrating this system with curriculum to address the Indiana (PURCE) , West Lafayette, IN, USA. Contact him at [email protected]. Department of Education’s high school academic standards for Preparing for College and Careers. David Hummels is Dean and Professor of economics with Purdue University, West ACKNOWLEDGMENTS Lafayette, IN, USA. Contact him at [email protected]. This work was supported by a grant to Purdue University from the Lilly Endowment Inc. Yingjie Victor Chen is an Associate Professor with the Department of Computer Graphics Technology, Purdue University, West Lafayette, IN, USA. Contact him at The grant is part of the Lilly Endowment’s “Round III: Initiative to Promote Opportunities [email protected]. through Educational Collaborations.” Access the CareerVis visualization system at: va.tech.purdue.edu/careerVis. Contact department editor Mike Potel at [email protected].

REFERENCES 1. F. A. Levy, The New Division of Labor: How Computers are Creating the Next Job Market. Princeton, NJ, USA: Princeton Univ. Press, 2005. 2. N. J. Evans, D. S. Forney, F. M. Guido, L. D. Patton, and K. A. Renn, Student Development in College: Theory, Research, and Practice. Hoboken, NJ, USA: Wiley, 2009. This article originally appeared in 3. W. A. Anderson, “Important events in career counseling: Client and counselor descriptions,” IEEE Computer Graphics and Applications, Career Dev. Quarter., vol. 48, pp. 251–263, 2000. vol. 38, no. 6, 2018. 4. B. Lee, C. Plaisant, C. S. Parr, J. D. Fekete, and N. Henry, “Task taxonomy for graph visualization,” in Proc. AVI Workshop Beyond Time Errors: Novel Eval. Methods Inf. Vis., May 2006, pp. 1–5. 5. J. Heer, M. Bostock, and V. Ogievetsky, “A tour through the visualization zoo,” Queue, vol. 8, no. 5, p. 20, 2010. 6. A. A. Cooper, About Face: The Essentials of Interaction Design. Hoboken, NJ, USA: Wiley, 2014. 7. A. Buja, D. Cook, and D. F. Swayne, “Interactive high-dimensional data visualization,” J. Comput. Graph. Statist., vol. 5, no. 1, pp. 78–99, 1996. 8. J. S. Yi, Y. ah Kang, and J. Stasko, “Toward a deeper understanding of the role of interaction in information visualization,” IEEE Trans. Vis. Comput. Graph., vol. 13, no. 6, pp. 1224–1231, Nov./Dec. 2007. 9. J. Stasko and E. Zhang, “Focus context display and navigation techniques for enhancing radial, space-filling hierarchy visualizations,þ ” in Proc. IEEE Symp. Inf. Vis., 2000, pp. 57–65. 10. D. D. Woods, “Visual momentum: A concept to improve the cognitive coupling of person and computer,” Int. J. Man-Mach. Studies, vol. 21, no. 3, pp. 229–244, 1984.

November/December 2018 104 www.computer.org/cga www.computer.org/computingedgeNovember/December 2018 105 www.computer.org/cga55 Conference Calendar Questions? Contact [email protected]

EEE Computer Society conferences are valuable forums for learning on broad and dynamically I shifting topics from within the computing profession. With over 200 conferences featuring leading experts and thought leaders, we have an event that is right for you.

Find a region: Africa ■ Australia ◆ North America ◗ Asia ▲ Europe ● South America ★

JUNE • WETICE (IEEE 28th Int’l Conf. on Enabling 2 June Technologies: Infrastructure for Collaborative • ICESS (IEEE Int’l Conf. on Embedded Soft- Enterprises) ● ware and Systems) ◗ 15 June • JCDL (ACM/IEEE Joint Conf. on Digital • CVPR (IEEE/CVF Conf. on Computer Vision Libraries) ◗ and Pattern Recognition) ◗ 3 June 16 June • AIKE (IEEE Second Int’l Conf. on Artifi cial • SASO (IEEE 13th Int’l Conf. on Self-Adaptive Intelligence and Knowledge Eng.) ● and Self-Organizing Systems) ● 5 June 17 June • CBMS (IEEE 32nd Int’l Symposium on Com- • EuroS&P (IEEE European Symposium on puter-Based Medical Systems) ● Security and Privacy) ● 9 June 21 June • WoWMoM (IEEE 20th Int’l Symposium on • CSCloud (6th IEEE Int’l Conf. on Cyber Secu- “A World of Wireless, Mobile and Multime- rity and Cloud Computing) ● dia Networks”) ◗ • EdgeCom (5th IEEE Int’l Conf. on Edge Com- 10 June puting and Scalable Cloud) ● • ARITH (IEEE 26th Symposium on Computer 22 June Arithmetic) ▲ • ISCA (ACM/IEEE 46th Annual Int’l Sympo- • ICHI (IEEE Int’l Conf. on Healthcare Infor- sium on Computer Architecture) ◗ matics) ▲ 24 June • MDM (20th IEEE Int’l Conf. on Mobile Data • DSN (2019 49th Annual IEEE/IFIP Int’l Conf. Management) ▲ on Dependable Systems and Networks) ◗ 12 June • ECMSM (IEEE Int’l Workshop of Electron- • ICIS (IEEE/ACIS 18th Int’l Conf. on Computer ics, Control, Measurement, Signals and and Information Science) ▲ their application to Mechatronics) ● • SMARTCOMP (IEEE Int’l Conf. on Smart • IC2E (IEEE Int’l Conf. on Cloud Eng.) ● Computing) ◗ • LICS (2019 34th Annual ACM/IEEE

72 May 2019 Published by the IEEE Computer Society 2469-7087/19/$33.00 © 2019 IEEE Symposium on Logic in Computer Science) ◗ Communications) ◆ 25 June 8 August • CSF (IEEE 32nd Computer Security Founda- • 2019 Cloud Summit ◗ tions Symposium) ◗ 9 August • SmartIoT (IEEE Int’l Conf. on Smart Internet of JULY Things) ▲ 8 July 10 August • ICME (IEEE Int’l Conf. on Multimedia and • HPCC (IEEE 21st Int’l Conf. on High Perfor- Expo) ▲ mance Computing and Communications) ▲ • ICMEW (IEEE Int’l Conf. on Multimedia & • SmartCity (IEEE 17th Int’l Conf. on Smart City) Expo Workshops) ▲ ▲ • SERVICES (IEEE World Congress on Ser- • DSS (IEEE 5th Int’l Conf. on Data Science and vices) ● Systems) ▲ • SNPD (20th IEEE/ACIS Int’l Conf. on Soft- 15 August ware Eng., Artificial Intelligence, Networking • NAS (IEEE Int’l Conf. on Networking, Architec- and Parallel/Distributed Computing) ▲ ture and Storage) ▲ 15 July 18 August • ASAP (IEEE 30th Int’l Conf. on Application- • HCS (IEEE 31 Symposium) ◗ specific Systems, Architectures and Proces- • RTCSA (IEEE 25th Int’l Conf. on Embedded sors) ◗ and Real-Time Computing Systems and Appli- • CBI (IEEE 21st Int’l Conf. on Business Infor- cations) ▲ matics) ▲ 27 August • COMPSAC (IEEE 43rd Annual Computer Soft- • ASONAM (IEEE/ACM Int’l Conf. on Advances in ware and Applications Conf.) ◗ Social Networks Analysis and Mining) ◗ • ICALT (19th IEEE Int’l Conf. on Advanced Learning Technologies) ★ SEPTEMBER • ISVLSI (IEEE Computer Society Annual Sym- 13 September posium on VLSI) ◗ • EWDTS (IEEE East-West Design & Test Sym- 23 July posium) ◗ • ICCI*CC (IEEE 18th Int’l Conf. on Cognitive 19 September Informatics & Cognitive Computing) ● • AVSS (16th IEEE Int’l Conf. on Advanced 30 July Video and Signal Based Surveillance) ▲ • IRI (IEEE 20th Int’l Conf. on Information Reuse • ESEM (ACM/IEEE Int’l Symposium on Empir- and Integration for Data Science) ◗ ical Software Eng. and Measurement) ★ • SMC-IT (IEEE Int’l Conf. on Space Mission Challenges for Information Technology) ◗

AUGUST 1 August • CSE (IEEE Int’l Conf. on Computational Sci- Learn more about ence and Eng.) ◗ • EUC (IEEE Int’l Conf. on Embedded and Ubiq- IEEE Computer uitous Computing) ◗ Society Conferences 5 August • TrustCom (IEEE Int’l Conf. on Trust, www.computer.org/conferences Security and Privacy in Computing and

www.computer.org/computingedge 73

ce4con.indd 73 4/23/19 10:18 AM Mind your business.

Preserve your good name. IEEE Member Professional Liability Insurance Plan. Protect what you’ve worked so hard to achieve.

To learn more*, call 1-800-375-0775 or visit IEEEinsurance.com/IEEEPL

Canadian IEEE members can visit www.ieeeinsurance.com/canadapl for more information about the insurance program brokered by Marsh Canada Limited and underwritten by Certain Underwriters at Lloyd’s of London. * The IEEE Member Professional Liability Insurance Program with the Choice Platform is available to active IEEE members who reside in the U.S. IEEE members in Canada (excluding Quebec) have access to the IEEE Member Professional Liability Insurance Plan through Marsh Canada Limited. Coverage options may vary or may not be available in all states. Not all plan features will be available under all carriers or plan options. Program Administered by Mercer Health & Benefits Administration LLC In CA d/b/a Mercer Health & Benefits Insurance Services LLC AR Insurance License #100102691 | CA Insurance License #0G39709

85777 (5/19) Copyright 2019 Mercer LLC. All rights reserved.

85777 IEEE Even If-PL Computing.indd 1 85777 IEEE Computing Edge Professional Liability Ad (5/19)4/4/19 4:37 PM Trim Size: 7.875” x 10.75”; Bleeds: .125, Bleed: 8.125” x 11” Live Area: 7x10”; 4 color process, cmyk