Learning Deep Latent Models for Text Sequences Lei Li Bytedance AI Lab

Total Page:16

File Type:pdf, Size:1020Kb

Learning Deep Latent Models for Text Sequences Lei Li Bytedance AI Lab 2020 Learning Deep Latent Models for Text Sequences Lei Li ByteDance AI Lab 4/29/2020 The Rise of New Media Platforms Toutiao Helo Douyin/Tiktok 2 Huge Demand for Automatic Content Generation Technologies • Automatic News Writing • Author writing assist tools – Title generation and text summarization • Automatic Creative Advertisement Design • Dialog Robots w/ response generation • Translation of content across multiple languages • Story Generation 3 4 Automated News Writing Xiaomingbot is deployed and constantly producing news on social media platforms (TopBuzz & Toutiao). 5 AI to Improve Writing Text generation to rescue! 6 Outline 1. Overview 2. Learning disentangled latent representation for text 3. Mirror-Generative NMT 4. Multimodal machine writing 5. Summary 7 Disentangled Latent Representation for Text VTM [R. Ye, W. Shi, H. Zhou, Z. Wei, Lei Li, ICLR20b] DSS-VAE [Y. Bao, H. Zhou, S. Huang, Lei Li, L. Mou, O. Vechtomova, X. Dai, J. Chen, ACL19c] Natural Language Descriptions name Sukiyaki eatType pub Sukiyaki is a Japanese food Japanese restaurant. It is a pub and it has a price average average cost and rating good good rating. It is area seattle based in seattle. 9 Data to Text Generation Data Table Sentence <key, value> Medical The blood pressure is higher than Reports normal and may expose to the risk of hypertension Style long dress Made of poplin, this long dress has Painting bamboo ink Fashion Product an ink painting of bamboo and Texture poplin Description feels fresh and smooth. Feel smooth Sia Kate Isobelle Furler (born Name: Sia Kate Isobelle Furler 18 December 1975) is an DoB: 12/18/1975 Person Australian singer, songwriter, Nationality: Australia Biography Occupation: Singer, voice actress and music Songwriter video director. 10 [1] The E2E Dataset: New Challenges For End-to-End Generation. https://github.com/tuetschek/e2e-dataset [2] Can Neural Generators for Dialogue Learn Sentence Planning and Discourse Structuring?. https://nlds.soe.ucsc.edu/sentence-planning- NLG Problem Setup • Inference: – Given: table data x, as key-position-value triples. – e.g. Name: Jim Green => (Name, 0, Jim), (Name, 1, Green) – Output: fluent, accurate and diverse text sequences y • Training: – N : pairs of table data and text. {⟨xi, yi⟩}i=1 – M : raw text corpus. {yj}j=1 M ≫ N 11 Why is Data-to-Text Hard? • Desired Properties: – Accuracy: semantically consistent with the content in the table – Diversity: Ability to generate infinite varying utterances • Scalability: real-time generation, latency, throughput (QPS) • Training: limited table-text pairs 12 Previous Idea: Templates [name] is a [food] restaurant. It is a [eatType] and it has a [price] cost and [rating] Sukiyaki is a Japanese rating. It is in [area]. restaurant. It is a pub and it has a name Sukiyaki average cost and eatType pub good rating. It is in food Japanese seattle. price average But manually creation of rating good templates are tedious area seattle 13 Our Motivation for Variational Template Machine Motivation 1: Continuous and Template disentangled representation for Sentence template and content Content Table Motivation 2: Incorporate raw text corpus to learn good q (template, Raw text content | representation. sentence) 14 VTM [R. Ye, W. Shi, H. Zhou, Z. Wei, Lei Li, ICLR20b] Variational Template Machine Input: triples of <field_name, table position, value> data x f p v K template {xk, xk , xk }k=1 variable 1. p(c|x) ∼ Neural Net content k k k variable c z maxpool(tanh(W ⋅ [xf , xp, xv ] + b)) 2. Sample , e.g. z ∼ p0(z) Gaussian text y 3. Decode y from [c, z] using another NN (e.g. Transformer) 15 VTM [R. Ye, W. Shi, H. Zhou, Z. Wei, Lei Li, ICLR20b] Training VTM Key idea: Disentangling table content and templates while data x template preserving as much variable information as possible! content Total loss = variable c z Reconstruction loss + text y Information-Preserving loss Variational Inference Instead of optimizing exact and table intractable expected likelihood, data x minimizing the (tractable) template variational lower bounds. variable −E log dz content �� = �(� �(�), �)�(�) variable c z ∫ ELBO E log KL p = − �(� �) �(� �(�), �) + [�(�|�)||�(�)] � = −E log �(� �, �)�(�)�(�)dzd� � ∬ text y ELBO E log r = − �(� �)�(� �) �(� �, �) +KL[�(�|�)| �(�)] + KL[�(�|�)||�(�)] 17 Preserving Content & Template 1. Content preserving loss 2 table lcp = �q(c|y) |c − f(x)| + DKL(q(c|y) ∥ p(c)) data x template 2. Template preserving loss of variable pairs content ltp = − �q(z|y)[log p(y˜ |z, x)] variable c z y˜ is the text sketch by removing table entry i.e. cross entropy of variational text y prediction from templates Preserving Template Ensure the template variable could recover the text sketch table Table data : data x x template {name[Loch Fyne], variable eatType[restaurant], food[French] content price[below $20]} c z variable Text y: Loch Fyne is a French restaurant catering to a budget of below $20. text y Text Sketch y˜: <ent> is a <ent> <ent> catering to a budget of <ent>. 19 Learning with Raw Corpus • Semi-supervised learning: “Back-translate” corpus to obtain pseudo-parallel pairs <table, text>, to enrich the learning Table Text name Sukiyaki eatType pub Sukiyaki is a Japanese restaurant. It is food Japanese price average a pub and it has a average cost and rating good good rating. It is in seattle. area seattle ? Known for its creative flavours, Holycrab's signatures are the Hokkien crab. q(<c,z>|y) 20 Evaluation Setup • Tasks – WIKI: generating short-bio from person profile. – SPNLG: generating restaurant description from attributes Train Valid Test Dataset table-text table-text table-text raw text raw text pairs pairs pairs WIKI 84k 842k 73k 43k 73k SPNLG 14k 150k 21k / 21k • Evaluation Metric: – Quality (Accuracy): BLEU score to ground-truth – Diversity: self-BLEU (lower is better) 21 VTM Produces High-quality and Diverse Text Ideal WIKI Ideal SPNLG 0.3 0.5 VTM T2S-beam T2S-beam 0.256 0.41 VTM T2S- T2S- 0.212 pretrain 0.32 pretrain BLEU 0.23 BLEU 0.168 0.14 0.124 Temp-KN 0.05 Temp-KN 0.08 0.3 0.48 0.66 0.84 1.02 0.6 0.705 0.81 0.915 1.02 Self-BLEU Self-BLEU VTM uses beam-search decoding. 22 VTM [Ye, …, Lei Li, ICLR20b] Raw data and loss terms are necessary Ablation results on Wiki-bio dataset ideal 0.26 VTM 0.22 w/o raw data ↑ 0.18 BLEU w/o 0.14 information-preserving losses 0.1 0.7 0.75 0.8 0.85 0.9 Self-BLEU↓ 23 Interpreting VTM Template variable project to 2D 24 VTM Generates Diverse Text Input Data Table Generated Text 1: John Ryder (8 August 1889 – 4 April 1977) was an Australian cricketer. 2: Jack Ryder (born August 9, 1889 in Victoria, Australia) was an Australian cricketer. 3: John Ryder, also known as the king of Collingwood (8 August 1889 – 4 April 1977) was an Australian cricketer. 25 Learning Disentangled Representation of Syntax and Semantics DSSVAE enables learning and syntactic semantic transferring sentence-writing styles style content Syntax provider Semantic content zsyn zsem There is an apple The dog is on the table behind the door DSSVAE x sentence There is a dog behind the door 26 DSS-VAE [Y. Bao, H. Zhou, S. Huang, Lei Li, L. Mou, O. Vechtomova, X. Dai, J. Chen, ACL19c] Impact • VTM and its extensions have been applied to multiple online systems on Toutiao including query suggestion generation, ads bid-word generation, etc. • Serving over 100million active users. • 10% of query suggestion phrases from the generation algorithm. 27 Part I Takeaway VTM DSSVAE table syntactic semantic data •Deepx latent models enable learning with bothtemplate table-text pairsstyle and content unpairedvariable text, with high zaccuracyz content• Disentangling approach forsyn modelsem variable compositionc z •Variational technique to speed up inference x text y sentence 28 Outline 1. Overview of Intelligent Information Assistant 2. Learning disentangled latent representation for text 3. Mirror-Generative NMT 4. Multimodal machine writing 5. Summary and Future Directions 29 Neural Machine Translation • Neural machine translation (NMT) systems are super good when you have large amount of parallel bilingual data src2tgt [p(y|x; θxy)] Chinese English tgt2src TM x ∼ �xy y ∼ �xy [p(x|y; θyx)] • BUT, very expensive/non-trivial to obtain – Low resource language pairs (e.g., English-to-Tamil) – Low resource domains (e.g., social network) • Large-scale mono-lingual data are not fully utilized 30 Existing approaches to exploit non- parallel data • There are two categories of methods using non-parallel data – Training ‣ Back-translation, Joint Back-translation, dual learning… – Decoding ‣ Interpolation w/ external LM … • Still not the best 31 So, what we expect? • A pair of relevant TMs so that they can directly boost each other in training src2tgt tgt2src [p[p(y(y|x|,xz;;θθ )])] [p[p(x(x|y|,yz;;θθyxyx))]] xyxy z(x, y) • A pair of relevant TM & LM src2tgt target LM [p[p(y(y|x|,xz;;θθ )])] [p[p(y(y|;z;θθy)y])] so that they can cooperate xyxy z(x, y) more effectively for better decoding We need a bridge 32 Integrating Four Language Skills with MGNMT MGNM target src2tg [p(y|z; θy)] [p(y|x, z; θxy)] x y z(x, y) y x source tgt2sr [p(x|z; θx)] [p(x|y, z; θyx)] 1. composing sentence in Source lang Benefits utilizing both 2. composing sentence in Target lang parallel bilingual data 3. translating from source to target and non- 4. translating from target to source parallel corpus 33 MGNMT [Z. Zheng, H. Zhou, S. Huang, L. Li, X. Dai, J. Chen, ICLR 2020a] y src2tgt TM src LM [p(y|x, z)] [p(x|z)] x MGNMT y z(x, y) tgt LM tgt2src TM [p(y|z)] [p(x|y, z)] x Published as a conference paper at ICLR 2020 y z z src2tgt TM src LM MGNMT [p(y|x, z)] [p(x|z)] target LM src2tgt TM [p(y|z; θy)] [p(y|x, z; θxy)] x y x y x y x MGNMT y z(x, y) z(x, y) GNMT MGNMT y x tgt LM tgt2src TM source LM tgt2src TM [p(x|z; θ )] [p(x|y, z; θ )] Figure 1: The [p(y|z)] [p(x|y, z)] x yx graphical model of MGNMT.
Recommended publications
  • China in 50 Dishes
    C H I N A I N 5 0 D I S H E S CHINA IN 50 DISHES Brought to you by CHINA IN 50 DISHES A 5,000 year-old food culture To declare a love of ‘Chinese food’ is a bit like remarking Chinese food Imported spices are generously used in the western areas you enjoy European cuisine. What does the latter mean? It experts have of Xinjiang and Gansu that sit on China’s ancient trade encompasses the pickle and rye diet of Scandinavia, the identified four routes with Europe, while yak fat and iron-rich offal are sauce-driven indulgences of French cuisine, the pastas of main schools of favoured by the nomadic farmers facing harsh climes on Italy, the pork heavy dishes of Bavaria as well as Irish stew Chinese cooking the Tibetan plains. and Spanish paella. Chinese cuisine is every bit as diverse termed the Four For a more handy simplification, Chinese food experts as the list above. “Great” Cuisines have identified four main schools of Chinese cooking of China – China, with its 1.4 billion people, has a topography as termed the Four “Great” Cuisines of China. They are Shandong, varied as the entire European continent and a comparable delineated by geographical location and comprise Sichuan, Jiangsu geographical scale. Its provinces and other administrative and Cantonese Shandong cuisine or lu cai , to represent northern cooking areas (together totalling more than 30) rival the European styles; Sichuan cuisine or chuan cai for the western Union’s membership in numerical terms. regions; Huaiyang cuisine to represent China’s eastern China’s current ‘continental’ scale was slowly pieced coast; and Cantonese cuisine or yue cai to represent the together through more than 5,000 years of feudal culinary traditions of the south.
    [Show full text]
  • Joint Civil Society Report Submitted to the Committee on the Elimination of Racial Discrimination
    Joint Civil Society Report Submitted to The Committee on the Elimination of Racial Discrimination for its Review at the 96th Session of the combined fourteenth to seventeenth periodic report of the People’s Republic of China (CERD/C/CHN/14-17) on its Implementation of the Convention on the Elimination of All Forms of Racial Discrimination Submitters: Network of Chinese Human Rights Defenders (CHRD) is a coalition of Chinese and international human rights non-governmental organizations. The network is dedicated to the promotion of human rights through peaceful efforts to push for democratic and rule of law reforms and to strengthen grassroots activism in China. [email protected] https://www.nchrd.org/ Equal Rights Initiative is a China-based NGO monitoring rights development in Western China. For the protection and security of its staff, specific identification information has been withheld. Date of Submission: July 16, 2018 Table of Contents I. Executive Summary Paras. 1-2 II. Recommendations Para. 3 III. Thematic Issues & Findings A. Legislation underpinning discriminatory counter-terrorism policies Paras. 4-7 [Articles 2 (c) and 4; List of Themes para. 8] B. Militarized policing, invasive surveillance, and constant monitoring Paras. 8-21 [Articles 3, 4, and 5 (a-b); LOT para. 22] C. Extrajudicial detention, forced disappearances, torture, and other abuses Paras. 22-28 in “Re-education” camps [Article 5 (a)(b)(d); LOT para. 21] D. Counter-terrorism used to justify arbitrary detention and discriminatory Paras. 29-34 punishment of ethnic minorities [Articles 4 and 5 (a)(b)(d); LOT paras. 6 and 21] E. Discrimination and restrictions on religious freedom Paras.
    [Show full text]
  • The Rural Video Influencers in China: on the New Edge of Urbanization
    THE RURAL VIDEO INFLUENCERS IN CHINA: ON THE NEW EDGE OF URBANIZATION A Thesis Presented to the Faculty of the Graduate School of Cornell University In Partial Fulfillment of the Requirements for the Degree of Master of Arts by Xinwen Zhang May 2020 © 2020 Xinwen Zhang ABSTRACT On the new media platforms in China, especially the video platforms, some rural content has become quite influential, which is, to some extent, inconsistent with people's impression of the vulnerable position of rural areas the urban-rural inequality. This thesis studied some of the most popular the rural content producers with the close reading of their videos, explaining how they frame themselves and their artworks and what kind of "rurality" is performed to the audiences. While the audience might assign them as rural figures, these people were on the edge of the urban and the rural as they had a shared history as some people who were from the rural areas, spent a period of their lives as migrant workers, and finally became video influencers performing some kind of rural lifestyle. BIOGRAPHICAL SKETCH Xinwen Zhang was born in Shenyang, China, and spent the first 18 years of her life there. Then, she went to Sun Yat-sen University and entered Boya college, which sets no determined major and emphasizes on text close reading to explore potential for students. She finally chose anthropology as the major and researched the breakfast stalls in New Phoenix Village. She went to the village and had close contact with the migrant workers running breakfast stalls and people at breakfast, mainly focusing on the population identification, identity cognition, and community construction of the migrant population as well as their dilemma.
    [Show full text]
  • Analysis of Bytedance with a Close Look on Douyin / Tiktok ——————————————————
    —————————————————— Analysis of ByteDance with a close look on Douyin / TikTok —————————————————— Author: Xiaoye SHI Supervisor: Prof. Dr. Didier SORNETTE A thesis submitted to the Department of Management, Technology and Economics (D-MTEC) of the Swiss Federal Institute of Technology Zurich (ETHZ) in partial fulfilment of the requirements for the degree of Master of Science in Management, Technology and Economics June 2019 Acknowledgements I would like to thank my supervisor Prof. Dr. Didier Sornette, who has kindly given me countless guidance and advice throughout the Master thesis. I appreciate the opportunity to work on this interesting topic under the Chair of Entrepreneurial Risks and is deeply grateful for all the help received along the process. I also want to express my deepest gratitude to my parents, who have been supportive and encouraging under all circumstances. Without them, I would not be able to become the person I am today. - 2 - Abstract In 2018, ByteDance, a young Internet company with only 6 years of history, broke out on various news headlines as the highest valued unicorn. With the acquisitions of musical.ly and Flipgram, the company’s flagship product Douyin strikes to develop its global presence under the name TikTok. This thesis analyzed Douyin’s historical growth and revenue model. As a main revenue driver, future user growth is predicted and calibrated by extending the methodology proposed in earlier studies by Cauwels and Sornette. We considered three growth scenarios – base, high and extreme, and estimated Douyin as well as ByteDance’s value based on comparable company analysis. ByteDance’s key performance metrics and multiples were compared with four other firms in the similar industry, Facebook, Weibo, Momo and iQIYI.
    [Show full text]
  • China (Includes Tibet, Hong Kong, and Macau) 2018 Human Rights Report
    CHINA (INCLUDES TIBET, HONG KONG, AND MACAU) 2018 HUMAN RIGHTS REPORT EXECUTIVE SUMMARY The People’s Republic of China (PRC) is an authoritarian state in which the Chinese Communist Party (CCP) is the paramount authority. CCP members hold almost all top government and security apparatus positions. Ultimate authority rests with the CCP Central Committee’s 25-member Political Bureau (Politburo) and its seven-member Standing Committee. Xi Jinping continued to hold the three most powerful positions as CCP general secretary, state president, and chairman of the Central Military Commission. Civilian authorities maintained control of security forces. During the year the government significantly intensified its campaign of mass detention of members of Muslim minority groups in the Xinjiang Uighur Autonomous Region (Xinjiang). Authorities were reported to have arbitrarily detained 800,000 to possibly more than two million Uighurs, ethnic Kazakhs, and other Muslims in internment camps designed to erase religious and ethnic identities. Government officials claimed the camps were needed to combat terrorism, separatism, and extremism. International media, human rights organizations, and former detainees reported security officials in the camps abused, tortured, and killed some detainees. Human rights issues included arbitrary or unlawful killings by the government; forced disappearances by the government; torture by the government; arbitrary detention by the government; harsh and life-threatening prison and detention conditions; political prisoners;
    [Show full text]
  • China's Propensity for Innovation in the 21St Century
    C O R P O R A T I O N STEVEN W. POPPER, MARJORY S. BLUMENTHAL, EUGENIU HAN, SALE LILLY, LYLE J. MORRIS, CAROLINE S. WAGNER, CHRISTOPHER A. EUSEBI, BRIAN CARLSON, ALICE SHIH China's Propensity for Innovation in the 21st Century Identifying Indicators of Future Outcomes For more information on this publication, visit www.rand.org/t/RRA208-1 Library of Congress Cataloging-in-Publication Data is available for this publication. ISBN: 978-1-9774-0596-8 Published by the RAND Corporation, Santa Monica, Calif. © Copyright 2020 RAND Corporation R® is a registered trademark. Cover: Blackboard/Adobe Stock; RomoloTavani/Getty Images Limited Print and Electronic Distribution Rights This document and trademark(s) contained herein are protected by law. This representation of RAND intellectual property is provided for noncommercial use only. Unauthorized posting of this publication online is prohibited. Permission is given to duplicate this document for personal use only, as long as it is unaltered and complete. Permission is required from RAND to reproduce, or reuse in another form, any of its research documents for commercial use. For information on reprint and linking permissions, please visit www.rand.org/pubs/permissions. The RAND Corporation is a research organization that develops solutions to public policy challenges to help make communities throughout the world safer and more secure, healthier and more prosperous. RAND is nonprofit, nonpartisan, and committed to the public interest. RAND’s publications do not necessarily reflect the opinions
    [Show full text]
  • Social Media
    中国媒体概览 China Media Overview Social Media © 2016 Beyond Summits 3 Summary: • 90s generation has become the main users of Chinese social media. 90s generation prefer to watch internet videos, while 80s and 70s prefer to shop online and read news respectively. • Chinese netizens open and use 6-10 media APPs on average every day. The most frequently used social media platforms are WeChat, QQ and Weibo. When they use social media platforms, they outweigh ‘media’ nature of the platforms to obtain news and information as well as share their lives instead of being ‘social’. • WeChat users are younger and more frequently browse the platform. The users not only acquire common socialization by WeChat, but obtain information and life convenience as well. • Most of the Weibo users are male whose ages are relatively older than that of WeChat. Geographically, the Weibo users concentrate in such areas as Shanghai, Zhejiang Province, Jiangsu Province and the Zhujiang Delta. © 2016 Beyond Summits 4 Summary: • Currently, various APPs developed by such Internet tycoons as Baidu, Alibaba and Tencent are taking leading positions in China’s mobile Internet market. • Mobile Social Platforms: Tencent’s two APPs outperformed their competitive APPs. • Mobile Entertainment Platforms: The top3 mobile video APPs are Tencent, Youku and IQIYI. And the top3 mobile music APPS are KUGOU music, QQ music and KUWO music. • Mobile News Platforms: The top5 APPs are Tencent, TOUTIAO, 163 news, SOHU news and IFENG news. • Mobile Tourism: In this October, a merger between QUNAR and CTRIP produced the greatest mobile tourism APP in China. • Mobile E-commerce: TAOBAO outperformed other APPs.
    [Show full text]
  • Chinese Tech Landscape Overview NSCAI Presentation
    Chinese Tech Landscape Overview NSCAI Presentation epic.org EPIC-19-09-11-NSCAI-FOIA-20200331-3rd-Production-pt9 000534 EP,c-,,,,_,,,_,,,,,, May 2019 "Core tech" vs. "tech enabled" businesses • Being regarded as a core-tech business is glamorous -- everyone wants to believe and talk about their technological capabilities as a moat. But there are few industries where that 's actually the case. o e.g. mass deployment of machine vision for medical diagnosis is not blocked by the tech. o There are relatively few "core tech businesses" that compete in markets where cutting edge technology is the primary axis of competition and barrier to entry (e.g. Intel, Nvidia, Waymo, ) • It is more useful to understand most of these companies as "tech-enabled businesses". o e.g. Facebook, Uber, Linkedin, and Airbnb derive their power from network effects. Amazon's e-commerce platform derives its power from heavy capex. epic.org EPIC-19-09-11-NSCAI-FOIA-20200331-3rd-Production-pt9 000535 EPIC-2019-001-000603 epic.org EPIC-19-09-11-NSCAI-FOIA-20200331-3rd-Production-pt9 000536 WeChat(6 ) 0 BAT (Baidu, Alibaba, Tencent) - The Big 3 Q. Search .. J.:11llll~ fJ ~ You have added ;}:JJ as your WeChat c. • Tencent ($504B Valuation): Social and gaming. Best known for Subscriptions , • -. El 1559]jl1,i,r fft.!!ff\25!11:itfi,P~OIJM8 creating WeChat. Also the largest gaming company in the world. lilLR:::lo &f GGV 996 .,,"'ttH_o+ ,, DJ> (34 mossagos] lW' Cloo nera ara 60% of all mobile time in China is spent on Tencent properties.
    [Show full text]
  • Rise and Fall: Audience Participation and Market Economy in Traditional
    Title Page Rise and Fall: Audience Participation and Market Economy in Traditional Chinese Opera by Yuanziyi Zhang Bachelor of Science, Illinois Wesleyan University, 2018 Submitted to the Graduate Faculty of the Dietrich School of Arts and Sciences in partial fulfillment of the requirements for the degree of Master of Arts University of Pittsburgh 2021 Committee Membership Page UNIVERSITY OF PITTSBURGH DIETRICH SCHOOL OF ARTS AND SCIENCES This thesis was presented by Yuanziyi Zhang It was defended on April 8, 2021 and approved by Dr. Kun Qian, Associate Professor, Department of East Asian Languages and Literatures Dr. Charles Exley, Associate Professor, Department of East Asian Languages and Literatures Dr. Song Shi, Visiting Assistant Professor, School of Computing and Information Thesis Advisor: Dr. Kun Qian, Associate Professor, Department of East Asian Languages and Literatures ii Copyright © by Yuanziyi Zhang 2021 iii Abstract Rise and Fall: Audience Participation and Market Economy in Traditional Chinese Opera Yuanziyi Zhang, MA University of Pittsburgh, 2021 Throughout Chinese history, traditional opera has encountered peaks and valleys. Although the status of the art and performers has been long controversial, the most significant factor that determines the destiny of traditional Chinese opera is actually the vitality of its market. My thesis seeks to tackle the role of market and audience participation in different stages of the development of Peking Opera, delineating the tripartite relationship between political interference, audience participation, and technological development. Within Henry Jenkins’ theoretical framework of media convergence, I investigate collective intelligence, participatory culture, and contemporary fan culture vis-à-vis digital media that has been contextualized in traditional Chinese opera and its economic impact.
    [Show full text]
  • Comparing Chinese Mobile Applications' Data and User Privacy
    INTERNET POLICY REVIEW Journal on internet regulation Volume 9 | Issue 3 Going global: Comparing Chinese mobile applications’ data and user privacy governance at home and abroad Lianrui Jia University of Toronto, Canada, [email protected] Lotus Ruan Citizen Lab, University of Toronto, Canada, [email protected] Published on 16 Sep 2020 | DOI: 10.14763/2020.3.1502 Abstract: We examine and compare data and privacy governance by four China-based mobile applications and their international versions: Baidu, Toutiao and its international version TopBuzz, Douyin and its international version TikTok, and WeChat. Together, these four applications represent popular Chinese apps branching into diverse overseas markets such as Europe, Brazil, North America, and Southeast Asia. We first present an overview of the ownership, functions, business models and strategies of the reviewed apps. To study the app’s interface design, we employ the walkthrough method to examine privacy features during the account registration and deletion stages in app usage. Lastly, we conducted content analysis of the terms of service and privacy policies to establish the app’s data collection, storage, transfer, use, and disclosure measures. Our analysis showed variations across apps and within the Chinese and international-facing versions in their data and privacy governance in app design and policies. Baidu has the most unsatisfactory data and privacy protection measures, while ByteDance’s TikTok/Douyin and TopBuzz/Toutiao offer more comprehensive user protection from different jurisdictions. Moreover, this paper highlights the role of platform owners (e.g., Google and Apple) in gatekeeping mobile app privacy standards and the role of the state in imposing a data protection framework on overseas versions of China-based mobile apps.
    [Show full text]
  • Artificial Intelligence Factory, Data Risk, and Vcs' Mediation: the Case of Bytedance, an AI-Powered Startup
    Journal of Risk and Financial Management Article Artificial Intelligence Factory, Data Risk, and VCs’ Mediation: The Case of ByteDance, an AI-Powered Startup Peiyi Jia 1 and Ciprian Stan 2,* 1 Robert J. Manning School of Business, University of Massachusetts Lowell, One University Avenue, Pulichino-Tong Business Center Suite 228, Lowell, MA 01854, USA; [email protected] 2 Rinker School of Business, Palm Beach Atlantic University, 901 S Flagler Drive, West Palm Beach, FL 33401, USA * Correspondence: [email protected] Abstract: The AI factory is an effective way of managing artificial intelligence (AI) processes, enabling broad AI deployment in a firm. The purpose of this study is to explore the role of the AI factory in an entrepreneurship context. How do AI-powered startups leverage AI to grow, and manage data risks? What is the role of venture capitalists in this process? We answer these research questions by conducting an in-depth study of an AI-powered startup: ByteDance. Our study extends both AI and entrepreneurship literature by showing that AI-powered startups adopt the AI factory approach to optimize scale, scope, and learning. Our discussion also emphasizes the critical role played by venture capitalists in assisting AI-powered startups in building AI factories and in reducing data risk. Keywords: artificial intelligence; scaling growth; AI factory; AI-powered startup; data risk Citation: Jia, Peiyi, and Ciprian Stan. 2021. Artificial Intelligence Factory, 1. Introduction Data Risk, and VCs’ Mediation: The Artificial intelligence can process, analyze, and interpret data much faster and at Case of ByteDance, an AI-Powered a scale unachievable by human capabilities, potentially leading to increased prosperity, Startup.
    [Show full text]
  • China's New Media Dilemma
    China’s New Media Dilemma: The Profit in Online Dissent LOUISA CHIANG June 2019 China’s New Media Dilemma: The Profit in Online Dissent JUNE 2019 ABOUT CIMA Contents The Center for International Media Assistance (CIMA), at the National Introduction. .1 Endowment for Democracy, works to strengthen the support, raise the How the State Tamed Legacy Media . 4 visibility, and improve the effectiveness of The Disruptive Power of China’s New Media. 6 independent media development throughout the world. The center provides information, The Business of Liberal Opinion in China. 9 builds networks, conducts research, and highlights the indispensable role The Political Economy of China’s New Media. 14 independent media play in the creation and Conclusion. 19 development of sustainable democracies. An important aspect of CIMA’s work is Endnotes . .20 to research ways to attract additional US private sector interest in and support for international media development. CIMA convenes working groups, discussions, and panels on a variety of topics in the field of media development and assistance. ABOUT THE AUTHOR The center also issues reports and Louisa Chiang, an independent researcher recommendations based on working group who has been published in The New York discussions and other investigations. These reports aim to provide policymakers, Review of Books, was previously a senior as well as donors and practitioners, with East Asia officer at the National Endowment ideas for bolstering the effectiveness of for Democracy and a US Commerce attaché media assistance. in Beijing. Center for International Media Assistance ACKNOWLEDGMENTS National Endowment for Democracy I am grateful to Nicholas Benequista, Mark Nelson, Shanthi 1025 F STREET, N.W., 8TH FLOOR Kalathil, Akram Keram, Lynn Lee, and Paul Rothman for their WASHINGTON, DC 20004 invaluable suggestions and input.
    [Show full text]