Reflections on Artificial Intelligence, Speech Technology

Total Page:16

File Type:pdf, Size:1020Kb

Reflections on Artificial Intelligence, Speech Technology Proceedings from FONETIK 2019 Stockholm, June 10–12, 2019 Reflections on artificial intelligence, speech technology, brain imaging and phonetics Björn Lindblom Phonetics laboratory, Stockholm University, Sweden [email protected] Whither speech research? his blackboard: ‘What I cannot create I Broadly defined, the field of spoken lan- do not understand!’ (Gleick 1993). Let guage research is undergoing rapid us interpret this quote as a requirement change as a result of several factors. to be met by all good science. Let us call Computer technology can be used to it the Feynman criterion. If we apply it simulate interactive human speech com- to speech research, what do we find? munication in spectacularly diverse and First let us consider some recent realistic ways. achievements in speech technology. Since the 90’s sophisticated meth- On an almost daily basis, we hear ods have become increasingly available about new, often spectacular, artificial for imaging the real-time processes that intelligence (AI) developments. For in- take place in the brain during speaking, stance, some car models now have auto- listening and learning to speak. pilots. Medical diagnoses and treatments It would seem that the present situa- can be drastically improved thanks to tion offers speech research yet another vast data sets and clever methods that opportunity to expand its experimental detect patterns in the data that escape toolbox and orient itself towards neuro- even the most experienced expert. Fur- science with a view to acquiring a more thermore, law breakers get more easily profound understanding of the many caught with the aid of clever facial forms of language use under normal as recognition algorithms. well as disordered conditions. Already And then there is AlphaGo - a pro- many investigators have done so and gram that has beaten the world’s top hu- have contributed significant new man players of Go - a much more com- knowledge (Guenther 2016). plex game than chess (Mozur 2017). Its Our field presents a diverse mosaic training consisted in learning not only of research interests and special exper- from records of human games but from tise. Has time come for unification? discovering, on its own, winning and How could such interdisciplinary inte- losing strategies by playing innumerable gration be brought about? times against itself. The subject matter of a discipline The list of potential benefits of AI – tends to defined by the questions it asks, and disadvantages - is long (Foer 2018, but there are of course other factors such Friedman 2019). as tradition, ineffective administration Part of the story is AI-based speech and obsolete educational programs that technology. create obstacles. Recently we learned that attempts to Those are some of the questions and improve communication for paralyzed issues that need to be urgently addressed patients are under way and show consid- and dealt with today. erable promise. One such project (Akbar et al 2019) used a Brain Computer Inter- Speech technology and AI face technique to reconstruct input There is an anecdote about Richard speech signals from a population of Feynman, the eminent physicist, who evoked neural activity in the human au- had the following sentence written on ditory cortex. 139 Proceedings from FONETIK 2019 Stockholm, June 10–12, 2019 The investigators used a deep neural deep neural networks learn to a great ex- network to make direct estimates of tent on their own by accumulating and speech synthesizer parameters. This comparing a massive number of exam- model achieved high subjective and ob- ples of successes and failures –in part jective intelligibility scores on a digit found in records of human data, in part recognition allowing the authors to con- from information generated by the algo- clude that reconstructing speech from rithm itself. the human auditory cortex offers a Such results take steps towards re- speech neuroprosthetic for direct com- moving a major hurdle in the modeling munication with a paralyzed patient’s of human cognition. This hurdle is cre- brain. The technique is said to work un- ated by the fact that humans ‘know more der both overt and covert conditions. than they can tell’ - Polanyi’s Paradox. If this becomes a characteristic also of The IBM debater project (2019) computers, it would imply a major This project has produced a computer breakthrough for AI. Machines would program that recently participated in a then have become even more human- debate on ‘whether preschools should be like. Are we already there? subsidized’. Its performance was based There is an extremely serious down- on fifteen minutes of preparation, devel- side to these developments. ‘Not know- oping an argument in favor of subsidies, ing exactly what deep learning algo- producing a text and then giving an oral rithms do’ is problematic. It feeds right presentation of it. A large human audi- into people’s worst fears about future ence attended the event and was found to computers evolving to surpass and dom- prefer the human participant, but the inate their human masters. simulated debater came close to win- The ‘secrecy’ of the modeling ning. should also disqualify them as scientific Listening to this AI debater one is tools. They are unquestionably great im- struck by its ability to ‘reason’ and pro- itators. But imitation is hardly enough. If duce fluent and natural sounding speech these frameworks do not allow us to (see references). The big news is not that study the observed behavior, what use the human debater had won the debate. would they be to linguists, phoneticians, It is that the IBM debater had given its speech pathologists and communication human competitor a good run for his engineers? money – despite the complexity of the Instead of helping us understand our task. I cannot blame those who respond own brains, AI projects would create an by thinking that these developments are additional problem perhaps even more nothing short of extraordinary. I agree. forbidding: Understanding artificial But we have to ask: Where do we brains. place them within general science? What are the implications for research on Explanation: Holy Grail of research speech and language learning? Exactly Reasoning from first principles goes how does the debater do it? It meets the back to Aristotle. We can illustrate this Feynman criterion, but is imitation time-honored method by taking a look at enough? the acoustic theory of speech production AI people admit that they do not (Fant 1960, Stevens 1998). know exactly how these systems work At first one might assume that the (McAfee & Brynjolfsson 2016). But term ‘formant’ refers to an empirical no- they know enough to attribute the suc- tion derived from observations and anal- cess of AlphaGo (and other projects) to yses of large corpora of recorded speech, the use of deep learning networks. These but it is not. Anyone who has struggled models make it possible to abandon with making formant frequency meas- what has been used so far, viz. explicitly urements from short-term spectra and specified, computational rules. Instead, spectrograms knows that voice quality, 140 Proceedings from FONETIK 2019 Stockholm, June 10–12, 2019 nasalization and a high fundamental fre- human speech. Rather they hide what quency often make that exercise diffi- they do. cult. Those and other factors also cause Consequently they leave a big piece errors for automatic methods such as of the scientific task undone and might LPC whose results must be checked force upon us the epiphenomenal task of manually for accuracy. It is true that in- understanding - not only our own brains verse filtering can provide solutions to - but also artificial brains and their such problems – provided that you know deeply embedded organization. what the shape of the residue glottal pulse is supposed to look like. Neurobiology of speech We love formants, but they can be Students of language and speech are not very elusive! the only ones trying to get used to new The ‘formant’ is an independently technology. Neuroscience - now attract- motivated concept deduced from phys- ing our attention - is itself in fact a cur- ics. It is not a product of a data-driven rent instructive example: The Human approach. Rather the acoustic theory of Brain Project (HBP) is a gigantic EU- speech rests on the following first prin- sponsored effort in which some 500 sci- ciples (Beranek 1954): Newton’s second entists from over 100 universities partic- law, Boyle’s law and the Conservation of ipate (2012). mass. They provide the foundation for To motivate such a large project, the the wave equation that can be solved for applicants argue that time has come to arbitrary vocal tract configurations bring medicine, neuroscience and com- (Flanagan 1965). When the vocal tract puting together in response to the great- shape is known its resonances (for- est challenge of the 21st century: the un- mants) can be identified and their fre- derstanding of the human brain. Their quencies can be calculated. application underscores that, despite sig- When someone like me (with a lib- nificant progress neuroscience in recent eral arts education) invokes first princi- years, the field still suffers from frag- ples, it is not an expression of ‘physics mentation. envy’, nor a wish to reduce everything to To open the door to new treatments physics. of brain diseases – more than 500 have [I would perhaps be willing to plead been identified so far – the application guilty of ‘science envy’ because science states that it will be necessary to show is undoubtedly a good thing to be striv- how the ‘parts fit together in a single ing towards]. multi-level system’. A two-way process There are no absolute explanations, is envisioned. New computing technolo- only deeper and deeper accounts. Indi- gies will be discovered as more is vidual fields choose their explanans learned about the brain, and conversely, principles at different levels.
Recommended publications
  • A Comprehensive Workplace Environment Based on a Deep
    International Journal on Advances in Software, vol 11 no 3 & 4, year 2018, http://www.iariajournals.org/software/ 358 A Comprehensive Workplace Environment based on a Deep Learning Architecture for Cognitive Systems Using a multi-tier layout comprising various cognitive channels, reflexes and reasoning Thorsten Gressling Veronika Thurner Munich University of Applied Sciences ARS Computer und Consulting GmbH Department of Computer Science and Mathematics Munich, Germany Munich, Germany e-mail: [email protected] e-mail: [email protected] Abstract—Many technical work places, such as laboratories or All these settings share a number of commonalities. For test beds, are the setting for well-defined processes requiring both one thing, within each of these working settings a human being high precision and extensive documentation, to ensure accuracy interacts extensively with technical devices, such as measuring and support accountability that often is required by law, science, instruments or sensors. For another thing, processing follows or both. In this type of scenario, it is desirable to delegate certain a well-defined routine, or even precisely specified interaction routine tasks, such as documentation or preparatory next steps, to protocols. Finally, to ensure that results are reproducible, some sort of automated assistant, in order to increase precision and reduce the required amount of manual labor in one fell the different steps and achieved results usually have to be swoop. At the same time, this automated assistant should be able documented extensively and in a precise way. to interact adequately with the human worker, to ensure that Especially in scenarios that execute a well-defined series of the human worker receives exactly the kind of support that is actions, it is desirable to delegate certain routine tasks, such required in a certain context.
    [Show full text]
  • Build Your Trust Advantage, Leadership in the Era of Data
    Global C-suite Study 20th Edition Build Your Trust Advantage Leadership in the era of data and AI everywhere This report is IBM’s fourth Global C-suite Study and the 20th Edition in the ongoing IBM CxO Study series developed by the IBM Institute for Business Value (IBV). We have now collected data and insights from more than 50,000 interviews dating back to 2003. This report was authored in collaboration with leading academics, futurists, and technology visionaries. In this report, we present our key findings of CxO insights, experiences, and sentiments based on analysis as described in the research methodology on page 44. Build Your Trust Advantage | 1 Build Your Trust Advantage Leadership in the era of data and AI everywhere Global C-suite Study 20th Edition Our latest study draws on input from 13,484 respondents across 6 C-suite roles, 20 industries, and 98 countries. 2,131 2,105 2,118 2,924 2,107 2,099 Chief Chief Chief Chief Chief Chief Executive Financial Human Information Marketing Operations Officers Officers Resources Officers Officers Officers Officers 3,363 Europe 1,910 Greater China 3,755 North America 858 Japan 915 Middle East and Africa 1,750 Asia Pacific 933 Latin America 2 | Global C-suite Study Table of contents Executive summary 3 Introduction 4 Chapter 1 Customers: How to win in the trust economy 8 Action guide 19 Chapter 2 Enterprises: How to build the human-tech partnership 20 Action guide 31 Chapter 3 Ecosystems: How to share data in the platform era 32 Action guide 41 Conclusion: Return on trust 42 Acknowledgments 43 Related IBV studies 43 Research methodology 44 Notes and sources 45 Build Your Trust Advantage | 3 Executive summary More than 13,000 C-suite executives worldwide their data scientists, to uncover insights from data.
    [Show full text]
  • A.I. Technologies Applied to Naval CAD/CAM/CAE
    A.I. Technologies Applied to Naval CAD/CAM/CAE Jesus A. Muñoz Herrero, SENER, Madrid/Spain, [email protected] Rodrigo Perez Fernandez, SENER, Madrid/Spain, [email protected] Abstract Artificial Intelligence is one of the most enabling technologies of digital transformation in the industry, but it is also one of the technologies that most rapidly spreads in our daily activity. Increasingly, elements and devices that integrate artificial intelligence features, appear in our everyday lives. These characteristics are different, depending on the devices that integrate them, or the aim they pursue. The methods and processes that are carried out in Marine Engineering cannot be left out of this technology, but the peculiarities of the profession and the people that take part in it must be taken into account. There are many aspects in which artificial intelligence can be applied in the field of our profession. The management and access to all the information necessary for the correct and efficient execution of a naval project is one of the aspects where this technology can have a very positive impact. Access all the rules, rules, design guides, good practices, lessons learned, etc., in a fast and intelligent way, understanding the natural language of the people, identifying the most appropriate to the process that is being carried out and above all. Learning as we go through the design, is one of the characteristics that will increase the application of this technology in the professional field. This article will describe the evolution of this technology and the current situation of it in the different areas of application.
    [Show full text]
  • Ilearn Goals
    Combining AI and collective intelligence to create a GPS for knowledge CRI Paris We experiment at frontiers of learning, life and digital From babies to lifelong learning LMD students Learning by doing Interdisciplinarity Sustainable Development goals A changing job market Too much! Fake news Recruting for skills on the digital job market Can we reinvent learning with artificial intelligence? What can you (almost) do with AI today? Deep Learning Classification Generation Who are these persons? www.thispersondoesnotexist.com Based on GAN (Generative Adversarial Networks) A.I. image generation From text to image From text to image Text generation Automatic Q&A generation from Wikipedia Debating with A.I. February 12th, 2019 IBM Project Debater vs. World Debating Champion Motion: “We should subsidise preschool” ● 15 mins to prepare arguments ● 4-minute opening statement ● 4-minute rebuttal ● 2-minute summary Predictive tools Pneumonia detection from thorax radiographies (2017) Deepfake A.I. lip reading Conversational interfaces May 2018 Google Duplex iLearn goals Map all learning resources available on the Internet Build learning profiles of our users Provide them maps of their own learnings Match learners with adapted resources Match learners with mentors or co-learners iLearn Knowledge maps + = + Collective Artificial intelligence intelligence Learning groups iLearn iLearn Tags an online Users improve document as a qualification Concepts extraction useful learning (concepts and resource difficulty) Artificial Collective Intelligence Intelligence
    [Show full text]
  • Comments from More Cowbell Unlimited
    June 10, 2019 Elham Tabassi National Institute of Standards and Technology 100 Bureau Drive, Stop 200 Gaithersburg, MD 20899 TM Enclosed: Technical Paper: FOCAL Information Warfare Defense Standard ​ (v1.0 minus Appendix) ​ Dear Ms. Tabassi, Thank you for the opportunity to submit comments in response to the National Institute of Standards and Technology’s (NIST) request for information on artificial intelligence (AI) standards. We assert that NIST should work collaboratively with Federal agencies and the private sector to develop a cross sector Information Warfare (IW) Defense Standard. The enclosed technical paper supports our assertion and TM​ describes our FOCAL IW Defense Standard ,​ which is available for anyone to use. ​ More Cowbell Unlimited, Inc. is a process mining and data science firm based in Portland OR. We are developing process technologies in support of national security and industry. Our mission is to help America remain a beacon of hope and strength on the world stage. Technological advancements are a double-edged sword. AI is a tool which promises great things for humanity, such reducing poverty and allowing creativity to flourish; however, there is a dark side which we believe must be the focal point of national security. Unsurprisingly, hunger for dominance and money are present in this discussion, too. Feeding large hordes of private information into an AI to create a “World Brain” is plausible and provides a vehicle to project power in various ways. One way to monetize and project power from this information is through advertisements. Another way--perhaps one we are already seeing-- is through IW. As the world becomes more reliant upon information, IW boosted with weaponized AI is a major threat.
    [Show full text]
  • Leading Edge Technologies from the Slide Rule to the Cloud
    Liquidtool Systems AG Leading Edge Technologies From the Slide Rule to the Cloud Rudolf Meyer 2021-03-24 Table of Contents 1 Introduction .............................................................................................................................. 2 2.1 Industry 0.0 – Industrial prerequisites ..................................................................... 3 2.2 Industry 1.0 – Industrial production .......................................................................... 3 2.3 Industry 2.0 – Industrial mass-production ............................................................. 4 2.4 Industry 3.0 – Industrial automation ........................................................................ 4 2.5 Industry 3.5 – Industrial globalization ...................................................................... 5 3 A closer look – Where do we stand now? ..................................................................... 6 3.1 Industry 4.0 – Industrial digitization ......................................................................... 7 3.2 Leading edge technologies for Industry 4.0 ....................................................... 10 4 Looking ahead – Industry 4.0, 5.0, 6.0, 7.0… ................................................................ 15 Impact on the labor market .............................................................................................. 15 Impending developments .................................................................................................16 4.1 Industry 5.0
    [Show full text]
  • Creating the Engine for Scientific Discovery
    www.nature.com/npjsba PERSPECTIVE OPEN Nobel Turing Challenge: creating the engine for scientific discovery ✉ Hiroaki Kitano 1 Scientific discovery has long been one of the central driving forces in our civilization. It uncovered the principles of the world we live in, and enabled us to invent new technologies reshaping our society, cure diseases, explore unknown new frontiers, and hopefully lead us to build a sustainable society. Accelerating the speed of scientific discovery is therefore one of the most important endeavors. This requires an in-depth understanding of not only the subject areas but also the nature of scientific discoveries themselves. In other words, the “science of science” needs to be established, and has to be implemented using artificial intelligence (AI) systems to be practically executable. At the same time, what may be implemented by “AI Scientists” may not resemble the scientific process conducted by human scientist. It may be an alternative form of science that will break the limitation of current scientific practice largely hampered by human cognitive limitation and sociological constraints. It could give rise to a human-AI hybrid form of science that shall bring systems biology and other sciences into the next stage. The Nobel Turing Challenge aims to develop a highly autonomous AI system that can perform top-level science, indistinguishable from the quality of that performed by the best human scientists, where some of the discoveries may be worthy of Nobel Prize level recognition and beyond. npj Systems Biology and Applications (2021) 7:29 ; https://doi.org/10.1038/s41540-021-00189-3 1234567890():,; NOBEL TURING CHALLENGE AS AN ULTIMATE GRAND EURISKO6,8.
    [Show full text]
  • Mediation and Artificial Intelligence: Notes on the Future of International Conflict Resolution Diplofoundation IMPRESSUM
    Mediation and artificial intelligence: Notes on the future of international conflict resolution DiploFoundation IMPRESSUM Layout and design: Viktor Mijatović, Aleksandar Nedeljkov Copy editing: Mary Murphy Author and researcher: Katharina E. Höne Except where otherwise noted, this work is licensed under http://creativecommons.org/licences/by-nc-nd/3.0/ 2 DiploFoundation, 7bis Avenue de la Paix, 1201 Geneva, Switzerland Publication date: November 2019 For further information, contact DiploFoundation at [email protected] For more information on Diplo’s AI Lab, go to https://www.diplomacy.edu/AI To download the electronic copy of this report, click on https://www.diplomacy.edu/sites/default/files/Mediation_and_AI.pdf 3 Table of Contents Acknowledgements ............................................................................................................................................................... 5 Executive summary ............................................................................................................................................................... 6 Introduction ............................................................................................................................................................................. 7 Mediation and AI: The broad picture ................................................................................................................................... 8 Understanding AI and related concepts ......................................................................................................................
    [Show full text]
  • Data for Peacebuilding and Prevention Ecosystem Mapping
    October 2020 DATA FOR PEACEBUILDING AND PREVENTION Ecosystem Mapping The State of Play and the Path to Creating a Com- munity of Practice Branka Panic ABOUT THE CENTER ON INTERNATIONAL COOPERATION The Center on International Cooperation (CIC) is a non-profit research center housed at New York University. Our vision is to advance effective multilateral action to prevent crises and build peace, justice, and inclusion. Our mission is to strengthen cooperative approaches among national governments, international organizations, and the wider policy community to advance peace, justice, and inclusion. Acknowledgments The NYU Center on International Cooperation is grateful for the support of the Netherlands Ministry of Foreign Affairs for this initiative. We also offer a special thanks to interviewees for their time and support at various stages of this research, and to an Expert Group convened to discuss the report findings: Bas Bijlsma, Christina Goodness, Gregor Reisch, Jacob Lefton, Lisa Schirch, Merve Hickok, Monica Nthiga, and Thierry van der Horst. CITATION: This report should be cited as: Branka Panic, Data for Peacebuilding and Prevention Ecosystem Mapping: The State of Play and the Path to Creating a Community of Practice (New York: NYU Center on International Cooperation, 2020). Contents Summary . 1 Terminology........................................................................ 4 Introduction: Objectives and Methodology ............................................. 5 Objectives.....................................................................
    [Show full text]
  • Towards Common-Sense Reasoning with Advanced NLP Architectures
    Towards Common-Sense Reasoning with Advanced NLP architectures Alex Movilă - November 2019 NLP in 2019 News – Scary advances - GPT-2 model makes press headlines February 2019: Researchers, scared by their own work, hold back “deep-fakes for text” AI OpenAI Trains Language Model, Mass Hysteria Ensues This New Storytelling AI Fools Humans 3 out of 5 Times With It’s Writing Artificial Intelligence Can Now Write Amazing Content - What Does That Mean For Humans? The first AI-generated textbook: "Lithium-Ion Batteries: A Machine-Generated Summary of Current Research" Oct 2019: Elon Musk-backed AI company releases its text-generating robot despite concerns it could be used to create fake news and spam GPT-2 generates convincing fake news Humans find GPT-2 outputs convincing. Our partners at Cornell University surveyed people to assign GPT-2 text a credibility score across model sizes. People gave the 1.5B model a “credibility score” of 6.91 out of 10. … extremist groups can use GPT-2 for misuse, specifically by fine-tuning GPT-2 models on four ideological positions: white supremacy, Marxism, jihadist Islamism, and anarchism. CTEC demonstrated that it’s possible to create models that can generate synthetic propaganda for these ideologies. GPT-2: 1.5B Release Google improved 10 percent of searches by understanding language context (with BERT) “how to catch a cow fishing?” Even though I had purposely used the word “fishing” to provide context, Google ignored that context and provided results related to cows. That was on October 1, 2019. Today, October 25, 2019 the same query results in search results that are full of striped bass and fishing related results.
    [Show full text]
  • AI) for Your Business
    5 Things You Should Know About Artificial Intelligence (AI) For Your Business The other day, as I drove into work I saw a mobile SaaS van. It wasn’t a technician driving around, in fact, it was not a mobile IT services company at all. It was all about Smoothies. There is a technological shift where the world is headed to XaaS (Replace the X with whatever you provide). Companies are headed to XaaS fast and they’re becoming extremely personalized. Some other examples are CaaS (Cookies as a Service) and Blue Apron (Meals as a Service). What about Stitch Fix (Clothing as a Service)? Stitch Fix is a REALLY interesting case study for those of us in technology. With a strong degree of accuracy, they can predict what clothing you’ll like, send you a collection of curated clothing items on a monthly basis, and have VERY FEW items returned. And, it’s all through a subscription model. Everybody talks about the Netflix AI that predicts the shows you should watch. And who can escape Siri & Alexa? It’s all based on data, predicting from that data, and learning from your successes and failures, hence AI. Artificial intelligence, though it’s advancing very quickly, is still in relatively early days. As you try to understand how your company can play a part in this AI evolution, here are some initial things to consider. DEFINE AI 1 AI means different things to different people. Customer support bots are being called AI but so is IBM’s Watson. Two very different things.
    [Show full text]
  • Francesca Bonin
    Curriculum vitae: Francesca Bonin Contacts [email protected] [email protected] http://researcher.watson.ibm.com/researcher/view.php?person=ie-FBONIN Highlights Research Scientist in Artificial Intelligence and NLP, orientated on Natural Language data. Currently leading a group of 5 researchers within IBM Research Europe in finding synergies between research goals and business goals for internal and external stakeholders. Helping in the definition of the internal team strategy. Education September 2011 – June 2016 PhD in Natural Language Processing School of Computer Science and Statistics, Trinity College Dublin (Ireland). Advisors: Dr. Carl Vogel and Prof. Nick Campbell. Focus: NLP, Dialogue Processing, Conversational Analysis, Multimodal Information Extraction, Social Signal Processing, Feature Engineering, Machine Learning. Publications: [1,2,3,4,5,6,9,10,12,13,14,16,17,18,19,20,21,22] September 2005- December 2007 M.Sc. in Computer Science for the Humanities. Major in Language Technologies University of Pisa (Italy). Thesis: Natural Language and Ontologies. Text Simplification for human machine interaction Grade: 110/110 with honors. Advisors: Prof. Alessandro Lenci, Raffaella Bernardi Focus: Natural Language Processing, Semantic Web, and Ontologies. Publications: [29] September 2001-June 2005 B.A. in Computer Science for the Humanities. Major in Linguistics University of Pisa (Italy). Thesis: A computational grammar for Italian functional analysis Grade: 110/110 with honors. Advisor: Prof. Alessandro Lenci. Focus: Natural Language Processing, Syntactic Parsing. Work experiences April 2018 – current IBM - Principal Investigator - Research Scientist IBM Ireland, Dublin, Ireland March 2021 Francesca Bonin Leading a group of 5 researchers towards project planning, stakeholder satisfaction and business definition.
    [Show full text]