ICML 2020 Workshop Book

Total Page:16

File Type:pdf, Size:1020Kb

ICML 2020 Workshop Book ICML 2020 Workshop book Schedule Highlights Generated Mon Dec 21, 2020 Workshop organizers make last-minute changes to their schedule. Download this document again to get the lastest changes, or use the ICML mobile application. July 13, 2020 None WiML D&I Chairs Remarks: Sinead Williamson and Rachel Thomas July 17, 2020 None Graph Representation Learning and Beyond (GRL+) Veličković, Bronstein, Deac, Hamilton, Hamrick, Hashemi, Jegelka, Leskovec, Liao, Monti, Sun, Swersky, Ying, Žitnik None XXAI: Extending Explainable AI Beyond Deep Models and Classifiers Samek, Holzinger, Fong, Moon, Mueller None Self-supervision in Audio and Speech Ravanelli, Serdyuk, Hjelm, Ramabhadran, Parcollet None 5th ICML Workshop on Human Interpretability in Machine Learning (WHI) Weller, Xiang, Dhurandhar, Kim, Wei, Varshney, Bhatt None Law & Machine Learning Castets-Renard, Cussat-Blanc, Risser None Learning with Missing Values Josse, Frellsen, Mattei, Varoquaux None Healthcare Systems, Population Health, and the Role of Health-tech Heaukulani, Palla, Heller, Prasad, Ghassemi None Challenges in Deploying and Monitoring Machine Learning Systems Tosi, Korda, Lawrence None Workshop on AI for Autonomous Driving (AIAD) Chao, McAllister, Gaidon, Li, Kreiss None MLRetrospectives: A Venue for Self-Reflection in ML Research Forde, Dodge, Jaiswal, Lowe, Liu, Pineau, Bengio None Participatory Approaches to Machine Learning Zhou, Madras, Raji, Milli, Kulynych, Zemel None Workshop on Continual Learning Fayek, Chaudhry, Lopez-Paz, Belilovsky, Schwarz, Pickett, Aljundi, Ebrahimi, Dokania None Workshop on eXtreme Classification: Theory and Applications Choromanska, Langford, Majzoubi, Prabhu None ICML 2020 Workshop on Computational Biology Aghamirzaie, Anderson, Azizi, Diallo, Burdziak, Gallaher, Kundaje, Pe'er, Prabhakaran, Remita, Robertson-Tessi, Tansey, Vogt, Xie None Object-Oriented Learning: Perception, Representation, and Reasoning Ahn, Kosiorek, Hamrick, van Steenkiste, Bengio None Theoretical Foundations of Reinforcement Learning Brunskill, Lykouris, Simchowitz, Sun, Wang None ML Interpretability for Scientific Discovery Venugopalan, Brenner, Linderman, Kim None Uncertainty and Robustness in Deep Learning Workshop (UDL) 2 Li, Lakshminarayanan, Hendrycks, Dietterich, Snoek None Beyond first order methods in machine learning systems Berahas, Gholaminejad, Kyrillidis, Mahoney, Roosta July 18, 2020 None 4th Lifelong Learning Workshop Sodhani, Chandar, Ravindran, Precup None INNF+: Invertible Neural Networks, Normalizing Flows, and Explicit Likelihood Models Huang, Krueger, Van den Berg, Papamakarios, Cremer, Chen, J. Rezende None Inductive Biases, Invariances and Generalization in Reinforcement Learning Goyal, Ke, Wang, Weber, Viola, Schölkopf, Bauer None Negative Dependence and Submodularity: Theory and Applications in Machine Learning Mariet, Derezinski, Gartrell None Federated Learning for User Privacy and Data Confidentiality Baracaldo, Choudhury, Joshi, Raskar, Wang, Yu None Machine Learning for Global Health Belgrave, Hyland, Onu, Furnham, Mwebaze, Lawrence None Bridge Between Perception and Reasoning: Graph Neural Networks & Beyond Tang, Song, Leskovec, Liao, Li, Fidler, Zemel, Salakhutdinov None Economics of privacy and data labor Vasiloglou, Cummings, Weyl, Koutris, Young, Jia, Dao, Waggoner None 7th ICML Workshop on Automated Machine Learning (AutoML 2020) Hutter, Vanschoren, Lindauer, Weill, Eggensperger, Feurer None Real World Experiment Design and Active Learning Bogunovic, Neiswanger, Yue None 1st Workshop on Language in Reinforcement Learning (LaReL) Nardelli, Luketina, Nardelli, Foerster, Zhong, Andreas, Grefenstette, Rocktäschel None Workshop on Learning in Artificial Open Worlds Szlam, Hofmann, Salakhutdinov, Kuno, Guss, Srinet, Houghton None Incentives in Machine Learning Faltings, Liu, Parkes, Radanovic, Song None Machine Learning for Media Discovery Schmidt, Nieto, Gouyon, Raimond, Kinnaird, Lanckriet None 2nd ICML Workshop on Human in the Loop Learning (HILL) Zhang, Wang, Yu, Wu, Darrell 3 July 13, 2020 July 13, 2020 4 July 17, 2020 July 17, 2020 01:30 Virtual Poster Session #1 AM 02:30 Novel Applications: Wiki-CS: A AM Wikipedia-Based Benchmark for Graph Representation Learning and Graph Neural Networks Mernyei Beyond (GRL+) 02:40 Novel Applications: Graph Neural AM Networks for Massive MIMO Petar Veličković, Michael M. Bronstein, Andreea Detection Scotti Deac, Will Hamilton, Jessica Hamrick, Milad Hashemi, Stefanie Jegelka, Jure Leskovec, Renjie Liao, Federico 02:50 Novel Applications: Embedding a Monti, Yizhou Sun, Kevin Swersky, Zhitao Ying, AM random graph via GNN: Extended Marinka Žitnik mean-field inference theory and RL applications to NP-Hard multi-robot/ Thu Jul 16, 23:40 PM machine scheduling Kang Recent years have seen a surge in research on 03:00 Update: PyTorch Geometric Fey graph representation learning, including AM techniques for deep graph embeddings, 03:10 Update: Deep Graph Library Karypis generalizations of CNNs to graph-structured data, AM and neural message-passing approaches. These 03:20 Update: Open Graph Benchmark advances in graph neural networks and related AM Leskovec techniques have led to new state-of-the-art 03:30 Lunch Break results in numerous domains: chemical synthesis, AM 3D-vision, recommender systems, question 04:30 Original Research: Get Rid of answering, continuous control, self-driving and AM Suspended Animation: Deep social network analysis. Building on the Diffusive Neural Network for Graph successes of three related workshops from last Representation Learning Zhang year (at ICML, ICLR and NeurIPS), the primary 04:45 Original Research: Learning Graph goal for this workshop is to facilitate community AM Models for Template-Free building, and support expansion of graph Retrosynthesis Somnath representation learning into more 05:00 Original Research: Frequent interdisciplinary projects with the natural and AM Subgraph Mining by Walking in social sciences. With hundreds of new Order Embedding Space Ying researchers beginning projects in this area, we 05:15 Invited Talk: Kyle Cranmer Cranmer hope to bring them together to consolidate this AM fast-growing area into a healthy and vibrant 05:45 Invited Talk: Danai Koutra Koutra subfield. Especially, we aim to strongly promote AM novel and exciting applications of graph representation learning across the sciences, 06:15 Q&A / Discussions / Coffee 2 reflected in our choices of invited speakers. AM 06:45 COVID-19 Applications: Navigating AM the Dynamics of Financial Schedule Embeddings over Time Gogoglou 06:55 COVID-19 Applications: Integrating AM Logical Rules Into Neural Multi-Hop 11:40 Opening Remarks Veličković, Deac Reasoning for Drug Repurposing Liu PM 07:05 COVID-19 Applications: Gaining 12:00 Invited Talk: Xavier Bresson Bresson AM insight into SARS-CoV-2 infection AM and COVID-19 severity using self- 12:30 Invited Talk: Thomas Kipf Kipf supervised edge features and Graph AM Neural Networks Sehanobish 01:00 Q&A / Discussions / Coffee 1 07:15 Invited Talk: Tina Eliassi-Rad Eliassi- AM AM Rad 5 July 17, 2020 07:45 Invited Talk: Raquel Urtasun Urtasun N/A (#34 / Sess. 1) Set2Graph: Learning AM Graphs From Sets Serviansky 08:15 Q&A / Discussions / Coffee 3 N/A (#96 / Sess. 1) Active Learning on AM Graphs via Meta Learning Madhawa 08:45 Virtual Poster Session #2 N/A (#40 / Sess. 2) HNHN: Hypergraph AM Networks with Hyperedge Neurons 09:45 Closing Remarks Veličković Dong AM N/A (#95 / Sess. 2) Graphein - a Python N/A (#10 / Sess. 2) Multi-Graph Neural Library for Geometric Deep Learning Operator for Parametric Partial and Network Analysis on Protein Differential Equations Li Structures Jamasb N/A (#12 / Sess. 1) Deep Graph N/A (#43 / Sess. 2) Uncovering the Contrastive Representation Learning Folding Landscape of RNA Zhu Secondary Structure with Deep N/A (#15 / Sess. 2) Learning Distributed Graph Embeddings Castro Representations of Graphs with N/A (#46 / Sess. 2) Discrete Planning Geo2DR Scherer with End-to-end Trained Neuro- N/A (#18 / Sess. 1) Hierarchical Protein algorithmic Policies Vlastelica Function Prediction with Tail-GNNs Pogancic Spalević N/A (#92 / Sess. 1) From Graph Low- N/A (#19 / Sess. 1) Neural Bipartite Rank Global Attention to 2-FWL Matching Georgiev Approximation Puny N/A (#20 / Sess. 1) Principal N/A (#90 / Sess. 1) Pointer Graph Neighbourhood Aggregation for Networks Veličković Graph Nets Corso N/A (#50 / Sess. 1) Graph Convolutional N/A (#21 / Sess. 2) Graph Neural Gaussian Processes for Link Networks for the Prediction of Prediction Opolka Substrate-Specific Organic Reaction N/A (#86 / Sess. 2) Graph Generation Conditions Maser with Energy-Based Models Liu N/A (#23 / Sess. 2) A Note on Over- N/A (#54 / Sess. 2) SE(3)-Transformers: Smoothing for Graph Neural 3D Roto-Translation Equivariant Networks Cai Attention Networks Fuchs N/A (#24 / Sess. 2) Degree-Quant: N/A (#84 / Sess. 2) UniKER: A Unified Quantization-Aware Training for Framework for Combining Graph Neural Networks Tailor Embedding and Horn Rules for N/A (#26 / Sess. 2) Message Passing Knowledge Graph Inference Cheng Query Embedding Daza N/A (#83 / Sess. 2) Connecting Graph N/A (#27 / Sess. 2) Learning Graph Convolutional Networks and Graph- Structure with A Finite-State Regularized PCA Zhao Automaton Layer Johnson N/A (#77 / Sess. 1) SIGN: Scalable N/A (#29 / Sess. 2) Few-shot link Inception Graph Neural Networks prediction via graph neural Frasca networks for Covid-19 drug- N/A (#76 / Sess. 1) Graph Clustering with repurposing Zheng Graph Neural Networks Tsitsulin N/A (#31 / Sess.
Recommended publications
  • Health Outcomes from Home Hospitalization: Multisource Predictive Modeling
    JOURNAL OF MEDICAL INTERNET RESEARCH Calvo et al Original Paper Health Outcomes from Home Hospitalization: Multisource Predictive Modeling Mireia Calvo1, PhD; Rubèn González2, MSc; Núria Seijas2, MSc, RN; Emili Vela3, MSc; Carme Hernández2, PhD; Guillem Batiste2, MD; Felip Miralles4, PhD; Josep Roca2, MD, PhD; Isaac Cano2*, PhD; Raimon Jané1*, PhD 1Institute for Bioengineering of Catalonia (IBEC), Barcelona Institute of Science and Technology (BIST), Universitat Politècnica de Catalunya (UPC), CIBER-BBN, Barcelona, Spain 2Hospital Clínic de Barcelona, Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Universitat de Barcelona (UB), Barcelona, Spain 3Àrea de sistemes d'informació, Servei Català de la Salut, Barcelona, Spain 4Eurecat, Technology Center of Catalonia, Barcelona, Spain *these authors contributed equally Corresponding Author: Isaac Cano, PhD Hospital Clínic de Barcelona Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS) Universitat de Barcelona (UB) Villarroel 170 Barcelona, 08036 Spain Phone: 34 932275540 Email: [email protected] Abstract Background: Home hospitalization is widely accepted as a cost-effective alternative to conventional hospitalization for selected patients. A recent analysis of the home hospitalization and early discharge (HH/ED) program at Hospital Clínic de Barcelona over a 10-year period demonstrated high levels of acceptance by patients and professionals, as well as health value-based generation at the provider and health-system levels. However, health risk assessment was identified as an unmet need with the potential to enhance clinical decision making. Objective: The objective of this study is to generate and assess predictive models of mortality and in-hospital admission at entry and at HH/ED discharge. Methods: Predictive modeling of mortality and in-hospital admission was done in 2 different scenarios: at entry into the HH/ED program and at discharge, from January 2009 to December 2015.
    [Show full text]
  • Arxiv:1811.04918V6 [Cs.LG] 1 Jun 2020
    Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers Zeyuan Allen-Zhu Yuanzhi Li Yingyu Liang [email protected] [email protected] [email protected] Microsoft Research AI Stanford University University of Wisconsin-Madison November 12, 2018 (version 6)∗ Abstract The fundamental learning theory behind neural networks remains largely open. What classes of functions can neural networks actually learn? Why doesn't the trained network overfit when it is overparameterized? In this work, we prove that overparameterized neural networks can learn some notable con- cept classes, including two and three-layer networks with fewer parameters and smooth activa- tions. Moreover, the learning can be simply done by SGD (stochastic gradient descent) or its variants in polynomial time using polynomially many samples. The sample complexity can also be almost independent of the number of parameters in the network. On the technique side, our analysis goes beyond the so-called NTK (neural tangent kernel) linearization of neural networks in prior works. We establish a new notion of quadratic ap- proximation of the neural network (that can be viewed as a second-order variant of NTK), and connect it to the SGD theory of escaping saddle points. arXiv:1811.04918v6 [cs.LG] 1 Jun 2020 ∗V1 appears on this date, V2/V3/V4 polish writing and parameters, V5 adds experiments, and V6 reflects our conference camera ready version. Authors sorted in alphabetical order. We would like to thank Greg Yang and Sebastien Bubeck for many enlightening conversations. Y. Liang was supported in part by FA9550-18-1-0166, and would also like to acknowledge that support for this research was provided by the Office of the Vice Chancellor for Research and Graduate Education at the University of Wisconsin-Madison with funding from the Wisconsin Alumni Research Foundation.
    [Show full text]
  • Analysis of Perceptron-Based Active Learning
    JournalofMachineLearningResearch10(2009)281-299 Submitted 12/07; 12/08; Published 2/09 Analysis of Perceptron-Based Active Learning Sanjoy Dasgupta [email protected] Department of Computer Science and Engineering University of California, San Diego La Jolla, CA 92093-0404, USA Adam Tauman Kalai [email protected] Microsoft Research Office 14063 One Memorial Drive Cambridge, MA 02142, USA Claire Monteleoni [email protected] Center for Computational Learning Systems Columbia University Suite 850, 475 Riverside Drive, MC 7717 New York, NY 10115, USA Editor: Manfred Warmuth Abstract Ω 1 We start by showing that in an active learning setting, the Perceptron algorithm needs ( ε2 ) labels to learn linear separators within generalization error ε. We then present a simple active learning algorithm for this problem, which combines a modification of the Perceptron update with an adap- tive filtering rule for deciding which points to query. For data distributed uniformly over the unit 1 sphere, we show that our algorithm reaches generalization error ε after asking for just O˜(d log ε ) labels. This exponential improvement over the usual sample complexity of supervised learning had previously been demonstrated only for the computationally more complex query-by-committee al- gorithm. Keywords: active learning, perceptron, label complexity bounds, online learning 1. Introduction In many machine learning applications, unlabeled data is abundant but labeling is expensive. This distinction is not captured in standard models of supervised learning, and has motivated the field of active learning, in which the labels of data points are initially hidden, and the learner must pay for each label it wishes revealed.
    [Show full text]
  • The Sample Complexity of Agnostic Learning Under Deterministic Labels
    JMLR: Workshop and Conference Proceedings vol 35:1–16, 2014 The sample complexity of agnostic learning under deterministic labels Shai Ben-David [email protected] Cheriton School of Computer Science, University of Waterloo, Waterloo, Ontario N2L 3G1, CANADA Ruth Urner [email protected] School of Computer Science, Georgia Institute of Technology, Atlanta, Georgia 30332, USA Abstract With the emergence of Machine Learning tools that allow handling data with a huge number of features, it becomes reasonable to assume that, over the full set of features, the true labeling is (almost) fully determined. That is, the labeling function is deterministic, but not necessarily a member of some known hypothesis class. However, agnostic learning of deterministic labels has so far received little research attention. We investigate this setting and show that it displays a behavior that is quite different from that of the fundamental results of the common (PAC) learning setups. First, we show that the sample complexity of learning a binary hypothesis class (with respect to deterministic labeling functions) is not fully determined by the VC-dimension of the class. For any d, we present classes of VC-dimension d that are learnable from O~(d/)- many samples and classes that require samples of size Ω(d/2). Furthermore, we show that in this setup, there are classes for which any proper learner has suboptimal sample complexity. While the class can be learned with sample complexity O~(d/), any proper (and therefore, any ERM) algorithm requires Ω(d/2) samples. We provide combinatorial characterizations of both phenomena, and further analyze the utility of unlabeled samples in this setting.
    [Show full text]
  • Sinkhorn Autoencoders
    Sinkhorn AutoEncoders Giorgio Patrini⇤• Rianne van den Berg⇤ Patrick Forr´e Marcello Carioni UvA Bosch Delta Lab University of Amsterdam† University of Amsterdam KFU Graz Samarth Bhargav Max Welling Tim Genewein Frank Nielsen ‡ University of Amsterdam University of Amsterdam Bosch Center for Ecole´ Polytechnique CIFAR Artificial Intelligence Sony CSL Abstract 1INTRODUCTION Optimal transport o↵ers an alternative to Unsupervised learning aims at finding the underlying maximum likelihood for learning generative rules that govern a given data distribution PX . It can autoencoding models. We show that mini- be approached by learning to mimic the data genera- mizing the p-Wasserstein distance between tion process, or by finding an adequate representation the generator and the true data distribution of the data. Generative Adversarial Networks (GAN) is equivalent to the unconstrained min-min (Goodfellow et al., 2014) belong to the former class, optimization of the p-Wasserstein distance by learning to transform noise into an implicit dis- between the encoder aggregated posterior tribution that matches the given one. AutoEncoders and the prior in latent space, plus a recon- (AE) (Hinton and Salakhutdinov, 2006) are of the struction error. We also identify the role of latter type, by learning a representation that maxi- its trade-o↵hyperparameter as the capac- mizes the mutual information between the data and ity of the generator: its Lipschitz constant. its reconstruction, subject to an information bottle- Moreover, we prove that optimizing the en- neck. Variational AutoEncoders (VAE) (Kingma and coder over any class of universal approxima- Welling, 2013; Rezende et al., 2014), provide both a tors, such as deterministic neural networks, generative model — i.e.
    [Show full text]
  • Data-Dependent Sample Complexity of Deep Neural Networks Via Lipschitz Augmentation
    Data-dependent Sample Complexity of Deep Neural Networks via Lipschitz Augmentation Colin Wei Tengyu Ma Computer Science Department Computer Science Department Stanford University Stanford University [email protected] [email protected] Abstract Existing Rademacher complexity bounds for neural networks rely only on norm control of the weight matrices and depend exponentially on depth via a product of the matrix norms. Lower bounds show that this exponential dependence on depth is unavoidable when no additional properties of the training data are considered. We suspect that this conundrum comes from the fact that these bounds depend on the training data only through the margin. In practice, many data-dependent techniques such as Batchnorm improve the generalization performance. For feedforward neural nets as well as RNNs, we obtain tighter Rademacher complexity bounds by considering additional data-dependent properties of the network: the norms of the hidden layers of the network, and the norms of the Jacobians of each layer with respect to all previous layers. Our bounds scale polynomially in depth when these empirical quantities are small, as is usually the case in practice. To obtain these bounds, we develop general tools for augmenting a sequence of functions to make their composition Lipschitz and then covering the augmented functions. Inspired by our theory, we directly regularize the network’s Jacobians during training and empirically demonstrate that this improves test performance. 1 Introduction Deep networks trained in
    [Show full text]
  • Nearly Tight Sample Complexity Bounds for Learning Mixtures of Gaussians Via Sample Compression Schemes∗
    Nearly tight sample complexity bounds for learning mixtures of Gaussians via sample compression schemes∗ Hassan Ashtiani Shai Ben-David Department of Computing and Software School of Computer Science McMaster University, and University of Waterloo, Vector Institute, ON, Canada Waterloo, ON, Canada [email protected] [email protected] Nicholas J. A. Harvey Christopher Liaw Department of Computer Science Department of Computer Science University of British Columbia University of British Columbia Vancouver, BC, Canada Vancouver, BC, Canada [email protected] [email protected] Abbas Mehrabian Yaniv Plan School of Computer Science Department of Mathematics McGill University University of British Columbia Montréal, QC, Canada Vancouver, BC, Canada [email protected] [email protected] Abstract We prove that Θ(e kd2="2) samples are necessary and sufficient for learning a mixture of k Gaussians in Rd, up to error " in total variation distance. This improves both the known upper bounds and lower bounds for this problem. For mixtures of axis-aligned Gaussians, we show that Oe(kd="2) samples suffice, matching a known lower bound. The upper bound is based on a novel technique for distribution learning based on a notion of sample compression. Any class of distributions that allows such a sample compression scheme can also be learned with few samples. Moreover, if a class of distributions has such a compression scheme, then so do the classes of products and mixtures of those distributions. The core of our main result is showing that the class of Gaussians in Rd has a small-sized sample compression. 1 Introduction Estimating distributions from observed data is a fundamental task in statistics that has been studied for over a century.
    [Show full text]
  • IBM Cloud Pak for Data
    IBM Cloud Pak for Data IBM Cloud Pak for Data Intelligently automate your data and AI strategy to connect the right data, to the right people, at the right time, from anywhere. Contents Introduction – IBM Cloud Pak for Data: Any data. Any cloud. Anywhere In today’s uncertain environment every organization must become smarter and more responsive to intelligently operate and engage the – The next generation of world, resiliently respond to market changes, flexibly optimize costs, IBM Cloud Pak for Data and innovate. Fueled by data, AI is empowering leading organizations to transform and deliver value. A recent study found that data-driven – Deployment models for organizations are 178% more likely to outperform in terms of revenue IBM Cloud Pak for Data and profitability. – Top use cases for IBM Cloud However, to successfully scale AI throughout your business, you Pak for Data must overcome data complexity. Companies today are struggling to manage and maintain vast amounts of data spanning public, private – Next steps and on-premises clouds. 70 percent of global respondents stated their company is pulling from over 20 different data sources to inform their AI, BI, and analytics systems. In addition, one third cited data complexity and data siloes as top barriers to AI adoption. Further compounding the complexity of these hybrid data landscapes is the fact that the lifespan of that data—the time that it is most relevant and valuable—is drastically shrinking. 1 The solution? An agile and resilient cloud-native platform that enables clients to predict and automate outcomes with trusted data and AI.
    [Show full text]
  • APPLIED MACHINE LEARNING for CYBERSECURITY in SPAM FILTERING and MALWARE DETECTION by Mark Sokolov December, 2020
    APPLIED MACHINE LEARNING FOR CYBERSECURITY IN SPAM FILTERING AND MALWARE DETECTION by Mark Sokolov December, 2020 Director of Thesis: Nic Herndon, PhD Major Department: Computer Science Machine learning is one of the fastest-growing fields and its application to cy- bersecurity is increasing. In order to protect people from malicious attacks, several machine learning algorithms have been used to predict the malicious attacks. This research emphasizes two vulnerable areas of cybersecurity that could be easily ex- ploited. First, we show that spam filtering is a well known problem that has been addressed by many authors, yet it still has vulnerabilities. Second, with the increase of malware threats in our world, a lot of companies use AutoAI to help protect their systems. Nonetheless, AutoAI is not perfect, and data scientists can still design better models. In this thesis I show that although there are efficient mechanisms to prevent malicious attacks, there are still vulnerabilities that could be easily exploited. In the visual spoofing experiment, we show that using a classifier trained on data using Latin alphabet, to classify a message with a combination of Latin and Cyrillic let- ters leads to much lower classification accuracy. In Malware prediction experiment, our model has been able to predict malware attacks on Microsoft computers and got higher accuracy than any well known Auto AI. APPLIED MACHINE LEARNING FOR CYBERSECURITY IN SPAM FILTERING AND MALWARE DETECTION A Thesis Presented to The Faculty of the Department of Computer Science East Carolina University In Partial Fulfillment of the Requirements for the Degree Master of Science in Software Engineering by Mark Sokolov December, 2020 Copyright Mark Sokolov, 2020 APPLIED MACHINE LEARNING FOR CYBERSECURITY IN SPAM FILTERING AND MALWARE DETECTION by Mark Sokolov APPROVED BY: DIRECTOR OF THESIS: Nic Herndon, PhD COMMITTEE MEMBER: Rui Wu, PhD COMMITTEE MEMBER: Mark Hills, PhD CHAIR OF THE DEPARTMENT OF COMPUTER SCIENCE: Venkat Gudivada, PhD DEAN OF THE GRADUATE SCHOOL: Paul J.
    [Show full text]
  • Review on Predictive Modelling Techniques for Identifying Students at Risk in University Environment
    MATEC Web of Conferences 255, 03002 (2019) https://doi.org/10.1051/matecconf/20192 5503002 EAAI Conference 2018 Review on Predictive Modelling Techniques for Identifying Students at Risk in University Environment Nik Nurul Hafzan Mat Yaacob1,*, Safaai Deris1,3, Asiah Mat2, Mohd Saberi Mohamad1,3, Siti Syuhaida Safaai1 1Faculty of Bioengineering and Technology, Universiti Malaysia Kelantan, Jeli Campus, 17600 Jeli, Kelantan, Malaysia 2Faculty of Creative and Heritage Technology, Universiti Malaysia Kelantan,16300 Bachok, Kelantan, Malaysia 3Institute for Artificial Intelligence and Big Data, Universiti Malaysia Kelantan, City Campus, Pengkalan Chepa, 16100 Kota Bharu, Kelantan, Malaysia Abstract. Predictive analytics including statistical techniques, predictive modelling, machine learning, and data mining that analyse current and historical facts to make predictions about future or otherwise unknown events. Higher education institutions nowadays are under increasing pressure to respond to national and global economic, political and social changes such as the growing need to increase the proportion of students in certain disciplines, embedding workplace graduate attributes and ensuring that the quality of learning programs are both nationally and globally relevant. However, in higher education institution, there are significant numbers of students that stop their studies before graduation, especially for undergraduate students. Problem related to stopping out student and late or not graduating student can be improved by applying analytics. Using analytics, administrators, instructors and student can predict what will happen in future. Administrator and instructors can decide suitable intervention programs for at-risk students and before students decide to leave their study. Many different machine learning techniques have been implemented for predictive modelling in the past including decision tree, k-nearest neighbour, random forest, neural network, support vector machine, naïve Bayesian and a few others.
    [Show full text]
  • From Exploratory Data Analysis to Predictive Modeling and Machine Learning Gilbert Saporta
    50 Years of Data Analysis: From Exploratory Data Analysis to Predictive Modeling and Machine Learning Gilbert Saporta To cite this version: Gilbert Saporta. 50 Years of Data Analysis: From Exploratory Data Analysis to Predictive Modeling and Machine Learning. Christos Skiadas; James Bozeman. Data Analysis and Applications 1. Clus- tering and Regression, Modeling-estimating, Forecasting and Data Mining, ISTE-Wiley, 2019, Data Analysis and Applications, 978-1-78630-382-0. 10.1002/9781119597568.fmatter. hal-02470740 HAL Id: hal-02470740 https://hal-cnam.archives-ouvertes.fr/hal-02470740 Submitted on 11 Feb 2020 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. 50 years of data analysis: from exploratory data analysis to predictive modelling and machine learning Gilbert Saporta CEDRIC-CNAM, France In 1962, J.W.Tukey wrote his famous paper “The future of data analysis” and promoted Exploratory Data Analysis (EDA), a set of simple techniques conceived to let the data speak, without prespecified generative models. In the same spirit J.P.Benzécri and many others developed multivariate descriptive analysis tools. Since that time, many generalizations occurred, but the basic methods (SVD, k-means, …) are still incredibly efficient in the Big Data era.
    [Show full text]
  • Data Mining and Exploration Michael Gutmann the University of Edinburgh
    Lecture Notes Data Mining and Exploration Michael Gutmann The University of Edinburgh Spring Semester 2017 February 27, 2017 Contents 1 First steps in exploratory data analysis1 1.1 Distribution of single variables....................1 1.1.1 Numerical summaries.....................1 1.1.2 Graphs.............................6 1.2 Joint distribution of two variables................... 10 1.2.1 Numerical summaries..................... 10 1.2.2 Graphs............................. 14 1.3 Simple preprocessing.......................... 15 1.3.1 Simple outlier detection.................... 15 1.3.2 Data standardisation...................... 15 2 Principal component analysis 19 2.1 PCA by sequential variance maximisation.............. 19 2.1.1 First principal component direction............. 19 2.1.2 Subsequent principal component directions......... 21 2.2 PCA by simultaneous variance maximisation............ 23 2.2.1 Principle............................ 23 2.2.2 Sequential maximisation yields simultaneous maximisation∗ 23 2.3 PCA by minimisation of approximation error............ 25 2.3.1 Principle............................ 26 2.3.2 Equivalence to PCA by variance maximisation....... 27 2.4 PCA by low rank matrix approximation............... 28 2.4.1 Approximating the data matrix................ 28 2.4.2 Approximating the sample covariance matrix........ 30 2.4.3 Approximating the Gram matrix............... 30 3 Dimensionality reduction 33 3.1 Dimensionality reduction by PCA.................. 33 3.1.1 From data points........................ 33 3.1.2 From inner products...................... 34 3.1.3 From distances......................... 35 3.1.4 Example............................. 36 3.2 Dimensionality reduction by kernel PCA............... 37 3.2.1 Idea............................... 37 3.2.2 Kernel trick........................... 38 3.2.3 Example............................. 39 3.3 Multidimensional scaling........................ 40 iv CONTENTS 3.3.1 Metric MDS..........................
    [Show full text]