<<

Review Article

Machine Learning for Condensed Matter

Edwin A. Bedolla-Montiel1, Luis Carlos Padierna1 and Ram´on Casta˜neda-Priego1 1 Divisi´onde Ciencias e Ingenier´ıas,Universidad de Guanajuato, Loma del Bosque 103, 37150 Le´on,Mexico E-mail: [email protected]

Abstract. (CMP) seeks to understand the microscopic interactions of matter at the quantum and atomistic levels, and describes how these interactions result in both mesoscopic and macroscopic properties. CMP overlaps with many other important branches of science, such as , , Statistical Physics, and High-Performance Computing. With the advancements in modern Machine Learning (ML) technology, a keen interest in applying these algorithms to further CMP research has created a compelling new area of research at the intersection of both fields. In this review, we aim to explore the main areas within CMP, which have successfully applied ML techniques to further research, such as the description and use of ML schemes for potential energy surfaces, the characterization of topological phases of matter in lattice systems, the prediction of transitions in off-lattice and atomistic simulations, the interpretation of ML theories with physics- inspired frameworks and the enhancement of simulation methods with ML algorithms. We also discuss in detial the main challenges and drawbacks of using ML methods on CMP problems, as well as some perspectives for future developments.

Keywords: machine learning, condensed matter physics

Submitted to: J. Phys.: Condens. Matter arXiv:2005.14228v3 [physics.comp-ph] 14 Aug 2020 Machine Learning for Condensed Matter Physics 2

Machine-learned Enhanced Phase transitions Potentials Structural and Simulation dynamical properties Methods

Materials Modeling Soft Matter

Atomistic Feature Experimental setups Engineering Machine-learned Free Condensed Colloidal systems Energy Matter Surfaces Physics + Enhanced Machine Explanation Simulation Learning of Methods Deep Learning Theory Interpretation Phase Transitions of & Critical Parameters ML models

Physics-inspired Hard Machine Learning Matter Theory

Topological Renormalization Phases of Group Matter Classical Restricted and Boltzmann Modern ML Approaches Machines

Figure 1. Schematic representation of the potential applications of ML to CMP discussed in this review.

1. Introduction

Currently, machine learning (ML) technology has seen widespread use in various aspects of modern society: automatic language translation, movie recommendations, face recognition in social media, fraud detection, and more everyday life activities [1] are all powered by a diverse application of ML methods. Tasks like human-like image classification performed by machines [2]; the effectiveness of understanding what a person means when writing a sentence within a context [3]; the ability to distinguish a face from a car in a picture [4]; and automated medical image analysis [5] are some of the most outstanding advancements that stem from the mainstream use of modern ML technology [6]. Inspired by these notable applications, scientists have thrivingly applied ML technology to scientific fields, such as matter engineering [7], drug discovery [8] and protein structure predictions [9]. Influenced by the overwhelming effectiveness of ML technology in other scientific fields, physicists have also applied such methods to specific subfields within Physics, with the main objective of making progress in the most challenging research questions. The motivating reasons for this surge of interest at the intersection between Condensed Matter Physics (CMP) and ML are vast. For instance, images, language and music Machine Learning for Condensed Matter Physics 3 exhibit power-law decaying correlations equivalent to those seen in classical or quantum many-body systems at their critical points [10]; very large data sets, with high- dimensional inputs found in materials science [11] are the quintessential problem that modern ML methods are tailored to deal with. It seems as though ML is well-suited to be applied to CMP research problems given all these deep connections between both, but not without its drawbacks. Such is the case of the encoding into a latent space of features that represent the atomistic structure found in soft matter and molecular systems [12], which is a challenge that is currently being solved by modern natural language processing and object detection techniques. The challenge of encoding and feature selection, as well as other key challenges of ML applications to CMP will be discussed with further detail later on. In this work, we review the contributions of ML methods to Physics with the aim of providing a starting point for both computer scientists and physicists interested in getting a better understanding of the interaction between these two research fields. For easier navigation of the topics covered in this review, the diagram shown in figure 1 displays a schematic representation of the outline, but we shall summarize them briefly here. We start the review with a very brief overview of both CMP and ML for the sake of self-containment, to help reduce the gap between both areas. In particular, the overview of CMP is meant to emphasize the difference between both main branches, i.e. Hard and Soft Matter, because these areas have a fundamental Physical difference that will later influence the way ML techniques are applied to their respective types of problems. We then continue by exploring some of the research inquiries within hard condensed matter physics. Hard Matter has seen most of the ML applications, for instance in detection and critical phenomena for lattice models. We continue with the review by analyzing how both classical ML and Deep Learning (DL) techniques have been applied to obtain detailed descriptions of strongly correlated, frustrated and out- of-equilibrium systems. We investigate how standard simulation methods, like Monte Carlo, have also seen enhancements from using ML models. Moving forward, we examine a new research area dubbed physics-inspired ML theory, an area where the most rigorous and fundamental physics frameworks have been applied to ML to obtain a deeper insight of the learning mechanisms of ML models. We discuss how in this engaging research area a robust understanding of the learning mechanism behind Restricted Boltzmann Machines has been obtained. We move on to reviewing how common theoretical physics frameworks, such as the Renormalization Group, have been used with interesting results to give insights into the learning mechanisms of the most common models in DL. The review continues with the exploration of Materials modeling, Soft Matter, and related fields, which recently have seen a strong surge of ML applications, mainly in the subfields of materials science and colloidal systems, e.g., proteins and complex molecular structures. Enhanced intelligent modeling, atomistic feature engineering, ML potential and free energy surfaces have been the primary topics of research; with the Machine Learning for Condensed Matter Physics 4 identification of a phase transition as well as the determination of the phases of matter being a close second. As one has to be aware that every technique has some limitations, the review is closed with some of the main challenges and drawbacks of ML applications to CMP, in addition to some perspectives and outlooks about current challenges. Before we continue with the review, we would like to address that many other notable applications within realated fields of CMP have been omitted due to lack of depth of knowledge in such fields and space in this short review. Such applications include graph-based ML force fields [13] and the deep learning architecture SchNet used to model atomistic systems [14]; deep learning methodologies used to predict the ground- state energy of an electron and effectively solving the Schr¨odingerequation for various potentials [15]; the use of ML techniques for many-body quantum states [16, 17]; the use of graph convolutional neural networks to directly learn material properties from the connection of atoms in the crystal [18]; the utilization of ML supervised and unsupervised techniques to approximate density functionals, as well as providing some insights into how these techniques might achieve such approximations[19, 20]. Lastly, we would like to suggest the reader interested in applications of ML to a wide range of areas within Physics, the recenet review by Carleo et al [21] discusses in detail some of the most relevant works and research areas.

2. Overview of Machine Learning

In this section, common ML concepts including those used in this review are described, following a specific hierarchy composed of category, problem, model, and method or architecture. Figure 2 illustrates the relationship between these concepts. The interested reader can get further information from this partial taxonomy by following the references provided in the figure caption. Machine Learning is a research field within Artificial Intelligence aimed to construct programs that use example data or past experience to solve a given problem [22]. The ML algorithms can be divided into categories of supervised, unsupervised, reinforcement and deep learning [47, 22, 6]: in supervised learning the aim is to learn a mapping from the input data to an output whose correct values are provided by a supervisor; in unsupervised learning, there is no such supervisor and the goal is to find the regularities or useful properties of the structure in the input; in reinforcement learning, the learner is an agent that takes actions in an environment and receives reward or penalty for its actions in trying to solve a problem, the output is a sequence of actions, and the main purpose is to learn from past good sequences to generate a policy that maximize a reward; deep learning is a special kind of ML, designed to deal with high-dimensional environments. DL may be regarded as a transverse category, since it has been used to solve the limitations of traditional algorithms of supervised [48], unsupervised [41], and reinforcement learning [31]. DL is focused on learning nested concepts, e.g., the image of a person, in terms of simpler concepts, such as the corners, contours or textures within Machine Learning for Condensed Matter Physics 5

Machine Learning i

Deep Learning ii Supervised ii Unsupervised ii Reinforcement ii Categories Learning Learning Learning

Classification iii Clustering iv Path tracking v Regression Novelty detection Optimal control Problems Data Imputation Ranking Games … … …

Kernel Methods vi Neural Networks vi Models

SVC iii SV KLSPI vii DKNs x MLP xiii AE xv ART xvii CNN xiii Clustering iv SVR iii Kernel PCA SVM-A2C viii DEK xi LSTM xiv RBM xv MLP xiii Deep Q- iv Network xviii Methods & Architectures LS-SVM iii SV Ranking KBR ix TWSVC xii RBF GAN xvi RAN xvii DBM xviii iv Networks xiv … … … … … … … …

Figure 2. Partial taxonomy of Machine Learning methods and architectures. (i) A detailed description of other models not presented in the figure, such as graphical or Bayesian models can be consulted in [22]. (ii) For further information regarding the ML categories please confer to [6] and [23]. (iii) Examples of methods for classification, regression and data imputation are discussed in [24], [25], and [26], respectively. (iv) Examples of methods for clustering, novelty detection and ranking can be found in [27], [28], and [29], respectively. (v) Examples of methods for path tracking, optimal control and games are introduced in [30], [31], [32], respectively. (vi) For further information regarding kernel methods and neural networks models please confer to [33] and [34], respectively. (vii) KLSPI - Kernel-based least squares policy iteration [35]. (viii) SVM-A2C - Advantage Actor-Critic with Support Vector Machine Classification [36]. (ix) KBR - Kernel-based relational reinforcement Learning [37]. (x) DKN - Deep Kernel Networks [38]. (xi) DEK - Deep Embedding Kernel [39]. (xii) TWSVC - Twin Support Vector Clustering [40]. (xiii) Some architectures have been used in more than one category, for instance, variants of Convolutional Neural Networks (CNN) were employed both in supervised and unsupervised problems [41], and Multilayer Perceptrons (MLP) have been used for supervised and reinforcement learning [33, 42]. (xiv) LSTM - Long short-term memories and RBF - Radial Basis Function Networks [34]. (xv) AE - Autoencoders and RBM - Restricted Boltzmann Machines [23]. (xvi) GAN - Generative Adversarial Networks [43]. (xvii) ART - Adaptive Resonance Theory Networks [44] and RAN - Resource Allocation Networks [45]. (xviii) Deep Q-Network [32] and DBM - Deep Boltzmann Machines [46]. Machine Learning for Condensed Matter Physics 6 the image [6]. The DL approach enhances previous ML models such as neural networks and kernel methods on several aspects, for instance by extending the capability of dealing with large-scale data, and providing new architectures, operators, and learning methods for the solution of problems with higher precision [32, 49]. As a result, DL methods are able to deal with huge amounts of heterogeneous information, as well as learning directly from raw data without the user-dependent preprocessing step of receiving the main and filtered features of raw data as valid training vectors (feature extraction) required by classical ML methods. Although Figure 2 presents several ML concepts, there exist other important models like Decision Trees, Gaussian Processes, and methods such as boosting, LASSO, ridge regression, k-means, etc. [50, 51, 52], that are not included in this brief overview in order to focus on those concepts that frequently appear in the rest of the manuscript. Specifically we will restrict ourselves to the description of Neural Networks (feed-forward, autoencoders, restricted Boltzmann machines and convolutional) and kernel methods (support vector machines). These methods and architectures are now described, with the exception of restricted Boltzmann machines that, because of their relevance, are presented in subsequent sections. Useful references are included for readers interested in obtaining a deeper insight. Artificial Neural Networks (ANNs) are models of ML easily found in science and engineering applications. ANNs attempt to simulate the decision process of neurons of biological central nervous systems [53]. Many ANNs are computational graphs [54, 34, 55]. In these graphs, the nodes compute activation functions (sigmoidal, radial basis, and any other Tauber-Wiener function [56]) according to the input data provided by the connecting edges. The learning process consists in assigning weights to the edges and updating them according to a learning algorithm—such as backpropagation, Levenberg-Marquardt, Stochastic Gradient Descent, Natural Gradient Descent, etc. [57, 58, 59]—. The tasks that ANNs can solve depend on the architecture that specifies how data is processed and stored. The storage of information can be either by using an external memory [60, 61] or as an implicit memory codified into the weights of the network, being the latter the most common approach. The number of current ANNs architectures is immeasurable, but a project trying to track them can be consulted in Ref. [62]. Multilayer feed-forward Neural Networks (FFNNs) are one of the simplest ANN architectures, where neurons are organized in layers [63]. The default version of FFNNs assumes that all neurons in one layer are connected to those of the next layer. Neurons of the input layer receive the data to be processed and then pass it to the next layer, successive layers feed into one another in the forward direction from input to output. The key point is that the transfer of information between layers is controlled by nonlinear activation functions evaluated by the neurons. A neat illustration of this architecture and further details about its variants can be consulted in [34] sect 1.2.2. Two examples of these networks are the Multilayer Perceptrons and Radial Basis Function Networks. When the learning process in FFNNs is supervised, in Machine Learning for Condensed Matter Physics 7

addition to the training data—X ⊂

3. Overview of Condensed Matter Physics

Understanding the intricate phenomena of all types of matter and their properties is the driving force of condensed matter physicists. From quantum theory all the way to materials science, CMP encompasses a large diversity of subfields within physics, as it deals with different time and length scales depending on both the molecular details and the type of matter being analyzed. To study such systems, theoretical, computational and experimental physicists collaborate to gain a deeper insight into the behavior in and out of equilibrium of matter. We shall briefly talk about the two main areas within CMP, namely, hard matter and soft matter, the essential models as well as the central problems these research areas are focused on. More information about these research areas can be found in standard textbooks, see, e.g., [87, 88, 89] and reviews [90, 91].

3.1. Soft Condensed Matter Fluids, , polymers, colloids, and many more are the principal types of matter that are studied by Soft Condensed Matter, or Soft Matter (SM). SM has the peculiar property that it will easily respond to external forces. SM will change its flow when a small shear force is applied to it, or thermal fluctuations are applied, thus changing its properties as a whole. We are surrounded by these materials in our everyday life, from glass windows, toothpaste, hair-styling gel, among others. These Machine Learning for Condensed Matter Physics 9 materials can show diverse properties and when some external force is applied, they will exhibit a different set of properties, but in all cases SM shows the fascinating property of self-organization of its molecules [92]. Proteins and bio-molecules are also good examples of SM, they change their structure when subjected to thermal fluctuations, while also showing a pattern of self-assembly [93, 94], as is the case in most types of SM. All in all, SM is a subfield of CMP that gained a lot of attention when first introduced in the 1970s by Nobel prize laureate Pierre-Gilles de Gennes [95], because it has shown to be of great importance to science in general, with very useful applications, as well as pushing the boundaries in theoretical, experimental and computational Physics. In order to explore all these properties and changes in a SM system, a model that describes most of these types of matter is needed. Most fluids and colloidal dispersions show a repulsive behavior, which can be modeled very well with the hard- sphere model [96]. A system modeled as a hard-sphere dispersion only shows an infinite repulsive interaction when there exists interparticle separations less than the particle diameter, and zero otherwise. There is no attractive interaction included in this model. The hard-sphere model is a very simple one, which has also exhibited intriguing, but unexpected equilibrium and non-equilibrium states [97, 98, 99]. There are some interesting properties that define the hard-sphere model in SM. First of all, the internal energy of any allowed system configuration is always zero. Furthermore, forces between particles and variations in the free energy—defined from classical as F = E − TS, with E being the internal energy, T the temperature and S the entropy—are both determined entirely by the entropy. Moreover, the entropy in a hard-sphere system depends directly on the total volume occupied by the hard spheres, commonly defined as the volume fraction, φ. When φ is small then the system shows the properties of an ideal gas, but as φ starts to increase, particles interact with each other and their motion is restricted by collisions with other nearby particles. Even more interesting is the fact that the phase transitions of hard-sphere systems are completely defined by φ [100]. In general, a SM system does not need to be composed only of hard spheres, it can also be built with non-spherical molecules, for example, rods, sphero-cylinders—cylinders with semi-circular caps—, and hard disks, just to mention a few examples. Nevertheless, all these systems experience a large diversity of phase transitions [101, 102, 103]. The description of phase transitions and critical phenomena are one of the main challenges in all CMP. When a thermodynamic system changes its uniform physical properties to another set of properties, then the system encounters a phase transition. In SM, phase transitions are one of the most important phenomena that can be analyzed given the diversity of phases that could appear in unexpected circumstances [104, 105]. In particular, a gas-liquid phase transition is defined by a temperature—, known as the critical temperature, Tc—, and a density—, called the critical density, ρc—. These quantities are also known as the critical point of the system, meaning that two phases can coexist when the system is held to these special values. When neither of these thermodynamical quantities can be employed, an order parameter is more than sufficient Machine Learning for Condensed Matter Physics 10 to determine the phase transitions of a system. Order parameters are special quantities that can measure the degree of order between a phase transition; they range between several values for the different phases in the system [106]. For instance, when studying the liquid-gas transition in a fluid, a specified order parameter can take the value of zero in the gaseous phase, but it will take a non-zero value in the liquid state. In this scenario, the density difference between both phases (ρl − ρg)—where ρl is the density in the liquid state and ρg in the gas state—can be a useful order parameter. When traversing the of a SM system, one might encounter the so-called topological defects. A topological defect in ordered systems happens when the order parameter space is continuous everywhere except at isolated points, lines or surfaces [107]. This can be the case of a system which is stuck in-between two phases, and there is no continuous distortion or movement from the particles that can return the system to an ordered phase. One such example is that of nematic liquid crystals [108]. The nematic phase is observed in a hard-rod system, when all the rods align in a certain direction. Hard rods are allowed to move and to rotate freely. Furthermore, the neighboring hard rods tend to align in the same direction. But if it so happens that some hard rods are aligned in one direction and the rest in another direction, then the system contains topological defects [109]. These defects depend on the symmetry of the order parameter of the phases, as well as the topological properties of the space in which the transformation is happening. Topological defects are of the utmost importance in a large variety of systems in SM, e.g., they determine the strength of the mentioned before. Nevertheless, these types of systems and problems are not so easily solved. How can we tell when a system will be prone to topological defects? Is it possible to determine the phases that a system will come across given its configuration? Even more challenging questions are always originating when these types of exotic matter appear. But theoretical, experimental or even computational frameworks are not enough to tackle these problems. Physicists are always open to new ways to approach such challenges, and ML is slowly starting to become a powerful tool to deal with these complex systems, and in this review we shall discuss some of the ways ML has been applied to such problems, as well as commenting on some of the problems that can arise from using such techniques. However, it is important to note that SM also experiences thermodynamic transitions to solid-like phases. For example, in the case of the hard-sphere model, when the packing fraction φ approaches the value of φ ≈ 0.492 [110, 111], it undergoes a liquid-solid first order transition [112, 113]. This thermodynamic behavior can also be studied using theoretical and experimental techniques employed when one deals with Hard Matter. Thus, at some point, SM is also at the boundary Hard Matter. Machine Learning for Condensed Matter Physics 11

3.2. Hard Condensed Matter When physicists deal with materials that show structural rigidity, such as , metals, insulators or semiconductors, their interest in the properties of these materials are the main focus of Hard Condensed Matter, or Hard Matter (HM). Furthermore, in contrast to SM systems, HM typically deals with systems on smaller scales, i.e., matter that is governed by atomic and quantum-mechanical interactions. Entropy no longer determines the free energy of a HM system, it is now up to the internal energy. One particular example of interest in HM is the phenomenon of superconductivity.A material is considered a type I superconductor if all electrical resistance is absent and it is a perfect diamagnet [114] —i.e., all magnetic flux fields are excluded from the system—; a type II superconductor is not completely diamagnetic because they do not exhibit a complete Meissner effect [115]. The phenomenon of superconductivity is described by both quantum and statistical mechanical interactions. This means that superconductivity can be seen as a macroscopic manifestation of the laws of . As was the case with SM, we are also surrounded by HM materials, for instance, semiconductors are the principal components in microelectronics [116], which in turn are the components of every digital device we use, ranging from computers to smart cellphones. The applications that stem from HM are too great to enumerate in this short review, and the theoretical, experimental and computational frameworks that have derived from HM are nothing short of revolutionary; we suggest the reader who might be interested in knowing some of these applications to a technical perspective article [117], and a more thorough book on some of the modern aspects of HM, as well as its applications to technology [118]. In HM, we also need a reference model that can help explain at the basic level some of the investigated materials. Many other materials can use a similar reference model, but with different assumptions and physical quantities. Such is the case of lattice systems, and, in particular, the Ising model [119]. The Ising model is the simplest model that can explain ferromagnetic properties, in addition to accounting for quantum interactions. It is defined in a lattice, much like if it were a crystalline solid. In each site of the lattice, there is a discrete variable σk that takes one of either two values, such that σk ∈ {−1, +1}. The energy for a lattice is given by the following Hamiltonian X X H(σ) = − Jijσiσj − µ hjσj, (1) hi,ji j

where Jij is a pairwise interaction between lattice sites; the notation hi, ji indicates that i and j are lattice site neighbors; µ is the magnetic moment and hj is an external magnetic field. As originally solved by Ising in 1924 [120], when the system lies in a one- dimensional lattice, the system shows no phase transition. Later, Lars Onsager proved in 1944 [121] that when the Ising model lies in a two-dimensional lattice a continuous phase transition is observed. For dimensions larger than two, different theoretical [122, 123] and computational [124] frameworks have been proposed. Machine Learning for Condensed Matter Physics 12

While the Ising model has seen some very useful applications, it is not the only model that has shaped the research interests within CMP, as it cannot possible contain and explain all phenomena seen in HM. A generalized model, such as the n-vector model [125] or the Potts model [126] are extensively studied models in HM that have been successfully applied to explain even more intricate phenomena [127, 128]. In general, all these models show phase transitions, in the same way as SM systems. This research area grew even more when the Berezinsky-Kosterlitz-Thouless (BKT) phase transition was discovered [129, 130, 131, 132]. The BKT transition was unveiled on two-dimensional lattice systems, such as the XY model—which is a special case of the n-vector model—. The BKT transition consists of a phase transition that is driven by topological excitations and does not break any symmetries in the system [133]. This type of phase transition was quite abstract in the 1970s when first introduced, but it turned out to be a great theoretical framework that was later discovered in exotic materials [134, 135], which in turn showed puzzling properties. Such was the impact of the BKT transition in CMP that it would later be awarded a Nobel prize in Physics in 2016 [136]. However, the BKT transition is a complicated transition to study. Hard condensed matter physicists are always trying to find a better way to look for these kind of unique aspects in materials. It is an elusive and complicated challenge. In the same spirit as in SM, ML has already been put to the test on this task, if it is possible to re-formulate the problem at hand as a ML problem, it can then be solved by standard ML techniques. But, as mentioned before, the task of re-formulation and feature selection is not simple; there is currently no standard way to approach this in CMP applications, contrary to most ML models and applications. We shall examine these applications and their potential challenges as well as drawbacks with more detail in the following sections.

4. Hard Matter and Machine Learning

In this section, we shall review the main techniques and ML models applied to CMP, mainly to HM. These applications might constitute a paradigm of ML applications to CMP due to the results obtained, which are generally acceptable. The structure of HM data is similar to the one found in computer vision applications in ML, among other applications. Furthermore, these schemes have also shown most of the major shortcomings such as feature selection, training and tuning of ML models.

4.1. Phase transitions and critical parameters One of the most prominent uses of NNs was on the two-dimensional ferromagnetic Ising model which has a Hamiltonian similar to Eq. (1), with the aim to identify the critical temperature, Tc, at which the system leaves an ordered phase—the ferromagnetic phase—and enters a disordered phase—the paramagnetic phase. Carrasquilla and Melko [137] introduced a new paradigm by essentially formulating the problem of Machine Learning for Condensed Matter Physics 13

phase transitions as a supervised classification problem, where the ferromagnetic phase constitutes one class and the paramagnetic phase is another one, then a ML algorithm is used to discriminate between both of them under specific conditions. By sampling configurations using standard Monte Carlo simulations for different system sizes, and then feeding this raw data to a FFNN, an estimate of the correlation-length critical exponent ν is obtained directly from the output of the last layer of the trained NN.

Finite-size scaling theory [138] is then employed to obtain an estimation of Tc. Even if the geometry of the lattice is modified, the power of abstraction provided by the NN [34] is enough to predict the critical parameters even if it was trained in a completely different geometry; this makes it a very convenient tool to identify critical parameters for systems that do not have a defined order parameter [139].

4.2. Topological phases of matter Once having shown that ML is able to identify phases of matter in simple models as shown for models that have a similar Hamiltonian to Eq. (1), researchers have been interested in trying out these algorithms in more complex systems, and a lot of work has been put into the topic of identifying topological phases of matter using ML algorithms for strongly correlated topological systems [140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150]. For instance, Zhang et al [151] have applied FFNNs as well as CNNs to find the BKT phase transition, as well as constructing a complete phase diagram of the XY and generalized XY models, with very promising results. In the case of the FFNN approach, Zhang et al found out that a simple NN architecture consisting of one input layer, one hidden layer with four neurons and one output layer is more than enough to obtain the critical exponent ν for the XY model. We can compare this to the results obtained by Kim and Kim [152], where they explain a similar behavior but for the Ising model. In their work, analyzing a tractable model of a NN, they found out that a NN architecture consisting of one input layer, one hidden layer with two neurons and one output layer is sufficient to obtain the critical exponent ν for the Ising model. If such is the case for a shallow architecture to be able to obtain good approximations for the critical exponent ν then, perhaps, such models can be further simplified into more robust and analytically tractable ones. Such is the case of the Support Vector Machine methodology developed by Giannetti et al [153]. It is important to keep track of the model complexity specially for complicated models to avoid overfitting and to reduce the amount of data needed to train the model. In the case of the NN methodologies exposed before, we can argue that such models are easier to train if there are enough data samples in the data set, when this is not the case then overfitting will occur yielding low accuracy results; NNs are also faster to train due to the fact that these models can be implemented in accelerated computing platforms, such as Graphical Processing Units (GPUs). On the other hand, SVMs cannot be readily accelerated with such techniques but they are less prone to overfitting because the model complexity, in cases where the support vectors—the subset of the data set Machine Learning for Condensed Matter Physics 14 needed to construct the decision function and thus make a prediction—are less than the total number of samples, is low enough that a smaller data set might be enough to obtain desirable approximations. Still, SVMs need to be adjusted according to the task, so hyperparameter tuning must always be part of the data processing pipeline, which in turn might take longer to obtain results. If this is the case, then simpler methods—such as logistic regression or kernel ridge regression [154]—could potentially be better suited for the task of approximating critical exponents due to the fact that these models are simple to implement, need less data samples and most implementations are numerically stable and robust; at the very least these simple models could potentially provide a starting point for such approximations. New and interesting methodologies are always arising in order to solve the problem of phase transitions. Kenta et al [155] developed a generalized version of the methodology proposed by Carrasquilla and Melko [137] for a variety of models, namely, the Ising model, the q-state Potts model [156] and the q-state clock model. Instead of using the raw data obtained from Monte Carlo simulations as proposed in [137], Kenta et al considered the configuration of a specific long-range spatial correlation function for each model, effectively exploiting feature engineering, hence providing meaningful data to the NN, which now contains relevant information from the systems at hand. This strategy has the advantage of generalization for the already trained NN, thus it can be readily applied to similar systems without further manipulation of the physical model or the production of new simulation data. The strategy is also quite general in the sense that it can be applied to multi-component systems, with the similar precision as standard theoretical methods. Finally, we remark some of the advances in topological electronic phases. The most fundamental quantity to characterize these states is the so called topological invariant, whose value determines the topological class of the system. In particular, interfaces between systems with different topological invariants show topologically protected excitations, resilient towards perturbations respecting the symmetry class of the system. Mappings of topological invariants using NN as performed in [157] shows that supervised learning is an efficient methodology to characterize the local topology of a system. DL has also seen some useful applications to these type of systems, e.g., for random quantum systems [158, 159, 160].

4.3. Classical and modern ML approaches The amount of success that ML has seen in the area of Lattice Field Theory and Spin Systems has given it a special place in the toolboxes of physicists in order to provide a different perspective to a given problem based on different ML models, each with its own advantages and drawbacks. It is specially important to identify those tools as the ones that have seen the most use in this topic, the main one being FFNNs given its generalization and abstraction capabilities for which it has become the workhorse of most of the applications in this field—particularly when the problem can be formulated Machine Learning for Condensed Matter Physics 15

as a supervised learning problem. For more classical approaches, SVMs have also been implemented on the phase transition problem [148, 153] with promising results given their ease of use and physical interpretation of the results, which we shall discuss in detail in section 5.4. It is useful to compare the SVM method against the NN one presented in the previous subsection, specially given the fact that both techniques can yield very similar results; equivalently we can ask ourselves, when should we use one technique or the other? As mentioned in the previous subsection, SVMs are a great tool when the number of samples in the data set is low, this is because SVMs only need a small subset, i.e., the support vectors, to effectively predict some value. The main disadvantage of SVMs is their time complexity when training, as it completely depends on the implementation used and the data set. For example, the most popular implementation of a SVM solver is LIBSVM [82], which scales between O(M ×N 2) and O(M ×N 3), where M is the number of features and N is the number of samples in the data set. SVMs depend strongly on the hyperparameters used, so in the case of an incorrectly tuned SVM the time it takes to train could be O(N 3) in the worst-case scenario, while also yielding poor results. On the other hand, if the user is careful in tuning and selecting a good kernel as presented in [153], one can solve a CMP problem with very high precision; in this case some Physics insight of the problem can be helpful in providing good parameters for the SVM. If we now turn our attention to NNs and their training time complexity, it normally scales as O(N 3) provided the NN use an efficient linear algebra and matrix multiplication implementation. If we also account for the backpropagation operations, which scale linearly with the number of layers in the network, we can see that NNs are significantly more difficult to train than SVMs. Research is being carried out to understand why NN are hard to train for and at the same time they are efficiently trained in practice [161, 162]. The reason for this is that modern numerical linear algebra implementations are efficient, and they can be accelerated with modern computing technologies, such as GPUs. This makes NNs a great tool when the number of data samples is large, while training can also happen online, i.e. feeding the network with newly created samples. In short, both techniques give similar results provided there are enough representative data samples from the problem. A simple criterion for choosing between both methodologies depends greatly on the data set and the problem, but starting off with SVMs can give a good approximation, and in the case of needing even more precision or if the data set contains a large number of features then NNs might be a better choice for the problem at hand. Many other classical techniques—specifically the ones that stem from unsupervised learning methods —have become prominent algorithms. For instance, one such technique is the nonlinear dimensionality reduction algorithm t-distributed Stochastic Neighbor Embedding (t-SNE) [163] that Zhang et al [151] used to map high-dimensional features onto lower dimensional spaces in order to identify three distinct phases in the site percolation model. In this methodology, the idea is to use different site configurations, handled as a single array of values from the lattice, and using t-SNE Machine Learning for Condensed Matter Physics 16 to map this large dimensional space into a 2-dimensional one, such that when enough iterations are performed three distinct clusters appear: one for each ordered phase (percolating and non-percolating phases), and the transition phase between these two phases. Although t-SNE is mostly a visualization tool, by employing other unsupervised learning techniques such as k-means [154] or spectral clustering [164], together with the results obtained from t-SNE one could possibly obtain a good approximation of the transition point in the percolation model. Another prime example is Principal Component Analysis (PCA) [165, 154] which has proved to be one of the best feature extraction algorithms that can be used in Spin Systems [166, 167], as it can be readily applied to raw configurations and use the information from the system and build different features, which then can be compared to physical observables like structure factors and order parameters. For instance, Wei [167] established a methodology to build a matrix Mwith different site configurations from the Ising model—using a simplified version of the Hamiltonian from Eq. (1)—and then performing PCA on M. This yields a set of new component vectors that describe the large dimensional space of M in a new, lower dimensional space. These new vectors are used as a basis set to project the original samples in this space and it is in this new space that clustering techniques like k-means can be used in order to find a phase transition. DL has also been wielded as a powerful technique to understand quantum phase transitions [168] in the form of adversarial NNs [169]; to provide a thorough description of many-body quantum systems [170] by employing Variational Autoencoders [171, 6]; and CNNs [6] have been applied, for instance, to learn quantum topological invariants [172], or to solve frustrated many-body models [173]. One of the main reasons for DL to be such a strong asset is that it can handle complex data representations and achieve good accuracy results, provided these models have enough data to learn from. The main drawbacks of these models is precisely the fact that the data needed must be representative of the task at hand; the data must also preserve invariants of the system in the case that the system has such invariants, such as invariance to translation and rotations. These type of issues are mostly solved with data augmentation techniques, such as adding some white noise to the data samples, or creating some additional samples that are mirrored, although there are other such modifications that can be done. But again, this implies that the user of such models must know the problem well enough to perform such adjustments for the sake of stable training and yielding good results.

4.4. Enhanced Simulation Methods One of the standard computer simulation technique in HM, along with SM, is the Monte Carlo (MC) method, an unbiased probabilistic method that enables the sampling of configurations from a classical or quantum many-body system [174, 175, 176, 177]. Throughout the decades, specific modifications have been performed to the MC method in order to enhance it, creating a better, more rigorous method that could be used in all types of simulations—from observing phase transitions to computing quantum path Machine Learning for Condensed Matter Physics 17

integrals—. It is in this situation that ML has something to offer in order to improve the original MC method; such is the case of the Self-learning Monte Carlo (SLMC) method developed by Li et al [178]. In this novel scheme, the classical MC method is used to obtain configurations from a similar Hamiltonian to Eq. (1)—these configurations will serve as the training data—; we call this set of configurations {s} from a state space S where each sample should have a probability density given by some function p: S → <+. Then a ML method is employed—linear regression, in this case—to learn the rules that guide the

configuration updates in the simulation, i.e. a parametrized effective modelw ˜θ which can ˜ + ˜ approximate the original model and obtainw ˜θ ≈ w, withw ˜θ: S → < and S ⊇ S. This enables the MC method to learn by itself under which conditions a given configuration is updated without external manipulation or guidance, efficiently sampling configurations for the most general possible model. Demonstrated on a generalized version of the Hamiltonian given by Eq. (1), the SLMC method is extremely efficient when producing configurations from the system, lowering the autocorrelation time from each of the updates. Li et al discussed that, although a ten-fold speedup from the original MC method is obtained, the SLMC still suffers from the quintessential problem of the original MC method—when approaching the thermodynamical limit, the method heavily slows down, rendering it useless—. Almost simultaneously, Huang and Wang [179] developed a similar scheme but using restricted Boltzmann Machines (RBMs). This interesting approach consists of building special types of RBMs that can use the physical properties of the system being studied. The strategy by Huang and Wang lowers autocorrelation time as well as raising the acceptance ratio of sampled configurations. The SLMC method has sparked interesting new research [180, 181, 182, 183, 184]; recently, Nagai, Okumura and Tanaka [185] have shown that the SLMC is even more promising by coupling it with Behler-Parrinello NNs [186], making it more general and less biased towards specific physical systems. Despite its advantages, the SLMC method still has some shortcomings and drawbacks. As with every other ML model, good and representative training data is required in order to train the model; for a complex model, i.e., if the model contains a large number of parameters to be fitted, the more data samples are needed to avoid overfiting. This might come as a problem if the original MC method used to sample configurations is already slow. Another important drawback is that the ML model needed to approximate the probability density for the configuration set must be robust enough to handle the problem at hand, so some intervention from the user is required in order to choose a good model, but this is not simple a task because this would imply that the dyamics of the system must be known from the start; this might remove all advantages of using ML models for automated computation of the problem. To alleviate some of these disadvantages, Bojesen [187] proposed a reinforcement learning approach, defined as the Policy-guided Monte Carlo (PGMC) method. In this methodology, the focus is now in an automatic approach to target the proposal generation for each configuration, which can be done by tuning the policy for making updates. Showing Machine Learning for Condensed Matter Physics 18

promising results in some models, the PGMC proves to be a better candidate for automatic MC sampling, although it also has its own shortcomings, which is primarily the fact that if the available information is inadequate, the training might become unstable and the probability density will be biased, ultimately breaking ergodicity. Nevertheless, these methods prove that ML models can be used in CMP to obtain some gains in computational efficiency and simplicity making them a potentially good tool for more challenging problems.

4.5. Outlook Current research is geared towards enhancing mainstream methodologies in CMP with ML, and enabling physicists to obtain high-precision results while also providing some insight of the learning mechanisms used by the ML algorithms employed. Another current area of research in Spin Systems is that of figuring out which ML techniques are more suitable to understand out-of-equilibrium systems [188], or even more complex frustrated models [189], just to name a few. Challenges, such as represetantive data set generation and feature selection must also be overcome within this area if we wish to see these methodologies become more popular. This matter is even more important for out-of-equilibrium systems because it is not a simple task to generate useful training data sets.

5. Physics-inspired Machine Learning Theory

5.1. Explanation of Deep Learning Theory The gain in knowledge between ML and CMP is not one-sided as one might think. ML has also benefited greatly from CMP with its thorough and rigorous theoretical frameworks that it possesses. Despite their huge success in human-like tasks—e.g., the classification of images [2]—, ML, and more importantly DL are not always able to explain why this success is even possible in the first place because these models might not reach the algorithmic transparency of other classical ML models, e.g. linear models. Schmidt et al [11] mention that representative datasets, a good knowledge of the training process, and a comprehensive validation of the model can usually overcome this obstacle. Other classical algorithms, such as MC computations are not always analytically tractable, so we might argue that this problem is not unique to ML models. With this in mind, an interpretation of the models used can be done to some extent, and even more so with the help of some theoretical frameworks, such as those found within Physics. In 2016, Zhang et al [190] took it upon themselves to run detailed experiments that showed the lack of complete understanding of the current learning theory from a theoretical perspective; on the practical side of things, DL is still very prosperous in its results. Since then, physicists have tried to give insights [191] into this topic, creating a fruitful research area with extremely interesting cross-fertilization between Machine Learning for Condensed Matter Physics 19 both communities. For a more thorough and complete analysis on the explanation and interpretability of DL contributions until now, the reader can refer to the recent review on the topic by Fan, Xiong and Wang [192].

5.2. Renormalization Group The lack of theoretical understanding of the learning theory in most ML algorithms seemed like a good challenge for condensed matter physicists that exerted experienced manipulation in theoretical frameworks, such as the Renormalization Group (RG) [122, 123]. RG consists of a mathematical framework that allows us a very rigorous and systematic investigation of a many-body system by working with short-distance fluctuations and building up this information to understand long-distance interactions of the whole system. In 2014, Mehta and Schwab [193] provided such an insight by creating a mapping between the Variational Renormalization Group and the DL workflow that is common nowadays. This turned out to be an interesting take on the problem, resulting in a very rigorous framework by which one could explain the learning theory that Deep NNs follow when data is fed to them. In the same spirit, Li and Wang [194] proposed a new scheme, this time in a more systematic way, by means of hierarchical transformations of the data from an abstract space—the space of features present in the data—to a latent space. It turns out that this approach has an exact and tractable likelihood, facilitating unbiased training with a step-by-step scheme; they proved this through a practical example using Monte Carlo configurations sampled from the one-dimensional and two-dimensional Ising models. More research regarding a deep link between RG and DL has been carried out since then, for instance, Koch-Janusz and Ringel [195] showed that even though the association between RG and DL is strong, most of the practical methodology to achieve good results from this is not so readily available. By using feed-forward NNs, they proved that a RG flow is obtainable without discriminating the important degrees of freedom that are sometimes integrated out by following the RG framework. This seems like a promising and enriching area of research as new insights are constantly developed [196, 197, 198].

5.3. Restricted Boltzmann Machines Instead of giving a complete explanation of a learning theory, CMP has been able to give thorough explanations of ML algorithms; such is the case of the Restricted Boltzmann Machine (RBM) [199]. This type of model is a very interesting case of the inspiration drawn from statistical physics onto ML. In the RBM, a NN is trained with a given data set, the intelligent model then learns complex internal representations of the data, after the model has been trained it outputs an approximation of the probability distribution of the input data. The resulting approximate probability distribution is then linked to a certain ML task, be it classification [200], dimensionality reduction [201], feature selection/engineering [202], and many more. The underlying structure of the model is not a conventional FFNN, instead, it is a type of model that Machine Learning for Condensed Matter Physics 20 is built using graphs [203]. The principal mechanism by which the RBM learns features is by computing an energy function E(h, v) from certain hidden—h —and visible —v —units that form a configuration; after computing this energy function E(h, v) one evaluates the Boltzmann factor of the energy to obtain a probability for the given configuration —P (h, v) = e−E(h,v)/Z —. From classical we know P −E(h,v) that the partition function Z = i e is the normalizing constant needed to ensure that 0 ≤ P (h, v) ≤ 1. It is this link between the Boltzmann description and the energy function from the RBM that makes it an ideal candidate to be analyzed under the statistical mechanics framework. Interestingly enough, RBMs are extremely versatile, despite the fact that it is a ML method for which the training mechanisms are not as fast as in more modern ML algorithms, thus special technical schemes have been developed for the task. This is possibly the greatest drawback of using RBMs, the training procedure is not unique and the most common procedure to train such models is the so-called contrastive divergence algorithm [204]. This algorithm can be tricky to use effectively and requires a certain amount of practical experience to decide how to tune the corresponding hyperparameters from the model. A good practical guide to learn about training RBMs is the one by Hinton [205], where special care is taken to inform the reader of the uses and shortcomings of the training procedure for RMBs. We shall discuss some of the attempts at providing some insight into the training mechanisms of RMBs by using some theoretical frameworks. For example, Huang and Toyoizumi [206] introduced a mean-field theory approach to perform the training using a message-passing-based method that evaluates the partition function, together with its gradients without requiring statistical sampling—which is one of the drawbacks of traditional learning schemes—. Further work on understanding the thermodynamics of the model were carried out [207] with encouraging results, showing that the learning mechanism in general is completely tractable, albeit complex when the model enters a non-linear regime. The RBM can also be seen as the inverse Ising model [208], where instead of obtaining physical quantities like order parameters and critical exponents from system configurations, one has as input the physical observables and intends to obtain insights on the system itself. With this theoretical framework in mind, the first steps to produce rigorous explanations of the algorithm were done by Decelle et al. [209] by studying the linear regime of the training dynamics using spectral analysis. RBMs are such rich and resourceful models that have been applied to a number of hard tasks; for instance, the modeling of protein families from sequential data [210, 211]. Another such complex task is that of continuous speech recognition, on which a special type of RBM was employed with promising results [212]. Within the realm of CMP, these models have been used to construct accurate ground-state wave functions of strongly interacting and entangled quantum spins, as well as fermionic models on lattices [213]. RBMs are currently one of the most flexible algorithms, despite its training mechanisms disadvantages. These models are particularly well suited to study problems Machine Learning for Condensed Matter Physics 21 within CMP and quantum many-body physics [214]. With even more research efforts to make RBMs scalable to larger data set and faster training mechanims, it might be possible for RBMs to become a standard tool in ML applications to CMP problems.

5.4. Interpretation of Machine Learning models Despite the success they have achieved, ML algorithms are prone to a lack of interpretation, resulting in so-called black box algorithms that cannot grant a full rigorous explanation of the learning mechanisms and features that result in the outputs obtained from them. However, for a condensed matter physicist, it is more important to understand the core mechanism by which the ML model obtained the desired output, so that it can later be linked to a useful physical model. As an example, we can take the model with the Hamiltonian described by Eq. (1) and train a FFNN on simulation data, following the methodology of Carrasquilla and Melko [137]; once the NN has been trained, the critical exponent is obtained and used to compute the critical temperature, but the question remains, what enabled the NN to obtain such information? Is there a mechanism to extract meaningful physical knowledge from the weights and layers from the NN? To answer this type of questions researchers have had to resort to arduous extraction and analysis from the underlying model architectures. For example, Suchsland and Wessel [215] trained a shallow FFNN on two-dimensional configurations from Hamiltonians similar to Eq. (1), then proceeded to comprehensively analyze the weight matrices obtained from the training scheme, while also performing a meticulous study of the reasons why activation layers and their respective neurons fire a response when required; they found out that both elements—training weights and activation layers—play a crucial role in the understanding of how the NN is able to output the physical observables that we give a meaning to. They also determined that the learning mechanism corresponds to a domain-specific understanding of the system, i.e., in the case of the Ising model, the NN effectively learned the structure of the surrounding neighbors for a particular spin, with this information it then proceeded to compute the magnetization order parameter as part of the computations involved in the training and predicting strategies. Another approach is to use classical ML algorithms like SVMs given that these models provide a rigorous mathematical formulation for the mechanism by which they learn from given features in the data. Ponte and Melko [148], as well as Gianetti et al. [153] exploited these models and used the underlying decision function of the SVMs to map the Ising model’s order parameters and obtained a robust way to extract the critical exponents from the system. Within this approach, one does not need to provide a meticulous explanation of the learning mechanism for the model, this has already been done in its own theoretical formulation—instead, an adjustment of the physical model with respect to the ML algorithm needs to be performed in order to successfully achieve a link between both frameworks. Nonetheless, this approach has the disadvantage that Machine Learning for Condensed Matter Physics 22 it can be hard to map the ML method to the physical model, as it requires a deep and experienced understanding of the physical system, while simultaneously needing a comprehensive awareness of the theoretical formulation of the ML algorithm being used.

5.5. Outlook The approach of illustrating the training scheme, as well as the loss function landscape of a ML algorithm with CMP has proven to be both intriguing and effective. Baity-Jesi et al. [216] used the theory of glassy dynamics to give an interesting interpretation of the training dynamics in DL models. Similar work was performed by Geiger et al. [217], but this time with the theory of jamming transitions, disclosing why poor minima of the loss function in a FFNN cannot be encountered in an overparametrized regime—i.e., when there exists more parameters to fit in a NN than training samples—. Dynamical explanations of training schemes have the advantage of providing precise frameworks that have been understood in the physics community for a long time, but it is an intricate solution as it needs a scrupulous analysis of the model, leaving little room for generalization on other ML models. It is true that FFNNs, as well as Deep NNs are the less understood ML models, but they are not the only ones that need a carefully detailed framework; maybe in future research we can see how other ML models benefit from all these powerful techniques developed so far by the CMP community.

6. Materials Modeling

6.1. Enhanced Simulation Methods Molecular dynamics (MD) is a classical simulation technique that integrates Newton’s equations of motion at the microscale to simulate the dynamical evolution of atoms and molecules [218]. It is, along with the MC method, a fundamental tool in computer simulations within CMP, specifically in HM, SM and Quantum Chemistry. The powerful method of MD enables the scientist to obtain physical observables through a controlled and simulated environment, by-passing limitations that most experimental setups have to deal with. MD has become a cornerstone method to model complex structures, such as proteins [219] and chemical structures [220], just to name a few. But such complex structures have to deal with large number of particles—with their many degrees of freedom—as well as complicated interactions in order to obtain meaningful results out of the simulations. This creates a difficult challenge to overcome, as efficient exploration of the phase space is a difficult task, even with MD simulation code accelerated with modern hardware. In order to reduce the computational burden of modeling such many-body systems, special enhanced sampling methods have been developed [221], but not even these schemes can alleviate the arduousness of modeling complicated systems. ML is seen as a promising candidate to surpass modern solutions in MD simulations, as shown primarily by Sidky and Whitmer [222]. In their novel approach, authors employed FFNNs to learn the free energies of relevant chemical processes, Machine Learning for Condensed Matter Physics 23 effectively enhancing MD simulations with ML and creating a comprehensive method to obtain high-precision information from simulations that were not previously readily available. Another interesting procedure to enhance MD simulations was the automated discovery of features that could be potentially fed into intelligent models, making it possible to create a systematic and automated workflow for materials research. Sultan and Pande [223] developed such techniques by training several classical ML methods —Support Vector Machines, Logistic Regression, and FFNNs —, showing that given a MD framework one could set up a complete, self-adjustable scheme to obtain the system’s free energy, entropy, structural information and many more observables. Another powerful tool to study many-body systems is Density Functional Theory [224] (DFT), which is a quantum mechanical framework for the investigation of the electronic structure of atoms. We refer the reader to useful references for more information on DFT [225, 226]. Although a very robust tool its main disadvantage is that the performance of DFT results depends on the approximations and functionals used; it is also computationally demanding for large systems. The use of NN to enhance these computations has shown favorable results, for instance, in the work by Cust´odio, Filletti and Fran¸ca[227], where an approximation of a density functional by means of a NN reduces the computational cost, while simultaneously obtaining accurate results for a vast range of fermionic systems. For similar applications to DFT, we would like to refer the reader to a review where a number of different methods are used to solve the Schr¨odingerequation and related problems of constructing functionals for DFT [228]. Although in its early stages, the enhancement of classical methods with ML is an intriguing area of research. When no rigorous foundation is needed, ML models are able to exceed human performance in the modeling of atomistic simulations, making it a suitable candidate to become the mainstream method in the future. Of course, there is a long path ahead to fully embrace these methods, as a complete simulation strategy is not easily available yet; further research on the coupling between molecular simulations and ML schemes is the primary way to achieve this.

6.2. Machine-learned potentials In this new day and age, computer simulations are a standard technique to do research in CMP and its sub-fields. These simulations produce enormous quantities of data depending on the sub-field they are tasked in, for example, and applied soft matter have produced one of the largest databases—the Protein Data Bank [229]—that contain free theoretical and experimental molecular information as a by-product of research in specific topics. This explosion of immense data available to the public has attracted scientists to look into the ML methodologies that have been created for the sole purpose of dealing with the so-called big data outbreak [230] of recent years, and effectively uniting both fields to attempt to use such well-tested intelligent algorithms in the understanding and automation of computer-aided models. One such task is that of constructing complex potential energy surfaces (PESs) Machine Learning for Condensed Matter Physics 24 with the experimental and computational data available [12]. This has turned out to be a successful application to materials science, physical chemistry and computational chemistry, as it allows scientists to obtain high-quality, high-precision representations of atomistic interactions of all kinds of materials and compounds. In atomistic simulations and experiments, scientists are always interested in obtaining a good representation of PESs because it guarantees the full recovery of the potential energy—and as a consequence, the full atomistic interactions—of a many-body system. The baseline technique that provides such information is the coupling of MD with DFT which turns out to be a very computational demanding scheme, even for the simplest cases. This constituted the starting point in a long standing approach to use ML models that could compute such complicated functions, but not without the usual challenges of good data representation, stable training and good hyperparameter tuning. The first attempt of this was done by Bank et al [231] where they proposed to fit FFNNs with information produced by some low-dimensional models of a CO molecule chemisorbed on a Ni(111) surface. They obtained promising results, such as faster evaluation of the potential energy by means of the NN model compared to the original empirical model. The data sets used by Bank et al were composed with a range from two up to twelve features, or degrees of freedom, making it easy and simple to train the NN models. But this is not always the case, in fact, one could argue that it is never the case, as systems like proteins consist of thousands of particles and electronic densities, hence a large number of degrees of freedom need to be computed simultaneously [232]. Another challenge for these type of ML potentials is that of the descriptors or features needed in order to be fed to an intelligent model. Data fed into ML models should account for all possible variations and transformations seen in real physical models to extend the generalization of the ML schemes and avoid bias in them. It took more than ten years after the pioneering work of Bank et al for Behler and Parrinello [186] to undertake this challenge to a new level by proposing a similar methodology employing FFNNs, advocating now to a more generalized approach to computing the PES of an atomistic system by creating what they called atomic-centered symmetric functions. We will describe in detail this method as it is an important achievement in the application of ML to CMP and materials modeling. The main purpose of the Behler-Parrinello approach is to represent the total energy P E of the system as a sum of atomic contributions Ei, such that E = i Ei. This is done by approximating E with a NN. The input of the NN should reflect information of the system from the contribution of all corresponding atoms, while also preserving important information about the local energetic environment. In this sense, symmetric functions are used as features, transforming the positions to other values that contain more useful information from the system. In order to define the energetically relevant local environment, a cutoff function is employed fc of the interatomic distance Rij, which Machine Learning for Condensed Matter Physics 25

has the form  h  i 0.5 · cos πRij + 1 for R ≤ R ,  Rc ij c fc(Rij) = (2)  0 for Rij > Rc . Next, radial symmetry functions are constructed as a sum of Gaussian functions with parameters η and Rs

all 2 1 X −η(Rij −Rs) Gi = e fc(Rij). (3) j6=i Finally, angular terms are constructed for all triplets of atoms by summing the Rij ·Rik cosine values of the angles θijk = centered at atom i, with Rij = Ri − Rj, and Rij Rik the following symmetry function is constructed

all 2 2 2 2 1−ζ X ζ −η(Rij +Rik+Rjk) Gi = 2 (1 + λ cos θijk) e fc(Rij)fc(Rik)fc(Rjk), (4) j,k6=i with parameters λ(= +1, −1), η, ζ. These parameters, along with the ones defined in Eq. (3) need to be found before these symmetry functions can be used as input to the NN. The parameter fitting is done by doing DFT calculations for the system to be studied. By computing the root mean squared error (RMSE) between the fit and the true values from the computations, the training set is constructed with the DFT calculations if the RMSE is larger than the fit. With this information, a training and testing data set is constructed and then used to train the NN, which in turn will yield an approximation for E. This approximation is then used to compute different observables and quantities from the system. This methodology introduced a paradigm shift because it was no longer a proof-of- concept that ML algorithms could be used in these research fields, and many other attempts have been carried out since then. Following the footsteps of Behler and Parrinello, Bart´ok et al [233] proposed a similar methodology, but with completely different ML schemes—in this case being Gaussian Processes [51] —which in turn needed new atomistic descriptors. We shall discuss atomistic descriptors in more detail in the next section. The advantages of such techniques in all the sub-fields of CMP is manifold, the main one being that simulations are now faster and more precise. Another advantage of such methods is the fact that we can leverage the retention of the learned features from the NN and by employing transfer learning [234] the model does not need to be trained again to compute the same interactions it learned to compute. For the reader interested in more detailed discussions on transfer learning we point out some reviews [235, 236] that give enough information on the current advancements on this topic. Nevertheless, this approach has one significant drawback which is the fact that most NNs cannot learn sequential tasks, which in turn creates a phenomenon called catastrophic forgetting [237, 238, 239, 240]. In the case of catastrophic learning, the NN Machine Learning for Condensed Matter Physics 26

abruptly loses most of the information learned from a previous task, unable to use it in the new task at hand. Transfer learning is a ML technique that needs meticulous manipulation from the user, involving fine tuning, which is a training process where some of the learned information are kept and some are learned again, and then joint training where new information is now learned using previous and newly acquired information obtained from fine tuning. Research is currently being carried out [241, 242] to overcome this disadvantage from NNs and shared learning. Machine-learned potentials have also been successfully applied to explaining complex interactions, like the van der Waals interaction in water molecules and the reason this interaction results in such unique properties seen in water [243]. Another interesting application was the simulation of solid-liquid interfaces with MD [244], an application that was not thought of being computationally simple or even tractable. One of the remaining challenges in this area of research is a paradoxical one, being that these ML potentials require very large data sets that are computationally demanding in order to train the intelligent model, one possible approach to alleviate this issue is to build and maintain a decentralized data bank similar to the Protein Data Bank—which could be a cornerstone of modern computational materials science and chemistry—enabling anyone interested in this type of research to collect insights from the acquired data, without needing to run large-scale simulations or re-train complicated NN architectures to do so. This is an interesting research field in the junction of many other specialized sub- fields which is also growing faster due to current computing processing power available, setting up the pathway for new and interesting intelligent materials modeling—which could surely become the future of materials science.

6.3. Atomistic Feature Engineering So far, we have discussed the different ways in which scientists have been able to apply ML technology in different sub-fields of CMP, but lattice systems seem to prevail, where most of the research done is reported with these systems—why is that?—. One of the fundamental parts of using a ML method is that of data input with specific features; for instance, images are described by pixels in a two-dimensional grid where the pixels are the features that numerically represent the image, these pixels are then fed to a ML model, e.g., a classifier; another clear example are numerical inputs that can be readily used in regression tasks, and then used as a prediction model. We wish to obtain a similar representation for physical systems, so as to ease the use of ML models in condensed matter systems. Recall that the intrinsic features of lattice systems are spins (σi), these spins do not change location they actually stay put and only change their spin value taken from a finite, discrete set of values. We can easily see a link between images and pixels, and a lattice system and its spins; both can be described by a set of discrete features that do not change their values as the system is evolving and changing. But we cannot say the same about atomistic and off-lattice systems, thus we cannot use the same representation as lattice systems because atomistic Machine Learning for Condensed Matter Physics 27 systems are described primarily by the constant movement of their atoms and particles, and their many degrees of freedom—for instance, for a three-dimensional system, one has 3N translational degrees of freedom with N being the total number of particles—. By using the raw positional coordinates, velocities or accelerations, we might not be able to convey the true features that the system possesses; if even possible, they could be conveyed in a biased way, e.g., by defining an arbitrary coordinate system or by not being invariant to translation and rotations. In order to overcome this limitation, a special mechanism needs to be developed to feed this data to the ML algorithms. It is this encoding of raw atomistic data into a set of special features that has seen a strong surge in recent years when dealing with materials of any kind. Atomistic descriptors have been studied heavily throughout the years [245, 246, 247, 248, 249, 250, 251] because they are the gateway to applying ML methodologies in atomistic systems. The reader is referred to these reviews [12, 252] for more information on the topic. Descriptors are a systematic way of encoding physical information in a set of feature vectors that can then be fed to a ML algorithm. These encodings need to have the following properties: a) the descriptors must be invariant to any type of point symmetry, for example, translations, rotations and permutations; b) the computation of the descriptor must be fast as it needs to be evaluated for every possible structure in the system; and c) the descriptors have to be differentiable with respect to the atomic positions to enable the computation of analytical gradients for the forces. Computing these descriptors is essentially called feature selection in the ML literature, and most of the time these procedures are carried out in an automated way in a normal ML workflow. Once a particular descriptor has been used, all the methodology from ML that we have seen applied to lattice systems can now be employed on atomistic and off-lattice systems.

6.4. Outlook ML-enhanced materials modeling is a strong research area, with a focus on streamlining computations and automating material discoveries. Descriptors that employ less data should be the primary research area, as it currently is one of its main shortcomings. A standard framework and benchmark data sets could also be used to further this area, while also looking forward to an automated workflow.

7. Soft Matter

At this stage, most of the ML applications we have discussed have primarily been focused on HM and atomistic many-body systems. It has been so due to the fact that ML methods can be applied using data sets that need little feature engineering, even without its disadvantages. However, off-lattice and SM systems, along with other atomistic many-body systems, on the other hand, need thorough feature engineering and a different approach to training ML models. Machine Learning for Condensed Matter Physics 28

The applications we will review in this section reflect this inherent challenge of finding representative features that could enable ML models to be used in tasks common to all CMP, e.g., phase transitions and critical phenomena.

7.1. Phase Transitions Phase transitions are the cornerstone of ML applications to lattice systems, but with the use of atomistic descriptors these can easily be extended to off-lattice systems, which are usually observed in SM. Jadrich, Lindquist and Truskett [253, 254] developed the idea of using the set of distances between particles as a descriptor for the system, and used unsupervised learning—in particular, Principal Component Analysis—on these descriptors. They wanted to test whether a ML model was able to detect a phase transition in hard-disk and hard-sphere systems. When the unsupervised method was applied, the output from the model was later found to be the positional bond order parameter for the system [255, 256], which effectively determines whether the system has been subjected to a phase transition. In other words, the ML model was able to automatically compute the order parameter without having been told to do so. A similar approach to colloidal systems was done by Boattini and coworkers [257]. Instead of using inter-particle distances, they computed the orientational bond order parameters [258, 259], then they used a DL unsupervised method—a FFNN based autoencoder—to obtain a latent representation of the computed order parameters. With all these preprocessing steps, they fed the encoded data to a clustering algorithm and obtained detailed descriptions of all the phases of matter in the current system. This methodology is systematic and leads itself nicely to an automated form of phase classification in colloidal systems.

7.2. Colloidal systems Liquid crystals are a that have both the properties of a liquid and of a solid crystal [260]. At the simplest case, liquid crystals can be studied using one of the reference model from SM, i.e., hard rods, or similar anisotropic particles [261]. This type of systems exhibit properties similar to those present in HM, such as topological defects. The main issue with using the same methodology is that building a data set with useful and representative features in systems made of liquid crystal molecules is not trivial, mainly due to the fact that positions and other physical observables can take any continuous value instead of a specific value from a given set. In the work by Walters, Wei and Chen [262] a time series approach using Recurrent NN was employed, using the basic descriptors from a system of liquid crystals, namely the orientation angle and positions of all the particles in the system. By using long short-term memory networks [263, 264] (LSTM), they showed that the off-lattice simulation data can be used as input and expect to identify between several topological defects. This is a great step towards the use of mainstream techniques from ML in SM, as one of the main drawbacks, which is feature selection and data set building, is resolved by using this Machine Learning for Condensed Matter Physics 29

kind of methods. Although a data set collected from computer simulations does not convey much problem, it might be so for experimental setups. Similarly, Terao [265] used CNNs to analyze liquid crystals, as well as different types of solid-like structures, by computing special descriptors for the system. In particular, the approach presented in that work can be seen as a generalization of the one tipically used for HM, which is to transform the off-lattice data into a common structure normally used in computer vision. By extracting any subsystems with few particles from the total system, a conversion to grayscale images of size l×l in the case of 2 dimensional systems were performed by using the following expression,

mp  2  X |rn − Rij| u = C exp − , i,j a2 n=1

were rn is the positions of the n-th particle in the subsystem with a size Ls and total number of particles mp; Rij is the positional vector in the center of the pixel (i, j); and the constant C is determined such that the set {ui,j} is rescaled within the interval [0, 1]. The advantage of this method is its generalization potential due to the fact that no physical input is needed in order for the technique to work, so a large range of systems could be studied under the same methodology. One possible disadvantage is the fact that some systems might need to conserve some kind of symmetries, so this should be taken into account when constructing the data set. Polymers are materials that consist of large molecules composed of smaller subunits. These materials are of great interest in SM because most polymers are surrounded by a continuum, turning them into colloid systems [266]. Polymers are also seen in , e.g., the case of the deoxyribonucleic acid (DNA) found in almost all living organisms. Even though these systems have been studied extensively throughout the years, computational and experimental issues always arise. It is in these situations where ML might be a suitable approach. One such example is the design of a computational method for understanding and optimizing the properties of complex physical systems by Menon et al [267]. In the work by Menon et al physical and statistical modeling are integrated into a hierarchical framework of machine learning, as well as using simple experiments to probe the underlying forces that combine to determine changes in slurry rheology due to dispersants. One of the main contributions of this work is that they show how physical and chemical information can aid in the reduction of the number of samples in a data set needed to do efficient ML modeling. If data sets are constructed using experimental data and some insight about the problem, the use of ML techniques might be easier to achieve. An issue presented in the work by Menon et al considers how data sets might be subjected to correlation between system variables and responses from the system. This could be a potential issue when training some ML models due to training stability and yielding acceptable results. On a more recent approach to polymers one can find is the work by Xu and Jiang [268], where a ML methodology is used for studying the swelling of polymers immersed in . In this work, the importance of using chemical and physical Machine Learning for Condensed Matter Physics 30 descriptors from the beginning to construct the ML framework is highlighted. By constructing and using these descriptors, molecular representations are used, along with data augmentation, to study the relationships between molecular fragments and swelling degrees with the help of PCA. Various other applications are constantly arising, such as the discovery and design of new types of polymers in ML methods [269]. As ML methods help with complex representations of data, it could potentially be used to accelerate the discovery of innovative materials within polymer science.

7.3. Properties of equilibrium and non-equilibrium systems The structural and dynamical properties of equilibrium and non-equilibrium systems, such as complex liquids, glasses, gels, granular materials, and many more, are extensively studied under the branch of colloidal SM [270, 271]. Common tools, such as computer simulations are used to study such properties, but in some systems these tools are computationally demanding to the point that special facilities, such as high-performance computing (HPC) clusters are needed. Even if in the current era we are able to use this kind of computational resources, we are still limited to the amount of computations needed for large, physically-representative systems. Studying simple reference models, Moradzadeh and Aluru [272] were able to fabricate a DL autoencoder framework to compute structural properties of these simple systems. Using MD and long running simulations, they collected several position snapshots for a diverse range of Lennard- Jones systems and using the autoencoder as a denoising framework they computed the time average radial distribution function (RDF). This was achieved using a data set built with three main features, the thermodynamic state composed of the temperature and density—(T, ρ) —at each time step, and the RDF snapshot at that exact moment. Both time and data are efficiently reduced with this methodology, as only 100 samples are needed when the network is trained, although a large data set is employed to do such training. Non-equilibrium systems have also seen some applications from ML. Employing SVMs, Schoenholz et al [273] found that the structure is important to glassy dynamics in three dimensions. The results were obtained by performing MD simulations of a bidisperse Kob–Andersen Lennard-Jones glass [274] in three dimensions at different densities ρ and temperatures T above its dynamical glass transition temperature. At each density, a training set of 6,000 particles was selected, taken from a MD trajectory at the lowest T studied, to build a hyperplane in 0 if i lies on the soft side of the hyperplane and Si < 0 otherwise. With this method, it is shown that there is structure hidden in the disorder of glassy liquids and this structure can be quantified by softness, computed through a ML framework. New ML applications to design and create glass materials are steadily appearing as Machine Learning for Condensed Matter Physics 31 presented in the review by Han et al [275], where challenges and limitations are also discussed for these applications to glassy systems.

7.4. Experimental setups and Machine Learning Experiments are a fundamental part of CMP, providing results which can be later compared to theoretical and computational results. Experimental setups have also been enhanced with ML models in different aspects of SM. In the work by Li et al [276] collective dynamics are explored, measured through piezoelectric relaxation studies, to identify the onset of a structural phase transition in nanometer-scale volumes. Aided by k-means clustering on raw experimental data, Li et al were able to automatically determine phase diagrams on systems where identification of underpinning physical mechanisms is difficult to achieve and microscopic degrees of freedom are generally unavailable for observations. Although these results are more precise than those obtained with traditional tools, experimental setups might still be difficult and expensive to perform each time, so ML methods might be able to alleviate this shortcommings by requiring less data or performing more efficiently than other techniques. We have touched several times on the fact that data and the amount of samples needed are one of the main shortcomings for the applications of ML methods to CMP. This is particularly more of a challenge on experimental setups, where large quantities of data samples might not be possible on a given system. The end-to-end framework developed by Minor et al [277] shows that this problem can be overcome. The collection of physical data of a liquid crystalline mesophase was performed using an experimental setup, i.e., a mechanical quench experiment. The ML pipeline consisted of real time object detection using the YOLOv2 [278] architecture to detect topological defects on the system. Standardization and image enhancement techniques were needed to mimic real- world inaccuracies in the training data set. The model resulted in comparable spatial and number resolution human annotations to which the pipeline was compared; with significant improvement on the time spent in each frame, thus resulting in a dramatic increase in the time-resolution (more frames analyzed).

7.5. Outlook As we have discussed so far, the fundamental step to applying ML to SM and off- lattice systems is to define and use a set of meaningful descriptors, but the choices for descriptors are large as well as system-dependent. In the case of experimental setups, the efficient use of data samples are necessary. In order to fully embrace ML in SM and colloidal systems a more automated approach will need to be developed, posing it as new and fascinating research area. Recently, DeFever et al [279] developed a generalized approach that does not use hand-made descriptors; instead, a complete DL workflow is used with raw coordinates from simulations and it is able to obtain promising results about the phase transitions, both in and out of thermodynamic equilibrium in Machine Learning for Condensed Matter Physics 32

SM systems. This scheme is a great step forward towards a fully automated strategy for off-lattice systems, as it reduces the amount of bias a scientist might introduce while using certain descriptors for given systems. The only downside to employing these types of DL methodologies is the large amount of data needed, so maybe in future research we can see approaches where one can use a small amount of data to obtain similar results. These disadvantes seem to disappear in end-to-end frameworks as the one described in Minor et al [277]; still, the amount of data might be an obstacle when applying ML models. This is discussed in detail in the next section.

8. Challenges and Perspectives

At this point of the review, we have discussed some of the major applications of ML to CMP. We have also touched on other useful insights provided by theoretical frameworks on ML, as well as discussing important points on various shortcomings and drawbacks of using ML models in diverse applications. In this section, however, we would like to expand on such discussions and address the main challenges that we see in the current climate of this research area.

8.1. Benchmark data sets and performance metrics In order to test new algorithms in ML a benchmark data set is needed. This data set serves the purpose of assessing the performance of a particular algorithm by means of a defined performance metric. One such example is the extremely popular MNIST data set [280], which is currently extensively used to test new ML algorithms. The most common task to solve on this data set is to check if the proposed ML algorithm can successfuly recognize all ten different classes of handwritten digits, in a supervised learning fashion. Thus, the performance metric is the accuracy—which corresponds to the total number of correct predictions over the total number of samples in the data set. This means that if a proposed NN architecture scores a 97% accuracy on the MNIST data set, the NN architecture can predict at least 9.7 handwritten digits correctly. Although the MNIST has been widely used, it is not the only one and most researchers agree that this data set should be replaced by a more challenging one, for instance the Fashion-MNIST [281]. With this in mind, the main point we wish to convey is that there is an important need for standardized and well-tested benchmark data sets in the current research area of ML and CMP, along with their respective performance metrics. Benchmark data sets will be used to try and test new ML models, while verifying if such models can provide the expected results, and achieve a good score on the respective performance metric. We can look at the example of computing the critical temperature of an Ising model where we do not have the external field shown in Eq. (1). In this example, we have seen several possible ML models—NNs, SVMs, DL approaches, Reinforcement Learning approaches—that can achieve good results on this problem, so we might agree Machine Learning for Condensed Matter Physics 33 that such problem can be taken as a benchmark problem. The Ising model is a simple one, which can be computed with standard MC methods and a standard benchmark data set is fairly easy to obtain. This is not the case, however, of other CMP systems and problems, for example, the case of machine-learned potentials. As discussed in Ref. [12], the construction of such models currently require very large reference data sets from electronic structure calculations, making the construction computationally very demanding and time consuming. This means that it is not simple nor easy to test newly created ML models that can tackle this problem. Thus, it implies that there is a strong need to spend countless computing hours to obtain representative data samples just to test if the proposed ML method can perform well at the simplest cases. Other CMP systems are even more complicated to produce data sets, in particular those that come from experimental setups. Experiments can be very costly; obtaining clean, labeled and representative data is not something that is currently being addressed completely. The solution to this problem is to generate well-tested standard data sets for each problem in CMP. Although this is not a simple task, some efforts have been done in current research areas. Related to materials modeling and CMP, MoleculeNet by Wu et al [282] is a contribution that curates multiple public datasets, establishes metrics for evaluation, and offers high quality open-source implementations of multiple previously proposed molecular featurization and learning algorithms. Other effort, although in the realm of quantumm chemistry, is the QM9 data set [283] for molecular structures and their properties. Not much attempts are seen in the realm of SM, where one could test for the simplest cases, namely, the hard-sphere model, binary hard-sphere mixtures and some other simple liquids benchmarks. This could signal a different approach to solving the problem of performance metrics and benchmark data sets by means of physical correctness and standard results drawn from physical theories. For instance, when testing new molecular dynamics algorithms, the Lennard-Jones potential [284] is a well established benchmark given that all its properties are well known and studied. Instead of having to compute these values each time, one wishes to test a new ML model, or having to read and extract the data set from the literature, a joint effort of building a common and curated data bank might help scientists to quickly test their newly proposed algorithms. A final approach to solving this problem could be the reuse and sharing of created data sets for a given problem. While solving some problem within CMP, the data set created can be shared through an open service, e.g., Zenodo [285], where other scientists can easily access such information and use them in an open science way [286]. Benchmarks are most needed for the prevalence and proliferation of newly proposed ML algorithms applied to CMP, and this key challenge should be one of the main challenges to overcome. Machine Learning for Condensed Matter Physics 34

8.2. Feature selection and Training of ML models Another key challenge in the application of ML to CMP is that of feature selection. ML models rely on a data set that is representative of the system itself, i.e., without interpretation capabilities, models might not provide physical insights or meaning. If a data set is not correctly labeled or structured, training can be unstable and different issues can arise. In plain ML various feature selection techniques are available along with some useful metrics [287]. Feature selection strongly depends on the data set and problem alike, so most of the feature selection methods cannot be transfered directly to CMP problems. For this to be employed, physics criteria and insight into the problem is mandatory. For example, it requires the expertise of a SM physicist to identify betwen a nematic phase and a smectic phase in a crystal liquid, or the distinction of symmetries in a given system that need to be conserved when using a ML model. In this review, we have seen some of these shortcomings, when talking about atomistic descriptors and other applications. Unfortunately, this is a challenge that is not so easily resolved as it requires having experience on the problem itself and the ML model to be used. One such example that we have already covered is the description of hard-spheres and hard-disks using the distance between particles shown in Ref. [253]. This feature selection is not a standard one in ML, it comes directly from the Physics of the problem at hand, and is an important contribution to feature selection in ML and CMP. Research of this kind might be needed even more than creating new ML models, so that more CMP systems can be tested with more feature selection techniques. Another possible approach to such problem can be the automatization of feature selection by means of ML models; for instance, using a modification of the one proposed by Bojensen [187]. It is interesting to see that most automatization methodologies use Reinforcement Learning, which might indicate that it is well suited to solve some of these drawbacks. We now discuss the issue of training and using ML models in general. Handling and training ML models well so as to obtain good results is not so simple at first. Nowadays, there are a lot of standard techniques used to train ML models, such as regularization, early stopping and many others; for a thorough and detailed explanation of these techniques we refer the reader to the excellent review by Mehta et al [288] especially written for physicists. The main point here is that training ML models has its difficulties and most of the time is not so easy to implement all of them as not all are needed in order for the training to be stable enough. Dealing with overfitting and underfitting is one of the main issues in training ML models, especially in CMP applications because of the possible lack of data samples and the difficulty of choosing the correct ML algorithm. Moreover, tuning of hyperparameters in ML models requires time and patience from the user as it can be a time consuming task, especially if approaching the problem with a ML model for which there is little knowledge of how to use it correctly. There are solutions to this problem, such as Bayesian optimization [289], a robust tool which can help tune hyperparameters in an automated fashion, with good results. Fortunately, more and more research is being carried out to alleviate the Machine Learning for Condensed Matter Physics 35 problem in ML, for instance, the DropConnect approach [290]. which greatly helps with the overfitting problem. In general, most ML models are not so readily applicable to CMP problems, mostly because of the lack of good data representation and the need of model tuning. There is a diverse set of possible solutions to these problems brought directly from ML, but these need to be tested in CMP problems to ensure such techniques are viable. New research should be carried out in this way to enhance current and recently proposed ML applications to CMP, and to make it easier for physicists to employ ML models.

8.3. Replicability of results It is a known fact that current ML research faces a reproducibility crisis [291]—the fact that even with detailed descriptions from the original paper, an independent implementation of such models does not yield the same results as those claimed in the original work. One of the main problems with this phenomenon is the fact that most research papers do not provide the code that was used to obtain the results presented in the work. Even if the code is provided, most of the results claimed cannot obtained due to the fact of stochasticity of certain algorithms—for example, when using dropout [292], a technique that reduces overfitting by randomly turning on or off certain parts of the NN architecture. The intrinsic randomness of most ML models is another key issue in solving the replicability problem. Most researchers deal with this problem by sharing the trained models along with the code used. This can greatly alleviate the problem, the fact is that this is a rare practice in current ML research. Nonetheless, important efforts are carried out to solve this problem. OpenML [293] is an iniciative to collect and centralize ML models, data sets, pipelines and most importantly reproducible results. Another great example is DeepDIVA [294], an infrastructure designed to enable quick and intuitive setup of reproducible experiments with a large range of useful analysis functionality. Current research in this area [295, 296, 297] is necessary to create outreach on research and encourage practices that can lead to reproducible science. Code and data sharing is not a common practice within CMP. Some researchers believe that the original paper must suffice to reproduce the results claimed, this might be true for most of the proposed methodologies and techniques, but no so much in ML applications to CMP. In order for ML to work with CMP, reproducibility is a key factor that needs to be present from the beginning. By focusing on sharing the full training pipeline, the data set used to train a given ML model, and the code used, ML can be quite succesful in CMP, most importantly, it can be implemented by anyone that wishes to use such tools. Until now, there is no approach to this in the current climate of ML and CMP, so it is a great opportunity to start implementing these practices of code, data and model sharing to ensure reproducibility of results. Machine Learning for Condensed Matter Physics 36

8.4. Perspectives ML and CMP have created a very enriching and prosperous joint research area. Modern ML techniques have been applied to CMP with outstanding results, and Physics has given fruitful insights into the theoretical background of ML. But in order to embrace and fully adopt ML tools into standard physics workflows, some challenges need to be overcome. Currently, there is no standard way to couple both ML and Physics simulation software suites; the main approach is to use available software and hand-craft code that can use the strengths of both parts. Physicists have seen the great advantage that is the use of standard simulation software suites like LAMMPS [298] or ESPResSO [299], but these are not meant to be used with ML workflows. A universal, coupling scheme might be able to simplify research in this exciting scientific area. Quantum systems have also seen a great deal of advances with ML applications [10], thus it might be interesting to see if the same schemes can be modified and applied to larger-scale systems, such as soft materials, as we have seen it was possible in the topic of phase transitions. New interesting theoretical insights have also been proposed to explain ML by applying not only CMP, but other Physics frameworks, like the AdS/CFT correspondence [300]. Physics is full of these rich and rigorous frameworks and new research into the area might be able to give a full and thorough explanation of the theory of Deep Learning and its learning mechanisms. All in all, ML and CMP are just starting to convey ideas to each other, creating a fertile research area where the work done is towards an automated, faster and simplified workflow that will enable condensed matter physicists to unravel the deep and intricate features of many-body systems.

Acknowledgments

This work was financially supported by Conacyt grant 287067.

References

[1] LeCun Y, Bengio Y and Hinton G 2015 Nature 521 436–444 [2] Alom M Z, Taha T M, Yakopcic C, Westberg S, Sidike P, Nasrin M S, Van Esesn B C, Awwal A A S and Asari V K 2018 The history began from alexnet: A comprehensive survey on deep learning approaches (Preprint 1803.01164) [3] Young T, Hazarika D, Poria S and Cambria E 2018 IEEE Computational Intelligence Magazine 13 55–75 [4] Voulodimos A, Doulamis N, Doulamis A and Protopapadakis E 2018 Computational Intelligence and Neuroscience 2018 [5] Litjens G, Kooi T, Bejnordi B E, Setio A A A, Ciompi F, Ghafoorian M, Van Der Laak J A, Van Ginneken B and S´anchez C I 2017 Medical image analysis 42 60–88 [6] Goodfellow I, Bengio Y and Courville A 2016 Deep Learning (MIT Press) URL http://www. deeplearningbook.org Machine Learning for Condensed Matter Physics 37

[7] Sanchez-Lengeling B and Aspuru-Guzik A 2018 Science 361 360–365 ISSN 0036- 8075 (Preprint https://science.sciencemag.org/content/361/6400/360.full.pdf) URL https://science.sciencemag.org/content/361/6400/360 [8] Vamathevan J, Clark D, Czodrowski P, Dunham I, Ferran E, Lee G, Li B, Madabhushi A, Shah P, Spitzer M et al. 2019 Nature Reviews Drug Discovery 18 463–477 [9] Senior A W, Evans R, Jumper J, Kirkpatrick J, Sifre L, Green T, Qin C, Z´ıdekA,ˇ Nelson A W, Bridgland A et al. 2020 Nature 1–5 [10] Carrasquilla J 2020 Machine learning for quantum matter (Preprint 2003.11040) [11] Schmidt J, Marques M R, Botti S and Marques M A 2019 npj Computational Materials 5 1–36 [12] Behler J 2016 Journal of Chemical Physics 145 ISSN 00219606 [13] Webb M A, Delannoy J Y and de Pablo J J 2019 Journal of Chemical Theory and Computation 15 1199–1208 ISSN 1549-9618 publisher: American Chemical Society URL https://doi.org/ 10.1021/acs.jctc.8b00920 [14] Sch¨uttK T, Sauceda H E, Kindermans P J, Tkatchenko A and M¨ullerK R 2018 The Journal of Chemical Physics 148 241722 ISSN 0021-9606 publisher: American Institute of Physics URL https://aip.scitation.org/doi/full/10.1063/1.5019779 [15] Mills K, Spanner M and Tamblyn I 2017 Physical Review A 96 042113 publisher: American Physical Society URL https://link.aps.org/doi/10.1103/PhysRevA.96.042113 [16] Carleo G and Troyer M 2017 Science 355 602–606 ISSN 0036-8075, 1095-9203 publisher: American Association for the Advancement of Science Section: Research Articles URL https: //science.sciencemag.org/content/355/6325/602 [17] Glasser I, Pancotti N, August M, Rodriguez I D and Cirac J I 2018 Physical Review X 8 011006 publisher: American Physical Society URL https://link.aps.org/doi/10.1103/PhysRevX. 8.011006 [18] Xie T and Grossman J C 2018 Physical Review Letters 120 145301 publisher: American Physical Society URL https://link.aps.org/doi/10.1103/PhysRevLett.120.145301 [19] Snyder J C, Rupp M, Hansen K, M¨ullerK R and Burke K 2012 Physical Review Letters 108 253002 publisher: American Physical Society URL https://link.aps.org/doi/10.1103/ PhysRevLett.108.253002 [20] Li L, Snyder J C, Pelaschier I M, Huang J, Niranjan U N, Duncan P, Rupp M, M¨uller K R and Burke K 2016 International Journal of Quantum Chemistry 116 819–833 ISSN 1097-461X eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1002/qua.25040 URL https: //onlinelibrary.wiley.com/doi/abs/10.1002/qua.25040 [21] Carleo G, Cirac I, Cranmer K, Daudet L, Schuld M, Tishby N, Vogt-Maranto L and Zdeborov´a L 2019 Reviews of Modern Physics 91 045002 publisher: American Physical Society URL https://link.aps.org/doi/10.1103/RevModPhys.91.045002 [22] Ethem A 2014 Introduction to Machine Learning 3rd ed (Cambridge: MIT Press) [23] Alom M Z, Taha T M, Yakopcic C, Westberg S, Sidike P, Nasrin M S, Hasan M, Van Essen B C, Awwal A A and Asari V K 2019 Electronics 8 292 [24] Vapnik V 1998 Statistical Learning Theory 1st ed (New York: John Wiley and Sons, Inc) [25] Smola A J and Sch¨olkopf B 2004 Statistics and computing 14 199–222 [26] Zhang Y and Liu Y 2009 IEEE Signal Processing Letters 16 414–417 [27] Ben-Hur A, Horn D, Siegelmann H T and Vapnik V 2001 Journal of machine learning research 2 125–137 [28] Hoffmann H 2007 Pattern recognition 40 863–874 [29] Tan S, Zheng F, Liu L, Han J and Shao L 2016 IEEE Transactions on Circuits and Systems for Video Technology 28 356–363 [30] Liu J, Huang Z, Xu X, Zhang X, Sun S and Li D 2020 IEEE Transactions on Systems, Man, and Cybernetics: Systems [31] Nguyen T T, Nguyen N D and Nahavandi S 2020 IEEE Transactions on Cybernetics 1–14 ISSN 2168-2275 Machine Learning for Condensed Matter Physics 38

[32] Mnih V, Kavukcuoglu K, Silver D, Rusu A A, Veness J, Bellemare M G, Graves A, Riedmiller M, Fidjeland A K, Ostrovski G et al. 2015 nature 518 529–533 [33] Shawe-Taylor J, Cristianini N et al. 2004 Kernel methods for pattern analysis (Cambridge university press) [34] Aggarwal C C 2018 Neural Networks and Deep Learning: A Textbook 1st ed (Springer Publishing Company, Incorporated) ISBN 3319944622 [35] Xu X, Hu D and Lu X 2007 IEEE Transactions on Neural Networks 18 973–992 [36] An Y, Ding S, Shi S and Li J 2018 Pattern Recognition Letters 111 30–35 [37] Driessens K, Ramon J and G¨artnerT 2006 Machine Learning 64 91–119 [38] Jiu M and Sahbi H 2019 Pattern Recognition 88 447 – 457 ISSN 0031-3203 URL http: //www.sciencedirect.com/science/article/pii/S0031320318304266 [39] Le L and Xie Y 2019 Neurocomputing 339 292–302 [40] Bai L, Shao Y H, Wang Z and Li C N 2019 Knowledge-Based Systems 163 227–240 [41] Sanakoyeu A, Bautista M A and Ommer B 2018 Pattern Recognition 78 331 – 343 ISSN 0031-3203 URL http://www.sciencedirect.com/science/article/pii/S0031320318300293 [42] Tesauro G 1992 Practical issues in temporal difference learning Advances in neural information processing systems pp 259–266 [43] Liu H, Zhou J, Xu Y, Zheng Y, Peng X and Jiang W 2018 Neurocomputing 315 412–424 [44] da Silva L E B, Elnabarawy I and Wunsch II D C 2019 Neural Networks 120 167–203 [45] Tan A H, Lu N and Xiao D 2008 IEEE Transactions on Neural Networks 19 230–244 [46] Salakhutdinov R and Hinton G 2009 Deep boltzmann machines Artificial intelligence and statistics pp 448–455 [47] Bishop C 2006 Pattern Recognition and Machine Learning Information Science and Statistics (Springer) ISBN 9780387310732 URL https://books.google.com.mx/books?id= qWPwnQEACAAJ [48] Hu Y, Luo S, Han L, Pan L and Zhang T 2020 Artificial Intelligence in Medicine 102 101764 ISSN 0933-3657 URL http://www.sciencedirect.com/science/article/pii/ S093336571830366X [49] Hinton G E and Salakhutdinov R R 2006 science 313 504–507 [50] Quinlan J R 1986 Machine learning 1 81–106 [51] Williams C K and Rasmussen C E 2006 Gaussian processes for machine learning 2nd ed (MIT press Cambridge, MA) [52] Roth V 2004 IEEE transactions on neural networks 15 16–28 [53] Graupe D 2019 Principles of Artificial Neural Networks: bacis designs to deep learning 4th ed (Singapore: World Scientific) [54] Aggarwal C 2020 Linear Algebra and Optimization for Machine Learning: A Textbook (Springer International Publishing) ISBN 9783030403447 URL https://books.google.com.mx/books? id=EdTkDwAAQBAJ [55] Guresen E and Kayakutlu G 2011 Procedia Computer Science 3 426 – 433 ISSN 1877- 0509 world Conference on Information Technology URL http://www.sciencedirect.com/ science/article/pii/S1877050910004461 [56] Chen T and Chen H 1995 IEEE Transactions on Neural Networks 6 911–917 [57] Shrestha A and Mahmood A 2019 IEEE Access 7 53040–53065 [58] Bottou L 2012 Stochastic gradient descent tricks Neural Networks: Tricks of the Trade: Second Edition ed Montavon G, Orr G B and M¨ullerK R (Berlin, Heidelberg: Springer Berlin Heidelberg) pp 421–436 ISBN 978-3-642-35289-8 URL https://doi.org/10.1007/ 978-3-642-35289-8{_}25 [59] Pascanu R and Bengio Y 2013 Revisiting natural gradient for deep networks (Preprint 1301.3584) [60] Graves A, Wayne G, Reynolds M, Harley T, Danihelka I, Grabska-Barwi´nska A, Colmenarejo S G, Grefenstette E, Ramalho T, Agapiou J et al. 2016 Nature 538 471–476 [61] Quan Z, Zeng W, Li X, Liu Y, Yu Y and Yang W 2019 IEEE Transactions on Neural Networks Machine Learning for Condensed Matter Physics 39

and Learning Systems 31 813–826 [62] Van Veen F and Leijnen S 2019 Asimov institute: The neural network zoo available at: URL https://www.asimovinstitute.org/neural-network-zoo [63] Rosenblatt F 1958 Psychological review 65 386–408 [64] Hornik K, Stinchcombe M, White H et al. 1989 Neural networks 2 359–366 [65] Jain A K, Mao J and Mohiuddin K M 1996 Computer 29 31–44 [66] Frommberger L 2010 Qualitative spatial abstraction in reinforcement learning (Springer Science & Business Media) [67] Bourlard H and Kamp Y 1988 Biological cybernetics 59 291–294 [68] Snoek J, Adams R P and Larochelle H 2012 Journal of Machine Learning Research 13 2567–2588 [69] Blaschke T, Olivecrona M, Engkvist O, Bajorath J and Chen H 2018 Molecular Informatics 37 1700123 (Preprint https://onlinelibrary.wiley.com/doi/pdf/10.1002/minf.201700123) URL https://onlinelibrary.wiley.com/doi/abs/10.1002/minf.201700123 [70] Talwar D, Mongia A, Sengupta D and Majumdar A 2018 Scientific reports 8 1–11 [71] Vellido A, Lisboa P J and Vaughan J 1999 Expert Systems with applications 17 51–70 [72] Hsieh T J, Hsiao H F and Yeh W C 2011 Applied soft computing 11 2510–2525 [73] Bohanec M, BorˇstnarM K and Robnik-Sikonjaˇ M 2017 Expert Systems with Applications 71 416–428 [74] Montavon G, Samek W and M¨ullerK R 2018 Digital Signal Processing 73 1–15 [75] Padierna L C, Carpio M, Rojas-Dom´ınguezA, Puga H and Fraire H 2018 Pattern Recognition 84 211 – 225 ISSN 0031-3203 URL http://www.sciencedirect.com/science/article/pii/ S0031320318302280 [76] Rojas-Dom´ınguezA, Padierna L C, Valadez J M C, Puga-Soberanes H J and Fraire H J 2017 IEEE Access 6 7164–7176 [77] Vapnik V N 1999 IEEE transactions on neural networks 10 988–999 [78] Maldonado S and L´opez J 2014 Information Sciences 268 328–341 [79] Xu Y, Yang Z and Pan X 2016 IEEE transactions on neural networks and learning systems 28 359–370 [80] Tsang I W, Kwok J T and Cheung P M 2005 Journal of Machine Learning Research 6 363–392 URL https://www.jmlr.org/papers/v6/tsang05a.html [81] Sadrfaridpour E, Razzaghi T and Safro I 2019 Machine Learning 108 1879–1917 [82] Chang C C and Lin C J 2011 ACM Transactions on Intelligent Systems and Technology 2(3) 27:1–27:27 software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm [83] Fan R E, Chang K W, Hsieh C J, Wang X R and Lin C J 2008 Journal of Machine Learning Research 9 1871–1874 [84] Shalev-Shwartz S, Singer Y, Srebro N and Cotter A 2011 Mathematical programming 127 3–30 [85] Nandan M, Khargonekar P P and Talathi S S 2014 J. Mach. Learn. Res. 15 59–98 ISSN 1532-4435 [86] Bohn B, Griebel M and Rieger C 2019 Journal of Machine Learning Research 20 1–32 URL http://jmlr.org/papers/v20/17-621.html [87] Chaikin P M, Lubensky T C and Witten T A 1995 Principles of condensed matter physics vol 10 (Cambridge university press Cambridge) [88] Anderson P W 2018 Basic notions of condensed matter physics (CRC Press) [89] Girvin S M and Yang K 2019 Modern Condensed Matter Physics (Cambridge University Press) [90] Kohn W 1999 Rev. Mod. Phys. 71(2) S59–S77 [91] Lubensky T 1997 Solid State Communications 102 187 – 197 ISSN 0038-1098 highlights in Condensed Matter Physics and Materials Science [92] Witten T A 1999 Rev. Mod. Phys. 71(2) S367–S373 [93] Yan H, Park S H, Finkelstein G, Reif J H and LaBean T H 2003 Science 301 1882–1884 ISSN 0036-8075 [94] Zhang S, Marini D M, Hwang W and Santoso S 2002 Current Opinion in Chemical Biology 6 865 – 871 ISSN 1367-5931 Machine Learning for Condensed Matter Physics 40

[95] De Gennes P 2005 Soft Matter 1(1) 16–16 [96] Russel W, Russel W, Saville D and Schowalter W 1991 Colloidal Dispersions Cambridge Monographs on Mechanics (Cambridge University Press) ISBN 9780521426008 URL https: //books.google.com.mx/books?id=3shp8Kl6YoUC [97] Eberle A P R, Wagner N J and Casta˜nedaPriego R 2011 Phys. Rev. Lett. 106(10) 105704 URL https://link.aps.org/doi/10.1103/PhysRevLett.106.105704 [98] Eberle A P, Casta˜neda-PriegoR, Kim J M and Wagner N J 2012 Langmuir 28 1866–1878 [99] Valadez-P´erezN E, Benavides A L, Sch¨oll-Paschinger E and Casta˜neda-PriegoR 2012 The Journal of chemical physics 137 084905 [100] Cheng Z, Chaikin P, Russel W, Meyer W, Zhu J, Rogers R and Ottewill R 2001 Materials & Design 22 529–534 [101] McGrother S C, Gil-Villegas A and Jackson G 1998 Molecular Physics 95 657– 673 (Preprint https://doi.org/10.1080/00268979809483199) URL https://doi.org/10. 1080/00268979809483199 [102] McGrother S C, Gil-Villegas A and Jackson G 1996 Journal of Physics: Condensed Matter 8 9649–9655 [103] Bleil S, von Gr¨unberg H H, Dobnikar J, Casta˜neda-PriegoR and Bechinger C 2006 Europhysics Letters (EPL) 73 450–456 [104] Anderson V J and Lekkerkerker H N 2002 Nature 416 811–815 [105] Zinn-Justin J 2007 Phase Transitions and Renormalization Group Oxford Graduate Texts (OUP Oxford) ISBN 9780199227198 URL https://books.google.com.mx/books?id=mUMVDAAAQBAJ [106] Binney J, Binney J, Binney M, Dowrick N, Fisher A, Newman M, Fisher B and Newman L 1992 The Theory of Critical Phenomena: An Introduction to the Renormalization Group Oxford Science Publ (Clarendon Press) ISBN 9780198513933 URL https://books.google.com.mx/ books?id=lvZcGJI3X9cC [107] Mermin N D 1979 Rev. Mod. Phys. 51(3) 591–648 [108] Chuang I, Yurke B, Pargellis A N and Turok N 1993 Phys. Rev. E 47(5) 3343–3356 [109] Gil-Villegas A, McGrother S C and Jackson G 1997 Chemical Physics Letters 269 441 – 447 ISSN 0009-2614 URL http://www.sciencedirect.com/science/article/pii/ S0009261497003072 [110] Alder B J and Wainwright T E 1957 The Journal of Chemical Physics 27 1208–1209 ISSN 0021- 9606 publisher: American Institute of Physics URL https://aip.scitation.org/doi/10. 1063/1.1743957 [111] Fern´andezL A, Mart´ın-Mayor V, Seoane B and Verrocchio P 2012 Physical Review Letters 108 165701 publisher: American Physical Society URL https://link.aps.org/doi/10.1103/ PhysRevLett.108.165701 [112] Hoover W G and Ree F H 1968 The Journal of Chemical Physics 49 3609–3617 ISSN 0021-9606 publisher: American Institute of Physics URL https://aip.scitation.org/doi/10.1063/ 1.1670641 [113] Robles M, L´opez de Haro M and Santos A 2014 The Journal of Chemical Physics 140 136101 ISSN 0021-9606 publisher: American Institute of Physics URL https://aip.scitation.org/ doi/10.1063/1.4870524 [114] Annett J F 2004 Superconductivity, superfluids, and condensates Oxford mas- ter series in condensed matter physics (Oxford University Press) ISBN 9780198507550,0198507550,0198507569,9780198507567 [115] Tinkham M 1996 Introduction to superconductivity 2nd ed International series in pure and (McGraw Hill) ISBN 0070648786,9780070648784 [116] Santi E, Eskandari S, Tian B and Peng K 2018 6 - power electronic modules Power Electronics Handbook (Fourth Edition) ed Rashid M H (Butterworth-Heinemann) pp 157 – 173 fourth edition ed ISBN 978-0-12-811407-0 URL http://www.sciencedirect.com/science/article/ pii/B9780128114070000064 Machine Learning for Condensed Matter Physics 41

[117] Yeh N C 2008 AAPPS Bulletin 18 11 [118] Council N R 2007 Condensed-Matter and Materials Physics: The Science of the World Around Us (Washington, DC: The National Academies Press) ISBN 978-0-309-10969-7 [119] Cipra B A 1987 The American Mathematical Monthly 94 937–959 (Preprint https://doi. org/10.1080/00029890.1987.12000742) URL https://doi.org/10.1080/00029890.1987. 12000742 [120] Ising E 1925 Z. Phys. 31 253–258 [121] Onsager L 1944 Phys. Rev. 65(3-4) 117–149 [122] Wilson K G 1971 Phys. Rev. B 4(9) 3174–3183 [123] Wilson K G 1971 Phys. Rev. B 4(9) 3184–3205 URL https://link.aps.org/doi/10.1103/ PhysRevB.4.3184 [124] Ferrenberg A M and Landau D P 1991 Phys. Rev. B 44(10) 5081–5091 URL https://link. aps.org/doi/10.1103/PhysRevB.44.5081 [125] Stanley H E 1968 Phys. Rev. Lett. 20(12) 589–592 URL https://link.aps.org/doi/10.1103/ PhysRevLett.20.589 [126] Ashkin J and Teller E 1943 Phys. Rev. 64(5-6) 178–184 URL https://link.aps.org/doi/10. 1103/PhysRev.64.178 [127] Kunz H and Zumbach G 1992 Phys. Rev. B 46(2) 662–673 URL https://link.aps.org/doi/ 10.1103/PhysRevB.46.662 [128] Straley J P and Fisher M E 1973 Journal of Physics A: Mathematical, Nuclear and General 6 1310–1326 URL https://doi.org/10.1088/0305-4470/6/9/007 [129] Berezinsky V 1971 Sov. Phys. JETP 32 493–500 URL https://inspirehep.net/literature/ 61186 [130] Berezinsky V 1972 Sov. Phys. JETP 34 610 URL https://inspirehep.net/literature/ 1716846 [131] Kosterlitz J M and Thouless D J 1973 Journal of Physics C: Solid State Physics 6 1181–1203 [132] Kosterlitz J M 1974 Journal of Physics C: Solid State Physics 7 1046–1060 [133] Chen J, Liao H J, Xie H D, Han X J, Huang R Z, Cheng S, Wei Z C, Xie Z Y and Xiang T 2017 Chinese Physics Letters 34 050503 [134] Yong J, Lemberger T R, Benfatto L, Ilin K and Siegel M 2013 Phys. Rev. B 87(18) 184505 URL https://link.aps.org/doi/10.1103/PhysRevB.87.184505 [135] Lin Y H, Nelson J and Goldman A M 2012 Phys. Rev. Lett. 109(1) 017002 URL https: //link.aps.org/doi/10.1103/PhysRevLett.109.017002 [136] Gibney E and Castelvecchi D 2016 Nature News 538 18 [137] Carrasquilla J and Melko R G 2017 Nature Physics 13 431–434 ISSN 1745-2473 [138] Nightingale P 1982 Journal of Applied Physics 53 7927–7932 [139] Kogut J B 1979 Rev. Mod. Phys. 51(4) 659–713 [140] Deng D L, Li X and Das Sarma S 2017 Phys. Rev. B 96(19) 195145 [141] Broecker P, Carrasquilla J, Melko R G and Trebst S 2017 Scientific reports 7 1–10 [142] Tanaka A and Tomiya A 2017 Journal of the Physical Society of Japan 86 063001 [143] Zhang Y, Melko R G and Kim E A 2017 Phys. Rev. B 96(24) 245119 [144] Rodriguez-Nieva J F and Scheurer M S 2019 Nature Physics 15 790–795 [145] Beach M J S, Golubeva A and Melko R G 2018 Phys. Rev. B 97(4) 045207 URL https: //link.aps.org/doi/10.1103/PhysRevB.97.045207 [146] Melko R G and Gingras M J P 2004 Journal of Physics: Condensed Matter 16 R1277–R1319 URL https://doi.org/10.1088/0953-8984/16/43/r02 [147] Ch’ng K, Carrasquilla J, Melko R G and Khatami E 2017 Phys. Rev. X 7(3) 031038 URL https://link.aps.org/doi/10.1103/PhysRevX.7.031038 [148] Ponte P and Melko R G 2017 Phys. Rev. B 96(20) 205146 URL https://link.aps.org/doi/ 10.1103/PhysRevB.96.205146 [149] Morningstar A and Melko R G 2017 J. Mach. Learn. Res. 18 5975–5991 ISSN 1532-4435 Machine Learning for Condensed Matter Physics 42

[150] Efthymiou S, Beach M J S and Melko R G 2019 Phys. Rev. B 99(7) 075113 URL https: //link.aps.org/doi/10.1103/PhysRevB.99.075113 [151] Zhang W, Liu J and Wei T C 2019 Physical Review E 99 1–16 ISSN 24700053 (Preprint arXiv:1804.02709v2) [152] Kim D and Kim D H 2018 Phys. Rev. E 98(2) 022138 [153] Giannetti C, Lucini B and Vadacchino D 2019 Nuclear Physics B 944 114639 ISSN 05503213 [154] Friedman J, Hastie T and Tibshirani R 2001 The elements of statistical learning vol 1 (Springer series in statistics New York) [155] Kenta S, Hiroyuki M, Yutaka O and Kuan L H 2020 Scientific Reports (Nature Publisher Group) 10 [156] Wu F Y 1982 Rev. Mod. Phys. 54(1) 235–268 URL https://link.aps.org/doi/10.1103/ RevModPhys.54.235 [157] Carvalho D, Garc´ıa-Mart´ınezN A, Lado J L and Fern´andez-RossierJ 2018 Phys. Rev. B 97(11) 115453 URL https://link.aps.org/doi/10.1103/PhysRevB.97.115453 [158] Ohtsuki T and Ohtsuki T 2016 Journal of the Physical Society of Japan 85 123706 (Preprint https://doi.org/10.7566/JPSJ.85.123706) URL https://doi.org/10.7566/ JPSJ.85.123706 [159] Ohtsuki T and Ohtsuki T 2017 Journal of the Physical Society of Japan 86 044708 (Preprint https://doi.org/10.7566/JPSJ.86.044708) URL https://doi.org/10.7566/ JPSJ.86.044708 [160] Ohtsuki T and Mano T 2020 Journal of the Physical Society of Japan 89 022001 (Preprint https: //doi.org/10.7566/JPSJ.89.022001) URL https://doi.org/10.7566/JPSJ.89.022001 [161] Livni R, Shalev-Shwartz S and Shamir O 2014 On the Computational Efficiency of Training Neural Networks Advances in Neural Information Processing Systems 27 ed Ghahramani Z, Welling M, Cortes C, Lawrence N D and Weinberger K Q (Curran Associates, Inc.) pp 855–863 URL http://papers.nips.cc/paper/ 5267-on-the-computational-efficiency-of-training-neural-networks.pdf [162] Song L, Vempala S, Wilmes J and Xie B 2017 On the Complexity of Learning Neural Networks Advances in Neural Information Processing Systems 30 ed Guyon I, Luxburg U V, Bengio S, Wallach H, Fergus R, Vishwanathan S and Gar- nett R (Curran Associates, Inc.) pp 5514–5522 URL http://papers.nips.cc/paper/ 7135-on-the-complexity-of-learning-neural-networks.pdf [163] Maaten L v d and Hinton G 2008 Journal of machine learning research 9 2579–2605 [164] Ng A Y, Jordan M I and Weiss Y 2001 On spectral clustering: Analysis and an algorithm ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS (MIT Press) pp 849– 856 [165] FRS K P 1901 The London, Edinburgh, and Dublin and Journal of Science 2 559–572 [166] Wetzel S J 2017 Phys. Rev. E 96(2) 022140 URL https://link.aps.org/doi/10.1103/ PhysRevE.96.022140 [167] Wang L 2016 Phys. Rev. B 94(19) 195105 URL https://link.aps.org/doi/10.1103/ PhysRevB.94.195105 [168] Huembeli P, Dauphin A and Wittek P 2018 Phys. Rev. B 97(13) 134109 [169] Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A and Bengio Y 2014 Generative adversarial nets Advances in neural information processing systems pp 2672–2680 [170] Rocchetto A, Grant E, Strelchuk S, Carleo G and Severini S 2018 npj Quantum Information 4 1–7 [171] Kingma D P and Welling M 2013 arXiv preprint arXiv:1312.6114 [172] Zhang P, Shen H and Zhai H 2018 Phys. Rev. Lett. 120(6) 066401 URL https://link.aps.org/ doi/10.1103/PhysRevLett.120.066401 Machine Learning for Condensed Matter Physics 43

[173] Liang X, Liu W Y, Lin P Z, Guo G C, Zhang Y S and He L 2018 Phys. Rev. B 98(10) 104426 URL https://link.aps.org/doi/10.1103/PhysRevB.98.104426 [174] Baumg¨artnerA, Burkitt A, Ceperley D, De Raedt H, Ferrenberg A, Heermann D, Herrmann H, Landau D, Levesque D, von der Linden W et al. 2012 The Monte Carlo method in condensed matter physics vol 71 (Springer Science & Business Media) [175] Newman M and Barkema G 1999 Monte Carlo methods in statistical physics chapter 1-4 (Oxford University Press: New York, USA) [176] Gubernatis J E 2003 The monte carlo method in the physical sciences: celebrating the 50th anniversary of the metropolis algorithm The Monte Carlo Method in the Physical Sciences vol 690 [177] Landau D P and Binder K 2014 A guide to Monte Carlo simulations in statistical physics (Cambridge university press) [178] Liu J, Qi Y, Meng Z Y and Fu L 2017 Physical Review B 95 1–5 ISSN 24699969 [179] Huang L and Wang L 2017 Phys. Rev. B 95(3) 035105 URL https://link.aps.org/doi/10. 1103/PhysRevB.95.035105 [180] Liu J, Shen H, Qi Y, Meng Z Y and Fu L 2017 Phys. Rev. B 95(24) 241104 URL https: //link.aps.org/doi/10.1103/PhysRevB.95.241104 [181] Nagai Y, Shen H, Qi Y, Liu J and Fu L 2017 Phys. Rev. B 96(16) 161102 URL https: //link.aps.org/doi/10.1103/PhysRevB.96.161102 [182] Shen H, Liu J and Fu L 2018 Phys. Rev. B 97(20) 205140 URL https://link.aps.org/doi/ 10.1103/PhysRevB.97.205140 [183] Chen C, Xu X Y, Liu J, Batrouni G, Scalettar R and Meng Z Y 2018 Phys. Rev. B 98(4) 041102 URL https://link.aps.org/doi/10.1103/PhysRevB.98.041102 [184] Pilati S, Inack E M and Pieri P 2019 Phys. Rev. E 100(4) 043301 URL https://link.aps.org/ doi/10.1103/PhysRevE.100.043301 [185] Nagai Y, Okumura M and Tanaka A 2020 Phys. Rev. B 101(11) 115111 URL https://link. aps.org/doi/10.1103/PhysRevB.101.115111 [186] Behler J and Parrinello M 2007 Phys. Rev. Lett. 98(14) 146401 [187] Bojesen T A 2018 Physical Review E 98 063303 publisher: American Physical Society URL https://link.aps.org/doi/10.1103/PhysRevE.98.063303 [188] Venderley J, Khemani V and Kim E A 2018 Phys. Rev. Lett. 120(25) 257204 URL https: //link.aps.org/doi/10.1103/PhysRevLett.120.257204 [189] Choo K, Neupert T and Carleo G 2019 Phys. Rev. B 100(12) 125124 URL https://link.aps. org/doi/10.1103/PhysRevB.100.125124 [190] Zhang C, Bengio S, Hardt M, Recht B and Vinyals O 2016 Understanding deep learning requires rethinking generalization (Preprint 1611.03530) [191] Lin H W, Tegmark M and Rolnick D 2017 Journal of Statistical Physics 168 1223–1247 [192] Fan F, Xiong J and Wang G 2020 On interpretability of artificial neural networks (Preprint 2001.02522) [193] Mehta P and Schwab D J 2014 An exact mapping between the Variational Renormalization Group and Deep Learning (Preprint 1410.3831) URL http://arxiv.org/abs/1410.3831 [194] Li S H and Wang L 2018 Phys. Rev. Lett. 121(26) 260601 URL https://link.aps.org/doi/ 10.1103/PhysRevLett.121.260601 [195] Koch-Janusz M and Ringel Z 2018 Nature Physics 14 578–582 [196] Lenggenhager P M, G¨okmenD E, Ringel Z, Huber S D and Koch-Janusz M 2020 Phys. Rev. X 10(1) 011037 URL https://link.aps.org/doi/10.1103/PhysRevX.10.011037 [197] de Melllo Koch E, de Mello Koch A, Kastanos N and Cheng L 2020 Short sighted deep learning (Preprint 2002.02664) [198] Hu H Y, Li S H, Wang L and You Y Z 2019 Machine learning holographic mapping by neural network renormalization group (Preprint 1903.00804) [199] Smolensky P 1986 Information processing in dynamical systems: Foundations of harmony theory Machine Learning for Condensed Matter Physics 44

Tech. rep. Colorado Univ at Boulder Dept of Computer Science [200] Larochelle H and Bengio Y 2008 Classification using discriminative restricted boltzmann machines Proceedings of the 25th International Conference on Machine Learning ICML ’08 (New York, NY, USA: Association for Computing Machinery) p 536–543 ISBN 9781605582054 URL https://doi.org/10.1145/1390156.1390224 [201] Hinton G E and Salakhutdinov R R 2006 Science 313 504–507 ISSN 0036-8075 (Preprint https: //science.sciencemag.org/content/313/5786/504.full.pdf) URL https://science. sciencemag.org/content/313/5786/504 [202] Coates A, Ng A and Lee H 2011 An analysis of single-layer networks in unsupervised feature learning Proceedings of the fourteenth international conference on artificial intelligence and statistics pp 215–223 [203] Fischer A and Igel C 2014 Pattern Recognition 47 25–39 [204] Hinton G E 2002 Neural Computation 14 1771–1800 ISSN 0899-7667 publisher: MIT Press URL https://doi.org/10.1162/089976602760128018 [205] Hinton G E 2012 A Practical Guide to Training Restricted Boltzmann Machines Neural Networks: Tricks of the Trade: Second Edition Lecture Notes in Computer Science ed Montavon G, Orr G B and M¨ullerK R (Berlin, Heidelberg: Springer) pp 599–619 ISBN 978-3-642-35289-8 URL https://doi.org/10.1007/978-3-642-35289-8{{_}}32 [206] Huang H and Toyoizumi T 2015 Phys. Rev. E 91(5) 050101 [207] Decelle A, Fissore G and Furtlehner C 2018 Journal of Statistical Physics 172 1576–1608 [208] Nguyen H C, Zecchina R and Berg J 2017 Advances in Physics 66 197–261 (Preprint https: //doi.org/10.1080/00018732.2017.1341604) URL https://doi.org/10.1080/00018732. 2017.1341604 [209] Decelle A, Fissore G and Furtlehner C 2017 EPL (Europhysics Letters) 119 60001 URL https://doi.org/10.1209/0295-5075/119/60001 [210] Tubiana J, Cocco S and Monasson R 2019 Elife 8 e39397 [211] Tubiana J, Cocco S and Monasson R 2019 Neural Computation 31 1671–1717 pMID: 31260391 (Preprint https://doi.org/10.1162/neco{_}a{_}01210) URL https://doi.org/10.1162/ neco{_}a{_}01210 [212] Dahl G, Ranzato M, Mohamed A r and Hinton G E 2010 Phone recognition with the mean- covariance restricted boltzmann machine Advances in neural information processing systems pp 469–477 [213] Nomura Y, Darmawan A S, Yamaji Y and Imada M 2017 Phys. Rev. B 96(20) 205152 URL https://link.aps.org/doi/10.1103/PhysRevB.96.205152 [214] Melko R G, Carleo G, Carrasquilla J and Cirac J I 2019 Nature Physics 15 887–892 [215] Suchsland P and Wessel S 2018 Phys. Rev. B 97(17) 174435 URL https://link.aps.org/doi/ 10.1103/PhysRevB.97.174435 [216] Baity-Jesi M, Sagun L, Geiger M, Spigler S, Arous G B, Cammarota C, LeCun Y, Wyart M and Biroli G 2019 Journal of Statistical Mechanics: Theory and Experiment 2019 124013 [217] Geiger M, Spigler S, d’Ascoli S, Sagun L, Baity-Jesi M, Biroli G and Wyart M 2019 Phys. Rev. E 100(1) 012115 URL https://link.aps.org/doi/10.1103/PhysRevE.100.012115 [218] Allen M P and Tildesley D J 2017 Computer simulation of liquids (Oxford university press) [219] Lindahl E and Sansom M S 2008 Current Opinion in Structural Biology 18 425 – 431 ISSN 0959- 440X membranes / Engineering and design URL http://www.sciencedirect.com/science/ article/pii/S0959440X08000304 [220] Badia A, Lennox R B and Reven L 2000 Accounts of Chemical Research 33 475–481 pMID: 10913236 [221] Sidky H, Col´onY J, Helfferich J, Sikora B J, Bezik C, Chu W, Giberti F, Guo A Z, Jiang X, Lequieu J, Li J, Moller J, Quevillon M J, Rahimi M, Ramezani-Dakhel H, Rathee V S, Reid D R, Sevgen E, Thapar V, Webb M A, Whitmer J K and de Pablo J J 2018 The Journal of Chemical Physics 148 044104 Machine Learning for Condensed Matter Physics 45

[222] Sidky H and Whitmer J K 2018 The Journal of Chemical Physics 148 104111 [223] Sultan M M and Pande V S 2018 The Journal of Chemical Physics 149 094106 [224] Marx D and Hutter J 2009 Ab initio molecular dynamics: basic theory and advanced methods (Cambridge University Press) [225] Koch W and Holthausen M 2015 A Chemist’s Guide to Density Functional Theory (Wiley) ISBN 9783527802814 [226] Parr R G 1980 Density functional theory of atoms and molecules Horizons of quantum chemistry (Springer) pp 5–15 [227] Cust´odioC A, Filletti E´ R and Fran¸caV V 2019 Scientific reports 9 1–7 [228] Manzhos S 2020 Machine Learning: Science and Technology 1 013002 URL https://doi.org/ 10.1088/2632-2153/ab7d30 [229] Berman H, Henrick K, Nakamura H and Markley J L 2006 Nucleic Acids Research 35 D301–D303 ISSN 0305-1048 [230] Hilbert M and L´opez P 2011 Science 332 60–65 ISSN 0036-8075 (Preprint https://science. sciencemag.org/content/332/6025/60.full.pdf) URL https://science.sciencemag. org/content/332/6025/60 [231] Blank T B, Brown S D, Calhoun A W and Doren D J 1995 The Journal of Chemical Physics 103 4129–4137 [232] Ozboyaci M, Kokh D B, Corni S and Wade R C 2016 Quarterly Reviews of Biophysics 49 e4 [233] Bart´okA P, Payne M C, Kondor R and Cs´anyi G 2010 Phys. Rev. Lett. 104(13) 136403 [234] Pan S J and Yang Q 2010 IEEE Transactions on Knowledge and Data Engineering 22 1345–1359 [235] Torrey L and Shavlik J 2010 Transfer Learning iSBN: 9781605667669 Library Catalog: www.igi- global.com Pages: 242-264 Publisher: IGI Global URL www.igi-global.com/chapter/ transfer-learning/36988 [236] Weiss K, Khoshgoftaar T M and Wang D 2016 Journal of Big Data 3 9 ISSN 2196-1115 URL https://doi.org/10.1186/s40537-016-0043-6 [237] French R M 1999 Trends in Cognitive Sciences 3 128–135 ISSN 1364-6613 URL http://www. sciencedirect.com/science/article/pii/S1364661399012942 [238] McCloskey M and Cohen N J 1989 Catastrophic Interference in Connectionist Networks: The Sequential Learning Problem Psychology of Learning and Motivation vol 24 ed Bower G H (Academic Press) pp 109–165 URL http://www.sciencedirect.com/science/article/pii/ S0079742108605368 [239] Kumaran D, Hassabis D and McClelland J L 2016 Trends in Cognitive Sciences 20 512–534 ISSN 1364-6613 URL http://www.sciencedirect.com/science/article/pii/ S1364661316300432 [240] McClelland J L, McNaughton B L and O’Reilly R C 1995 Psychological Review 102 419–457 ISSN 1939-1471(Electronic),0033-295X(Print) place: US Publisher: American Psychological Association [241] Kirkpatrick J, Pascanu R, Rabinowitz N, Veness J, Desjardins G, Rusu A A, Milan K, Quan J, Ramalho T, Grabska-Barwinska A, Hassabis D, Clopath C, Kumaran D and Hadsell R 2017 Proceedings of the National Academy of Sciences 114 3521–3526 ISSN 0027-8424 publisher: National Academy of Sciences eprint: https://www.pnas.org/content/114/13/3521.full.pdf URL https://www.pnas.org/content/114/13/3521 [242] Li Z and Hoiem D 2018 IEEE Transactions on Pattern Analysis and Machine Intelligence 40 2935–2947 ISSN 1939-3539 conference Name: IEEE Transactions on Pattern Analysis and Machine Intelligence [243] Morawietz T, Singraber A, Dellago C and Behler J 2016 Proceedings of the National Academy of Sciences 113 8368–8373 ISSN 0027-8424 [244] Natarajan S K and Behler J 2016 Phys. Chem. Chem. Phys. 18(41) 28704–28725 [245] Bart´okA P, Kondor R and Cs´anyi G 2013 Phys. Rev. B 87(18) 184115 [246] Rupp M, Tkatchenko A, M¨ullerK R and von Lilienfeld O A 2012 Phys. Rev. Lett. 108(5) 058301 Machine Learning for Condensed Matter Physics 46

[247] Thompson A, Swiler L, Trott C, Foiles S and Tucker G 2015 Journal of Computational Physics 285 316 – 330 ISSN 0021-9991 URL http://www.sciencedirect.com/science/article/ pii/S0021999114008353 [248] Geiger P and Dellago C 2013 The Journal of Chemical Physics 139 164105 [249] Balabin R M and Lomakina E I 2011 Phys. Chem. Chem. Phys. 13(24) 11710–11718 [250] Sch¨uttK T, Glawe H, Brockherde F, Sanna A, M¨ullerK R and Gross E K U 2014 Phys. Rev. B 89(20) 205118 URL https://link.aps.org/doi/10.1103/PhysRevB.89.205118 [251] Faber F, Lindmaa A, von Lilienfeld O A and Armiento R 2015 International Journal of Quantum Chemistry 115 1094–1101 (Preprint https://onlinelibrary.wiley.com/doi/pdf/10.1002/ qua.24917) URL https://onlinelibrary.wiley.com/doi/abs/10.1002/qua.24917 [252] Ceriotti M 2019 The Journal of Chemical Physics 150 150901 [253] Jadrich R B, Lindquist B A and Truskett T M 2018 The Journal of Chemical Physics 149 194109 [254] Jadrich R B, Lindquist B A, Pi˜nerosW D, Banerjee D and Truskett T M 2018 The Journal of Chemical Physics 149 194110 [255] Mak C H 2006 Phys. Rev. E 73(6) 065104 URL https://link.aps.org/doi/10.1103/ PhysRevE.73.065104 [256] Bagchi K, Andersen H C and Swope W 1996 Phys. Rev. Lett. 76(2) 255–258 URL https: //link.aps.org/doi/10.1103/PhysRevLett.76.255 [257] Boattini E, Dijkstra M and Filion L 2019 The Journal of Chemical Physics 151 154901 [258] Steinhardt P J, Nelson D R and Ronchetti M 1983 Phys. Rev. B 28(2) 784–805 URL https: //link.aps.org/doi/10.1103/PhysRevB.28.784 [259] Lechner W and Dellago C 2008 The Journal of Chemical Physics 129 114707 [260] Chandrasekhar S, Chandrasekhar S and Press C U 1992 Liquid Crystals Cambridge monographs on physics (Cambridge University Press) ISBN 9780521417471 URL https://books.google. com.mx/books?id=GYDxM1fPArMC [261] Sluckin T, Dunmur D and Stegemeyer H 2004 Crystals That Flow: Classic Papers from the History of Liquid Crystals Liquid Crystals Book Series (Taylor & Francis) ISBN 9780415257893 [262] Walters M, Wei Q and Chen J Z Y 2019 Physical Review E 99 062701 publisher: American Physical Society URL https://link.aps.org/doi/10.1103/PhysRevE.99.062701 [263] Hochreiter S and Schmidhuber J 1997 Neural computation 9 1735–80 [264] Greff K, Srivastava R K, Koutn´ıkJ, Steunebrink B R and Schmidhuber J 2017 IEEE Transactions on Neural Networks and Learning Systems 28 2222–2232 ISSN 2162-2388 [265] Terao T 2020 Soft Materials 0 1–13 [266] de Gennes P, Gennes P and Press C U 1979 Scaling Concepts in Polymer Physics (Cornell University Press) ISBN 9780801412035 URL https://books.google.com.mx/books?id= ApzfJ2LYwGUC [267] Menon A, Gupta C, M Perkins K, L DeCost B, Budwal N, T Rios R, Zhang K, P´oczosB and R Washburn N 2017 Molecular Systems Design & Engineering 2 263–273 publisher: Royal Society of Chemistry URL https://pubs.rsc.org/en/content/articlelanding/2017/me/ c7me00027h [268] Xu Q and Jiang J 2020 ACS Applied Polymer Materials Publisher: American Chemical Society URL https://doi.org/10.1021/acsapm.0c00586 [269] Wu S, Kondo Y, Kakimoto M a, Yang B, Yamada H, Kuwajima I, Lambard G, Hongo K, Xu Y, Shiomi J, Schick C, Morikawa J and Yoshida R 2019 npj Computational Materials 5 1–11 ISSN 2057-3960 number: 1 Publisher: Nature Publishing Group URL https://www.nature. com/articles/s41524-019-0203-2 [270] Hamley I 2013 Introduction to Soft Matter: Synthetic and Biological Self-Assembling Materials (Wiley) ISBN 9781118681428 URL https://books.google.com.mx/books?id=Qd794-DI49wC [271] Jones R, R Jones P and Jones R 2002 Soft Condensed Matter Oxford Master Series in Physics (OUP Oxford) ISBN 9780198505891 URL https://books.google.com.mx/books?id=Hl_ HBPUvoNsC Machine Learning for Condensed Matter Physics 47

[272] Moradzadeh A and Aluru N R 2019 The Journal of Physical Chemistry Letters 10 7568– 7576 publisher: American Chemical Society URL https://doi.org/10.1021/acs.jpclett. 9b02820 [273] Schoenholz S S, Cubuk E D, Sussman D M, Kaxiras E and Liu A J 2016 Nature Physics 12 469–471 ISSN 1745-2481 number: 5 Publisher: Nature Publishing Group URL https: //www.nature.com/articles/nphys3644 [274] Kob W and Andersen H C 1994 Phys. Rev. Lett. 73(10) 1376–1379 URL https://link.aps. org/doi/10.1103/PhysRevLett.73.1376 [275] Liu H, Fu Z, Yang K, Xu X and Bauchy M 2019 Journal of Non-Crystalline Solids 119419 ISSN 0022-3093 URL http://www.sciencedirect.com/science/article/pii/ S0022309319302662 [276] Li L, Yang Y, Zhang D, Ye Z G, Jesse S, Kalinin S V and Vasudevan R K 2018 Science Advances 4 eaap8672 ISSN 2375-2548 publisher: American Association for the Advancement of Science Section: Research Article URL https://advances.sciencemag.org/content/4/3/eaap8672 [277] Minor E N, Howard S D, Green A A S, Glaser M A, Park C S and Clark N A 2020 Soft Matter 16 1751–1759 ISSN 1744-6848 publisher: The Royal Society of Chemistry URL https://pubs.rsc.org/en/content/articlelanding/2020/sm/c9sm01979k [278] Redmon J and Farhadi A 2016 Yolo9000: Better, faster, stronger (Preprint 1612.08242) [279] DeFever R S, Targonski C, Hall S W, Smith M C and Sarupria S 2019 Chem. Sci. 10(32) 7503– 7515 [280] Lecun Y, Bottou L, Bengio Y and Haffner P 1998 Proceedings of the IEEE 86 2278–2324 ISSN 1558-2256 conference Name: Proceedings of the IEEE [281] Xiao H, Rasul K and Vollgraf R 2017 arXiv:1708.07747 [cs, stat] ArXiv: 1708.07747 version: 2 URL http://arxiv.org/abs/1708.07747 [282] Wu Z, Ramsundar B, N Feinberg E, Gomes J, Geniesse C, S Pappu A, Leswing K and Pande V 2018 Chemical Science 9 513–530 publisher: Royal Society of Chemistry URL https://pubs.rsc.org/en/content/articlelanding/2018/sc/c7sc02664a [283] Ramakrishnan R, Dral P O, Rupp M and von Lilienfeld O A 2014 Scientific Data 1 140022 ISSN 2052-4463 number: 1 Publisher: Nature Publishing Group URL https://www.nature.com/ articles/sdata201422 [284] Verlet L 1967 Physical Review 159 98–103 publisher: American Physical Society URL https: //link.aps.org/doi/10.1103/PhysRev.159.98 [285] Zenodo - Research. Shared. URL https://zenodo.org/ [286] Allen C and Mehler D M A 2019 PLOS Biology 17 e3000246 ISSN 1545-7885 publisher: Public Library of Science URL https://journals.plos.org/plosbiology/article?id=10.1371/ journal.pbio.3000246 [287] Cai J, Luo J, Wang S and Yang S 2018 Neurocomputing 300 70–79 ISSN 0925-2312 URL http://www.sciencedirect.com/science/article/pii/S0925231218302911 [288] Mehta P, Bukov M, Wang C H, Day A G R, Richardson C, Fisher C K and Schwab D J 2019 Physics Reports 810 1–124 ISSN 0370-1573 URL http://www.sciencedirect.com/science/ article/pii/S0370157319300766 [289] Snoek J, Larochelle H and Adams R P 2012 Practical Bayesian Optimiza- tion of Machine Learning Algorithms Advances in Neural Information Process- ing Systems 25 ed Pereira F, Burges C J C, Bottou L and Weinberger K Q (Curran Associates, Inc.) pp 2951–2959 URL http://papers.nips.cc/paper/ 4522-practical-bayesian-optimization-of-machine-learning-algorithms.pdf [290] Wan L, Zeiler M, Zhang S, Cun Y L and Fergus R 2013 Regularization of Neural Networks using DropConnect International Conference on Machine Learning pp 1058–1066 iSSN: 1938-7228 Section: Machine Learning URL http://proceedings.mlr.press/v28/wan13.html [291] Hutson M 2018 Science 359 725–726 ISSN 0036-8075, 1095-9203 publisher: American Association for the Advancement of Science Section: In Depth URL https://science.sciencemag.org/ Machine Learning for Condensed Matter Physics 48

content/359/6377/725 [292] Srivastava N, Hinton G, Krizhevsky A, Sutskever I and Salakhutdinov R 2014 The Journal of Machine Learning Research 15 1929–1958 ISSN 1532-4435 [293] Vanschoren J, van Rijn J N, Bischl B and Torgo L 2013 SIGKDD Explorations 15 49–60 URL http://doi.acm.org/10.1145/2641190.2641198 [294] Alberti M, Pondenkandath V, W¨ursch M, Ingold R and Liwicki M 2018 DeepDIVA: A Highly-Functional Python Framework for Reproducible Experiments 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR) pp 423–428 [295] Forde J Z and Paganini M 2019 arXiv:1904.10922 [cs, stat] ArXiv: 1904.10922 URL http: //arxiv.org/abs/1904.10922 [296] Allen C and Mehler D M A 2019 PLOS Biology 17 1–14 URL https://doi.org/10.1371/ journal.pbio.3000246 [297] Samuel S, L¨offlerF and K¨onig-RiesB 2020 arXiv:2006.12117 [cs, stat] ArXiv: 2006.12117 URL http://arxiv.org/abs/2006.12117 [298] Plimpton S 1993 Fast parallel algorithms for short-range molecular dynamics Tech. rep. Sandia National Labs., Albuquerque, NM (United States) URL https://lammps.sandia.gov/index. html [299] Weik F, Weeber R, Szuttor K, Breitsprecher K, de Graaf J, Kuron M, Landsgesell J, Menke H, Sean D and Holm C 2019 The European Physical Journal Special Topics 227 1789–1816 [300] Hashimoto K, Sugishita S, Tanaka A and Tomiya A 2018 Phys. Rev. D 98(4) 046019 URL https://link.aps.org/doi/10.1103/PhysRevD.98.046019