Deep Learning on the 2-Dimensional Ising Model to Extract the Crossover

www.nature.com/scientificreports

OPEN Deep learning on the 2‑dimensional Ising model to extract the crossover region with a variational autoencoder Nicholas Walker1*, Ka‑Ming Tam1,2 & Mark Jarrell1,2

The 2-dimensional Ising model on a square lattice is investigated with a variational autoencoder in the non-vanishing feld case for the purpose of extracting the crossover region between the ferromagnetic and paramagnetic phases. The encoded latent variable space is found to provide suitable metrics for tracking the order and disorder in the Ising confgurations that extends to the extraction of a crossover region in a way that is consistent with expectations. The extracted results achieve an exceptional prediction for the critical point as well as agreement with previously published results on the confgurational magnetizations of the model. The performance of this method provides encouragement for the use of machine learning to extract meaningful structural information from complex physical systems where little a priori data is available.

Machine learning (ML) and consequently data science as a whole have seen rapid development over the last decade or so, due largely to considerable advances in implementations and hardware that have made computa- tions more accessible. Conceptually, the ML approach can be regarded as a data modeling approach employing algorithms that eschew explicit instructions in favor of strategies based around pattern extraction and inference driven by statistical analysis. Tis presents a colossal opportunity for modern scientifc investigations, particularly numerical studies, as they naturally involve large data sets and complex systems where obvious explicit instructions for analysis can be elusive. Conventional approaches ofen neglect possible nuance in the structure of the data in favor of rather simple measurements that are ofen untenable for sufciently complex problems. Some ML methods such as inference methods have been routinely applied to certain physical problems, such as the maximum likelihood method and the maximum entropy method 1,2, but applications which utilizing ML methods have only recently attracted attention in the physical sciences, particularly for the study of interacting systems on both classical and quantum scales3. Tere is a unique opportunity to take advantage of the advances in ML algorithms and implementations to provide interesting new approaches to understanding physical data and even perhaps improve upon existing numerical methods 4. Outstanding problems involving the predictions of transition points and phase diagrams are also of great interest for treatment with ML methods. In order to utilize ML approaches for studying phase transitions, one must assume that there is some pattern change in the measured data across the phase transition. Fortunately, this is in fact exactly what happens in most phase transitions. Te widely adopted Lindemann parameter, for example, is essentially a measure of the deviations of atomic positions in the system from equilibrium positions and is ofen used to characterize the melting of a crystal structure 5. Similar form of pattern changes in the positions of the constituent atoms are ofen present in molecular systems in general. Perhaps more importantly, for some sufciently complex systems, their phase transitions do not have obvious order parameters, ofen prohibiting the detection of such pattern changes using conventional methods. Tis is not a hypothetical situation, indeed hidden orderings for some interesting materials, such as heavy fermion materials and cuprate superconductors, have been proposed for long time 6–8. Other systems may not even exhibit a true phase transition, but rather a crossover region where there is no singularity across diferent phases that can be difcult to characterize with conventional methods. A conventional phase transition can be identifed in two ways, with the frst being a singularity in a derivative of the free energy

1Department of Physics and Astronomy, Louisiana State University, Baton Rouge, LA 70803, USA. 2Center for Computation and Technology, Louisiana State University, Baton Rouge, LA 70803, USA. *email: nwalk19@ lsu.edu

Scientific Reports | (2020) 10:13047 | https://doi.org/10.1038/s41598-020-69848-5 1 Vol.:(0123456789) www.nature.com/scientificreports/

as proposed by Ehrenfest and the second being a broken symmetry exhibited by an order parameter as proposed by Landau. Unlike a conventional phase transition, a crossover is not identifed by a singularity in the free energy. Tere is also no broken symmetry in such a situation and thus no order parameter is associated with a crossover. Te order parameter and singularity in the free energy are presumably sharp and obvious features which can be rather easily identifed. Te absence of such features clearly present a challenging situation in the prediction of a crossover region by ML. ML is a new route of studying these systems by searching for hidden patterns in the measured data where readily applicable a priori information is in short supply. A viable ML method for detecting a crossover will fnd its use in many interesting systems related to the quantum phase transition9. While the quantum phase transition is a second order phase transition controlled by non-thermal parameters at zero temperature, all experiments and most numerical simulations are conducted at fnite albeit low temperatures for practical reasons. As a consequence of said thermal conditions, quantum critical points at low temperatures behave as crossover phenomenona. It is widely believed that many interesting materials, particularly high temperature cuprate superconductors, harbor a quantum critical point. An ML approach for detecting the crossover phenomenon can thus be an important tool for studying quantum critical points. Work has been done on various problems to characterize phase transitions in physical systems using ML methods, including the Ising model in the vanishing feld case 3,10–20. Tis work will use a similar approach to those seen in these papers, but will focus on the crossover regions that are introduced in the non-vanishing feld case of the 2-dimensional Ising model instead of seeking only the exactly known transition point in the vanishing feld case 21. Tis is a somewhat more difcult problem, as there is no explicit transition to be found, but it remains an interesting problem nonetheless and possibly carries much greater implications for crossover regions in more complicated problems. Te Ising model itself is a mathematical model for ferromagnetism that is ofen explored in the feld of statistical mechanics in physics to describe magnetic phenomena22. Originally, the Ising model was developed to investigate magnetic phenomena, as mentioned earlier. With the discovery of electron spins, the model was designed to determine whether or not local interactions between magnetic spins could induce a large fraction of the electronic spins in a material to align in order to produce a macroscopic net magnetic moment. It is expressed in the form of a multidimensional array of spins si that represent a discrete arrangement of magnetic dipole 22 moments of atomic spins . Te spins are restricted to spin-up or spin-down alignments such that si 1, 1 . ∈ {− + } Te spins interact with their nearest neighbors with an interaction strength given by Jij for neighbors si and sj . Te spins can additionally interact with an applied external magnetic feld Hi (where the magnetic dipole moment µ has been absorbed). Te full Hamiltonian describing the system is thus expressed as H Jijsisj Hisi =− − (1) i,j i � � where i, j indicates a sum over adjacent spins. For Jij > 0 , the interaction between the spins is ferromagnetic, for Jij < 0 , the interaction between the spins is antiferromagnetic, and for Jij 0 , the spins are noninteracting. Fur- = thermore, if Hi > 0 , the spin at site i tends to prefer spin-up alignment, if Hi < 0 , the spin at site i tends to prefer spin-down alignment, and if Hi 0 , there is no external magnetic feld infuence on the spin at site i. Te model = has seen extensive use in investigating magnetic phenomena in condensed matter phhysics 24–29. Additionally, the model can be equivalently expressed in the form of the lattice gas model, described by the following Hamiltonian H 4J ninj µ ni =− − (2) i,j i � � where the external feld strength H is reinterpreted as the chemical potential µ , J retains its role as the interaction strength, and ni 0, 1 represents the lattice site occupancy. Te original Ising Hamiltonian can be recovered ∈{ } using the relation σi 2ni 1 up to a constant. Tis model describes a multidimensional array of lattice sites = − which can be either occupied or unoccupied by a hard shell atom, disallowing occupancy greater than one. Te frst term is then interpreted as a short-range attractive interaction term while the second is the fow of atoms between the system and the reservoir. Tis is a simple model of density fuctuation and the liquid-gas transformations used primarily in chemistry, albeit ofen with modifcations 30,31. Additionally, modifed versions of the lattice gas models have been applied to binding behavior in biology 32–34. Typically, the model is studied in the case of Jij J 1 and the vanishing feld case Hi H 0 is of particu- = = = = lar interest for dimension d 2 since a phase transition is exhibited as the critical temperature is crossed. For ≥ two dimensions, the critical temperature can be identifed by exploiting Kramers-Wannier duality symmetry 35–37. At low temperatures with a vanishing feld, the physics of the Ising model is dominated by the nearest-neighbor interactions, which for a ferromagnetic model means that adjacent spins tend to align with one another. How- ever, as the temperature is increased, the thermal fuctuations will eventually overpower the interactions such that the magnetic ordering is destroyed and the orientations of the spins can be considered independent of one another. Tis is called a paramagnet. In such a case, if an external magnetic feld were to be applied, the paramagnet would respond to it and tend to align with it, though for high temperatures, a sufciently strong external feld will be required to overcome the thermal fuctuations. Since the magnetization smoothly decreases to zero with increasing temperature in the presence of an external magnetic feld, there is no phase transition where the magnetization abruptly vanishes. Instead, the region in which the system goes from an ordered to a disordered state is referred to as the crossover region. Generically, a crossover refers to when a system undergoes a change in phase without encountering a canonical phase transition characterized by a critical point as there are no discontinuities in derivatives of the free energy (as determined by Ehrenfest classifcation) or symmetry-breaking mechanisms (as determined by

Scientific Reports | (2020) 10:13047 | https://doi.org/10.1038/s41598-020-69848-5 2 Vol:.(1234567890) www.nature.com/scientificreports/

Landau classifcation). A well known example is the BEC-BCS crossover in an ultracold Fermi gas in which tun- ing the interaction strength (the s-wave scattering length) causes the system to crossover from a Bose-Einstein- condensate state to a Bardeen-Cooper-Schriefer state38. Additionally, the Kondo Efect is important in certain metallic compounds with dilute concentrations of magnetic impurities that cross over from a weakly-coupled Fermi liquid phase to a local Fermi liquid phase upon reducing the temperature below some threshold 39. Fur- thermore, examples of strong crossover phenomena have also been recently discovered in classical models of statistical mechanics such as the Blume-Capel model and the random-feld Ising model40,41. Te organization of this work is as follows. Te next section details the data science and ML methods explored in this work. In sect. 3, the results of the analysis of the 2-dimensional square Ising model are reported. Section 4 concludes this work with a discussion of the interpretation, implications, and greater impacts of these fndings. Methods Te Ising confgurations are generated using a standard Monte Carlo algorithm written in Python using the NumPy library42,43. Te algorithm was also optimized to be parallel using the Dask library and select subroutines were compiled at run-time for efciency using the JIT compiler provided by the Numba library44,45. Te Monte Carlo moves used are called spin-fips. A single spin fip attempt consists of fipping the spin of a single lattice site, calculating the resulting change in energy E , and then using that change in energy to defne the Metropolis criterion exp( �E ) . If a randomly generated number is smaller than said Metropolis criterion, the confgura- − T tion resulting from the spin-fip is accepted as the new confguration. Te data analyzed in this work consists of 1,024 square Ising confgurations of side length 32 with periodic boundary conditions across 65 external feld strengths and 65 temperatures respectively uniformly taken from 2, 2 and [1, 5]. Te interaction energies [− ] were set to unity such that Jij J 1 . Each sample was equilibrated with 8,192 Monte Carlo updates before = = data collection began. Data was then collected at an interval of 8 Monte Carlo updates for each sample up to a sample count of 1,024. At the end of each data collection step, a replica exchange Markov chain Monte Carlo move was performed across the full temperature range for each set of Ising confgurations that shared the same external feld strength46–48. Tis allows for more robust sampling of the ensemble across the temperature range by allowing high-temperature states to be available at low temperatures as well as the inverse. Additionally, this helps to prevent samples on the vanishing feld line from relaxing into either positive or negative magnetization states since two states at temperatures close to one another opposite spins would be very likely to swap. In this work, the Ising spins were rescaled such that a spin-down atomic spins carry the value 0 and spin-up atomic spins carry the value 1, which is a standard setup for binary-valued features in data science. Physically, this would be interpreted as the lattice gas model as described in the prior section. Te goal is to map the raw Ising confgurations to a small set of descriptors that can discriminate between the samples using a structural criterion inferred by an ML algorithm. Tis application is referred to as representation learning and is ofen presented as dimensionality reduction. Tere are many methods in the feld of unsupervised ML that seek to achieve such data dimensionality reduction49,50 however, such methods do not respect the multidimensional structure of the input data, so a deep neural network will be used instead to accomplish the data dimensionality reduction in the form of a self-supervised variational autoencoder (VAE)51. Such a neural network is composed of three main components, an encoder network, a decoder network, and a sampling function. Te encoder and decoder neural networks are implemented as deep convolutional neural networks (CNN) in order to preserve the spatially dependent 2-dimensional structure of the Ising confgurations 52. Te general idea of a VAE is to encode confgurations into a latent variable space composed of the parameters for a chosen prior distribution. A multidimensional Gaussian distribution was used for this work. Random variables from these distributions can then be decoded to recover the original input confgurations. In this way, VAEs are both generative models and latent variable models. Assuming a model is sufciently trained, new sample data can be generated through traversing the latent space input to the decoder network. Te purpose for using a VAE in this manner is to extract a low-dimensional representation of the Ising confgurations that are otherwise unwieldy to compare directly in a meaningful manner without a priori knowl- edge of the important derived measurements from statistical physics used to accomplish the same tasks. Te motivation then for using a VAE to encode and decode the Ising confgurations lies in the desire to automate the parameterization of the Ising confgurations without conventional methods from statistical physics, preferring instead to allow the neural network to learn and discover the important features itself directly from the structures of the confgurations. Te latent representations of the confgurations will be small sets of descriptors for the confgurations that can be used to discriminate between them by relying on the assumption that proximities between latent representations of the Ising confgurations in the latent space are notions of structural similarity between the confgurations in their original 2-dimensional lattice representations. In this way, the VAE is is used as an alternative to conventional statistical mechanics algorithms to accomplish the same task of characterizing the structural features of input confgurations. Te encoder CNN uses four convolutional layers with kernel shapes of (3, 3) following the input layer with kernel strides of (2, 2) and increasing flter counts by a factor of 4. Zero-padding is used to ensure that the entire input is reached with convolutions. Furthermore, each convolutional layer uses scaled exponential linear unit activation functions (SELU) and LeCun normal initializations as well as kernels of shape (3, 3)53. Te output of the fnal convolutional layer is then fattened, feeding into two dense layers of eight neurons respectively rep- 2 resenting the latent variables that correspond to the means µi and logarithmic variances log σi of multivariate Gaussian distributions using linear activations. A random variable zi is drawn from the distribution such that 1 2 zi µi exp log σ N0,1 , where N0,1 is the standard normal distribution. Te logarithmic variance is used in = + [ 2 i ] favor of the standard deviation directly in the interest of maintaining numerical stability. Te random variable zi is then used as the input layer for the decoder CNN, where zi is mapped to a dense layer that is then reshaped

Scientific Reports | (2020) 10:13047 | https://doi.org/10.1038/s41598-020-69848-5 3 Vol.:(0123456789) www.nature.com/scientificreports/

Figure 1. A diagram depicting the structure of the VAE where X is the input Ising confguration, E(X) is the encoder network, µ and σ are the latent means and standard deviations, z is the random Gaussian sample µ σ X from the distribution described by and , D(z) is the decoder network, and ˆ is the reconstructed Ising confguration.

Figure 2. A diagram depicting the convolution operation for a single kernel of shape (3, 3) with a stride of (2, 2) acting on an input of shape (4, 4) with zero-padding denoted by the striped input region to produce an output feature map of shape (2, 2). Each stride is color coded such that each entry in the output is the sum of the products of the kernel weights and input entries over the subvolume corresponding to the same color. Since the stride is less than the kernel size, the subvolumes overlap.

to match the structure of the output from the fnal convolutional layer in the encoder CNN. From there, the decoder CNN is simply the reverse of the encoder network in structure, albeit with convolutional transpose layers in favor of standard convolutional layers. Te fnal output layer from the decoder network is thus a reproduction of the original input confgurations to the encoder network using a sigmoid activation function. Te structures of the VAE as a whole is shown in Fig. 1 as well as an example of a convolution operation that composes the bulk of the operations in the encoder and decoder networks is shown in Fig. 2. Te loss term consists of two separate components. Te frst is the standard reconstruction loss, which was implemented using the binary crossentropy between the encoder input and decoder output in this work. Other choices for the reconstruction loss are still valid, however, such as mean squared error or mean absolute error. Te second loss term is a Kullback-Liebler divergence term which acts as a regularizer to ensure the latent variables µi an σi faithfully represent multivariate Gaussian parameters. Te combination of the reconstruction loss and the Kullback–Leibler divergence is called the tractable evidence lower bound, ofen referred to as ELBO. In this work, the Kullback–Leibler term was decomposed in a manner similar to a β total correlation VAE ( β − − TCVAE) network, separating it into three parts describing the index-code mutual information, total correlation, and dimension-wise Kulback-Leibler divergence 54. Minibatch stratifed sampling was also employed during training54. Te specifc parameters of the decomposition used were α 1 and β 8. = = = Te Nesterov-accelerated Adaptive Moment Estimation (Nadam) optimizer was used to optimize the loss, though many other choices are available 55. It was found that the adaptive nature of the Nadam optimizer more efciently arrived at minimizing the loss during training of the β TCVAE model than other optimizers. Te −

Scientific Reports | (2020) 10:13047 | https://doi.org/10.1038/s41598-020-69848-5 4 Vol:.(1234567890) www.nature.com/scientificreports/

specifc parameters used for the Nadam optimizer were β1 0.9 , β2 0.999 , a schedule decay of 0.4, and the = = default epsilon provided by the Keras library. A learning rate of 0.00001 was chosen. Training was performed over 16 epochs with a batch size of 845 and the samples were shufed before training started. A callback was used to reduce the learning rate on a loss plateau with a patience of 8 epochs. Afer ftting the β TCVAE model, the latent encodings of the Ising confgurations were extracted for further − analysis. Principal component analysis (PCA) was used on the latent means and standard deviations indepen- dently to produce linear transformations of the Gaussian parameters that more clearly discriminate between the samples using the scikit-learn package in an attempt to further disentangle the representations provided by the β -TCVAE49. Tis is done by diagonalizing the covariance matrix of the original features to fnd a set of independent orthogonal projections that describe the most statistically varied linear combinations of the original feature space49. Te PCA projections are then interpreted for the 2-dimensional Ising model. Te motivation for using the principal components (PC) of the latent variables instead of the raw latent variables is to more efectively capture measurements that are both statistically independent due to the orthogonality constraint and also explain the most variance possible in the latent space under said constraint. Given that the latent representations characterize the structure of the Ising confgurations, the principal components of the latent representations allow for more efective discrimination between the diferent structural characteristics of the confgurations than the raw latent variables do. Te β TCVAE model used in this work was implemented using the Keras ML library − with TensorFlow as a backend56,57. Results All of the plots in this section were generated with the MatPlotLib package using a perceptually uniform colormap58. In each plot, the coloration for a square sector on the diagram represents the average value of the measurement at that sector on the diagram, with lightness corresponding to magnitude. Te latent means µi contain only one PC with noticeable statistical signifcance as it explains 77.1% of the total statistical variances between the µi encodings of the Ising confgurations while the rest explained less than 3 4% each. Tis PC will be denoted with ν0 . Tese results refect the accomplishments of prior published works . By comparing ν0 depicted in Fig. 3 to the calculated magnetizations m of the Ising confgurations in Fig. 4, it is readily apparent that ν0 is rather faithfully representing the magnetizations of the Ising confgurations. Tere are some inaccuracies in the intermediate magnetizations produced by a relationship resembing a sigmoid between ν0 and m, but a very clear discrimination between the ferromagnetic spin-up and ferromagnetic spin- down confgurations is shown. Since the magnetizations act as the order parameter for the 2-dimensional Ising model, this shows that the extraction of a reasonable representation of the order parameter is possible with a VAE. It is important to note that since the magnetization is a linear feature of the Ising confgurations, a much simpler linear model would be sufcient for extracting the magnetization. Te latent standard deviations σi show much more interesting behavior, however. Two PCs of the σi encodings, denoted as τ0 and τ1 , are investigated with respect to the external feld strengths and the temperatures. By comparing τ0 depicted in Fig. 5 to the calculated energies E of the Ising confgurations shown in Fig. 6, it is clear that τ0 exhibits a strong discrimination between the low to intermediate energy regions and the highest energy region characterized by a cone starting at the vanishing feld critical point approximated at TC 2.25 ≈ that extends symmetrically to include more external feld values with rising temperature, which is rather similar to the critical point predicted using a dense autoencoder15. Tis is in efect capturing the concretely paramagnetic samples and the relative error in the estimation of the critical temperature is acceptable with a 0.85% overestimate 2 21 error with respect to the exact value of TC 2.27 . Given that the paramagnetic samples are essen- = ln 1 √2 ≈ tially noise due to entropic contributions from [thermal+ ] fuctuations destroying any order that would otherwise be present, it makes sense that these would be easy to discriminate from the rest of the samples using a β TCVAE − model. Tis is because the samples with ν0 values corresponding to nearly zero magnetizations and rather high values for τ0 will resemble Gaussian noise with no notable order preference, which is indeed refected in the raw data. In this way, it seems that ν0 is suitable for tracking the ferromagnetic ordering while τ0 is suitable for characterizing the paramagnetic disorder. Te behavior of τ1 shown in Fig. 7 is even more interesting, as it is not simply discriminating samples with intermediate energies from the rest of the data set. If this were true, then some samples at temperatures below the critical point at non-zero external feld strengths would be included, as is readily apparent in the energies shown in Fig. 6. Rather, there is another cone shape as was seen with τ0 , albeit much wider and with the the samples represented strongly by τ0 omitted. In efect, it would appear as if τ1 is capturing regions in the diagram with intermediate structural disorder as opposed to the highly disordered structures captured by τ0 . Interestingly, τ1 bears a rather strong resemblance to the specifc heat capacity C depicted in Fig. 8. It is worth noting that there is a slight asymmetry between the spin up and the spin down confgurations in τ1 , but it has negligible efects on the relevant analysis. Te distribution of the error between the true and β TCVAE predicted values for the Ising model spins is − shown in Fig. 9. Te distribution is very sharply centered around and reasonably symmetric about zero, showing suitable spin prediction accuracy without a considerable bias towards one spin over the other. Te distribution of the absolute errors between the true and β TCVAE predicted values is shown in Fig. 10, showing that the − bulk of the predictions exhibit very little error. Te distribution of the Kullback–Leibler divergences is depicted in Fig. 11 and is well-behaved with few outliers.

Scientific Reports | (2020) 10:13047 | https://doi.org/10.1038/s41598-020-69848-5 5 Vol.:(0123456789) www.nature.com/scientificreports/

Figure 3. Te ensemble average ν0 with respect to the external feld strengths and temperatures.

Figure 4. Te ensemble average magnetization m with respect to the external feld strengths and temperatures.

Scientific Reports | (2020) 10:13047 | https://doi.org/10.1038/s41598-020-69848-5 6 Vol:.(1234567890) www.nature.com/scientificreports/

Figure 5. Te ensemble average τ0 with respect to the external feld strengths and temperatures.

Figure 6. Te ensemble average energy E with respect to the external feld strengths and temperatures.

Scientific Reports | (2020) 10:13047 | https://doi.org/10.1038/s41598-020-69848-5 7 Vol.:(0123456789) www.nature.com/scientificreports/

Figure 7. Te ensemble average τ1 with respect to the external feld strengths and temperatures.

Figure 8. Te ensemble Ising specifc heat C with respect to the external feld strengths and temperatures.

Scientific Reports | (2020) 10:13047 | https://doi.org/10.1038/s41598-020-69848-5 8 Vol:.(1234567890) www.nature.com/scientificreports/

Figure 9. Te distribution of errors in the β TCVAE Ising spin predictions. −

Figure 10. Te distribution of absolute errors in the β TCVAE Ising spin predictions. −

Figure 11. Te distribution of Kullback–Leibler divergences of the β TCVAE model latent encodings. −

Scientific Reports | (2020) 10:13047 | https://doi.org/10.1038/s41598-020-69848-5 9 Vol.:(0123456789) www.nature.com/scientificreports/

Discussion In essence, using a VAE to extract structural information from raw Ising confgurations exposes interesting derived descriptors of the confgurations that can be used to not only identify a transition point, but also a crossover region amongst other regions of interest. Te crux of this analysis is in the interpretation of the extracted feature space as represented by the latent variables. Tis is done by studying the behavior of the latent variable mappings of the Ising confgurations with respect to the external magnetic felds and temperatures. Considering that ν0 refects the magnetization for the 2-dimensional Ising model, this means it can be readily interpreted as an indicator for the ferromagnetic ordering exhibited by the confgurations. By contrast, τ0 and τ1 can be interpreted as an indicator of paramagnetic disorder that also provides a suitable estimate of the transition temperature. Te extracted region from τ1 can readily be interpreted as the crossover region, as these confgurations exhibit order preferences alongside a signifcant amount of noise brought on by the entropic contributions from the thermal fuctuations at higher temperatures. As would be expected of the crossover region, it shifs to higher temperatures with increasing external magnetic feld strengths. Tese results potentially carry broad implications for the path towards formulating a generalized order parameter alongside a notion of a crossover region with minimal a priori information through the use of ML methods, which would allow for the investigation of many interesting complex systems in condensed matter physics and materials science. Te advantage of the present method is in its capability of capturing the crossovers. Tis opens a new avenue for the study of quantum critical points from the data obtained at low but fnite temperatures that instead exhibits crossover regions. Examples of these include data from large scale numerical Quantum Monte Carlo simulations for heavy fermion materials and high temperature superconducting cuprates for which quantum critical points are believed to play crucial roles for their interesting properties 60–61. Tere are many opportunities beyond investigating more complex systems by introducing improvements to this method beyond the scope of this work. For instance, fnite-size scaling is an important approach towards addressing limitations presented by fnite-sized systems for investigation critical phenomena 62. Establishing correspondence between the VAE encodings of diferent system sizes is a challenging proposition, as diferent VAE structures will need to be trained for each system size, which in turn may require diferent hyperparameters and training iteration counts to provide similar results. Consequently, numerical difculties can arise when perform- ing fnite-size scaling analysis, as the variation of predicted properties with respect to system size may be difcult to isolate from the systemic variation due to diferent neural networks being used to extract said properties. Nevertheless, this would be a signifcant step towards improving VAE characterization of critical phenomena.

Received: 27 December 2019; Accepted: 6 July 2020

References 1. Gubernatis, J. E., Jarrell, M., Silver, R. N. & Sivia, D. S. Quantum monte carlo simulations and maximum entropy: Dynamics from imaginary-time data. Phys. Rev. B 44, 6011–6029. https://doi.org/10.1103/PhysRevB.44.6011 (1991). 2. Jarrell, M. & Gubernatis, J. Bayesian inference and the analytic continuation of imaginary-time quantum monte carlo data. Phys. Rep. 269, 133–195. https://doi.org/10.1016/0370-1573(95)00074-7 (1996). 3. Carrasquilla, J. & Melko, R. G. Machine learning phases of matter. Nat. Phys. 13, 431–434 (2017). 4. Huang, L. & Wang, L. Accelerated monte carlo simulations with restricted boltzmann machines. Phys. Rev. B 95, 035105. https:// doi.org/10.1103/PhysRevB.95.035105 (2017). 5. Lindemann, F. Te calculation of molecular vibration frequencies. Physik. Z. 11, 609–615 (1910). 6. Varma, C. M. & Zhu, L. Helicity order: Hidden order parameter in uru2si2 . Phys. Rev. Lett. 96, 036405. https://doi.org/10.1103/ PhysRevLett.96.036405 (2006). 7. Chakravarty, S., Laughlin, R. B., Morr, D. K. & Nayak, C. Hidden order in the cuprates. Phys. Rev. B 63, 094503. https://doi. org/10.1103/PhysRevB.63.094503 (2001). 8. Chandra, P., Coleman, P., Mydosh, J. A. & Tripathi, V. Nature 417, (2002). 9. Vojta, M. Quantum phase transitions. Rep. Prog. Phys. 66, 2069–2110. https://doi.org/10.1088/0034-4885/66/12/r01 (2003). 10. Wang, L. Discovering phase transitions with unsupervised learning. Phys. Rev. B 94, 195105. https://doi.org/10.1103/PhysR evB.94.195105 (2016). 11. Pilania, G., Gubernatis, J. E. & Lookman, T. Structure classifcation and melting temperature prediction in octet ab solids via machine learning. Phys. Rev. B 91, 214302. https://doi.org/10.1103/PhysRevB.91.214302 (2015). 12. Walker, N., Tam, K.-M., Novak, B. & Jarrell, M. Identifying structural changes with unsupervised machine learning methods. Phys. Rev. E 98, 053305. https://doi.org/10.1103/PhysRevE.98.053305 (2018). 13. Hu, W., Singh, R. R. P. & Scalettar, R. T. Discovering phases, phase transitions, and crossovers through unsupervised machine learning: A critical examination. Phys. Rev. E 95, 062122. https://doi.org/10.1103/PhysRevE.95.062122 (2017). 14. Wetzel, S. J. Unsupervised learning of phase transitions: From principal component analysis to variational autoencoders. Phys. Rev. E 96, 022140. https://doi.org/10.1103/PhysRevE.96.022140 (2017). 15. Alexandrou, C., Athenodorou, A., Chrysostomou, C. & Paul, S. Unsupervised identifcation of the phase transition on the 2D-Ising model. arXiv e-printsarXiv:1903.03506 (2019). 16. Wetzel, S. J. & Scherzer, M. Machine learning of explicit order parameters: from the Ising model to SU(2) lattice gauge theory. Phys. Rev. 96, 184410. https://doi.org/10.1103/PhysRevB.96.184410 (2017). 17. Wang, L. Discovering phase transitions with unsupervised learning. Phys. Rev. 94, 195105. https://doi.org/10.1103/PhysR evB.94.195105 (2016). 18. Kim, D. & Kim, D.-H. Smallest neural network to learn the Ising criticality. Phys. Rev. E 98, 022138. https://doi.org/10.1103/PhysR evE.98.022138 (2018). 19. Torlai, G. & Melko, R. G. Learning thermodynamics with boltzmann machines. Phys. Rev. B 94, 165134 (2016). 20. Morningstar, A. & Melko, R. G. Deep learning the ising model near criticality. J. Mach. Learn. Res. 18, 5975–5991 (2017). 21. Onsager, L. Crystal statistics. i. a two-dimensional model with an order-disorder transition. Phys. Rev. 65, 117–149. https://doi. org/10.1103/PhysRev.65.117 (1944). 22. Chaikin, P. M. & Lubensky, T. C. Principles of condensed matter physics (Cambridge University Press, Cambridge, 1995). 23. Joy, P. A., Kumar, P. S. A. & Date, S. K. Te relationship between feld-cooled and zero-feld-cooled susceptibilities of some ordered magnetic systems. J. Phys. Condens. Matter 10, 11049–11054. https://doi.org/10.1088/0953-8984/10/48/024 (1998).

Scientific Reports | (2020) 10:13047 | https://doi.org/10.1038/s41598-020-69848-5 10 Vol:.(1234567890) www.nature.com/scientificreports/

24. Nordblad, P., Lundgren, L. & Sandlund, L. A link between the relaxation of the zero feld cooled and the thermoremanent magnetizations in spin glasses. J. Magn. Magn. Mater. 54–57, 185–186. https://doi.org/10.1016/0304-8853(86)90543-3 (1986). 25. Montroll, E. W., Potts, R. B. & Ward, J. C. Correlations and spontaneous magnetization of the two-dimensional ising model. J. Math. Phys. 4, 308–322. https://doi.org/10.1063/1.1703955 (1963). 26. Singh, S. P. Spinodal theory: A common rupturing mechanism in spinodal dewetting and surface directed phase separation (some technological aspects: spatial correlations and the signifcance of dipole-quadrupole interaction in spinodal dewetting). Adv. Condens. Matter. Phys. 2011. https://doi.org/10.1155/2011/526397 (2011). 27. Magnus, F. et al. Long-range magnetic interactions and proximity efects in an amorphous exchange-spring magnet. Nat. Commun. 7, 1–7 (2016). 28. Singh, S. P. Revisiting 2d lattice based spin fip-fop ising model: magnetic properties of a thin flm and its temperature dependence. Eur. J. Phys. Educ. 5, 8–19 (2017). 29. Huang, R. & Gujrati, P. D. Phase transitions of antiferromagnetic ising spins on the zigzag surface of an asymmetrical husimi lattice. R. S. Open Sci. 6, 181500 (2019). 30. Hirschfelder, J. O., Curtiss, C. F. & Bird., R. B. Molecular theory of gases and liquids. J. Polym. Sci. 17, 116–116. https://doi. org/10.1002/pol.1955.120178311 (1955). 31. Titov, S. V. & Tovbin, Y. K. A molecular model of water based on the lattice gas model. Russ. J. Phys. Chem. A 85, 194–201. https ://doi.org/10.1134/S0036024411020336 (2011). 32. Shi, Y. & Duke, T. Cooperative model of bacterial sensing. Phys. Rev. E 58, 6399–6406. https://doi.org/10.1103/PhysR evE.58.6399 (1998). 33. Bai, F. et al. Conformational spread as a mechanism for cooperativity in the bacterial fagellar switch. Science 327, 685–689. https ://doi.org/10.1126/science.1182105 (2010). 34. Vtyurina, N. N. et al. Hysteresis in dna compaction by dps is described by an ising model. Proc. Nat. Acad. Sci. 113, 4982–4987. https://doi.org/10.1073/pnas.1521241113 (2016). 35. Baxter, R. J. Exactly solved models in statistical mechanics (1982). 36. Kramers, H. A. & Wannier, G. H. Statistics of the two-dimensional ferromagnet. part ii. Phys. Rev. 60, 263–276. https://doi. org/10.1103/PhysRev.60.263 (1941). 37. Wannier, G. H. Te statistical problem in cooperative phenomena. Rev. Mod. Phys. 17, 50–60. https://doi.org/10.1103/RevMo dPhys.17.50 (1945). 38. Nozieres, P. & Schmitt-Rink, S. Bose condensation in an attractive fermion gas: from weak to strong coupling superconductivity. J. Low Temp. Phys. 59, 195–211 (1985). 39. Kondo, J. Resistance minimum in dilute magnetic alloys. Prog. Teor. Phys. 32, 37–49. https://doi.org/10.1143/PTP.32.37 (1964). 40. Fytas, N. G. et al. Universality from disorder in the random-bond blume-capel model. Phys. Rev. E 97, 040102. https://doi. org/10.1103/PhysRevE.97.040102 (2018). 41. Fytas, N. G. & Martín-Mayor, V. Universality in the three-dimensional random-feld ising model. Phys. Rev. Lett. 110, 227201. https://doi.org/10.1103/PhysRevLett.110.227201 (2013). 42. Van Rossum, G. & Drake, F. L. Python 3 Reference Manual (CreateSpace, Scotts Valley, CA, 2009). 43. van der Walt, S., Colbert, S. C. & Varoquaux, G. Te numpy array: a structure for efcient numerical computation. Comput. Sci. Eng. 13, 22–30. https://doi.org/10.1109/MCSE.2011.37 (2011). 44. Dask Development Team. Dask: Library for dynamic task scheduling. https://dask.org (2016). 45. Lam, S. K., Pitrou, A. & Seibert, S. Numba: A llvm-based python jit compiler. In Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC, LLVM ’15, 7:1–7:6, https://doi.org/10.1145/28331 57.28331 62 (ACM, New York, NY, USA, 2015). 46. Swendsen, R. H. & Wang, J.-S. Replica monte carlo simulation of spin-glasses. Phys. Rev. Lett. 57, 2607–2609. https://doi. org/10.1103/PhysRevLett.57.2607 (1986). 47. Hukushima, K. & Nemoto, K. Exchange monte carlo method and application to spin glass simulations. J. Phys. Soc. Jpn. 65, 1604–1608. https://doi.org/10.1143/JPSJ.65.1604 (1996). 48. Marinari, E. & Parisi, G. Simulated tempering: a new monte carlo scheme. EPL 19, 451–458. https://doi.org/10.1209/0295- 5075/19/6/002 (1992). 49. Pearson, K. Liii. on lines and planes of closest ft to systems of points in space. Philos. Mag. 2, 559–572. https://doi.org/10.1080/14786 440109462720 (1901). 50. van der Maaten, L. & Hinton, G. Visualizing data using t-sne. J. Mach. Learn. Res. 9, 2579–2605 (2011). 51. Kingma, D. P. & Welling, M. Auto-Encoding Variational Bayes. arXiv e-printsarXiv:1312.6114 (2013). 52. Zhang, W., Itoh, K., Tanida, J. & Ichioka, Y. Parallel distributed processing model with local space-invariant interconnections and its optical architecture. Appl. Opt. 29, 4790–4797. https://doi.org/10.1364/AO.29.004790 (1990). 53. Klambauer, G., Unterthiner, T., Mayr, A. & Hochreiter, S. Self-normalizing neural networks. CoRRarXiv:abs/1706.02515 (2017). 54. Chen, T. Q., Li, X., Grosse, R. B. & Duvenaud, D. K. Isolating sources of disentanglement in variational autoencoders. CoRRarXiv :abs/1802.04942 (2018). 55. Ruder, S. An overview of gradient descent optimization algorithms. arXiv e-printsarXiv:1609.04747 (2016). 56. Chollet, F. et al. Keras. https://keras.io (2015). 57. Abadi, M. et al. TensorFlow: large-scale machine learning on heterogeneous systems (2015). Sofware available from tensorfow. org. 58. Hunter, J. D. Matplotlib: a 2d graphics environment. Comput. Sci. Eng. 9, 90–95. https://doi.org/10.1109/MCSE.2007.55 (2007). 59. Coleman, P. & Schofeld, A. J. Quantum criticality. Nature 433, 226 (2005). 60. Varma, C., Nussinov, Z. & van Saarloos, W. Singular or non-fermi liquids. Phys. Rep. 361, 267–417. https://doi.org/10.1016/S0370 -1573(01)00060-6 (2002). 61. Vidhyadhiraja, N. S., Macridin, A., Şen, C., Jarrell, M. & Ma, M. Quantum critical point at fnite doping in the 2d hubbard model: a dynamical cluster quantum monte carlo study. Phys. Rev. Lett. 102, 206407. https://doi.org/10.1103/PhysRevLett.102.206407 (2009). 62. Cardy, J. Finite-size Scaling. Current physics (North-Holland, 1988). Acknowledgements Tis work is funded by the NSF EPSCoR CIMM project under award OIA-1541079. Additional support (MJ) was provided by NSF Materials Teory grant DMR-1728457. An award of computer time was provided by the INCITE program. Tis research also used resources of the Oak Ridge Leadership Computing Facility, which is a DOE Ofce of Science User Facility supported under Contract DE-AC05-00OR22725. Author contributions N.W. collected the simulation data, analyzed the results, prepared the fgures, and wrote the bulk of the manuscript. K.-M.T. wrote considerable parts of the introduction. M.J. provided valuable guidance and conversations regarding the physical signifcance of the work. All authors reviewed the manuscript.

Scientific Reports | (2020) 10:13047 | https://doi.org/10.1038/s41598-020-69848-5 11 Vol.:(0123456789) www.nature.com/scientificreports/

Competing interests Te authors declare no competing interests. Additional information Correspondence and requests for materials should be addressed to N.W. Reprints and permissions information is available at www.nature.com/reprints. Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional afliations. Open Access Tis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. Te images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Scientific Reports | (2020) 10:13047 | https://doi.org/10.1038/s41598-020-69848-5 12 Vol:.(1234567890)