Arxiv:2103.08285V2 [Quant-Ph] 27 May 2021 Ino H Erlntokbcmsms ﬃin.Aside Eﬃcient

Home , Fock state

arXiv:2103.08285v2 [quant-ph] 27 May 2021 ino h erlntokbcmsms ffiin.Aside efficient. most compres- becomes information network neural large the the the of where into sion regime potential number scalability demon- boson methods’ we Moreover, the strate calculations. com- a providing benchmark of and [27–31] parison statistics by model states number laser Fock one-atom boson generic of stationary encoding the neural calculating presented the of Hilbert accuracy the sparsest demonstrate the We pro- access available. a representation to space employed where is state, method stationary jector density in optimized representation maximally matrix a even compression surpasses information efficiency network numbers, neural occupation state bit-encoded Fock find high the we of Strikingly, regime under- the in itself. the that structure of network the modification to neural en- without states lying number basis, layer bosonic state neuron bit of Fock visible a mapping the direct end, to a this applied abling To is scheme states. number encoding com- Fock systems bosonic spin hybrid prising towards architecture RBM the systems spin [26]. open well as asymmetric feasible of render to calculations shown been sampling accurate have the configurations system for input strategies of adaptive convergence 23–25], fast [9–12, and spin re- times performance invariance numerical and high translational systems in of sults spin symmetries with itera- While periodic systems 22]. via of [21, state implementation principle variational stationary the and a the of neurons application to artificial for tive access to allows spins direct it of enables as small mapping efficient 13–20], and one-to-one highly [1, spin a systems for and matrix quantum natural density molecular a the as of representation established been has with of the systems machine particular, description quantum Boltzmann In open the [9–12]. dynamics and for Markovian [1–8] utilized states successfully quantum been have ∗ [email protected] nti ril,w xadterpeettoa oe of power representational the expand we article, this In networks neural artificial breakthroughs, recent In oknme ttsi the in states number Fock ibr pc ersnainavailable. project bo representation a space maximum where Hilbert efficien the implementations, compression matrix than information density rather optimized its regime neurons occupation bit-encoded o high of complexity the number implementations, the matrix density common to epeetabtecdn ceefrahgl ffiin n scal and efficient highly a for scheme encoding bit a present We .INTRODUCTION I. ihlnaeOtkudQatnlkrnk Hardenbergstr Quantenelektronik, und Optik Nichtlineare ffiin i noigo erlntok o okstates Fock for networks neural of encoding bit Efficient RM erlntokarchitecture network neural (RBM) ehiceUiesta eln ntttfür Theoretisc Institut Universität Berlin, Technische lvrKaestle Oliver etitdBlzanmachine Boltzmann restricted restricted Dtd a 8 2021) 28, May (Dated: ∗ n lxne Carmele Alexander and ie nSc I.Lsl,w umrz u nig in findings our summarize we Lastly, VIII. VII. Sec. pro- is Sec. occupations in state vided Fock large an of for depiction potential scalability accurate methods’ the of before confirmation statistics, a by number boson method stationary presented the the calculating of VI accuracy Sec. the in Finally, demonstrate we efficiency. compression to approach of respect competing with bit-encoded the regime the outperforms the that still network projector find in neural we Heisenberg Yet, occupations state a Fock efficiency. high of maximum use for making can method which by state truncated stationary a in be space featuring of Hilbert system, scaling sparse model the highly considered compare the we state for end, Fock complexity this To required matrix the density dimension. optimized to basis highly respect a with compared and implementation is regular com- RBM a information bit-encoded both the to V, the Sec. of network In efficiency neural IV. pression the Sec. of in implemen- provided procedure neuron the are training visible on the the details we and Afterwards, to tation III, RBM. mapping number the Sec. direct Fock of for a In layer scheme enable II. encoding to Sec. bit states in neural the introduced derive is system model via [34]. coherence quantum superposition 33] quantify state [32, to Fock attempts algorithms recent sampling of boson or net- neural of e.g., realizations appli- include, work method specific presented systems, the quantum of for cations open as architecture of RBM network simulation the neural the of applicable paradigm the universally advancing a of goal the from h orsodn ytmHmloini ie y[27– by given approximation, is cavity dipole Hamiltonian bosonic and system a wave in- corresponding with rotating the the In system via spin [30]. laser single mode one-atom a a of model, of teraction Jaynes-Cummings realization a open describing paradigmatic the con- the we via approach, sider freedom network neural of bit-encoded degrees presented bosonic comprising systems h ril sognzda olw:Teinvestigated The follows: as organized is article The oqatf h civbeifraincmrsinin compression information achievable the quantify To erlntokacietr.I contrast In architecture. network neural rmto sue oacs h sparsest the access to used is method or yi hw osrasee maximally even surpass to shown is cy ß 6 02 eln Germany Berlin, 10623 36, aße h erlntoksae nywith only scales network neural the f o ubr rcal,i the in Crucially, number. son berpeetto fbosonic of representation able ePhysik, he I MODEL II. 2

29, 31] + − † + − † H/~ = ω0σ σ + ωcc c + g0 σ c + σ c , (1) ± with Pauli spin operators σ and bosonic creation and † annihilation operators c , c. Here, ω0 and ωc correspond to the spin and cavity mode frequencies and g0 denotes the coupling amplitude between the system and cavity mode. In addition, the spin-1/2 system is incoherently driven at rate Γ, combined with an incoherent decay of the bosonic mode occupation at rate κ. The resulting time evolution dynamics for the density operator is pre- FIG. 1. RBM realization of the neural density operator, fea- scribed by turing a visible layer storing the configuration of N spin-1/2 systems (orange) and the bosonic Fock state occupation bit- + ρ˙ = ρ = i [H/~, ρ]+ [ κ/2c]ρ + [ Γ/2σ ]ρ (2) encoded in Nβ neurons (blue), two hidden layers (green) and L − D D an ancillary mixing layer (red) with variational training pa- where we have introduced thep Lindblad dissipatorsp [35, rameters ϑ =(a, b, c, W , U). 36] κ [ κ/2c]ρ = 2cρc† c†c, ρ , (3a) D 2 −{ } each neuron in the network can take on one of two pos- p Γ sible configurations. Recently, it has been shown to en- [ Γ/2σ+]ρ = 2σ+ρσ− σ−σ+, ρ , (3b) D 2 −{ } able a highly favorable and efficient description of open imposingp incoherent excitation and dissipation on the spin systems via a one-to-one mapping of spins to binary neurons, establishing a natural representation of the sys- system and the cavity mode, respectively. In the follow- 2N −1 tems’ degrees of freedom [1, 9–14, 23, 26, 38–42]. The 2 ing calculations, we choose the parameters g0 =0.2 ps , −1 density matrix elements σ1,...,σN ρ η1,...,ηN for a Γ=0.4 ps , ω0 = ωc and varying bosonic decay rates κ. h | | i Moreover, we are only interested in the stationary state system of N spins σn, ηn = 1, 1 are constituted by a model distribution referred to{− as neural} density operator reached at time ts, where ρ˙(ts) = ρ(ts) = 0 within numerical precision. L which is optimized by iterative variation of a set of net- The corresponding system density matrix ρ consists of work parameters. This neural network realization of the d max density matrix certainly is a great achievement, however, 2 elements, with d = 2Nnβ for a system comprising N spins and a single bosonic mode with maximum oc- as of yet its potential has not been fully unleashed. In max order to further expand the representational power of the cupation number nβ . In case of the here considered Jaynes-Cummings model, we have N = 1. Due to the RBM, in the following we present a highly efficient and self-adjointness of the density matrix, only d(d + 1)/2 of scalable mapping of Fock number states to the artificial its elements must be determined for a complete system neurons by subjecting the bosonic Fock state basis to a description. Our model choice is motivated by the high bit encoding scheme [43, 44]: sparsity of the stationary state density matrix: Using a The fundamental idea is to decompose the Fock state Heisenberg projector method for maximum optimization, occupation number into a string of bits, which is then the full Hilbert space can be projected onto a subspace directly mapped onto the visible binary neurons of the spanned by only 2(d 1) nonzero elements, completely RBM. To derive a general framework for hybrid sys- − tems comprising both spins and bosonic degrees of free- describing the deterministic density matrix ρ(ts) in stationary state [35, 37]. In the following, we present a dom, we consider N spin-1/2 systems and a single bosonic mode, corresponding to density matrix elements neural bit encoding scheme of Fock states based on the σ η σ ,...,σN ; n ρ η ,...,ηN ; n where σn, ηn = 1, 1 restricted Boltzmann machine (RBM) neural network ar- h 1 β| | 1 βi {− } chitecture. Here, the deterministic density matrix ρ is again denote the left and right spin configurations and nσ,nη N correspond to the left and right number estimated by a probabilistic neural density operator ρϑ, β β 0 occupation∈ of the bosonic mode. The Fock state oc- which is fully described by a set of variational parame- σ η ters ϑ. In the high boson number regime, the presented cupations nβ, nβ are each decomposed into Nβ bits σ σ η η method is shown to yield a drastic reduction of complex- (β1 ,...,βNβ ) and (β1 ,...,βNβ ), following the encoding ity with respect to the deterministic density matrix rep- rule resentation, surpassing even the compression efficiency of Nβ the maximally optimized description. i−1 nβ = 2 δβi,1, (4) i=1 X III. NEURAL ENCODING OF FOCK STATES i.e., allowing for the representation of nβ = 0, 1,..., 2Nβ 1 indistinguishable bosons on each side. The RBM neural network architecture can be em- {In this bit-encoded− } format, the Fock state basis can be ployed to create a probabilistic model of the density ma- directly mapped onto the binary neurons of the RBM trix and is composed of binary neurons, meaning that analogous to the spin-1/2 systems and without any mod- 3

η η ification to the neural network architecture itself. Natu- (η1,...,ηN ,β1 ,...,βNβ ) representing the full configura- rally, the regime of representable Fock state occupations tion of the left and right side of the density matrix and is limited by the number of employed bits. For instance, consisting of N spin-1/2 systems (orange shapes) and utilizing a total of Nβ = 4 artificial neurons as bits corre- the bosonic mode occupation encoded in Nβ bits (blue sponds to 24 possible Fock state configurations in total, shapes). In addition, the network comprises two auxil- with the Fock occupation number given by iary hidden layers with M sites hσ and hη each (green shapes), connecting the visible sites of each side, and an 0 1 2 3 µ nβ =2 δβ1,1 +2 δβ2,1 +2 δβ3,1 +2 δβ4,1. (5) ancillary mixing layer of K neurons h connecting the left and right side of the density matrix (red shapes). Fig. 1 shows a sketch of the resulting bit-encoded Tracing out the hidden and ancillary degrees of freedom, RBM: The neural network features a visible layer of the elements of the neural density operator read [10– σ σ σ η 2(N + Nβ) sites = (σ1,...,σN ,β1 ,...,βNβ ) and = 12, 26, 38]

N N+Nβ ∗ σ ∗ η ρϑ(σ, η) = 8exp (aiσi + ai ηi)+ aiβi−N + ai βi−N " i=1 i=N+1 # X X M N N+Nβ N N+Nβ σ ∗ ∗ ∗ η cosh bm + Wmiσi + Wmiβ cosh b + W ηi + W β × i−N m mi mi i−N m=1 i=1 i N ! i=1 i N ! Y X =X+1 X =X+1 K N N+Nβ ∗ ∗ σ ∗ η cosh ck + c + (Ukiσi + U ηi)+ Ukiβ + U β , (6) × k ki i−N ki i−N k=1 " i=1 i=N+1 # Y X X where ϑ = (a, b, c, W , U) denotes a set of complex train- space [46–48]: A new system configuration (σ, η) = ing parameters split up into real and imaginary parts, (σ ,...,σ ,βσ,...,βσ ; η ,...,η ,βη,...,βη ) is 1 N 1 Nβ 1 N 1 Nβ yielding a total of 2(N +Nβ)+2M +K +2M(N +Nβ)+ drawn based on the current sample and either accepted 2K(N +Nβ) elements. These variational parameters con- or rejected at a certain acceptance probability to find a stitute the networks’ degrees of freedom, consisting of bi- subspace accurately representing the full Hilbert space ases a for visible sites, b for hidden neurons and c for the of the considered system. In many scenarios involving mixing layer, and of complex weights W and U connect- spin-1/2 systems interacting with bosonic modes, the ing the visible neurons (σ, η) to the hidden layers hσ, number of nonzero combinations of spin configurations hη and to the ancillary mixing layer hµ, respectively [see and Fock number occupations is severely limited by the Fig. 1]. structure of the spin-boson interaction, resulting in a highly sparse stationary state density matrix. Since our goal of training the neural network is to approximate IV. TRAINING PROCEDURE only the steady state of the considered system, we exploit this fact to increase sampling efficiency and accuracy by only drawing samples from the subspace Due to the exponential growth of the Hilbert space of nonzero steady state density matrix elements. To dimension with increasing system size, an exact mapping propose a new sample, a random selection rule is em- of the density matrix becomes increasingly expensive ployed where the left and right configuration of each spin when considering large Fock state numbers. The arti- σ1,...,σN , η1,...,ηN is flipped at 50% probability each. ficial neural network ansatz approaches this problem Afterwards, new random Fock number configurations by approximating the unknown density matrix ρ by σ σ η η β1 ,...,βNβ ,β1 ,...,βNβ are drawn based on the new the neural density operator ρϑ [Eq. (6)] via iterative spin configuration. Specifically, only combinations of optimization of the parameters ϑ. To this end, configu- spin configurations and boson numbers that have a ration space is efficiently compressed via application of nonzero stationary state contribution are taken into the Metropolis algorithm [45], where a sequence of Ns consideration as samples. The acceptance probability of samples of left and right density matrix configurations, a newly drawn sample is chosen as i.e., visible neuron configurations of the RBM, is drawn as input data rather than taking every possible density matrix configuration into account. The Metropolis algorithm is based on a Markov chain Monte Carlo p˜ϑ(σn+1, ηn+1) A(n +1,n) = min 1, (7) method, corresponding to a random walk in Hilbert p˜ (σ , η ) ϑ n n 4 where (σn, ηn) denotes the current and (σn+1, ηn+1) the dient is evaluated as [10, 26] newly proposed sample configuration. Ns † Employing the stochastic reconfiguration ap- ϑ C(ϑ) = 2Re p˜ϑ(σn, ηn) ˜ (σn, ηn) ∇ l L proach [49–51], the system observables and the ( n=1 X normalized occurrence probability of a given sam- Ns ρϑ(σm, ηm) ple configuration (σn, ηn) with n = 1,...,Ns are (σn, ηn, σm, ηm) Oϑl (σm, ηm) { } × L ρϑ(σn, ηn) approximated as statistical expectation values over the m=1 X Ns samples drawn during one iteration. As a result, the Ns normalized occurrence probability is given by p˜ϑ(σn, ηn)Oϑ (σn, ηn) − l "n=1 # X Ns 2 ˜† ˜ ρϑ(σn, ηn) p˜ϑ(σn, ηn) (σn, ηn) (σn, ηn) , (11) p˜ϑ(σn, ηn)= | | , (8) × " L L #) Ns 2 n=1 ρϑ(σn, ηn) X n=1 | | introducing the estimator of the Liouvillian P ρ (σ , η ) and diagonal observables can be estimated as statistical ˜(σ , η ):= (σ , η , σ , η ) ϑ m m , (12) n n n n m m ρ (σ , η ) averages X(σ, σ) X(σ, σ) q [10–12, 49–51] with L σ ,η L ϑ n n h i ≈ hh ii mXm and logarithmic derivatives stored in diagonal matrices N with elements s ρ (ξ, σ ) σ σ σ σ ξ ϑ n X( , ) q := q˜ϑ( n) X( n, ) , (9) ∂[ln ρϑ(σn, ηn)] hh ii ρϑ(σn, σn) n=1 ξ [Oϑl ]σnηn,σnηn = Oϑl (σn, ηn)= , (13) X X ∂ϑl which correspond to the neural density operator gradi- where we have introduced the normalized proba- ents with respect to all l elements of ϑ and for a given bility of diagonal system configurationsq ˜ϑ(σn) = sample configuration (σn, ηn). Ns ρϑ(σn, σn)/[ n=1 ρϑ(σn, σn)]. In this work, we focus on diagonal observables as figures of merit. As a re- V. NEURAL NETWORK EFFICIENCY GAIN sult, numericalP performance can be further increased by employing the probability amplitudeq ˜ϑ(σ) based only on diagonal samples, which considerably reduces the di- In a regular density matrix implementation, the num- mension of the relevant configuration subspace: During ber of required elements for a complete system descrip- each training iteration, Ns diagonal samples (σn, σn) are tion scales polynomially with the maximum boson num- max drawn to calculateq ˜ϑ(σ) for the estimation of diagonal ber nβ . For the here considered model [Eq. (2)], this max max max observables, and Ns unrestricted samples (σn, ηn) are corresponds to 2nβ (2nβ + 1)/2 elements, with nβ drawn to calculatep ˜ϑ(σ, η) for the training of the net- denoting the chosen bosonic occupation number limit dic- work. tated by the numerical implementation. In its maximally optimized stationary state representation, a linear scal- The training goal is to determine the steady state of the max ing via 2(2nβ 1) can be achieved. In contrast, in considered system, prescribed by the conditionρ ˙ = ρ = the presented bit-encoded− neural network the amount of L 0, with denoting the Liouvillian superoperator [35, 36]. variational parameters arising from bosonic degrees of L In order to optimize the parameter set ϑ to fulfill this freedom scales only with the number of bits N , with 2 β condition, we define a cost function C(ϑ)= ρϑ [10, nmax = 2Nβ 1, corresponding to a drastic decrease of kL k2 β 12]. Initially, the variational parameters are set to small complexity especially− in the limit of large boson numbers. (0) nonzero random values, ϑl [ 0.01, 0.01] 0 . Using In Fig. 2, we compare the number of parameters re- the standard stochastic gradient∈ − descent algorithm\{ } and quired for a complete and numerically convergent de- Ns sample system configurations as input training data, scription of the considered model system with respect to during each training iteration t t + 1 the parameters the average boson occupation number in stationary state → ϑ are updated by the rule nβ(ts) and plotted on a logarithmic scale. A lower value correspondsh i to a higher degree of information compression. The mean stationary Fock state occupation is tuned (t+1) (t) (t) by variation of the bosonic decay rate κ. In the neural ϑ = ϑ ν ϑ C[ϑ ], (10) l l − ∇ l network implementation, convergence is achieved once the number of employed bits Nβ is chosen sufficiently large and can be further improved by increasing the num- at a learning rate ν [48]. The required cost function gra- ber of samples per iteration Ns. Numerical convergence 5

12 15 (a) RBM 〉

) RK s

(t 10 β

8 n 〈 5 1200 4 0 4000 8000 800 RBM iteration RK (optimized) 400 RK (regular) 0.2

log(number of parameters) 100 150 200 250 (b) RBM 0 ) ∞ RK

0 50 100 150 200 250 (t β

〈 〉 n mean boson number nβ(ts)

FIG. 2. Required number of parameters for a complete and 0.1 numerically convergent system description, plotted on a logarithmic scale with respect to the mean stationary state boson occupation. The RBM approach (solid light blue line) is com- boson statistics P pared to a regular density matrix implementation (solid dark blue line) and a highly optimized approach with a truncated 0 0 1 2 3 4 5 6 7 8 9 10 Hilbert space featuring only nonzero steady state density ma- boson number n trix elements (dashed dark blue line). Inset: Cutout on a β linear scale, showing the area where the RBM implementation becomes the most efficient. FIG. 3. Demonstration of the accuracy of the bit-encoded neural network implementation of Fock number states. (a) Expectation value for the stationary Fock state occupation of the regular density matrix implementation is assumed number hnβ (ts)i obtained from the RBM (solid blue line), if further expanding the maximum Fock state occupation compared to a calculation using fewer samples per itera- max tion (solid grey line) and to the benchmark result (dashed nβ results in a relative deviation of less than 0.1% in line). (b) Steady state boson number statistics resulting from nβ(ts) . With increasing degrees of freedom, dynamical h i the RBM implementation (light blue bars) and compared to Runge Kutta calculations typically require an increas- benchmark results (dark blue bars). ingly small time discretization to achieve numerical convergence. In addition, the required number of elements scales polynomially, resulting in a polynomial increase in bly efficient and even undercuts the required number of complexity for rising system sizes (solid dark blue line). variational RBM parameters in the low Fock state oc- Exploiting the sparsity of the stationary state density cupation regime. Strikingly, the neural network infor- matrix to truncate the corresponding Hilbert space via mation compression becomes even more efficient above application of a projector method, the density matrix nβ(ts) 160 (see inset). Given the already excel- implementation can be maximally optimized to scale lin- lenth Hilberti ≈ space compression achieved by the projec- early in the required number of parameters (dashed dark tor method in the maximally optimized density matrix blue line). The number of variational RBM parameters approach, this is a remarkable result. In the following, defining the neural density operator scale with the num- we explicitly demonstrate the bit-encoded RBMs’ accu- ber of employed bits Nβ. While increasing the hidden racy and scalability potential with regard to the regime layer sizes of course results in a less efficient compression, of large Fock state basis dimensions. we note that in our experience numerical convergence of the network can be improved a lot more efficiently by in- VI. ACCURACY creasing the bosonic degrees of freedom Nβ rather than the hidden layer dimensions M and K. Therefore, the solid light blue line in Fig. 2 shows the required number As a proof of principle and to demonstrate the accu- of variational parameters to achieve a convergent estima- racy of the neural encoding of Fock states, we specifically tion of the density matrix at fixed hidden layer densities calculate the stationary boson occupation number statis- M/(Nβ +1) = K/(Nβ + 1) = 1, exhibiting a slow linear tics Pn(ts) for the considered model system [Eq. (2)], increase for rising Fock state basis dimensions. with

As a main result of our study, Fig. 2 illustrates a much nmax more eﬃcient compression of system information by the 1 †n n 1 (n + m)! Pn(t)= c c (t) Pn m(t) (14) RBM architecture with respect to the regular density ma- n! h i− n! m! + m=1 trix implementation. The inset shows a cutout on a lin- X ear scale, where the bit encoding of the bosonic degrees denoting the probability of measuring n bosons in the of freedom results in a stepwise increase of complexity system at a given time t, calculated up to the highest (solid light blue line). The maximally optimized, lin- included bosonic correlation degree nmax [52, 53]. Here early scaling density matrix implementation is compara- we choose a low bosonic decay rate κ = 0.04ps−1. In 6

σ accordance with Fig. 2, we have chosen Nβ = 5 bits and 〉 1

) 00 s σ hidden layer densities M/(Nβ +1) = K/(Nβ +1) = 1 (t 1000 11 β 0.5 n to achieve numerically convergent results. Calculations 〈 0 are performed at a learning rate ν = 0.01 and using 0 400 800 N = 5000 sample configurations per iteration. As a s 600 benchmark, we additionally calculate the system dynamics up to the steady state using a common density matrix max implementation using identical parameters, nβ = 14 and a time discretization ∆t =0.02 ps. 200 mean boson number Fig. 3(a) shows the estimated stationary state expecta- 0 400 800 tion value of the Fock state occupation number nβ(ts) iteration with respect to the number of training iterationsh of thei RBM (solid light blue line) and compared to the bench- FIG. 4. Demonstration of the scalability potential of the bit- mark result nβ(ts) 4.56 (dashed dark blue line), ex- encoded neural network in the large Fock number regime, h i≈ hibiting excellent agreement after approximately 4000 it- showing the mean Fock occupation number hnβ (ts)i over erations. The light oscillating behavior of the RBM re- training iterations (solid light blue line). The inset shows sults can be further reduced by increasing the number the spin down (green bottom line) and spin up occupations of samples per iteration N : Accordingly, a compari- (orange upper line) of the spin system over iterations. Dashed s dark blue lines indicate corresponding benchmark results. son RBM calculation using five times fewer samples per iteration exhibits increased variations (solid grey line). Fig. 3(b) depicts the steady state boson number statis- facilitate effective training [19]. Thanks to the favorable tics Pnβ (ts) [Eq. (14)] calculated via training of the neural network (light blue bars) and compared to benchmark scaling of the required number of variational parameters results (dark blue bars). The two resulting statistics with increasing system sizes, calculations are still highly are in overall very good qualitative agreement, sharing efficient in this regime. Fig. 4 shows the neural network results for the mean Fock state occupation number their highest boson number probability at nβ = 4, with a Kullback-Leibler divergence of approximately 0.14 which nβ(ts) over the course of training iterations (solid blue line).h Remarkably,i already after approximately 400 iter- can be further reduced by increasing the sample size Ns. ations, it approaches the benchmark value nβ(ts) 199 It is noted, however, that the statistics resulting from h i≈ the RBM implementation is prone to error accumula- (dashed blue line). The inset shows the steady state spin up and spin down expectation values of the single spin tion for nβ > 10: The estimated occurrence probabili- ties feature statistical deviations arising from the Monte system obtained from the RBM implementation (blue Carlo sampling procedure. These deviations are rela- and orange lines) and in good agreement with their cor- tively small when considering the boson number observ- responding benchmark results (dashed blue lines). To † conclude, the required number of neurons employed as able nβ = c c and choosing a sufficiently large sample h i h i bits Nβ to account for bosonic degrees of freedom exceed size Ns [solid light blue line in Fig. 3(a)]. However, during the calculation of Eq. (14) the statistical error mul- the actual stationary boson occupation by far. However, tiplies for each increasing correlation order n of c†ncn , the number of training iterations to achieve numerical thus limiting high accuracy RBM calculations ofh the bo-i convergence is drastically reduced with increasing neu- son number statistics to the low boson number regime ron numbers. This can be explained by the decreased for the considered sample size. asymmetry of the spin-boson interaction [Eq. (1)] in the large boson number regime n 1 where √n √n + 1, since the RBM architecture is known≫ to achieve≈ far higher VII. SCALABILITY levels of performance and convergence for the representation of systems with symmetries of translational invariance [26]. At the same time, the bit-encoded neu- To access the high boson number regime, we calcu- ral network performs more efficiently than even highly late the considered model system [Eq. (2)] once more at optimized common implementations where a projector − a small bosonic decay rate κ = 0.001 ps 1, resulting in method has been employed to access the sparsest Hilbert nβ(ts) 199 where the information compression effi- subspace available, underlining the performance of the h i ≈ ciency of the RBM realization has been shown to surpass bit-encoded neural network representation of Fock states even a maximally optimized density matrix implementa- in the high occupation regime. tion (see Fig. 2). For training, we employ Nβ = 13 bits and hidden layer densities M/(Nβ +1) = K/(Nβ +1)= 1 at a learning rate ν =0.003 and N = 5000 samples per s VIII. CONCLUSION iteration. Even though nβ(ts) is located well below the maximum Fock state occupationh i nmax =2Nβ 1, choos- β − ing fewer bits Nβ yields non-converging results, underlin- We have presented a bit-encoded realization of Fock ing the networks’ need for sufficient degrees of freedom to number states in the RBM neural network architecture, 7 extending its applicability of high-performing approxi- to illustrate the scalability potential and performance of mate mappings of the density matrix to hybrid spin sys- our method, we have calculated the mean stationary Fock tems featuring bosonic degrees of freedom, further ad- state occupation in the high boson number regime, where vancing the paradigm of a universally applicable neural the information compression of the neural network be- network architecture for open quantum systems. Cru- comes the most efficient. Once numerical convergence is cially, in the limit of large Fock state occupation num- achieved by tuning the number of visible neurons in the bers the RBM implementation requires severely fewer network it can be further improved, e.g., by increasing parameters for a complete system description than com- the number of samples per iteration or via application of mon density matrix approaches and even surpasses the adaptive sampling strategies [26]. information compression efficiency of a maximally optimized implementation, where the corresponding Hilbert space has been truncated to the sparsest possible repre- ACKNOWLEDGMENTS sentation by application of a projector method. We have demonstrated the accuracy of the presented neural en- We thank Marten Richter for fruitful discussions. coding of Fock states by calculating the stationary state The authors acknowledge support from the Deutsche boson number statistics of a model system, exhibiting Forschungsgemeinschaft (DFG) through SFB 910 project good agreement with benchmark calculations. Moreover, B1 (Project No. 163436311).

[1] G. Carleo and M. Troyer, Science 355, 602 (2017). [22] H. Weimer, Phys. Rev. Lett. 114, 040402 (2015). [2] D.-L. Deng, X. Li, and S. Das Sarma, [23] K. Choo, G. Carleo, N. Regnault, and T. Neupert, Phys. Rev. B 96, 195145 (2017). Phys. Rev. Lett. 121, 167204 (2018). [3] D.-L. Deng, X. Li, and S. Das Sarma, [24] D. Yevick and R. Melko, Phys. Rev. X 7, 021021 (2017). Computer Physics Communications 258, 107518 (2021). [4] I. Glasser, N. Pancotti, M. August, I. D. Rodriguez, and [25] T. Xiao, D. Bai, J. Fan, and G. Zeng, J. I. Cirac, Phys. Rev. X 8, 011006 (2018). Phys. Rev. A 101, 032304 (2020). [5] G. Torlai, G. Mazzola, J. Carrasquilla, [26] O. Kaestle and A. Carmele, M. Troyer, R. Melko, and G. Carleo, Phys. Rev. B 103, 195420 (2021). Nature Physics 14, 447 (2018). [27] B. W. Shore and P. L. Knight, [6] M. Schmitt and M. Heyl, Journal of Modern Optics 40, 1195 (1993). Phys. Rev. Lett. 125, 100503 (2020). [28] R. R. Puri and G. S. Agarwal, [7] H. Burau and M. Heyl, arXiv:2009.04473 (2020). Phys. Rev. A 33, 3610 (1986). [8] Y. Nomura, A. S. Darmawan, Y. Yamaji, and M. Imada, [29] M. Richter, A. Carmele, A. Sitek, and A. Knorr, Phys. Rev. B 96, 205152 (2017). Phys. Rev. Lett. 103, 087407 (2009). [9] N. Yoshioka and R. Hamazaki, [30] S. Kreinberg, T. Grbeˇsić, M. Strauß, A. Carmele, M. Em- Phys. Rev. B 99, 214306 (2019). merling, C. Schneider, S. Höfling, X. Porte, and S. Re- [10] F. Vicentini, A. Biella, N. Regnault, and C. Ciuti, itzenstein, Light: Science & Applications 7, 1 (2018). Phys. Rev. Lett. 122, 250503 (2019). [31] M. Gegg, A. Carmele, A. Knorr, and M. Richter, [11] M. J. Hartmann and G. Carleo, New Journal of Physics 20, 013006 (2018). Phys. Rev. Lett. 122, 250502 (2019). [32] A. Neville, C. Sparrow, R. Clifford, E. Johnston, [12] A. Nagy and V. Savona, P. M. Birchall, A. Montanaro, and A. Laing, Phys. Rev. Lett. 122, 250501 (2019). Nature Physics 13, 1153 (2017). [13] G. Torlai and R. G. Melko, [33] I. Agresti, N. Viggianiello, F. Flamini, N. Spagnolo, Phys. Rev. Lett. 120, 240503 (2018). A. Crespi, R. Osellame, N. Wiebe, and F. Sciarrino, [14] R. G. Melko, G. Carleo, J. Carrasquilla, and J. I. Cirac, Phys. Rev. X 9, 011013 (2019). Nature Physics 15, 887 (2019). [34] C. Lüders, M. Pukrop, E. Rozas, C. Schneider, [15] M. H. Amin, E. Andriyash, J. Rolfe, B. Kulchytskyy, and S. Höfling, J. Sperling, S. Schumacher, and M. Aßmann, R. Melko, Phys. Rev. X 8, 021050 (2018). arXiv:2103.03033 (2021). [16] C. Y. Hsieh, Q. Sun, S. Zhang, and C. K. Lee, [35] H. P. Breuer and F. Petruccione, npj Quantum Information 7, 19 (2021). The theory of open quantum systems (Oxford Uni- [17] R. Xia and S. Kais, Nature Communications 9, 4195 versity Press, Oxford, 2002). (2018). [36] S. Mukamel, Principles of nonlinear optical spectroscopy [18] D. Alcalde Puente and I. M. Eremin, (Oxford University Press, New York, 1995). Phys. Rev. B 102, 195148 (2020). [37] E. Fick, Einführung in die Grundlagen der Quantenthe- [19] D. Sehayek, A. Golubeva, M. S. Albergo, orie (Aula-Verlag, Wiesbaden, 1988). B. Kulchytskyy, G. Torlai, and R. G. Melko, [38] G. Carleo, I. Cirac, K. Cranmer, L. Daudet, M. Schuld, Phys. Rev. B 100, 195125 (2019). N. Tishby, L. Vogt-Maranto, and L. Zdeborová, [20] L. Huang and L. Wang, Phys. Rev. B 95, 035105 (2017). Rev. Mod. Phys. 91, 045002 (2019). [21] J. Cui, J. I. Cirac, and M. C. Bañuls, [39] T. Vieijra, C. Casert, J. Nys, W. De Neve, Phys. Rev. Lett. 114, 220601 (2015). J. Haegeman, J. Ryckebusch, and F. Verstraete, 8

Phys. Rev. Lett. 124, 097201 (2020). [40] G. Carleo, Y. Nomura, and M. Imada, Nature Communications 9, 5322 (2018). [41] S. Cheng, J. Chen, and L. Wang, Entropy 20, 583 (2018). [42] E. Rrapaj and A. Roggero, Phys. Rev. E 103, 013302 (2021). [43] S. C. Kuhn and M. Richter, Phys. Rev. B 99, 241301 (2019). [44] S. C. Kuhn and M. Richter, Phys. Rev. B 101, 075302 (2020). [45] N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller, and E. Teller, The Journal of Chemical Physics 21, 1087 (1953). [46] C. P. Robert and G. Casella, Monte Carlo Statistical Methods (Springer, New York, 2004). [47] N. G. van Kampen, Stochastic Processes in Physics and Chemistry, third edition ed. (Elsevier, New York, 2007). [48] M. Schuld and F. Petruccione, Supervised Learning with Quantum Computers (Springer, Cham, 2018). [49] S. Sorella, Phys. Rev. Lett. 80, 4558 (1998). [50] S. Sorella, M. Casula, and D. Rocca, The Journal of Chemical Physics 127, 014105 (2007). [51] F. Becca and S. Sorella, Quantum Monte Carlo Ap- proaches for Correlated Systems (Cambridge University Press, Cambridge, 2017). [52] J. Kabuss, A. Carmele, T. Brandes, and A. Knorr, Phys. Rev. Lett. 109, 054301 (2012). [53] J. Kabuss, A. Carmele, M. Richter, W. W. Chow, and A. Knorr, physica status solidi (b) 248, 872 (2011).