arXiv:2001.01273v2 [math.RA] 17 Sep 2020 s cmuaieo o-omttv)lf ler vradvso r division a over algebra left commutativ non-commutative) the [ where or in ring (commutative Ore polynomial a by general introduced is, most was the polynomial of skew elements univariate of concept The Introduction 1 ∗ [email protected] ri’ hoe ntedmnino h agrdvso ring division larger the 90. Theorem of Hilbert’s dimension and the correspondence, on results Suc Galois-theoretic theorem ring. Three Artin’s polynomial centralize fields. skew a of multivariate intro extension be free we Galois to a Finally, subring by lineari given the work. and class considering this skew by of in rings version connected division novel and a determi introduced to to method are rise explicit give an also give matrices results previous multivari The generalize which ces. introduced, are matrices cor monde the over products into translate polynomi class skew coord conjugacy multivariate into linearized of translate fi class) polynomials expected conjugacy Such skew as multivariate works classes. free interpolation conjugacy Lagrange of where into sets set rig the P-closed finite-dimensional are the of lists of to partition correspond sets Hen P-closed classes. t subring conjugacy corresponds disjoint division pair-wise P-independence of that a centralizers shown eva over is remainder-based linear It the right o to polynomials. connected are algebras is they and evaluation that natural automorphisms, shown ring is division It lineari of univariate groups generalize finite which introduced, are rings oyoil,Moemtie,Re-ulrcds kwpoly matrices. skew codes, Reed-Muller matrices, Moore polynomials, Keywords: lin First, given. are results these of applications Several wit polynomials skew multivariate linearized work, this In MSC: 21,1E5 21,1S6 94B60. 16S36, 12F10, 12E15, 12E10, ierzdmliait kwpolynomials skew multivariate linearized nttt fCmue cec n Mathematics, and Science Computer of Institute aoster,HletsTerm9,Lgag interpolati Lagrange 90, Theorem Hilbert’s theory, Galois nvriyo Neuchˆatel, Switzerland of University hoyadapiain of applications and Theory met Mart´ınez-Pe˜nasUmberto Abstract 1 ei sddcdta ntl generated finitely that deduced is it ce e oyoil,gopagba of algebras group polynomials, zed r eeaie osc extensions: such to generalized are l,adcmoiin vrasingle a over compositions and als, tvco pcsacrigt the to according spaces vector ht nt-iecmoiin oeper (one compositions inate-wise t or n rnka matri- Wronskian and Moore ate oil,Vnemnematrices, Vandermonde nomials, uto ffe utvraeskew multivariate free of luation epnigcentralizers. responding ih ieridpnec over independence linear right o orsodn oaconjugacy a to corresponding r ti losonta products that shown also is It . eterrnsi eea.Such general. in ranks their ne xesosgnrlz (finite) generalize extensions h ieygnrtdPcoe sets P-closed generated nitely e edMle oe,which codes, Reed-Muller zed ∗ aie utvraeVander- multivariate earized eiain,aogothers. among derivations, f uePGli xesosof extensions P-Galois duce vrissbig h Galois the subring, its over aldcnrlzr n their and centralizer, called offiinsoe division over coefficients h t xo sdopd that dropped, is axiom ity 8.Te r endas defined are They 38]. n ihalf ai of basis left a with ing n linearized on, monomials 1, x, x2,... whose product satisfies that xi + xj = xi+j , for all non-negative integers i, j, and such that the degree of the product of two skew polynomials is the sum of their degrees. Here we are slightly bending the usual notion of algebra [8]. By left algebra, we mean a left vector space with a ring structure whose product is linear on the first component (rather than bilinear as in the commutative case). A natural definition of evaluation of univariate skew polynomials, via Euclidean division, was introduced by Lam and Leroy in the works [22, 25]. Thanks to this concept of evaluation, Lam and Leroy introduced the concept of P-independence of evaluation points in [22, 24], which in turn gives rise to the concept of P-closed set (Definition 20), and P-basis (Definition 23) of a P-closed set. Intuitively, a finite set of evaluation points is P-independent if we may perform Lagrange interpolation over them, i.e., any set of values (of the right size) can be attained by evaluating some skew polynomial over such evaluation points (see Theorem 2). In [22, Theorem 23], it was shown that a set of evaluation points is P-independent if, and only if, the subsets in its partition into conjugacy classes (Definition 18) are each P-independent. Later in [25, Theorem 4.5], it was shown that a set of evaluation points, all from the same conjugacy class, is P-independent if, and only if, the exponents in the conjugacy relation are right linearly independent over the corresponding centralizer (Definition 10). With these two results, Lam and Leroy gave a simple explicit method to find the rank of matrices obtained by evaluating (univariate) skew polynomials, which generalize Vandermonde matrices [43] and are related to Moore matrices [33] and Wronskian matrices [15]. Later on, such method for finding the rank of such general Vandermonde matrices that use evaluations of skew polynomials was used in [29] to show that linearized Reed-Solomon codes (introduced in [29, Definition 31]) have maximum possible minimum sum-rank distance. Linearized Reed-Solomon codes are defined by evaluating certain operator polynomials that generalize classical univariate linearized polynomials over finite fields [28, Chapter 3]. Evaluations of such polynomials are tightly connected to Lam and Leroy’s concept of evaluation for skew polynomials via a particu- lar case of a result by Leroy [27, Theorem 2.8]. Since such operator polynomials are right linear over the corresponding centralizer, they can be seen as a linearization of skew polynomials. The generator matrices of linearized Reed-Solomon codes [29, page 604] are a linearized version of the skew Vandermonde matrices defined in [22, 25] and simultaneously recover as particular cases Vandermonde, Moore and Wronskian matrices. For this reason, linearized Reed-Solomon codes also recover as particular cases Reed-Solomon codes [40], which are MDS (maximum distance separable) and Gabidulin codes [10], which are MRD (maximum rank distance). These codes have numerous applications in error correction in telecommunications, repair in data storage or information-theoretical security, among others. Most notably, Reed-Solomon codes been ex- tensively used in practice, including CDs, DVDs, QR codes, satellite communications and the storage system RAID 6. In [31], free multivariate skew polynomials were introduced, following Ore’s definition: They are the most general polynomial ring in several free variables (variables are not allowed to com- mute with each other) such that the product of two monomials consists in appending them and the degree of a product of two skew polynomials is the sum of their degrees. Thanks to the lack of relations between the variables, the concept of evaluation was extended in [31, Definition 9] due to the uniqueness of remainders in the Euclidean division [31, Lemma 5], which cannot be guaranteed for iterated skew polynomial rings (see [31, Remark 8] and [12, Example 3.7]) or if the variables are allowed to commute (see [31, Remark 7]). The concepts of conjugacy, P-independence and skew Vandermonde matrices were then extended to such a multivariate case in [31], leading to a skew Lagrange interpolation theorem [31, Theorem 4] and equating the rank of a skew to the rank of the P-closed set generated by the corresponding evaluation points [31, Proposition 41].

2 In this work, we introduce a concept of multivariate polynomials on certain operators, as done in [29], which we will call linearized multivariate skew polynomials (Subsection 2.2), and we show that their natural evaluation is also tightly connected to the arithmetic evaluation (that is, based on Euclidean divisions) of free multivariate skew polynomials (Subsection 2.3). We will use this connection and skew Lagrange interpolation [31, Theorem 4] to extend the important results [22, Theorem 23] and [25, Theorem 4.5] to the multivariate case, finding an explicit representation of P-closed sets as a disjoint union or a list of right vector spaces over the corresponding centralizers (Theorems 4 and 5 in Section 3), and similarly for their P-bases. We will then show that compositions of linearized multivariate skew polynomials, seen as right linear maps, coincide with matrix products, and products of free multivariate skew polynomials can be mapped onto coordinate-wise compositions of linearized multivariate skew polynomials over pair-wise disjoint conjugacy classes (Section 4), which is hence equivalent to products of block-diagonal matrices. As a consequence, we deduce in Corollary 45 that quotients of free multivariate skew polynomial rings over the ideal of skew polynomials vanishing on a finite union of finitely generated conjugacy classes are semisimple rings [23, Definition (2.5)]. Moreover, they are simple rings [23, Definition (2.1)] in the case of a single conjugacy class (Corollary 44). We note that all of these results particularize to non-trivial results on the free conventional multivariate polynomial ring (where variables do not commute with each other but commute with constants) over an arbitrary division ring. The case where such division rings are fields (i.e., commutative) was extensively studied in [8], and the case of arbitrary division rings but where variables commute with each other was studied in [2]. However, our results in the general case are new to the best of our knowledge. The final two sections of this work constitute applications of the theory developed up to this point. In Subsection 5.2, we will define linearized multivariate Vandermonde matrices, connect them to skew multivariate Vandermonde matrices [31], and provide a simple explicit criterion to de- termine their ranks (Theorem 11 in Subsection 5.2) similar to that obtained by combining [22, Theorem 23] and [25, Theorem 4.5] in the univariate case. In Subsection 5.3, we introduce skew and linearized Reed-Muller codes, calculate their dimension and show a connection between their minimum skew and sum-rank distances (respectively) and their minimum Hamming distance. As we will show, skew Reed-Muller are similar but not exactly the same as those introduced in [12], and linearized Reed-Muller codes recover the version of Reed-Muller codes in [4] as the particular case of a single conjugacy class. Finally, in Section 6 we introduce the concept of P-Galois extensions of division rings, which generalize Galois extensions of fields. We then generalize to these P-Galois extensions of division rings three classical results in Galois theory [3]. In Subsection 6.2, we generalize Artin’s Theorem [3, Theorem 14], which calculates the dimension of the larger field over its subfield. In Subsection 6.3, we generalize the Galois correspondence [3, Theorem 16]. Finally, in Subsection 6.4, we generalize Hilbert’s Theorem 90 [3, Theorem 21].

Notation

For a set A and positive integers m and n, Am×n will denote the set of m × n matrices over A, and An will denote the set of column vectors of length n over A. That is, An = An×1. Given another set B, we denote by BA the set of all maps A −→B. Unless otherwise stated, F will denote a division ring, that is, a commutative or non- commutative ring with identity such that every non-zero element has another non-zero ele- ment that is both its left and right inverse. A field is a commutative division ring, and Fq

3 denotes the finite field of size q, where q is a prime power. On a ring R, we will denote by (A) ⊆ R the left ideal generated by a set A ⊆ R, and on a left vector space V over F, we will denote by hBiL ⊆ V the left F-linear vector space generated by a set B⊆V. We use the simplified notations (F1, F2,...,Fn) = ({F1, F2,...,Fn}), for F1, F2,...,Fn ∈ R, and L L hG1, G2,...,Gni = h{G1, G2,...,Gn}i , for G1, G2,...,Gn ∈ V. Similarly for right vector R L R spaces, where we denote hBi . We denote by dimF and dimF left and right dimensions over F. Rings are not assumed to be commutative, but all of them will be assumed to have multiplica- tive identity, and all ring morphisms map multiplicative identities to multiplicative identities.

2 Main definitions and the natural evaluation maps

In this section, we define linearized multivariate skew polynomials. We extend the notion of cen- tralizers, defined in [25, Equation (3.1)] for the univariate case (which was in turn an extension of the classical notion of centralizers of non-commutative division rings), and we show that lin- earized multivariate skew polynomials are right linear over the corresponding centralizer. Finally, we show that the natural evaluation on linearized multivariate skew polynomials corresponds to evaluation as proposed in [31, Definition 9], based on remainders of Euclidean divisions.

2.1 Skew polynomials and skew evaluation We start by revisiting the concepts of free multivariate skew polynomials from [31, Section 2]. These are the building blocks for defining skew polynomial rings with relations [31, Section 6], which are simply quotient rings of the free ring. For brevity, we will usually drop the term multivariate. Definition 1 (Free multivariate skew polynomial rings [31]). Given a ring morphism σ : F −→ Fn×n, we say that δ : F −→ Fn is a σ-derivation if it is additive and

δ(ab)= σ(a)δ(b)+ δ(a)b, for all a,b ∈ F. Let x1, x2,...,xn be n pair-wise distinct letters, which we will call variables, and denote by M the free (non-commutative) monoid on such letters, whose elements are called monomials and where 1 denotes the empty string. The free (multivariate) skew polynomial ring over F in the variables x1, x2,...,xn with morphism σ and derivation δ is the left vector space F[x; σ, δ] with left basis M and product given by appending monomials and the rule

xa = σ(a)x + δ(a), (1) for a ∈ F, which are called constants. Here, we denote

x1 x2 x =  .  ∈Mn. .    x   n    Therefore, (1) is a short form of the equations

n xia = σi,j (a)xj + δi(a), (2) j=1 X

4 for i =1, 2,...,n, where σi,j and δi denote the component functions of σ and δ, respectively. Each element F ∈ F[x; σ, δ] is called a free multivariate skew polynomial, or simply skew polynomial, and can be uniquely written as F = Fmm, (3) m X∈M where Fm ∈ F, for m ∈M, are called the coefficients of F , which are all zero except for a finite number of them. Define the degree of a monomial m ∈ M as its length as a string, and define the degree of a non-zero skew polynomial F ∈ F[x; σ, δ], denoted by deg(F ), as the maximum degree of a monomial m ∈M such that Fm 6= 0. We also define deg(0) = −∞. Following Ore’s line of thought [38], it was shown in [31, Theorem 1] that pairs (σ, δ) as in the previous definition correspond bijectively, via the rule (1), with products in the left vector space R with basis M that turn R into a ring with unit 1, where products of monomials consists in appending them and where deg(F G) = deg(F ) + deg(G), (4) for all F, G ∈ R. The notation R = F[x; σ, δ] emphasizes the ring structure on R given by the pair (σ, δ) via (1). If we denote by Id : F −→ Fn×n the ring morphism given by Id(a) = aI, for a ∈ F, where I ∈ Fn×n is the n × n , then F[x; Id, 0] is the free conventional polynomial ring in the variables x1, x2,...,xn (which do not commute with each other but commute with constants), as in [8, Section 0.11] and [23, Example (1.2)]. Note that this is the only skew polynomial ring where constants and variables commute. In other words, free multivariate skew polynomial rings are nothing but free multivariate polynomial rings where the commutativity axiom is dropped. Finally, an interesting point to make is that all of the results in this paper yield non-trivial results for free conventional polynomial rings over a division ring F by setting σ = Id and δ = 0. The free conventional polynomial ring with F being a field (i.e., commutative) was extensively studied in [8]. Multivariate polynomial rings over general division rings but where variables commute with each other was studied in [2]. As mentioned in [31, Section 2], the free multivariate skew polynomial ring F[x; σ, δ] can also be characterized by a universal property using the rule (1). We now state this property, leaving its proof to the reader.

Lemma 2. Let R be a left F-algebra such that there exist elements y1,y2,...,yn ∈ R satisfying n yia = σi,j (a)yj + δi(a), j=1 X for i = 1, 2,...,n, for all a ∈ F. Then there exists a unique left F-algebra morphism ϕ : F[x; σ, δ] −→ R such that ϕ(xi)= yi, for i =1, 2,...,n. We conclude the subsection by giving a few examples of ring morphisms and derivations to show how we recover classical objects but also less classical ones. Example 3 (Diagonal and triangular morphisms). A ring morphism σ : F −→ Fn×n satisfies that σi,j (a) = 0, for all a ∈ F and all i 6= j if, and only if, there exist ring endomorphisms σi : F −→ F, for i =1, 2,...,n, such that

σ1(a) 0 ... 0 0 σ2(a) ... 0 σ(a)=  . . . .  , . . .. .    0 0 ... σ (a)   n    5 for all a ∈ F. It is trivial to check that the σ-derivations in this case are precisely those such that δi is a σi-derivation, for i =1, 2,...,n. In this case, we say that σ is a diagonal morphism and we denote it by σ = diag(σ1, σ2,...,σn). Similarly for triangular morphisms. Example 4 (Similar morphisms and derivations). It is easy to see that, given an A ∈ Fn×n and a ring morphism σ : F −→ Fn×n, the similar or conjugate map τ = AσA−1 : F −→ Fn×n, given by τ(a)= Aσ(a)A−1 for a ∈ F, is also a ring morphism. Furthermore, its derivations are of the form Aδ : F −→ Fn, for a σ-derivation δ : F −→ Fn. We say that a ring morphism σ : F −→ Fn×n is diagonalizable (resp. triangulable) if it is similar to a diagonal (resp. triangular) ring morphim. It was shown in [30, Theorem 2] that all ring morphisms σ : F −→ Fn×n are diagonalizable if F is a finite field. However, this is far from the case in general, as we will next show.

Example 5 (Wild example I). Let p be a prime number and let F = Fp(z) be the field of 2×2 rational functions over Fp. The ring morphism σ : Fp(z) −→ Fp(z) given by

f(z) δ(f(z)) σ(f(z)) = , 0 f(z)   for f(z) ∈ Fp(z), where δ = d/dz is the usual standard derivation in Fp(z), is upper triangular p but is not diagonalizable: Simply note that the subfield of Fp(z) fixed by σ is Fp(z ), but there p is no field endomorphism of Fp(z), other than the identity, leaving the elements in Fp(z ) fixed.

∗ Example 6 (Wild example II). Let F = F4(z) and let γ ∈ F4 be a primitive element (F4 = {1,γ,γ2} and γ3 = 1). Define the matrix

0 z Z = ∈ (z)2×2. γz 0 F4  

Then Z is trascendental over F4, that is, there is no non-zero F ∈ F4[x] such that F (Z) = 0. 2×2 Therefore F4[Z] ⊆ F4(z) is an integral domain. Moreover, any non- in F4[Z] has 2×2 a matrix inverse (that is, its inverse lies inside the ring F4(z) ). Hence there exists a unique 2×2 ring morphism σZ : F4(z) −→ F4(z) such that σZ (z)= Z. In general, it is given by

τ(f(z)) γ∂(f(z)) σ (f(z)) = ∈ [z]2×2, Z γ2∂(f(z)) τ(f(z)) F4   2 where τ(f(z)) ∈ F4[z] and ∂(f(z)) ∈ F4[z] are formed by the even and odd terms in f(γ z) ∈ F4[z], respectively, for f(z) ∈ F4[z], and extended uniquely to F4(z). 6 The subfield of elements in F4(z) fixed by σZ is F4(z ). Therefore, σZ is neither diagonalizable nor triangulable, since there is no field endomorphism or derivation of F4(z), or a combination 6 of both, whose subfield of fixed and/or annihilated elements is F4(z ). To see this, note that any derivation δ of F4(z) is of the form δ = δ(z)d/dz, where δ(z) ∈ F4 and d/dz is the usual standard 2 derivation. For δ(z) 6= 0, the subfield of F4(z) of elements annihilated by δ is F4(z ), but again, 2 6 no field endomorphism of F4(z ), other than the identity, leaves the elements in F4(z ) fixed. Finally, it is worth showing how multiplication in F[x1, x2; σZ , 0] works. Letting k be a positive integer and setting a = z2k−1 and a = z2k in (2) we have, respectively, that

2k−1 k−1 2k−1 2k−1 k 2k−1 x1z = γ z x2 and x2z = γ z x1,

2k k 2k 2k k 2k x1z = γ z x1 and x2z = γ z x2.

6 Although Examples 5 and 6 may seem pathological, the field Fq(z), where q is a power of a prime number (typically a power of 2), appear naturally in Engineering applications, such as convolutional error-correcting codes [9]. Similarly, algebraic extensions of Fq(z) form the basis of algebraic-geometry codes [13]. The idea behind free skew polynomials as in Definition 1 is that they admit a natural arith- metic evaluation map, which we will call skew evaluation and which is guaranteed by unique remainder Euclidean division. The following definition is [31, Definition 9] and is consistent due to [31, Lemma 5].

n Definition 7 (Skew evaluation [31]). For a = (a1,a2,...,an) ∈ F and a skew polynomial S F ∈ F[x; σ, δ], we define its evaluation, denoted by F (a) = Ea (F ), as the unique element F (a) ∈ F such that F − F (a) ∈ (x1 − a1, x2 − a2,...,xn − an) . Given a set Ω ⊆ Fn, we define the skew evaluation map over Ω as the left linear map

S Ω EΩ : F[x; σ, δ] −→ F ,

S Ω where f = EΩ(F ) ∈ F is given by f(a)= F (a), for all a ∈ Ω and all F ∈ F[x; σ, δ]. Note that the skew evaluation map depends on the pair (σ, δ). This will be the case with most of the objects defined from now on. However, we will not write such a dependency for brevity, unless it is necessary to avoid confusions. The main motivation behind free multivariate skew polynomials is that Definition 7 is con- sistent. In contrast, if variables are allowed to commute (as considered, for instance, in [2]), then such a definition is not consistent unless F is a field, σ = Id and δ = 0 (see [31, Remark 7]). Definition 7 would not always be consistent either for iterated skew polynomials (see [31, Remark 8] and [12, Example 3.7]).

2.2 Linearized polynomials and linearized evaluation We now turn to linearized (multivariate) skew polynomials. The idea is to turn skew polynomials into linear maps by giving an alternative evaluation map. Remarkably, both evaluation maps are related by a simple formula involving the conjugacy relation (Theorem 1), which was proven in a more general form for the univariate case in [27, Theorem 2.8] (see also [29, Lemma 24]). However, one major difference arises in the multivariate case. On the one hand, skew polynomials are evaluated on an n-dimensional affine point over F. On the other hand, for each representative of a conjugacy class (see Definition 18), linearized skew polynomials are evaluated on an element of the division ring F. Definition 8 (Linearized multivariate skew polynomials). Given a ring morphism σ : F −→ Fn×n, a σ-derivation δ : F −→ Fn, a point a ∈ Fn and a monomial m ∈M, we define the operator m Da : F −→ F 1 m recursively on m ∈ M as follows. We start by defining Da = Id. Next, if Da is defined for m ∈M, then we define

x m Da1 (β) x m Da2 (β) xm m m n Da (β)=  .  = σ(Da (β))a + δ(Da (β)) ∈ F , .  x m   Dan (β)      7 for all β ∈ F. For convenience, we denote by Da : F −→ Fn the operator given by

x Da1 (β) x Da2 (β) n Da(β)=  .  = σ(β)a + δ(β) ∈ F , .  x   Dan (β)      for all β ∈ F. Hence, by definition, we have that xm m Da = Da ◦ Da , (5) for all m ∈M and all a ∈ F. We then define the left vector space of linearized (multivariate skew) polynomials F[Da] over F, with variables x1, x2,...,xn, morphism σ, derivation δ and conjugacy M m representative a, as the left vector space generated by the set of operators Da = {Da | m ∈ M}, which need not be a basis nor be in a one-to-one correspondence with M. We define the left F-linear (surjective) map

φa : F[x; σ, δ] −→ F[Da] m m (6) m∈M Fm 7→ m∈M FmDa , and we denote F D = φa(F ), for allPF ∈ F[x; σ, δ], omittingP the dependency on a for brevity. Observe that classical (univariate) linearized polynomials over finite fields [28, Chapter 3] as considered originally by Moore [33] and Ore [37] are the elements in the ring F[Da] from Definition 8 whenever F is a finite field, n = 1, δ = 0 and a = 1. See Example 14 below. From the definition itself, linearized polynomials admit a natural evaluation, which we will call linearized evaluation.

Definition 9 (Linearized evaluation). Given a ∈ Fn, for a linearized polynomial F D = m m∈M FmDa ∈ F[Da], we define its evaluation over β ∈ F as

P D m F (β)= FmDa (β) ∈ F. m X∈M Given a set Ω ⊆ F, we define the linearized evaluation map over Ω as the left linear map

L Ω EΩ : F[Da] −→ F ,

L Ω D D where f = EΩ (F ) ∈ F is given by f(β)= F (β), for all β ∈ Ω and all F ∈ F[Da]. As announced earlier, linearized polynomials are right linear over certain division subring of F, called centralizers, which were defined in [25, Equation (3.1)] in the univariate case. Definition 10 (Centralizers). Given a ∈ Fn, we define its centralizer as

Ka = {β ∈ F | Da(β)= aβ}⊆ F. The proof of the following lemma is straightforward.

n Lemma 11. For all a ∈ F , it holds that Ka ⊆ F is a division subring of F. Moreover, for F D ∈ F[Da], the map β 7→ F D(β), for β ∈ F, is right linear over Ka. That is, for all β,γ ∈ F and all λ ∈ Ka, it holds that

F D(β + γ)= F D(β)+ F D(γ) and F D(βλ)= F D(β)λ.

8 Thus, we have provided a type of evaluation that turns skew polynomials into linear maps over certain division subring. Interestingly, centralizers are the largest division subrings over which linearized polynomials are right linear. We will use this result later, for instance to establish P-Galois correspondences (see Subsection 6.3), since it allows to recover Ka from F[Da]. Proposition 12. For all a ∈ Fn, it holds that

Da Da Da Ka = {λ ∈ F | F (βλ)= F (β)λ, ∀β ∈ F, ∀F ∈ F[Da]}.

In other words, Ka is the largest division subring K of F such that every linearized polynomial in F[Da] is right linear over K.

Da x Proof. The inclusion ⊆ is Lemma 11. For the reversed inclusion, choose β = 1 and F = Dai , for some i =1, 2,...,n. Then xi xi Da (λ)= Da (1)λ = aiλ.

Thus Da(λ)= aλ and hence λ ∈ Ka by definition. In the next subsection, we connect both types of evaluation. We conclude this subsection with a few examples. Example 13 (Group algebras). Assume that G is a finite group (commutative or not) of ring automorphisms of F generated by σ1, σ2,...,σn. Consider σ = diag(σ1, σ2, ..., σn) (see Example 3), δ = 0 and a = 1. m Note that all elements in G are of the form D1 = m(σ), where m ∈ M and m(σ) denotes the symbolic evaluation of m in (σ1, σ2,...,σn) (for instance, m(σ) = σ1σ2 if m = x1x2). This mi is due to the fact that, since G is finite, there exists a positive integer mi such that σi = Id m −1 mi−1 mi−1 (as the set {σi | m ∈ N} ⊆ G is finite), thus σi = σi = (xi )(σ), for i = 1, 2,...,n. Therefore, we have that F[D1] = F[G] is the group algebra of G over F. Finally, observe that K1 = FG = {β ∈ F | τ(β)= β, ∀τ ∈ G} (see also Example 64). Example 14 (Linearized polynomials over finite fields). Let assumptions and notation be as in Example 13 above, and further let F = Fqm , for a positive integer m and a prime power q. qr Set n = 1 and define σ = σ1 by σ(β)= β , for all β ∈ Fqm , for some integer r ≥ 1 coprime with m. In such a case, F[D1] recovers the classical ring of linearized polynomials over a finite field [33, 37] [28, Chapter 3]:

∼ qr qrd F[D1] = Lqr Fqm [x]= {F0x + F1x + ··· + Fdx ∈ Fqm [x] | d ∈ N, F0, F1,...,Fd ∈ Fqm }.

Example 15 (Algebra of derivations). Assume that δ1,δ2,...,δn : F −→ F are standard derivations (i.e. Id-derivations) and let ∇ = {m(δ) | m ∈ M}, where m(δ) is a symbolic evaluation as in the previous example. Consider also σ = Id and a = 0. Similarly to the previous example, m it holds that D0 = m(δ), for all m ∈M, thus F[D0]= F[∇] is the algebra of derivatives ∇ over F. Finally, observe that K0 = Fδ = {β ∈ F | ∂(β)=0, ∀∂ ∈ ∇} (see also Example 65).

Example 16 (The case of σi-derivations). The previous example could be trivially extended to the case where δi is a σi-derivation, for i =1, 2,...,n.

2.3 Connecting both evaluations We now give the main connection between skew evaluation and linearized evaluation. This result extends the last part of [27, Theorem 2.8] from the univariate to the multivariate case. See also

9 [29, Lemma 24]. It is worth giving a meaningful proof of the connection between both evaluations, which requires the concepts of norm and conjugacy, which we will use again later in the paper. The following formula for the skew evaluation of monomials was given in [31, Theorem 2] and motivates the definition of multivariate norms. Lemma 17 (Multivariate norms [31]). Given a monomial m ∈ M and a point a ∈ Fn, S denote by Nm(a)= Ea (m) ∈ F the evaluation of the skew monomial m at a. It holds that

Nx1m(a) Nx m(a) 2 n Nxm(a)=  .  = σ(Nm(a))a + δ(Nm(a)) ∈ F , (7) .    N m(a)   xn    or in other words, it holds that Nxm(a)= Da(Nm(a)). (8)

By choosing n = 1 and δ = 0, the previous maps Nm recover the concept of norm (or “truncated norm”). For this reason, we will call Nm(a) the mth (multivariate) norm of a. We next revisit the concept of conjugacy. The following definition is [31, Definition 11]. Definition 18 (Conjugacy [31]). Given a ∈ Fn and β ∈ F∗, we define the conjugate of a with respect to β, which is called exponent, as β −1 n a = Da(β)β ∈ F . (9) We give the exponential notation aβ for simplicity and for consistency with previous notation (see [22, 24, 25, 26, 31]). Recall from [31, Lemma 12] that conjugacy defines an equivalence relation in Fn, thus a partition of Fn into conjugacy classes, which will be denoted by −1 ∗ n C(a)= {Da(β)β | β ∈ F }⊆ F , (10) for a ∈ Fn. Linearized polynomials and centralizers over conjugate points can be connected easily as follows. The proof is straightforward. Lemma 19. Let a, b ∈ Fn and γ ∈ F∗ be such that b = aγ . Then it holds that −1 Db(β)γ = Da(βγ) and Kb = γKaγ , for all β ∈ F. In particular, if F is commutative, then Kb = Ka. We may now prove the connection between linearized and skew evaluations. This result can be seen as an explicit linearization of the map β 7→ Nm(aβ ) that maps an exponent β to the mth norm of the conjugate aβ of a. Theorem 1. Given a ∈ Fn, β ∈ F∗ and F ∈ F[x; σ, δ], and denoting D = Da, it holds that F (D(β)β−1)= F D(β)β−1. In particular, for all monomials m ∈M, we have that −1 m −1 Nm(D(β)β )= D (β)β . (11) Proof. By linearity, we only need to prove (11), for all m ∈M. The case m = 1 is trivial. Assume now that it is true for a given m ∈M. Combining Equations (5) and (8) with Lemma 19, and denoting b = aβ = Da(β)β−1, we have that −1 m −1 xm −1 Nxm(b)= Db(Nm(b)) = Da(Nm(b)β)β = Da(Da (β))β = Da (β)β , and we are done.

10 3 Linearizing sets of roots and Lagrange interpolation

The structure of sets of roots play a central role in the study of conventional polynomials. In particular, Lagrange interpolation behaves well when the evaluation points form a “basis” of some set of roots, meaning that they can be differentiated by taking “independent” values on different polynomials. This is also true for skew polynomials [31, Theorem 4] and leads to the concepts of P-closed sets, P-independence and P-bases, where P stands for polynomial. Such concepts were introduced by Lam and Leroy in [22, 24, 25] for n = 1, and in [31] for the multivariate case. By looking at Theorem 1, we see that after fixing a conjugacy representative, the set of roots of a skew polynomial in that conjugacy class corresponds to certain right vector subspace of F. Furthermore, by Lemma 19, there is a simple way of changing the conjugacy representative. This suggests a linear structure of sets of roots on each conjugacy class separately. In this section, we will give such a linearized structure of the sets of roots of skew poly- nomials and linearized polynomials. In Section 3.1, we revisit the concepts of P-closed sets, P-independence, P-bases and skew Lagrange interpolation from [31]. In Section 3.2, we show that P-independence in one conjugacy class corresponds to right over the cor- responding centralizer. In Section 3.3, we show that, in general, P-independent sets correspond simply to disjoint unions of right linearly independent elements over the different centralizers. We will also give descriptions in terms of lattices. The results in Subsections 3.2 and 3.3 extend the important results [25, Theorem 4.5] and [22, Theorem 23], respectively.

3.1 P-closed sets and skew Lagrange interpolation We revisit the concepts of P-closedness and skew Lagrange interpolation from [31], all of which were previously introduced in [22, 24, 25] for n = 1. As in classical algebraic geometry, given a set A⊆ F[x; σ, δ], we define its set of roots, or zero set for brevity, as Z(A)= {a ∈ Fn | F (a)=0, ∀F ∈ A}. Conversely, given a set Ω ⊆ Fn, we define its associated ideal as I(Ω) = {F ∈ F[x; σ, δ] | F (a)=0, ∀a ∈ Ω}, which is a left ideal of F[x; σ, δ] by Definition 7. The following definition is [31, Definition 16]. Definition 20 (P-closed sets [31]). Given a subset Ω ⊆ Fn, we define its P-closure as Ω= Z(I(Ω)), and we say that Ω is P-closed if Ω = Ω. P-closed sets form all sets of roots of skew polynomials [31, Proposition 15, Item 8]. Fur- thermore, the P-closure of a set Ω ⊆ Fn is the smallest P-closed set in Fn containing Ω [31, Lemma 17]. This naturally leads to the following concepts, given in [31, Definitions 22, 23 & 24], respectively. Definition 21 (P-generators [31]). Given a P-closed set Ω ⊆ Fn, we say that G ⊆ Ω generates Ω if G = Ω, and it is then called a set of P-generators for Ω. We say that Ω is finitely generated if it has a finite set of P-generators. Definition 22 (P-independence [31]). We say that a ∈ Fn is P-independent from Ω ⊆ Fn if it does not belong to Ω. A set Ω ⊆ Fn is called P-independent if every a ∈ Ω is P-independent from Ω \{a}. P-dependent means not P-independent.

11 Definition 23 (P-bases [31]). Given a P-closed set Ω ⊆ Fn, we say that a subset B ⊆ Ω is a P-basis of Ω if it is P-independent and a set of P-generators of Ω. P-bases are minimal sets of P-generators of a P-closed set [31, Proposition 25] and, for a finitely generated P-closed set, they also correspond to its maximal P-independent subsets [31, Lemma 36]. Another important result is the following, which combines [31, Corollary 26] and [31, Corollary 32]. Lemma 24 ([31]). If a P-closed set is finitely generated, then it admits a finite P-basis, and any two of its P-bases are finite and have the same number of elements. Thus the following definition [31, Definition 33] is consistent.

Definition 25 (Ranks [31]). Given a finitely generated P-closed set Ω ⊆ Fn, we define its rank, denoted by Rk(Ω), as the size of any of its P-bases. Moreover, P-closed subsets of finitely generated P-closed sets are in turn finitely generated [31, Corollary 37].

Lemma 26 ([31]). Let Ψ ⊆ Ω ⊆ Fn be P-closed sets. If Ω is finitely generated, then so is Ψ. The main feature of P-bases of finitely generated P-closed sets is the following result on the existence and uniqueness of skew Lagrange interpolating polynomials, which is [31, Theorem 4].

Theorem 2 (Skew Lagrange interpolation [31]). Let Ω ⊆ Fn be a finitely generated P-closed set with finite P-basis B = {b1, b2,..., bM }, where M = Rk(Ω). The following hold:

S S S S 1. If EB (F )= EB (G), then EΩ(F )= EΩ(G), for all F, G ∈ F[x; σ, δ].

2. For every a1,a2,...,aM ∈ F, there exists F ∈ F[x; σ, δ] such that deg(F )

Definition 27 (Dual P-bases [31]). Given a finite P-basis B = {b1, b2,..., bM } of a P-closed set Ω ⊆ Fn, we say that a set of skew polynomials

∗ B = {F1, F2,...,FM }⊆ F[x; σ, δ] is a dual P-basis of B if Fi(bj ) = δi,j , where δi,j denotes the Kronecker delta, for all i, j = 1, 2,...,M. Thus the following is an immediate consequence of skew Lagrange interpolation, and was given in [31, Corollary 31]. Corollary 28 ([31]). Any finite P-basis, with M = Rk(Ω) elements, of a P-closed set Ω admits a dual P-basis consisting of M skew polynomials of degree less than M. Moreover, any two dual P-bases of the same P-basis define the same skew polynomial functions over Ω.

12 Another important consequence that we will use throughout the paper is the following left vector space isomorphism.

Corollary 29 ([31]). If {F1, F2,...,FM } is a dual P-basis of a finitely generated P-closed set Ω ⊆ Fn, then the natural projection map restricts to a left F-linear vector space isomorphism L ∼ hF1, F2,...,FM iF = F[x; σ, δ]/I(Ω).

Moreover, F1, F2,...,FM are left linearly independent over F, hence

L dimF (F[x; σ, δ]/I(Ω)) = Rk(Ω).

Finally, a powerful tool to relate conjugate points will be the so-called product rule, given first in [25, Theorem 2.7] in the univariate case, and in [31, Theorem 3] in general.

Theorem 3 (Product rule [31]). Given skew polynomials F, G ∈ F[x; σ, δ] and a ∈ Fn, if G(a)=0, then (F G)(a)=0, and if β = G(a) 6=0, then

(F G)(a)= F (aβ)G(a).

3.2 Linearizing P-closed sets in one conjugacy class In this subsection, we will give a linearized description of finitely generated P-closed sets that are generated by a subset of a single conjugacy class. As we will see, such finitely generated P-closed sets correspond to right linear subspaces of F over the corresponding centralizer. The results in this section extend [25, Theorem 4.5] from the univariate to the multivariate case. The first important ingredient is the following equivalence between P-independence and right linear independence over a single conjugacy class.

n ∗ Lemma 30. Let a, b1, b2,..., bM ∈ F and β1,β2,...,βM ∈ F be such that

βi −1 bi = a = Da(βi)βi ,

D for i = 1, 2,...,M. It holds that B = {b1, b2,..., bM } is P-independent if, and only if, B = {β1,β2,...,βM } is right linearly independent over Ka. Proof. We first prove the direct implication, which is significantly easier. Assume that B is P- D ∗ independent, but B is not right linearly independent over Ka. Let B = {F1, F2,...,FM } ⊆ F[x,σ,δ] be a dual P-basis of B (Definition 27), which exists by Corollary 28. We may assume without loss of generality that there exist λ1, λ2,...,λM−1 ∈ Ka such that

M−1 βM = βiλi. i=1 X Therefore by Lemma 11 and Theorem 1, denoting D = Da, it holds that

M−1 D D βM = FM (βM )= FM (βi)λi =0, i=1 X ∗ which is absurd since βM ∈ F by hypothesis. Conversely, assume that BD is right linearly independent over Ka. We will prove by induction on M that B is P-independent. The case M = 1 is obvious since singleton sets are always

13 ′ P-independent. Assume then that B = {b1, b2,..., bM−1} is P-independent but B is not P- ′ indpendent. Then bM ∈ B , since otherwise B would be P-independent by [31, Lemma 36]. ′∗ ′ Let B = {F1, F2,...,FM−1} be a dual P-basis of B (Corollary 28). Fix i =1, 2,...,M − 1 βM n and define Gi = (x−bi )(βM Fi) ∈ F[x; σ, δ] . It holds that Gi(bj )= 0, for j =1, 2,...,M −1, ′ by Theorem 3. We deduce from bM ∈ B and Theorem 2 that Gi(bM )= 0. If Fi(bM ) 6= 0, then

βM Fi(bM ) β M b b 0 = Gi(bM ) = (bM − bi )βM Fi(bM )= D M (βM Fi(bM )) − D i (βM )Fi(bM ), by Theorem 3. By Lemma 19, we deduce that

−1 −1 −1 −1 −1 0 = Da(βM Fi(bM )βM )βM Fi(bM ) βM − Da(βM βi)βi βM . (12) Using the notation in Definition 18, a straightforward calculation (see [31, Lemma 12, Item 1]) shows that (12) is equivalent to

− β F (b )β β β β 1F (b )β a M i M M = a M i ⇐⇒ a i i M M = a,

−1 which means that λi = βi Fi(bM )βM ∈ Ka, by Definition 10. Hence in all cases (Fi(bM )=0 or Fi(bM ) 6= 0) we have that −1 Fi(bM )= βiλiβM , for some λi ∈ Ka. Next if F = F1 + F2 + ··· + FM−1 ∈ F[x; σ, δ], we have by Definition 27 that

F (bj )=1,

′ for j =1, 2,...,M − 1. Since bM ∈ B , we deduce from Theorem 2 that F (bM ) = 1. Hence

M−1 M−1 M−1 −1 1= F (bM )= Fi(bM )= βiλiβM ⇐⇒ βM = βiλi. i=1 i=1 i=1 X X X That is, βM is right linearly dependent of β1,β2,...,βM−1 over Ka, which is a contradiction. The second important ingredient is to ensure that P-closed sets generated by a finite set inside a single conjugacy class remain contained in such a conjugacy class.

Lemma 31. Let G ⊆ Fn be a finite set. If b ∈ G, then b is conjugate to an element in G.

Proof. Let B = {b1, b2,..., bM } ⊆ G be a P-basis of G, which exists by [31, Proposition 25], and let {F1, F2,...,FM } be a dual P-basis of B (Corollary 28). There exists i = 1, 2,...,M such that Fi(b) 6= 0, since otherwise we deduce from Corollary 29 that F (b) = 0, for all n F ∈ F[x; σ, δ], which is absurd. However, if Gi = (x − bi)Fi ∈ F[x; σ, δ] , then Gi(bj ) = 0, for all j =1, 2,...,M, by Theorem 3. Since b ∈ B, we deduce from Theorem 2 and Theorem 3 that

Fi(b) 0 = Gi(b) = (b − bi)Fi(b).

Therefore b is conjugate to bi ∈ G and we are done. We may now give the main result of this section, which gives a linearized description of P-closed sets generated by a finite subset of a single conjugacy class.

Theorem 4. Let a ∈ Fn. The following hold:

14 1. If G⊆C(a) is finite and Ω= G ⊆ Fn, then

Ω= {aβ | β ∈ ΩD \{0}}⊆C(a), (13)

D for a finite-dimensional right vector space Ω ⊆ F over Ka.

2. Conversely, if ΩD ⊆ F is a finite-dimensional right vector space over Ka, then Ω ⊆ C(a) given as in (13) is a finitely generated P-closed set.

Moreover if Item 1 or 2 holds, then BD is a right basis of ΩD over Ka if, and only if, B = {aβ ∈ F∗ | β ∈BD} is a P-basis of Ω. In particular, we have that

R D Rk(Ω) = dimKa (Ω ). (14)

Thus we deduce that the map Ω 7→ ΩD is a bijection between finitely generated P-closed subsets of C(a) and finite-dimensional righ vector subspaces of F over Ka.

Proof. Assume first the hypotheses in Item 1, and let B = {b1, b2,..., bM } ⊆ G be a minimal ∗ set of P-generators of Ω, hence a P-basis of Ω by [31, Proposition 25]. Let βi ∈ F be such that βi bi = a , for i =1, 2,...,M, which exist since G⊆C(a), and define

D R Ω = hβ1,β2,...,βM iKa ⊆ F.

First, we have that Ω ⊆C(a) by Lemma 31. Now, the equality in (13) follows directly from the equivalence between P-independence and right linear independence inside the conjugacy class C(a) by Lemma 30 (analogously to the paragraph below), and Item 1 is proven. D D Assume now the hypotheses in Item 2. Let B = {β1,β2,...,βM } be a right basis of Ω βi over Ka, and define bi = a ∈ Ω, for i = 1, 2,...,M. We will prove that Ω = B, for B = β D {b1, b2,..., bM }. Let b = a ∈ Ω, for some β ∈ Ω . By Lemma 30, we have that {b}∪B is P-dependent. Thus by [31, Lemma 36] we have that b ∈ B, and we conclude that Ω ⊆ B. Conversely, by Lemma 31, if b ∈ B, then b = aβ, for some β ∈ F∗. Again by Lemma 30, we have that β ∈ ΩD, and we conclude that B ⊆ Ω, and Item 2 is proven. Finally, the claim on P-bases and right bases follows from Lemma 30 (analogously to the rest of this proof, see also [29, Corollary 27]), and we are done. We may deduce the following important consequence. As we will show in Subsection 6.4, this consequence is a generalization of Hilbert’s Theorem 90.

Corollary 32. Let a ∈ F. The conjugacy class C(a) ⊆ Fn is P-closed and finitely generated if, and only if, F is a finite-dimensional right vector space over Ka. In such a case, we have that

R Rk(C(a)) = dimKa (F).

Proof. Assume that C(a) ⊆ Fn is a finitely generated P-closed set, and let ΩD be as in Item 1 in Theorem 4, for Ω = C(a). Let BD be a finite right basis of ΩD. If β ∈ F, we have that aβ ∈ Ω, thus β is right linearly dependent from BD by Lemma 30. Hence F =ΩD, and F has finite right dimension over Ka. The converse is trivial from Item 2 in Theorem 4. The last equality in the corollary follows directly from (14). We may also deduce the following important consequence. As we will show in Subsection 6.2, this consequence is a generalization of Artin’s Theorem to prove Galois’ Theorem. We will extend it to arbitrary finitely generated P-closed sets in Theorems 7 and 9, where we also consider the ring arithmetic of linearized polynomials.

15 Corollary 33. For all a ∈ Fn, it holds that the map in (6) restricts to a left F-linear vector space isomorphism φa : F[x; σ, δ]/I(C(a)) −→ F[Da]. (15) In particular, F[Da] is a finite-dimensional left vector space over F if, and only if, F is a finite- dimensional right vector space over Ka, and in such a case, we conclude that L L R dimF (F[Da]) = dimF (F[x; σ, δ]/I(C(a))) = Rk(C(a)) = dimKa (F).

Proof. The fact that φa restricts to a left vector space isomorphism as in (15) follows from Theorem 1 and the definitions. In particular, if F has finite right dimension over Ka, then F[Da] has finite left dimension over F by Theorem 4 and Corollary 29. The equalities at the end of the corollary follow then from Corollary 29, Theorem 4 and (15). Conversely, assume that F has inifinite right dimension over Ka. Assume also that the left dimension of F[x; σ, δ]/I(C(a)) over F is finite. We will now reach a contradiction. By Theorem 4, there exists a finitely generated P-closed set Ω ⊆C(a) such that Rk(Ω) is strictly larger than the left dimension of F[x; σ, δ]/I(C(a)) over F. However, this contradicts the fact that the natural left linear projection map

F[x; σ, δ]/I(C(a)) −→ F[x; σ, δ]/I(Ω) is surjective and Rk(Ω) = dim(F[x; σ, δ]/I(Ω)) by Corollary 29, and we are done. We may also deduce that the set of P-closed subsets of a conjugacy class forms a lattice that is naturally isomorphic to the lattice of right projective subspaces of PKa (F) over Ka.

Corollary 34. Let a ∈ F, and define the sum of two finitely generated P-closed sets Ω1, Ω2 ⊆ C(a) as Ω1 +Ω2 = Ω1 ∪ Ω2 ⊆C(a). The collection of finitely generated P-closed subsets of C(a) forms a lattice with sums and intersections isomorphic to the lattice of right projective subspaces R a of PKa (F) over K via the bijection

a R π : PKa (F) −→ C(a) [β] 7→ aβ,

∗ ∗ R where [β]= {βλ ∈ F | λ ∈ Ka} ∈ PKa (F). For any finitely generated P-closed subset Ω ⊆C(a) D and the finite-dimensional right vector space Ω ⊆ F over Ka as in Theorem 4, the bijection πa restricts to a bijection R D πΩ : PKa (Ω ) −→ Ω [β] 7→ aβ, that induces a lattice isomorphism with respect to the same operations.

3.3 Linearizing P-closed sets over several conjugacy classes From the previous section (Theorem 4) and Lemma 26, we know that if a ∈ F and Ω ⊆ Fn is a finitely generated P-closed set, then Ω ∩C(a) is a finitely generated P-closed set corresponding to a finite-dimensional right vector space over Ka. In this section, we show that any P-basis of Ω has the same partition into conjugacy classes, thus Rk(Ω) is the sum of Rk(Ω ∩ C(a)), running over disjoint conjugacy classes. This extends the result [22, Theorem 23] from the univariate to the multivariate case. In particular, we will show in Corollary 36 that the lattice of finitely generated P-closed subsets of a finite union of conjugacy classes is isomorphic to the Cartesian product of the lattices of projective spaces over the corresponding centralizers. We start with the following lemma.

16 n Lemma 35. Let B1, B2 ⊆ F be non-empty finite P-independent sets such that no element in B1 is conjugate to an element in B2. Then B = B1 ∪B2 is P-independent.

Proof. Let B1 = {b1, b2,..., bM } and B2 = {c1, c2,..., cN }, where M,N > 0. We will prove the result by induction on k = M + N. The case k = 2 (M = N = 1) is trivial, since any set of two elements is P-independent. Assume that the lemma holds for certain k = M + N and we prove it for k +1= M + N + 1, where we may assume without loss of generality that N +1=#B2. If the result does not hold for k + 1, we may assume that cN+1 ∈ B, where B = {b1, b2,..., bM , c1, c2,..., cN }. Since B is P-independent by induction hypothesis, we may take one of its dual P-bases {F1, F2,...,FM , G1, G2,...,GN }. Also by hypothesis, we may take a dual P-basis {H1,H2,..., HN+1} of B2. First we prove that Fi(cN+1) = 0, for all i = 1, 2,...,M. Assume that it does not hold for n certain i. It holds that Gi = (x − bi)Fi ∈ I(B) , and since cN+1 ∈ B, then

Fi(cN+1) 0 = Gi(cN+1)= cN+1 − bi Fi(cN+1),   hence cN+1 and bi are conjugate, which is a contradiction. Next define

M F = HN+1 − HN+1(bi)Fi. i=1 X

It holds that F (bi) = F (cj ) = 0, for all i = 1, 2,...,M and all j = 1, 2,...,N. That is, F ∈ I(B), and since cN+1 ∈ B, we have that F (cN+1) = 0. In other words,

M 0= F (cN+1)= HN+1(cN+1) − HN+1(bi)Fi(cN+1)=1 − 0, i=1 X which is absurd, and we are done. We may now state and prove the main result of this section:

Theorem 5. If Ω ⊆ Fn is P-closed and finitely generated, then so is Ω ∩C(a) for all a ∈ Fn. Conversely, if the sets Ωi ⊆ C(ai) are P-closed and finitely generated, for i = 1, 2,...,ℓ, where n a1, a2,..., aℓ ∈ F are pair-wise non-conjugate, then Ω=Ω1 ∪ Ω2 ∪ ... ∪ Ωℓ is P-closed and finitely generated. In addition, if Bi is a P-basis of Ωi, for i =1, 2,...,ℓ, then B = B1 ∪B2 ∪ ...∪Bℓ is a P-basis of Ω, and in particular we have that

Rk(Ω) = Rk(Ω1) + Rk(Ω2)+ ··· + Rk(Ωℓ).

Proof. The first sentence follows from Theorem 4 and Lemma 26. Now let Ωi ⊆C(ai) be P-closed sets with finite P-bases Bi, for i =1, 2,...,ℓ, as in the theorem, and define Ω = Ω1 ∪Ω2 ∪...∪Ωℓ. First B = B1 ∪B2 ∪ ... ∪Bℓ is P-independent by Lemma 35, hence we are done if we prove that Ω= B. Since the inclusion Ω ⊆ B is obvious, we only need to prove the reversed one. Let b ∈ B. By Lemma 31, there exists j =1, 2,...,ℓ such that b is conjugate to an element ′ ′ in Bj . Define B = i6=j Bi and let {F1, F2,...,FM } be a dual P-basis of B = {b1, b2,..., bM }. Let F ∈ I(Bj ) and define S M G = F − F (bi)Fi. i=1 X

17 It holds that Fi(c) = 0, for all c ∈Bj, and therefore G ∈ I(B). Thus G(b) = 0. However, since ′ b is not conjugate to any element in B , it must hold that Fi(b) = 0, for all i =1, 2,...,M, by the proof of Lemma 31. Hence F (b) = 0 and b ∈ Bj =Ωj ⊆ Ω. We conclude by giving a lattice represenation of finitely generated P-closed sets over several conjugacy classes, which follows directly from Theorem 5. n Corollary 36. Let a1, a2,..., aℓ ∈ F be pair-wise non-conjugate. Then the lattice of finitely generated P-closed subsets of C(a1) ∪C(a2) ∪ ... ∪C(aℓ) is isomorphic to the Cartesian-product lattice

a ( ) × a ( ) ×···× a ( ), PK 1 F PK 2 F PK ℓ F via the map D D D Ω=Ω1 ∪ Ω2 ∪ ... ∪ Ωℓ 7→ (Ω1 , Ω2 ,..., Ωℓ ), where Ωi =Ω ∩C(ai), for i =1, 2,...,ℓ.

3.4 Linearizing Lagrange interpolation In this short subsection, we rewrite Theorem 2 using linearized polynomials. Theorem 6 (Linearized Lagrange interpolation). Let Ω ⊆ Fn be a finitely generated P- D (i) (i) (i) D closed set, define Ωi = Ω ∩C(ai) and let Bi = {β1 ,β2 ,...,βMi } be a right basis of Ωi (with (j) notation as in Theorem 4), for i = 1, 2,...,ℓ. Then, for all aj ∈ F, for j = 1, 2,...,Mi and for i =1, 2,...,ℓ, there exists F ∈ F[x; σ, δ] such that deg(F )

Remark 37. It was proven by Amitsur in [1, Theorem 2] that, if δ : F −→ F is a standard derivation over the division ring F, and K0 is the corresponding subring of constants, then for any right vector subspace ΩD of F over K0 of right dimension M, there exists a differential equation of order at most M whose space of solution is precisely ΩD. This result is recovered from Theorem 6 by setting n =1, σ = Id, ℓ =1 and a1 = 0.

4 Skew and linearized polynomial arithmetic

In this section, we show that the natural product of skew polynomials, given either by the rule (1) or by the universal property in Lemma 2, corresponds with composition of linearized polynomials and conventional products of matrices, when considered over a single conjugacy class. These results are obtained in Subsection 4.1 and extend the well-known particular cases obtained when n = 1. In Subsection 4.2, we show that, when considering several conjugacy classes, the natural product of skew polynomials decomposes into coordinate-wise products over each conjugacy class. We conclude (Corollary 45) that quotients of free multivariate skew polynomial rings over the ideal of skew polynomials vanishing on a finite union of finitely generated conjugacy classes is a semisimple ring. Moreover, they are simple rings in the case of a single conjugacy class (Corollary 44). Apart from its own interest, we will use these tools to give a Galois correspondence in Subsection 6.3.

18 4.1 A single conjugacy class: Map composition and matrix multiplica- tion We start by showing that skew polynomial multiplication over a single conjugacy class corre- sponds with composition of right Ka-linear maps in F[Da], which we will denote from now on by ◦. We will implicitly consider F[Da] as a left F-algebra with product ◦. Theorem 7. Given F, G ∈ F[x; σ, δ] and a ∈ Fn, it holds that (F G)Da = F Da ◦ GDa . (16)

In particular, the map given in (15),

φa : F[x; σ, δ]/I(C(a)) −→ F[Da], is a left F-algebra isomorphism. Proof. For any β ∈ F, the reader may check the rule

Da ◦ (βId) = σ(β)Da + δ(β)Id, (17) where the map δ(β)Id : F −→ Fn is defined by γ 7→ δ(β)γ. After untangling the definitions, (17) is only the short form of the equations

n xi xj Da ◦ (βId) = σi,j (β)Da + δi(β)Id ∈ F[Da], j=1 X for i =1, 2,...,n. Since these equations are the defining property of the product of the free skew polynomial ring F[x; σ, δ], the theorem follows from its universal property (Lemma 2). This map restricts to left F[x; σ, δ]-linear module isomorphism between F[x; σ, δ]/I(Ω) and certain quotients of F[Da], for finitely generated P-closed subsets Ω ⊆ C(a). To this end, we introduced left ideals of linearized polynomials vanishing in right Ka-linear subspaces of F.

Definition 38. Let a ∈ Fn, and let ΩD ⊆ F be a finite-dimensional right Ka-linear vector space. We define the ideal associated to ΩD as

D D D D I(Ω )= {F ∈ F[Da] | F (β)=0, ∀β ∈ Ω }.

The following result is straightforward.

D Proposition 39. With notation as in Definition 38, the set I(Ω ) is a left ideal of F[Da]. More interestingly, we have the following anticipated isomorphism. The proof is straightfor- ward from Theorems 1, 4 and 7.

Corollary 40. Let a ∈ Fn. Let Ω ⊆ C(a) be a finitely generated P-closed set and let ΩD ⊆ F be the corresponding right Ka-linear subspace, as in Theorem 4. The map φa in Theorem 7 D satisfies that φa(I(Ω)) = I(Ω ) and restricts to a natural left F[x; σ, δ]/I(C(a))-linear module isomorphism D φΩ : F[x; σ, δ]/I(Ω) −→ F[Da]/I(Ω ). (18) In particular, we conclude that

L D L R D dimF (F[Da]/I(Ω )) = dimF (F[x; σ, δ]/I(Ω)) = Rk(Ω) = dimKa (Ω ).

19 Remark 41. As shown in [31, Proposition 18], for a P-closed set Ω ⊆ Fn, it holds that I(Ω) is a two-sided ideal of F[x; σ, δ] if, and only if, Ω is closed under conjugacy. Hence, if Ω ⊆ C(a), then I(Ω) is a two-sided ideal if, and only if, Ω = C(a). Therefore, we deduce from Theorem 4 that I(ΩD) is a two-sided ideal of F[Da] if, and only if, ΩD = F. In all other cases, the left modules in (18) are not rings. Remark 42. Just as we did in Subsection 3.1, corresponding to [31, Section V], we could define I(B) = {F D ∈ F[Da] | F D(β)=0, ∀β ∈ B} and Z(A) = {β ∈ F | F D(β)=0, ∀F D ∈ A}, for arbitrary sets B ⊆ F and A ⊆ F[Da]. Basic rules as in [31, Prop. 15] still hold. However, the interest of considering closures as in Definition 20 is lost, since it is easy to see, from the results obtained so far, that the sets Z(A) correspond to right Ka-linear vector subspaces of F, and that

R Z(I(B)) = hBiKa ⊆ F.

Now we turn to . For the rest of this subsection, fix a ∈ Fn. Let V ⊆ F M be a right vector space over Ka and fix one of its ordered right bases β = (β1,β2,...,βM ) ∈ F . M M×M Denote by µβ : V −→ Ka the corresponding matrix-representation map, given by

1 1 1 x1 x2 ... xM 2 2 2 x1 x2 ... xM µβ (x)=  . . . .  , (19) . . .. .    xM xM ... xM   1 2 M    M 1 2 M for x = (x1, x2,...,xM ) ∈ V , where xj , xj ,...,xj ∈ Ka are the unique scalars such that M i xj = i=1 βixj ∈ F, for j = 1, 2,...,M. Observe that µβ is a right Ka-linear vector space isomorphism, and it is the identity map if M = 1 and β1 = 1. P Definition 43. Given x, y ∈VM , we define their matrix product with respect to the basis β as

−1 M x ⋆ y = µβ (µβ(x)µβ(y)) ∈V . (20)

The product ⋆ depends on the centralizer Ka ⊆ F (i.e., it depends on a) and the ordered basis β, but we will not denote this dependence for simplicity. M M i From the definitions, we note also that, if x = (x1, x2,...,xM ) ∈V and y = i=1 βiy ∈ M i M V , with xi ∈V and y ∈ Ka , for i =1, 2,...,M, then P M −1 i M µβ (µβ(x)µβ(y)) = xiy ∈V . (21) i=1 X We may now prove the following result.

R R Theorem 8. Let M be a positive integer with M ≤ dimKa (F), where dimKa (F) need not be finite. R a Let β1,β2,...,βM ∈ F be right linearly independent over K and let V = hβ1,β2,...,βM iKa ⊆ F. M If the matrix product ⋆ is defined from the ordered basis β = (β1,β2,...,βM ) ∈ F as in (20), then it holds that F Da ◦ GDa (β)= F Da (β) ⋆ GDa (β), (22) for all F, G ∈ F[x; σ, δ] such that F Da (β), GDa (β) ∈ VM (i.e., F Da (V) ⊆ V and GDa (V) ⊆ V). In particular, if F has finite right dimension over Ka and β1,β2,...,βM form one of its right bases, then (22) holds for any F, G ∈ F[x; σ, δ].

20 Da M i M Proof. Let y = G (β) ∈V and let y ∈ Ka , for i =1, 2,...,M, be the unique vectors such M i that y = i=1 βiy . It holds that

P M Da Da i Da F (y)= F (βi)y = F (β) ⋆ y, i=1 X where the first equality follows from Lemma 11, and the second equality is (21), and we are done. We conclude with the following consequence.

R M Corollary 44. If M = dimKa (F) < ∞ and β = (β1,β2,...,βM ) ∈ F is an ordered right basis of F over Ka, then the map L M Eβ : F[Da] −→ F M is a left F-algebra isomorphism, where the product in F is ⋆β defined from β (Definition 43). In particular, the map L M×M Mβ ◦ Eβ : F[Da] −→ Ka is a ring isomorphism. In conclusion, we have the following chain of natural left F-algebra and ring isomorphisms, where we indicate the considered products in case of ambiguity,

M M×M F[x; σ, δ]/I(C(a)) =∼ (F[Da], ◦) =∼ (F ,⋆β) =∼ Ka .

In particular, the ring F[x; σ, δ]/I(C(a)) is a simple ring [23, Definition (2.1)] by [23, Theorem (3.1)]. That is, F[x; σ, δ]/I(C(a)) has no non-trivial two-sided ideals.

4.2 Product decompositions over several conjugacy classes In this short subsection, we observe that, when a P-closed set contains elements of more than one conjugacy class, the skew polynomial product over the corresponding ideal decomposes as a coordinate-wise product over each conjugacy class.

n Theorem 9. Let Ω ⊆ F be a finitely generated P-closed set and let Ωi = Ω ∩C(ai) 6= ∅, for n i =1, 2,...,ℓ, where a1, a2,..., aℓ ∈ F are pair-wise non-conjugate and Ω=Ω1 ∪ Ω2 ∪···∪ Ωℓ (see Theorem 5). With notation as in Definition 38, the maps

ℓ ℓ D F[x; σ, δ]/I(Ω) −→ F[x; σ, δ]/I(Ωi) −→ F[Dai ]/I(Ω ) i=1 i=1 i (23) ℓ Da D ℓ F + I(Ω) 7→ (F + I(Ω )) 7→ F i + I(Ω ) L i i=1 L i i=1 are left F-algebra isomorphisms. In particular, we have that 

ℓ ℓ ℓ L D R D L dimF F[Dai ]/I(Ωi ) = dimF (Ωi )= Rk(Ωi) = dimF (F[x; σ, δ]/I(Ω)) = Rk(Ω). i=1 i=1 i=1 X  X X Proof. It follows from Corollary 40 and the fact that

ℓ ∼ F[x; σ, δ]/I(Ω) = F[x; σ, δ]/I(Ωi), i=1 M which follows from I(Ω) = I(Ω1) ∩ I(Ω2) ∩ ... ∩ I(Ωℓ) (see [31, Proposition 15]) and Theorem 5.

21 Similarly, we deduce the following result on coordinate-wise matrix multiplications by com- bining Corollary 44 and Theorem 9. Corollary 45. With assumptions and notation as in Corollary 44 and Theorem 9, for Ω = C(a1) ∪C(a2) ∪ ... ∪C(aℓ) (recall that a1, a2,..., aℓ are pair-wise non-conjugate), we have the following chain of natural left F-algebra and ring isomorphisms,

ℓ ℓ ℓ ℓ M M ×M ∼ ∼ a ∼ i ∼ i i F[x; σ, δ]/I(Ω) = F[x; σ, δ]/I(C(ai)) = (F[D i ], ◦) = (F ,⋆βi ) = Kai , i=1 i=1 i=1 i=1 M M M M

R Mi where Mi = dim a ( ) < ∞ and β ∈ is an ordered right basis of over Kai , for i = K i F i F F 1, 2,...,ℓ. In particular, the ring F[x; σ, δ]/I(Ω) is semisimple [23, Definition (2.5)] by [23, (3.3)] and [23, (3.4)]. That is, every left submodule of a left module over F[x; σ, δ]/I(Ω) is a direct summand.

5 Generalizations of Vandermonde, Moore and Wronskian matrices

One of the main objectives behind the results on evaluations of univariate skew polynomials in [22, 25] was to generalize the concept of and results on classical Vandermonde [43], Moore [33, 37] and Wronskian [15, 34] matrices. A general method for calculating the rank of such matrices was obtained by combining [25, Theorem 4.5] and [22, Theorem 23], which amount to linearizing the concept of P-independence in the case n = 1, as done in Section 3 for the general case. Multivariate skew Vandermonde matrices were defined in [31] using the skew evaluation of multivariate skew polynomials as in Definition 7. In this section, we give an analogous defi- nition using linearized evaluations as in Definition 9. In Subsection 5.1, we revisit the results on multivariate skew Vandermonde matrices from [31], and in Subsection 5.2, we provide a lin- earization of such matrices and calculate their ranks as done in [25]. As we will show in the examples, the matrices defined in Subsection 5.2 simultaneously generalize multivariate versions of Vandermonde, Moore and Wronskian matrices.

5.1 Skew Vandermonde matrices In this subsection, we revisit the concept of skew Vandermonde matrix, which was introduced in [22] for n = 1 and δ = 0, and in [25, Eq. (4.1)] in general for n = 1. The multivariate case was introduced in full generality in [31, Definition 40]. Definition 46 (Skew Vandermonde matrices [22, 25, 31]). Let N ⊆M be a finite set n of skew monomials and let B = {b1, b2,..., bM } ⊆ F . We define the corresponding skew Vandermonde matrix, denoted by VN (B), as the |N | × M matrix over F whose rows are given by

M (Nm(b1),Nm(b2),...,Nm(bM )) ∈ F , for all m ∈ N (given certain ordering in N or, more generally, in M). If d is a positive integer, we define Md as the set of monomials of degree less than d, and we denote

|Md|×M Vd(B)= VMd (B) ∈ F . (24) The following result is [31, Prop. 41], and connects the rank of a skew Vandermonde matrix with the underlying P-closed set.

22 Proposition 47 ([31]). Given a finite set G ⊆ Fn with M elements, and Ω= G, it holds that

Rk (VM (G)) = Rk(Ω).

Moreover, a subset B ⊆ G is a P-basis of Ω if, and only if, |B| = Rk(Ω) = Rk(V|B|(B)). Remark 48. The last statement implies that, applying Gaussian elimination to the matrix VM (G), we may find the rank of Ω and at least one of its P-bases. This is an alternative method to partitioning G into conjugacy classes and finding a right basis on each conjugacy class, as implied by Theorems 4 and 5. Skew Lagrange interpolation (Theorem 2) can be reinterpreted as the left invertibility of a skew Vandermonde matrix defined over a P-basis. The following result is [31, Corollary 42].

n Corollary 49 ([31]). Let Ω ⊆ F be a finitely generated P-closed set with P-basis B = {b1, b2,..., bM }. There exists a solution to the linear system

(Fm)m∈MM VM (B) = (a1,a2,...,aM ), (25) for any a1,a2,...,aM ∈ F (that is, VM (B) is left invertible). For any solution, the skew polyno- Fmm satisfies that F (b )= a , for i =1, 2,...,M, and deg(F )

1 1 1 ... 1 b1 b2 b3 ... bM  2 2 2 2  b b b ... b d×M Vd(B)= 1 2 3 M ∈ F . (26)  . . . . .   ......     bd−1 bd−1 bd−1 ... bd−1   1 2 3 M    In such a case, Proposition 47 is the well-known fact that the rank of a conventional univariate Vandermonde matrix (26) is the number of pair-wise distinct evaluation points, that is,

Rk(Vd(B)) = |B|, (27) whenever d ≥ |B|. This is because Rk(B)= |B| when F is a field, σ = Id and δ = 0. The exact same result holds if n> 1. For instance, let n = 2, M = 3 and d = 3, and denote b1 = (a1,b1), b2 = (a2,b2), b3 = (a3,b3). Then the corresponding multivariate conventional Vandermonde matrix is given by 1 1 1 a a a  1 2 3  b1 b2 b3 6×3 V3(B)= ∈ F . (28)  a2 a2 a2   1 2 3   a b a b a b   1 1 2 2 3 3   b2 b2 b2   1 2 3    23 In this case, it could happen that ai = aj for some i 6= j, but if b1, b2 and b3 are pair-wise distinct, then Vd(B) has rank 3, which coincides with |B|. However, if say a1 = a2 6= a3 and b1 = b2, then the rank of Vd(B) is 2, which also coincides with |B|. Finally, observe that, if F is non-commutative, then we need to add the extra row (b1a1,b2a2,b3a3). Example 51 (“Normed” Vandermonde matrices). A bit more generally, when n = 1, δ = 0 but σ : F −→ F is arbitrary, we obtain a form of Vandermonde matrices where, instead of powers, the entries are given by truncated norms using the morphism σ. That is, if B = {b1,b2,...,bM }⊆ F, then such matrices are of the form

N0(b1) N0(b2) N0(b3) ... N0(bM ) N1(b1) N1(b2) N1(b3) ... N1(bM )   N2(b1) N2(b2) N2(b3) ... N2(b ) d×M Vd(B)= M ∈ F , (29)  . . . . .   ......     N (b ) N (b ) N (b ) ... N (b )   d−1 1 d−1 2 d−1 3 d−1 M   i−1  for some positive integer d, where N0(b) = 1 and Ni(b) = σ (b) ··· σ(b)b is the ith truncated norm of b with respect to σ, for b ∈ F and for i = 1, 2,...,d − 1. These matrices are the ones originally introduced by Lam in [22], and extended to the case δ 6= 0 in [25]. The general case is a natural generalization of these matrices based on multivariate norms as in Lemma 17. For instance, the “normed” version of (28), when δ = 0, would be 1 1 1 a a a  1 2 3  b1 b2 b3 7×3 V3((a1,b1), (a2,b2), (a3,b3)) =  σ(a1)a1 σ(a2)a2 σ(a3)a3  ∈ F . (30)    σ(a )b σ(a )b σ(a )b   1 1 2 2 3 3   σ(b )a σ(b )a σ(b )a   1 1 2 2 3 3   σ(b )b σ(b )b σ(b )b   1 1 2 2 3 3    Observe how, in this case, we need to include both the rows (σ(a1)b1, σ(a2)b2, σ(a3)b3) and (σ(b1)a1, σ(b2)a2, σ(b3)a3) even if F is a field, when σ 6= Id.

5.2 Linearized Vandermonde matrices The efforts that led to [22, Theorem 23] and [25, Theorem 4.5] successfully aimed at finding a simple way to find the rank of skew Vandermonde matrices as in (29), analogous to Vander- monde’s original criterion [43] (see Example 50). Thanks to our extensions of such fundamental results in Theorems 4 and 5, we will give analogous ways to finding the rank of general skew Vandermonde matrices as in (24). This will be done by first transforming skew Vandermonde matrices into what we call linearized Vandermonde matrices, of which the full general case when n = 1 was first considered recently in [29, p. 604], to the best of our knowledge. As a particular case, we obtain a result by Roth [41, Lemma 1] on the rank of multivariate Wronskian matrices. We start with the main definition. Definition 52 (Linearized Vandermonde matrices). Let N ⊆M be a finite set of skew n D (i) (i) (i) monomials. Let a1, a2,..., aℓ ∈ F and let Bi = {β1 ,β2 ,...,βMi } ⊆ F, for i = 1, 2,...,ℓ. D D D D Denote a = (a1, a2,..., aℓ) and B = (B1 , B2 ,..., Bℓ ). We define the corresponding linearized D D Vandermonde matrix, denoted by VN (a, B ), as the |N | × M matrix over F, where M = M1 + M2 + ··· + Mℓ, whose rows are given by m D m D m D M Da1 (B1 ), Da2 (B2 ),..., Daℓ (Bℓ ) ∈ F ,  24 for all m ∈ N (given certain ordering in N or, more generally, in M), where we define

m m m m D (i) (i)),..., D (β(i) ) ∈ Mi , Dai (Bi )= Dai (β1 ), Dai (β2 ai Mi F   for i =1, 2,...,ℓ. If d is a positive integer, we denote

D D |Md|×M Vd(a, B )= VMd (a, B ) ∈ F , (31) where Md is as in Definition 46. As it was the case for skew polynomials and linearized polynomials, the point of linearized Vandermonde matrices is that they are indeed a linearized version of skew Vandermonde matrices. This is a rewriting of Theorem 1.

n Theorem 10. Let the notation be as in Definition 52, and define B = {b1, b2,..., bM } ⊆ F by (i) (i) βj (i) (i) −1 bj = ai = Dai (βj )(βj ) , (32) (i) where bj = br and r = M1 + M2 + ··· + Mi−1 + j, for j = 1, 2,...,Mi and i = 1, 2,...,ℓ. Then, for any finite set of skew monomials N ⊆M, it holds that

(1) β1 0 ... 0 (1) 0 β2 ... 0 V D(a, BD)= V (B)   , (33) N N . . .. .  . . . .   (ℓ)   0 0 ... β   Mℓ    D D assuming that we use the same ordering of N to order the rows in VN (B) and VN (a, B ). The following is the main result of this section. It combines Proposition 47 and Theorems 4 and 5 in the same way that [22, Theorem 23] and [25, Theorem 4.5] can be combined in the case n = 1 to determine the rank of skew and linearized Vandermonde matrices. Theorem 11. Let the notation be as in Definition 52 and Theorem 10. Assume moreover that (i) ∗ βj ∈ F , for j =1, 2,...,Mi and for i =1, 2,...,ℓ. Then it holds that

D D Rk(VN (a, B )) = Rk(VN (B)).

n If the points a1, a2,..., aℓ ∈ F are pair-wise non-conjugate, then this rank is

ℓ D D R (i) (i) (i) R Rk(V (a, B )) = dim a (hβ ,β ,...,β i a ). N K i 1 2 Mi K i i=1 X Remark 53. Note that, using Theorems 10 and 11, we may rewrite linearized Lagrange interpo- lation (Theorem 6) as the left invertibility of the corresponding linearized Vandermonde matrix, as in Corollary 49. We next show how to recover and generalize Moore and Wronskian matrices, which are particular cases of linearized Vandermonde matrices.

25 Example 54 (Moore matrices). Moore matrices were first considered by Moore in [33], and then by Ore in [37, Eq. (15)] when he studied classical (univariate) linearized polynomials over finite fields. They are recovered from linearized Vandermonde matrices (31) by setting n = 1, D δ = 0, a = 1 and letting σ : F −→ F be arbitrary. For any B = {β1,β2,...,βM } ⊆ F, the corresponding Moore matrix is given by the linearized Vandermonde matrix

β1 β2 β3 ... βM σ(β1) σ(β2) σ(β3) ... σ(βM )  2 2 2 2  D D σ (β1) σ (β2) σ (β3) ... σ (βM ) d×M Vd (1, B )= ∈ F . (34)  . . . . .   ......     σd−1(β ) σd−1(β ) σd−1(β ) ... σd−1(β )   1 2 3 M    Theorem 10 establishes the connection between Moore matrices and Normed Vandermonde ma- trices (29) over the conjugacy class C(1) through the simple formula

−1 β1 β2 ... βM β1 0 ... 0 −1 σ(β1) σ(β2) ... σ(βM ) 0 β2 ... 0  . . . .   . . . .  ......      σd−1(β ) σd−1(β ) ... σd−1(β )   0 0 ... β−1   1 2 M   M      −1 −1 −1 N0(σ(β1)β1 ) N0(σ(β2)β2 ) ... N0(σ(βM )βM ) −1 −1 −1 N1(σ(β1)β1 ) N1(σ(β2)β2 ) ... N1(σ(βM )βM ) =  . . . .  , . . .. .    N (σ(β )β−1) N (σ(β )β−1) ... N (σ(β )β−1)   d−1 1 1 d−1 2 2 d−1 M M    which the reader can easily check by computing telescopic products. This identity was first obtained by Lam and Leroy in [25, Eq. (4.12)]. It is worth observing how this identity can be trivially extended to any conjugacy class, not only the conjugacy class C(1). For any a ∈ F, it i i holds that Da(βj )= σ (βj )Ni(a) and

−1 β1 β2 ... βM β1 0 ... 0 −1 σ(β1)a σ(β2)a ... σ(βM )a 0 β2 ... 0  . . . .   . . . .  ......      σd−1(β )N (a) σd−1(β )N (a) ... σd−1(β )N (a)   0 0 ... β−1   1 d−1 2 d−1 M d−1   M      −1 −1 −1 N0(σ(β1)aβ1 ) N0(σ(β2)aβ2 ) ... N0(σ(βM )aβM ) −1 −1 −1 N1(σ(β1)aβ1 ) N1(σ(β2)aβ2 ) ... N1(σ(βM )aβM ) =  . . . .  . (35) . . .. .    N (σ(β )aβ−1) N (σ(β )aβ−1) ... N (σ(β )aβ−1)   d−1 1 1 d−1 2 2 d−1 M M    In both cases (a = 1 or not), Theorem 11 states that the rank of a Moore matrix is

D D R D R Rk(Vd (a, B )) = dimKa (hB iKa ), (36)

D whenever d ≥ |B |, where we note that K1 = {β ∈ F | σ(β) = β}. This can be seen as a linearized form of (27). It may seem that choosing a 6= 1 does not provide any essential novelty when F is a field, ∗ ∗ since Ka = K1, for all a ∈ F , and K0 = F. However, the elements a ∈ F play an essential role when being chosen pair-wise non-conjugate, as highlighted by Theorem 11. Let a1,a2,...,aℓ ∈

26 ∗ ∗ ℓ F be pair-wise non-conjugate and denote a = (a1,a2,...,aℓ) ∈ (F ) . Then the linearized D Vandermonde matrix Vd(a, B ) is given by

ℓ D D D D D d×Pi=1 |Bi | Vd(a, B ) = (Vd(a1, B1 )|Vd(a2, B2 )| ... |Vd(aℓ, Bℓ )) ∈ F , (37) D where, for i =1, 2,...,ℓ, the matrix Vd(ai, Bi ) is given by (i) (i) (i) β1 β2 ... βM (i) (i) (i)  σ(β1 )ai σ(β2 )ai ... σ(βM )ai  D ∈ Fd×|Bi |......  . . . .   (i) (i) (i)   σd−1(β )N (a ) σd−1(β )N (a ) ... σd−1(β )N (a )   1 d−1 i 2 d−1 i M d−1 i    Then Theorem 11 says that

ℓ Rk(V D(a, BD)) = dimR (hBDiR ), (38) d Kai i Kai i=1 X ℓ D whenever d ≥ i=1 |Bi |. To the best of our knowledge, the matrix in (37), where ℓ > 1, has been considered for the first time in [29, p. 604], where the case δ 6= 0 is also considered. The result (38) wasP implicitly used in [29, Theorem 3] and [29, Theorem 4] to prove that linearized Reed-Solomon codes are maximum sum-rank distance codes. See Subsection 5.3 for more details on codes. We now illustrate the multivariate case, which is the main novelty of Theorem 11. Let n =2 and set σ = diag(σ1, σ2) and δ = 0, for two ring morphisms σ1, σ2 : F −→ F. Let M = 3 and 2 d = 3, and let a = (a1,a2), b = (b1,b2) ∈ F and β1,β2 ∈ F. The linearized Vandermonde matrix 7×3 V3((a, b), ({β1,β2}, {β1})) ∈ F is given by

β1 β2 β1  σ1(β1)a1 σ1(β2)a1 σ1(β1)b1  σ (β )a σ (β )a σ (β )b  2 1 2 2 2 2 2 1 2   2 2 2   σ1 (β1)σ1(a1)a1 σ1 (β2)σ1(a1)a1 σ1 (β1)σ1(b1)b1  . (39)    σ (σ (β ))σ (a )a σ (σ (β ))σ (a )a σ (σ (β ))σ (b )b   1 2 1 1 2 1 1 2 2 1 2 1 1 2 1 1 2 1     σ2(σ1(β1))σ2(a1)a2 σ2(σ1(β2))σ2(a1)a2 σ2(σ1(β1))σ2(b1)b2   2 2 2   σ (β1)σ2(a2)a2 σ (β2)σ2(a2)a2 σ (β1)σ2(b2)b2   2 2 2    In this case, if a and b are non-conjugate, we also have that

R R R R Rk(V3((a, b), ({β1,β2}, {β1}))) = dimKa (hβ1,β2iKa ) + dimKb (hβ1iKb ). Example 55 (Wronskian matrices). Wronskian matrices were first considered by Muir [34] (see [34, Bullet 194]), who attributed them to Hoene-Wro´nski due to a letter aimed at refuting Lagrange’s use of infinite series [15]. Let n = 1, σ = Id, a = 0 and let δ : F −→ F be an arbitrary D standard derivation of F. For any B = {β1,β2,...,βM } ⊆ F, the corresponding Wronskian matrix is given by the linearized Vandermonde matrix

β1 β2 β3 ... βM δ(β1) δ(β2) δ(β3) ... δ(βM )  2 2 2 2  D D δ (β1) δ (β2) δ (β3) ... δ (βM ) d×M Vd (0, B )= ∈ F . (40)  . . . . .   ......     δd−1(β ) δd−1(β ) δd−1(β ) ... δd−1(β )   1 2 3 M    27 In this case, we may still find a “normed” form of the corresponding Vandermonde matrix. For b ∈ F, define N0(b) = 1 and then define the ith differential truncated norm of b as

Ni(b)= Ni−1(b)b + δ(Ni−1(b)), (41) for i =1, 2,...,d. Differential truncated norms, defined exactly as in (41), were first considered by Jacobson in [16, Eq. (29)], and they coincide with the norms defined in Lemma 17 for the particular univariate case considered here. Now, Theorem 10 reads

−1 β1 β2 ... βM β1 0 ... 0 −1 δ(β1) δ(β2) ... δ(βM ) 0 β2 ... 0  . . . .   . . . .  ......      δd−1(β ) δd−1(β ) ... δd−1(β )   0 0 ... β−1   1 2 M   M      −1 −1 −1 N0(δ(β1)β1 ) N0(δ(β2)β2 ) ... N0(δ(βM )βM ) −1 −1 −1 N1(δ(β1)β1 ) N1(δ(β2)β2 ) ... N1(δ(βM )βM ) =  . . . .  . (42) . . .. .    N (δ(β )β−1) N (δ(β )β−1) ... N (δ(β )β−1)   d−1 1 1 d−1 2 2 d−1 M M    This identity was first obtained by Lam and Leroy in [25, Eq. (4.8)]. Exactly as in (54), we may extend the identity (55) to any conjugacy class, not only C(0). Again, we may paste together Wronskian matrices defined over pair-wise distinct conjugacy classes as done in (37) for the Moore case. Again, to the best of our knowledge, the first time such a pasting has been considered was in [29, p. 604]. Theorem 11 states that the rank of the Wronskian matrix (40) is

D D R D R Rk(Vd (0, B )) = dimK0 (hB iK0 ), (43)

D whenever d ≥ |B |, where we note that K0 = {β ∈ F | δ(β)=0}. This result is essentially proven by Muir in [34, Bullets 199 & 200] when BD is a set of real differentiable functions. The general case was first considered by Jacobson in [16, Lemma 2], and stated in the form (43) by Lam and Leroy in [25, Theorem 4.9]. We note that [1, 25] further considered the case where σ 6= Id and δ is a σ-derivation. Finally, by Theorem 11, the identity (43) can be extended to several pair-wise distinct conjugacy classes as done in (38) for the Moore case. In contrast with Example 54, the multivariate case of Wronskian matrices has been considered and applied before. It was used by Roth to prove his theorem on rational approximations to irrational algebraic numbers [41], solving a problem started by Liouville more than a century earlier and earning Roth a Fields Medal. We illustrate [41, Lemma 1] as stated by Roth. Let n > 1, a = 0 ∈ Fn and σ = Id, and let δ : F −→ Fn be arbitrary. Note that its ith m component δi : F −→ F is a standard derivation, for i =1, 2,...,n. If = xi1 xi2 ··· xis ∈M is a skew monomial in F[x; Id,δ], for some non-negative integer s (set m = 1 if s = 0), then we call

m D0 = δi1 ◦ δi2 ◦···◦ δis : F −→ F a derivation of order s = deg(m). For a set BD ⊆ F of size M = d, the linearized Vandermonde D D D matrix Vd (0, B ) has as rows the evaluations on B of all derivations of orders s =0, 1,...,d−1. Then [41, Lemma 1] says that BD is a linearly independent set over K0 = {β ∈ F | δ(β) = D D 0} if, and only if, every d × d submatrix of Vd (0, B ) is invertible, which is equivalent to D D Rk(Vd (0, B )) = d. Thus such a result is recovered from Theorem 11 in the particular case considered by Roth. Observe that, in Roth’s case (i.e., F = Q(z1,z2,...,zn) for algebraically

28 independent variables z1,z2,...,zn), it holds that δi ◦δj = δj ◦δi, for all i, j =1, 2,...,n, whereas Theorem 11 works even if this is not the case and even if F is non-commutative. Even further and more interestingly, we may extend Roth’s result to the case of several pair- wise distinct conjugacy classes. For illustration purposes, we conclude with such an example, T which is the differential version of (39). Let n = 2 and set σ = Id and δ = (δ1,δ2) , for two 2 standard derivations δ1,δ2 : F −→ F. Let M = 3 and d = 3, and let a = (a1,a2), b = (b1,b2) ∈ F D 7×3 and β1,β2 ∈ F. The linearized Vandermonde matrix V3 ((a, b), ({β1,β2}, {β1})) ∈ F is given in this case by D D D 7×3 V3 (a,β1), V3 (a,β2), V3 (b,β1) ∈ F , (44) where, for instance, 

β1  δ1(β1)+ β1a1  δ (β )+ β a  2 1 1 2  D  2 2  7×1 V3 (a,β1)=  δ1(β1)+2δ1(β1)a1 + β1(δ1(a1)+ a1)  ∈ F . (45)    δ (δ (β )) + δ (β )a + δ (β )a + β (δ (a )+ a a )   1 2 1 1 1 2 2 1 1 1 1 2 2 1     δ2(δ1(β1)) + δ2(β1)a1 + δ1(β1)a2 + β1(δ2(a1)+ a1a2)   2 2   δ (β1)+2δ2(β1)a2 + β1(δ2(a2)+ a )   2 2    ∗ −1 In this case, if a and b are non-conjugate (i.e., there is no β ∈ F such that bi − ai = δi(β)β , for i =1, 2), then we have that

D Rk(V3 ((a, b), ({β1,β2}, {β1}))) = dimK0 (hβ1,β2iK0 ) + dimK0 (hβ1iK0 ), since now Ka = Kb = K0 = {β ∈ F | δ(β)= 0}, for any a, b ∈ Fn.

5.3 Linear codes and their generator matrices In Coding Theory, a full-rank matrix in Fd×M over a finite field F, with d ≤ M, defines via its row space a linear code of dimension d and length M. Rectangular Vandermonde matrices using conventional univariate polynomials, i.e., as in (26) with d ≤ M, define the linear codes known as Reed-Solomon codes [40]. Such codes have numerous applications in telecommunications, data storage and cryptography, among others, since they are MDS (their minimum Hamming distance attains the Singleton bound). The extensions of such codes obtained by using multivariate conventional Vandermonde matrices such as (28), univariate normed Vandermonde matrices (29) and univariate Moore matrices (34) form, respectively, Reed-Muller codes [35, 39], skew Reed-Solomon codes [5] and Gabidulin codes [10]. The latter of these codes is well-known for being MRD (its minimum rank distance attains the Singleton bound). Skew Reed-Solomon codes [5] are the most general linear codes obtainable via skew evaluation (Definition 7) of univariate skew polynomials. Similarly, the most general linear codes obtainable via linearized evaluation (Definition 9) of univariate linearized polynomials are linearized Reed- Solomon codes [29]. These latter codes simultanously extend Reed-Solomon codes [40] and Gabidulin codes [10], and more importantly, they are MSRD (their minimum sum-rank distance attains the Singleton bound). In this subsection, we will define the most general linear codes obtainable via skew evalua- tion (Definition 7) and linearized evaluation (Definition 9) of multivariate skew and linearized polynomials. We will show the relation between them (Proposition 58), similar to the relation

29 between skew and linearized Reed-Solomon codes [29, Proposition 33]. We will define skew metrics (Definition 59), extending the definition given in [29, Definition 9]. We will show the relations between skew metrics, sum-rank metrics and the Hamming metric (Proposition 60 and Theorem 12), and we will conclude that giving a lower bound on the minimum Hamming dis- tance of all skew Reed-Muller codes (resp. linearized Reed-Muller codes) over the same skew polynomial ring, same P-closed set and of equal degree, is also a lower bound on their minimum skew distance (resp. minimum sum-rank distance). We note that an alternative definition of skew Reed-Muller codes was recently given in [12] using iterated skew polynomial rings. As noted in [31], evaluations using iterated skew polynomial rings coincide with that of free multivariate skew polynomials in some cases [31, Remark 21], but they cannot be defined in all cases [31, Remark 8]. In addition, linearized Reed-Muller codes, skew metrics and their connections to the Hamming metric and sum-rank metrics were not provided in [12]. Recently, a linearized notion of Reed-Muller codes over finite extensions of fields was introduced in [4]. Since they are defined by conventional evaluations of polynomials in the Galois group algebra, such codes coincide with our notion of linearized Reed-Muller codes in Definition 57 defined over fields and a single conjugacy class, by Example 13 and [4, Remark 45]. Therefore, Proposition 58 provides a connection between the codes in [12, 4] whenever the corresponding particular cases intersect. We start with the main definitions. We will fix a skew polynomial ring F[x; σ, δ] on n variables, and for a positive integer d, we will denote

F[x; σ, δ]d = {F ∈ F[x; σ, δ] | deg(F ) < d}, (46) which is a left vector space over F of dimension |Md|, where Md is as in Definition 46.

Definition 56 (Skew Reed-Muller Codes). Given a P-independent set B = {b1, b2,..., bM } ⊆ Fn of size M, and given d ≥ 1, we define the skew Reed-Muller code of degree d over B as the left F-linear code

σ,δ M Cd (B)= {(F (b1), F (b2),...,F (bM )) | F ∈ F[x; σ, δ]d}⊆ F . And now we define their linearized version.

n Definition 57 (Linearized Reed-Muller Codes). Let a1, a2,..., aℓ ∈ F be pair-wise non- D (i) (i) (i) a conjugate elements, and let Bi = {β1 ,β2 ,...,βMi }⊆ F be right K i -linearly independent sets, for i =1, 2,...,ℓ. We define the linearized Reed-Muller code of degree d over a = (a1, a2,..., aℓ) D D D D and B = (B1 , B2 ,..., Bℓ ) as the left F-linear code

ℓ σ,δ a (i) a (i) a (i) D D i D i D i M Cd (a, B )= (F (β1 ), F (β2 ),...,F (βM )) | F ∈ F[x; σ, δ]d ⊆ F , i i=1    where M = M1 + M2 + ··· + Mℓ. Obviously, skew Reed-Solomon codes [5, Definition 7] and linearized Reed-Solomon codes [29, Definition 31] are the particular cases of skew and linearized Reed-Muller codes obtained by setting n = 1 in Definitions 56 and 57, respectively. The following result follows from the previous two definitions and Theorems 10 and 11. Proposition 58. With notation as in Definitions 56 and 57 and as in Theorems 10 and 11, we d×M σ,δ M D d×M have that Vd(B) ∈ F is a of Cd (B) ⊆ F , and Vd(a, B ) ∈ F is a σ,δ D M generator matrix of Cd (a, B ) ⊆ F (possibly not full-rank in both cases).

30 In particular, if the relation (32) holds, then we also have that

(1) β1 0 ... 0 (1) 0 β2 ... 0 Cσ,δ(a, BD)= Cσ,δ(B) ·   . (47) d d . . .. .  . . . .   (ℓ)   0 0 ... β   Mℓ    We observe that classical Reed-Muller codes [35, 39] are recovered from both skew and linearized Reed-Muller codes by letting F be a finite field, and setting σ = Id, δ = 0, thus (1) (2) (ℓ) M1 = M2 = ... = Mℓ = 1, ℓ = M, and further setting β1 = β1 = ... = β1 = 1. Finding the exact dimension and minimum distance of such codes is left as an open problem. We anticipate that the right metric for skew and linearized Reed-Muller are the corresponding skew and sum-rank metrics, defined exactly as in [29, Definition 11] and [29, Definition 25], re- spectively, but using multivariate skew polynomials. We now develop these ideas briefly, starting with a definition of skew metrics that trivially extends [29, Definition 9].

Definition 59 (Skew metrics). For F ∈ F[x; σ, δ]M , we define its (σ, δ)-skew polynomial weight, or just skew weight for simplicity, over the P-closed set Ω ⊆ Fn generated by B as

σ,δ wtΩ (F ) = Rk(Ω) − Rk(ZΩ(F )) = M − Rk(ZΩ(F )), where ZΩ(F ) = Z(F ) ∩ Ω = Z({F, FΩ}) is the P-closed set of zeros of F in Ω. Observe that σ,δ 0 ≤ wtΩ (F ) ≤ M. B Now, with notation as in Definition 7, for an arbitrary f ∈ F , there exists F ∈ F[x; σ, δ]M S such that f = EB (F ) by Theorem 2. We then define

σ,δ σ,δ wtB (f)=wtΩ (F ).

σ,δ 2 σ,δ B 2 Finally, we define the skew metrics dΩ : F[x; σ, δ]M −→ N and dB : (F ) −→ N given by

σ,δ σ,δ dΩ (F, G) =wtΩ (F − G),

σ,δ S S for all F, G ∈ F[x; σ, δ]M , and analogously for dB , using instead f = EB (F ) and g = EB (G).

Observe that the previous definition is consistent, since if F1, F2 ∈ F[x; σ, δ]M are both such S S S S σ,δ σ,δ that f = EB (F1)= EB (F2), then EΩ(F1)= EΩ(F2) by Theorem 2, hence wtΩ (F1)=wtΩ (F2). We will not prove that skew metrics actually satisfy the axioms of a metric, since we will show in Theorem 12 that they are a particular case of sum-rank metrics (Definition 61). We have defined skew metrics in the vector space FB. However, they can be trivially translated into metrics in FM by an ordering of B. Similarly, if we define the Hamming metric in FB as

wtH (f)= |{b ∈ B | f(b) 6=0}|, for all f ∈ FB, then we have the following result, which extends [29, Proposition 13] and [29, Proposition 14]. The proof follows exactly the same lines.

Proposition 60. Given another P-basis A of the P-closed set Ω ⊆ Fn generated by B, we define the corresponding change-of-P-basis map as

B A πB,A : F −→ F S S EB (F ) 7→ EA(F )

31 for all F ∈ F[x; σ, δ], which is well defined by Theorem 2. Then it holds that

σ,δ σ,δ wtB (f)=wtA (πB,A(f)),

B for all f ∈ F . That is, πB,A is a left F-linear isometry. Finally, it holds that

σ,δ wtB (f) = min{wtH(πB,A(f)) | A is a P-basis of Ω}≤ wtH(f), for all f ∈ FB. Observe that, given a positive integer d, we have the following relation between skew Reed- Muller codes: σ,δ σ,δ Cd (A)= πB,A(Cd (B)), (48) where A is any other P-basis of the P-closed set generated by B. Together with Proposition 60, we deduce that a number that is a lower bound on the minimum Hamming distance of all skew Reed-Muller codes, for a fixed skew polynomial ring F[x; σ, δ], P-closed set Ω and positive integer d, is also a lower bound on their minimum skew distances. Finally, we may deduce the same on the minimum sum-rank distance of linearized Reed- Muller codes. We now revisit the general definition of the sum-rank metric from [29, Definition 25].

Definition 61 (Sum-rank metrics [29]). Let K1,K2,...,Kℓ be division subrings of F. Given positive integers M1,M2,...,Mℓ and M = M1 + M2 + ··· + Mℓ, we define the sum-rank weight M in F with lengths (M1,M2,...,Mℓ) and division subrings (K1,K2,...,Kℓ) as

(1) (2) (ℓ) wtSR(c) =wtR(c )+wtR(c )+ ··· + wtR(c ), where c = (c(1), c(2),..., c(ℓ)) ∈ FM and c(i) ∈ FMi , and where

(i) (i) (i) (i) R wtR(c ) = dimKi (hc1 ,c2 ,...,cMi iKi ),

M 2 for i =1, 2,...,ℓ. We define the associated metric dSR : (F ) −→ N as

dSR(c, d)=wtSR(c − d), for all c, d ∈ FM . Then the following theorem holds, which is a generalization of [29, Theorem 3]. The proof follows exactly the same lines (see also [29, Theorem 2]).

Theorem 12. If the relation (32) holds, and we denote by wtSR the sum-rank weight with lengths

(M1,M2,...,Mℓ) and division subrings (Ka1 ,Ka2 ,...,Kaℓ ), then we have that

σ,δ wtSR(c) =wtB (f), where c = (c(1), c(2),..., c(ℓ)) ∈ FM and c(i) ∈ FMi , and where f ∈ FB is given by

(i) (i) (i) cj = f(βj )βj , (49) for i =1, 2,...,Mj, and for j =1, 2,...,ℓ.

32 Observe that the map that transforms skew metrics into sum-rank metrics (49) in the previous theorem is the same map that transforms skew Reed-Muller codes into linearized Reed-Muller codes (Proposition 58). Similarly to the case of skew Reed-Muller codes (48), for a positive integer d, we have the following relation between linearized Reed-Muller codes:

σ,δ D σ,δ D Cd (a, A )= Cd (a, B )A, (50) where AD = (AD, AD,..., AD), AD = {α(i), α(i),...,α(i) }, 1 2 ℓ i 1 2 Mℓ

A1 0 ... 0 0 A2 ... 0 M×M A =  . . . .  ∈ F , . . .. .    0 0 ... A   ℓ    M ×M i i a and where Ai ∈ Kai is the unique invertible matrix over K i such that

(α(i), α(i),...,α(i) ) = (β(i),β(i),...,β(i) )A , 1 2 Mℓ 1 2 Mℓ i for i =1, 2,...,ℓ. In conclusion, we have established a dictionary that translates skew metrics and skew Reed- Muller codes into sum-rank metrics and linearized Reed-Muller codes. As it was the case for skew Reed-Muller codes, we deduce that a number that is a lower bound on the minimum Hamming distance of all linearized Reed-Muller codes, for a fixed skew polynomial ring F[x; σ, δ], P-closed set Ω and positive integer d, is also a lower bound on their minimum sum-rank distances. Remark 62. To alleviate the study of skew and linearized Reed-Muller codes over a finite field Fq, the results in [30] may be useful. As [30, Theorem 5] shows, all free multivariate skew polynomial rings over a finite field F are isomorphic by a left F-algebra isomorphism that preserves degrees and skew evaluations, to a free multivariate skew polynomial ring F[x; σ, δ] such that σ = diag(σ1, σ2,...,σn) (as in Example 3) and δ = 0. Due to (47), such simplification carries over to both skew and linearized Reed-Muller codes.

6 Generalizations of Galois-theoretic results

In this section, we will define P-Galois extensions of division rings (Subsection 6.1), which recover classical Galois extensions of fields. We will rephrase and extend Corollaries 33, 44 and 32 so that we are able to recover and generalize, respectively, three important results in (finite) Galois theory: Artin’s Theorem (Subsection 6.2), the Galois correspondence (Subsection 6.3) and Hilbert’s Theorem 90 (Subsection 6.4). Galois’ original presentation of his results [11] lacked the modern formality found in most textbooks. In his lecture notes [3], Artin provided what is nowadays one of the main formal approaches to stating and proving Galois’ results. One of his main results is usually known as Artin’s Theorem [3, Theorem 14] (Corollary 70 below), which we will show is a particular case of Corollary 33 (see Subsection 6.2). Similarly, we will show that Corollary 44 recovers the classical Galois correspondence [3, Theorem 16]. Furthermore, Corollary 32 (see Subsection 6.4) extends Hilbert’s Theorem 90 [14, 20], recovering also Noether’s version [36], which is valid over any (finite) Galois extension of fields (not only cyclic) and was also collected by Artin in [3, Theorem 21]. We also note that our general version of Hilbert’s Theorem 90 works over any

33 finite extension of division rings that we define as P-Galois, which covers more cases than the alternative version for division rings obtained in [26]. The classical Galois-theoretic results concerning (finite or infinite) groups of field automor- phisms could be stated in terms of Homological Algebra, see [42]. However, the skew polynomial language will provide explicit formulations (in the finite case) that are self-contained given the results in this paper.

6.1 P-Galois extensions In [19, Section VII-5], Jacobson defines Galois extensions of division rings as those where the division subring is the set of fixed elements of a finite group of automorphisms of the larger division ring. We use the same idea to further generalize this concept using centralizers given by skew polynomial rings (Definition 10). Definition 63 (P-Galois extensions). Given a division ring F and one of its division subrings K ⊆ F, we say that the pair K ⊆ F is a P-Galois extension if there exists a positive integer n, a ring morphism σ : F −→ Fn×n, a σ-derivation δ : F −→ Fn and a point a ∈ Fn, such that K = Ka is a centralizer with respect to the skew polynomial ring F[x; σ, δ] as in Definition 10. Unless otherwise stated, we will assume that the skew polynomial rings F[x; σ, δ] is known from the context and it will not be specified explicitly. We say that a P-Galois extension K ⊆ F as in the previous paragraph is finite if the right dimension of F over K is finite. The term P-Galois stands for polynomially Galois and is chosen to be in accordance with the terms in Subsection 3.1. The rest of the subsection is devoted to showing some examples. Example 64 (Classical Galois extensions). Assume that G is a finite group (commutative or not) of ring automorphisms of F generated by σ1, σ2,...,σn. Then we recover the concept of (finite) Galois extension K ⊆ F by choosing σ = diag(σ1, σ2, ..., σn), δ = 0 and a = 1 in Definition 63. This is simply because an element in F is fixed by all automorphisms in G if, and only if, it is fixed by σ1, σ2,...,σn, which holds if, and only if, it belongs to K1. See also Example 13. Note also that here F could be commutative, as in classical Galois theory [3], or not, as considered by Jacobson in [19, Section VII-5].

Example 65 (The case of derivations). Assume that δ1,δ2,...,δn : F −→ F are standard derivations (i.e. Id-derivations) such that ∇ = {m(δ) | m ∈ M}, defined as in Example 15, is a finite set. Choosing also σ = Id and a = 0 in Definition 63, the P-Galois extension K0 ⊆ F recovers as particular cases those considered in [16] based on standard derivations. See also Example 15.

Example 66 (The case of σi-derivations). The previous example could be trivially extended to the case where δi is a σi-derivation, for i =1, 2,...,n.

Example 67 (Mixed cases). Let σ1, σ2,...,σn1 : F −→ F be ring endomorphisms, let δ1,δ2, n ..., δn2 : F −→ F be standard derivations, and let n = n1 + n2. Choosing a ∈ F such that ai = 1, for i =1, 2,...,n1, and ai = 0, for i = n1 +1,n1 +2,...,n, we have that K ⊆ F is the division subring of elements of F that are both fixed by σ1, σ2,...,σn1 and annihilated by the derivations δ1,δ2,...,δn2 . In other words,

K = {β ∈ F | σi(β)= β,δj (β)=0, 1 ≤ i ≤ n1, 1 ≤ j ≤ n2}.

However, note that this case can be recovered by the previous example, by considering σi- derivations, for i =1, 2,...,n1, and setting a = 0.

34 Example 68 (Wild example II). Let the definitions and notations be as in Example 6. The 6 field extension F4(z ) $F4(z), where z is trascendental over F4, is a finite P-Galois extension of 6 fields since F4(z )= K1 for the bivariate skew polynomial ring F[x1, x2; σZ , 0], and the dimension 6 6 6 6 of F4(z) over F4(z ) is 6. The latter fact is simply because x + z ∈ (F4(z ))[x] is the minimal 6 polynomial of z in F4(z ) and has degree 6. However, as shown in Example 6, the extension 6 6 F4(z ) $F4(z) is not of the form of Examples 64, 65, 66 or 67 (we already knew that F4(z ) $ 6 6 F4(z) is not a classical Galois extension because it is not separable, since d/dx(x + z ) = 0).

6.2 Generalizing Artin’s Theorem The following result follows from Corollary 33 and gives a sufficient criterion for finiteness and an upper bound on the right dimension of a P-Galois extension.

Theorem 13. A P-Galois extension K ⊆ F as in Definition 63 is finite if, and only if, F[Da] R has finite left dimension over F, in which case the latter dimension coincides with dimKa (F). In particular, K ⊆ F is finite if the set of right K-linear maps M m Da = {Da : F −→ F | m ∈ M} (51) is finite, and in such a case, we have that

R M dimK (F) ≤ |Da |. (52) We now show that Artin’s Theorem [3, Theorem 14] is a particular case of Theorem 13, when choosing fields and parameters as in Example 64. Furthermore, in this case equality holds in (52), which is due to the following straightforward result (see [3, Theorem 12]), commonly known as Dedekind’s lemma.

Lemma 69 (Dedekind [3]). Let F be a field and let σ1, σ2,...,σn : F −→ F be n pair-wise dinstinct automorphisms of F. Then σ1, σ2,...,σn are linearly independent over F. This previous lemma implies that dimF(F[G]) = |G|, where G is a finite group generated by pair-wise distinct field automorphisms σ1, σ2,...,σn : F −→ F. With this fact, the original Artin’s Theorem [3, Theorem 14] follows immediately from Theorem 13, by choosing fields and parameters as in Example 64. Corollary 70 (Artin’s Theorem [3]). If F is a field and G is a finite group of automorphisms of F, then dimK (F)= |G|, (53) where K is the subfield of elements fixed by G. In particular, the extension K ⊆ F is finite. In the case when F is non-commutative in Example 64, the equality (53) does not hold in R L general. In such a case, it still holds that dimK (F) = dimK (F) ≤ |G| < ∞, as implied by (52), but this dimension equals what is known as the reduced order of G, which can be strictly smaller than |G| (the order of G) if F is non-commutative. See [19, Section VII-5] for more details. We may similarly give a result for the case of derivations by choosing parameters as in Examples 65 and 66.

Corollary 71. Let σi : F −→ F be a ring endomorphism and let δi : F −→ F be a σi-derivation, for i =1, 2,...,n. If ∇ = {m(δ) | m ∈ M}, defined as in Examples 65 and 66, is a finite set and K ⊆ F is the division subring of constants of ∇, that is, K = {β ∈ F | D(β)=0, ∀D ∈ ∇}, then R dimK (F) ≤ |∇|. (54) In particular, the extension K ⊆ F is finite.

35 R The true dimension dimK (F) in the setting of Corollary 71 was found by Jacobson in a special case in [16] (see also [6, Theorem 2, p. A.V.103]), where equality in (54) is also attained. R We leave as open problem to explicitly find the true dimension dimK (F) for general finite P-Galois extensions of division rings.

6.3 A P-Galois correspondence We now turn to P-Galois correspondences. Given a finite Galois extension of fields K ⊆ F, the Fundamental Theorem of Galois’ Theory establishes a correspondence between Galois extensions K′ ⊆ F such that K ⊆ K′ and the subgroups of the Galois group of K ⊆ F. Artin was one of the first to formally write this result in [3, Theorem 16]. We next extend this theorem to general finite P-Galois extensions in terms of linearized polynomials. We note that, a general Galois correspondence (known as the Jacobson-Bourbaki Theorem) between finite extensions of division rings and certain algebras was established in [7, 17, 18], but without obtaining the explicit structure of skew and linearized polynomials.

Theorem 14. Let a, b ∈ Fn. If F[Da] = F[Db], then Ka = Kb. Conversely, if Ka = Kb and Ka ⊆ F is a finite P-Galois extension, then F[Da]= F[Db]. Proof. First, if F[Da] = F[Db], then Ka = Kb by Proposition 12. We now prove the reversed R implication, under the assumption that m = dimKa (F) < ∞. Denote K = Ka = Kb and let m β = (β1,β2,...,βm) ∈ F be an ordered right basis of F over K. By Corollary 44, the maps L m×m L m×m µβ ◦ Eβ : F[Da] −→ K and µβ ◦ Eβ : F[Db] −→ K are ring isomorphisms. Thus we have a ring isomorphism

L −1 −1 L ψ = (Eβ ) ◦ µβ ◦ µβ ◦ Eβ : F[Da] −→ F[Db].

L −1 L L This map is obviously ψ = (Eβ ) ◦ Eβ , where the domain of Eβ is F[Da] in the non-inverted case, and it is F[Db] in the inverted case. However, if F Da ∈ F[Da] and GDb = ψ(F Da ) ∈ F[Db], then by definition of ψ, we have that

L Da L Db m Eβ (F )= Eβ (G ) ∈ F , hence F Da and GDb are the same right K-linear map from F to F, since they coincide on the ordered right basis β of F over K. In other words, F Da = GDb , therefore ψ is the identity and we deduce that F[Da]= F[Db], as desired.

Remark 72. It is important to observe that F[Da]= F[Db] does not mean that Da = Db. Hence it is not true that, for any F ∈ F[x; σ, δ], it holds that F Da = F Db . See, for instance, Example 74 below. Example 73 (The classical Galois correspondence). Let the setting be as in Example 64. For a ∈ Fn, define its support as

Supp(a)= {i ∈{1, 2,...,n} | ai 6=0}.

Denote I = Supp(a) ⊆{1, 2,...,n} and let HI be the subgroup of G generated by σi, for i ∈ I. Then, it is immediate to check that

HI Ka = F = {β ∈ F | τ(β)= β, ∀τ ∈ HI }, and

F[Da]= F[HI ]= Fτ τ | Fτ ∈ F, ∀τ ∈ HI . ( ) τX∈HI

36 That is, Ka and F[Da] are, respectively, the subfield of elements fixed by HI and the group algebra of HI over F. We know that the group algebras of the subgroups of G are in bijective correspondence with the subgroups themselves, as implied, for instance, by Dedekind’s lemma (Lemma 69). Hence Theorem 14 recovers Galois’ original correspondence [3, Theorem 16]. For a full correspondence with all subgroups of G, we may choose G = {σ1, σ2,...,σn}. It is important to note that this example shows how to recover Galois’ original correspondence [3, Theorem 16] if we define a finite Galois extension K ⊆ F as an extension where K is the subfield of fixed elements of a finite group of automorphisms of F. However, as is well-known, an alternative useful way of seeing finite Galois extensions is as finite, normal and separable extensions, as stated by Artin in [3, Theorem 16]. Our results do not imply the normal and/or separable properties. Example 74 (The case of derivations). Let the setting be as in Example 65. If we choose σ = Id, then for any a ∈ Fn, we have that

δ Ka = F = {β ∈ F | δi(β)=0, ∀i =1, 2,...,n}, and

F[Da]= F[∇]= Fmm(δ) | Fm ∈ F, ∀m ∈M , (m ) X∈M with notation as in Examples 65. Therefore, this is a case where Ka = Kb and F[Da]= F[Db], for all a, b ∈ Fn. However, it may happen that Da 6= Db. Take for instance a = 0 and b = 1. xi xi Then, Da = δi, while Db = δi + 1, for i = 1, 2,...,n. We see that in both cases we obtain the polynomial ring in the operators δ1,δ2,...,δn, but where Db corresponds to the left basis δ1 +1,δ2 +1,...,δn + 1.

6.4 Generalizing Hilbert’s Theorem 90 A general version of Hilbert’s Theorem 90 can be obtained immediately from Corollary 32. Note that it holds for arbitrary division rings, and not only for fields.

Theorem 15. Let a ∈ Fn and assume that F is a finite-dimensional right vector space over Ka. n ∗ For b = (b1,b2,...,bn) ∈ F , there exists β ∈ F such that

−1 b = Da(β)β , if and only if, b1,b2,...,bn ∈ F satisfy the following Noether equations: F (b)=0, for all F ∈ I(C(a)).

Proof. If F is a finite-dimensional right vector space over Ka, then C(a) is P-closed and finitely generated by Corollary 32. Thus the result follows from the definition of P-closed sets and conjugacy. In order to make Theorem 15 more manageable in concrete examples, it is preferable to pick certain particular set of generators of I(C(a)). Fortunately, we may actually take a finite set of such generators from a finite set of monomials, thanks to the following lemma.

Lemma 75. Let a ∈ Fn and assume that F is a finite-dimensional right vector space over Ka. N Let N ⊆M be a minimal finite set such that Da is a left basis of F[Da]. The set N exists by Corollary 33. Then the set

c N = {xin ∈ M | 1 ≤ i ≤ n, n ∈N∪{1}, xin ∈N}/

37 is the smallest subset N ′ ⊆ M\N such that, if m ∈ M\N , then there exist m′ ∈M and n′ ∈ N ′ ′ ′ ′ n ′ c with m = m n . Furthermore, there exist Fn ∈ F, for n ∈ N and n ∈ N , such that

′ ′ n ′ c I(C(a)) = n − Fn n ∈ F[x; σ, δ] | n ∈ N , (55) ( n )! X∈N and it holds that |N c|≤ n(|N | + 1), where equality may be attained.

c N Proof. The minimality of N and the upper bound on its size are easy to see. Since Da is a left m basis of F[Da], there exist Fn ∈ F such that

m m n Da = Fn Da, (56) n X∈N m for m ∈ M and n ∈ N , where Fn = δm,n if m ∈ N . Let now I ⊆ F[x; σ, δ] be the left ideal on the right-hand side of (55). Using Theorem 1, it follows from (56) that I ⊆ I(C(a)). Therefore, there exists a canonical surjective left linear map

ρ : F[x; σ, δ]/I −→ F[x; σ, δ]/I(C(a)).

L N N L Now, it is easy to see that dimF (F[x; σ, δ]/I) = |N | = |Da |. Since |Da | = dimF (F[Da]) = L dimF (F[x; σ, δ]/I(C(a))) by Corollary 33, we conclude that ρ is a left vector space isomorphism. Hence I = I(C(a)) and we are done. Therefore we may obtain a more particular version of Theorem 15 as follows, where there is only a finite set of Noether equations.

Theorem 16. Let a ∈ Fn and assume that F is a finite-dimensional right vector space over Ka. c n ∗ Let N , N ⊆M be as in Lemma 75. For b = (b1,b2,...,bn) ∈ F , there exists β ∈ F such that

−1 b = Da(β)β , if and only if, b1,b2,...,bn ∈ F satisfy the following no more than n(|N | + 1) Noether equations:

′ n ′ c Nn′ (b)= Fn Nn(b), for all n ∈ N . n X∈N Hilbert’s theorem 90 was originally due to Kummer [20], and its name comes from the fact that it appears as “Theorem 90” in Hilbert’s notes on algebraic number theory [14]. The original theorem (i.e. [14, Theorem 90]), which we revisit below in Corollary 77, is only valid for cyclic Galois field extensions. A version of Hilbert’s theorem 90 for general Galois field extensions is due to Noether [36]. The terms “Noether equations” were used by Artin in his lectures on Galois theory [3], and his rewriting of Noether’s result [3, Theorem 21] is the closest to our notation. It can be essentially written in terms of skew binomials as follows. Note that, in order to guarantee the finite-dimensionality hypothesis in Theorem 15, we use Corollary 70.

Corollary 76 (Noether’s Hilbert 90 [36]). Let K ⊆ F be a finite Galois field extension with Galois group G. Assume that σ1, σ2,...,σn are generators of G as a group. For a list ∗ n ∗ b = (b1,b2,...,bn) ∈ (F ) , there exists β ∈ F such that

−1 bi = σi(β)β , for all i =1, 2, . . . , n,

38 if and only if, b1,b2,...,bn ∈ F satisfy the following Noether equations:

Nm(b)= Nn(b), whenever m(σ)= n(σ), which hold if, and only if, b1,b2,...,bn ∈ F satisfy the following Noether equations:

Nm(b)= Nn(b), whenever m − n ∈ NR, where NR ⊆ {m − n | m(σ) = n(σ)} is some finite set of relations of σ1, σ2,...,σn defining the group G. Here, for a (free) monomial m ∈ M, m(σ) denotes the symbolic evaluation of m at (σ1, σ2,...,σn).

n×n Proof. Let σ = diag(σ1, σ2,...,σn): F −→ F and consider the skew polynomial ring F[x; σ, 0]. It is trivial to see that K = K1, since K is the subfield of F fixed by the automorphisms in G. Therefore, the result follows immediately from Theorem 16, since NR contains a set N as in Lemma 75, which holds because the relations in NR define the group G. We conclude with the original Hilbert 90, due to Kummer [20], which is simply the case of cyclic Galois field extensions.

Corollary 77 (Classical Hilbert 90 [14, 20]). Let K ⊆ F be a cyclic finite Galois field extension with Galois group G generated by the automorphism σ : F −→ F. For b ∈ F∗, there exists β ∈ F∗ such that b = σ(β)β−1, if and only if, b ∈ F is in the kernel of the norm of F over K, that is,

m−1 m−2 NF/K(b)= σ (b)σ (b) ··· σ(b)b =1, where m = dimK (F).

Proof. It follows directly from Corollary 76 by choosing σ1 = σ as generator of G and NR = m F m {x1 −1}⊆ F[x1; σ1, 0] as the set of relations on σ1 defining G. Here, we denoted N /K = Nx1 . Remark 78. Setting n = 1 and σ = Id in Theorems 15 and 16, we obtain criteria for two elements a,b ∈ F to be conjugate in the sense that there exists β ∈ F∗ such that b = a + δ(β)β−1, where δ : F −→ F is a standard derivation. A criterion for conjugacy in this sense was given by Jacobson [16, Theorem 15] when F is a D-field. We leave as open problem proving Jacobson’s criterion from Theorems 15 and 16.

Acknowledgement

The author gratefully acknowledges the support from The Independent Research Fund Denmark (Grant No. DFF-7027-00053B).

References

[1] A. S. Amitsur. A generalization of a theorem on linear differential equations. Bulletin of the American Mathematical Society, 54(10):937–941, 1948.

[2] S. A. Amitsur and L. W. Small. Polynomials over division rings. Israel Journal of Mathematics, 31(3):353– 358, Sep 1978.

39 [3] E. Artin. Galois Theory. Notre Dame Mathematical Lectures, no. 2. University of Notre Dame, Notre Dame, Ind., second edition, 1944.

[4] D. Augot, A. Couvreur, J. Lavauzelle, and A. Neri. Rank-metric codes over arbitrary Galois extensions and rank analogues of Reed-Muller codes. pages 1–26. 2020. Preprint: arXiv:2006.14489.

[5] D. Boucher and F. Ulmer. Linear codes using skew polynomials with automorphisms and derivations. Designs, Codes and Cryptography, 70(3):405–431, 2014.

[6] N. Bourbaki. Algebra II, Chapters 4–7. Berlin, itd: Springer Heidelberg, 1981.

[7] H. Cartan. Les principaux th´eor`emes de la th´eorie de Galois pour les corps non n´ecessairement commutatifs. C. R. Acad. Sci. Paris, 224:249–251, 1947.

[8] P. M. Cohn. Free rings and their relations. London: Academic Press, 1971.

[9] P. Elias. Coding for noisy channels. In Record of the 1955 I.R.E. National Convention, Part 4, pages 37–46. 1955.

[10] E. M. Gabidulin. Theory of codes with maximum rank distance. Problems Information Transmission, 21, 1985.

[11] E. Galois. M´emoire sur les conditions de r´esolubilit´edes ´equations par radicaux. Collected by M. A. Chevalier and published by J. Liouville in 1846.

[12] W. Geiselmann and F. Ulmer. Skew Reed-Muller codes. In Rings, Modules and Codes, volume 727, pages 107–116. Contemporary Mathematics, 2019.

[13] V. D. Goppa. Codes on algebraic curves. Soviet Math. Dokl., 24(1):170–172, 1981.

[14] D. Hilbert. Die Theorie der algebraischen Zahlk¨orper, volume 4. Jahresbericht der Deutschen Mathematiker- Vereinigung, 1897.

[15] J. M. Hoene-Wro´nski. R´efutation de la th´eorie des fonctions analytiques de Lagrange. 1812. Dedi´ee `al’Insitut Imperial de France.

[16] N. Jacobson. Abstract derivation and Lie algebras. Transactions of the American Mathematical Society, 42(2):206–224, 1937.

[17] N. Jacobson. Galois theory of purely inseparable fields of exponent one. American Journal of Mathematics, 66(4):645–648, 1944.

[18] N. Jacobson. A note on division rings. American Journal of Mathematics, 69(1):27–36, 1947.

[19] N. Jacobson. Structure of rings. Providence: American Mathematical Society, 1956.

[20] E. E. Kummer. Uber¨ eine besondere Art, aus complexen Einheiten gebildeter Ausdr¨ucke. Journal f¨ur die reine und angewandte Mathematik, 50:212–232, 1855.

[21] J. L. Lagrange. Le¸cons ´el´ementaires sur les math´ematiques donn´ees `a l’Ecole´ Normale en 1795, volume 7. ID., Oeuvres completes, a cura di J.-A. Serret, Gauthier-Villars, Paris, 1867.

[22] T. Y. Lam. A general theory of Vandermonde matrices. Expositiones Mathematicae, 4:193–215, 1986.

[23] T. Y. Lam. A First Course in Noncommutative Rings, volume GTM, 131. Springer, New York, NY, 1991.

[24] T. Y. Lam and A. Leroy. Algebraic conjugacy classes and skew polynomial rings. In Perspectives in Ring Theory, pages 153–203. Springer, 1988.

[25] T. Y. Lam and A. Leroy. Vandermonde and Wronskian matrices over division rings. Journal of Algebra, 119(2):308–336, 1988.

[26] T. Y. Lam and A. Leroy. Hilbert 90 theorems over divison rings. Transactions of the American Mathematical Society, 345(2):595–622, 1994.

[27] A. Leroy. Pseudolinear transformations and evaluation in Ore extensions. Bulletin of the Belgian Mathe- matical Society, 2(3):321–347, 1995.

40 [28] R. Lidl and H. Niederreiter. Finite Fields, volume 20. Encyclopedia of Mathematics and its Applications. Addison-Wesley, Amsterdam, 1983.

[29] U. Mart´ınez-Pe˜nas. Skew and linearized Reed–Solomon codes and maximum sum rank distance codes over any division ring. Journal of Algebra, 504:587–612, 2018.

[30] U. Mart´ınez-Pe˜nas. Classification of multivariate skew polynomial rings over finite fields via affine transfor- mations of variables. Finite Fields and Their Applications, 65:101687, 2020.

[31] U. Mart´ınez-Pe˜nas and F. R. Kschischang. Evaluation and interpolation over multivariate skew polynomial rings. Journal of Algebra, 525:111–139, 2019.

[32] E. Meijering. A chronology of interpolation: from ancient astronomy to modern signal and image processing. Proceedings of the IEEE, 90(3):319–342, March 2002.

[33] E. H. Moore. A two-fold generalization of Fermat’s theorem. Bulletin of the American Mathematical Society, 2(7):189–199, 1896.

[34] Th. Muir. A treatise on the theorie of . London. Macmillan (1882)., 1882.

[35] D. E. Muller. Application of boolean algebra to switching circuit design and to error detection. Transactions of the I.R.E. Professional Group on Electronic Computers, EC-3(3):6–12, Sep. 1954.

[36] E. Noether. Der Hauptgeschlechtssatz f¨ur relativ-Galoissche Zahlk¨orper. Mathematische Annalen, 108(1):411–419, Dec 1933.

[37] O. Ore. On a special class of polynomials. Transactions American Mathematical Society, 35(3):559–584, 1933.

[38] O. Ore. Theory of non-commutative polynomials. Annals of Mathematics (2), 34(3):480–508, 1933.

[39] I. Reed. A class of multiple-error-correcting codes and the decoding scheme. Transactions of the IRE Professional Group on Information Theory, 4(4):38–49, Sep. 1954.

[40] I. S. Reed and G. Solomon. Polynomial codes over certain finite fields. Journal of the Society for Industrial and Applied Mathematics, 8(2):300–304, 1960.

[41] K. F. Roth. Rational approximations to algebraic numbers. Mathematika, 2(1):1–20, 1955.

[42] J.-P. Serre. Galois cohomology. Transl. from the French by Patrick Ion. 2nd printing. Berlin: Springer, 2nd printing edition, 2002.

[43] A. T. Vandermonde. M´emoire sur la r´esolution des ´equations. In Histoire de l’Acad´emie royale des sciences avec les m´emoires de math´ematiques et de physique pour la mˆeme ann´ee tir´es des registres de cette acad´emie. Ann´ee MDCCLXXI, pages 365–413. 1774.

[44] E. Waring. Problems concerning interpolations. Philosophical Transactions of the Royal Society of London, 69:59–67, Jan 1779.

41