Why Wirtinger behave so well

Bart Michels∗

April 14, 2019

Contents

1 Introduction 2

2 Partial derivatives revisited 2

3 Complexification 3 3.1 Complexified differentials: two variables ...... 4 3.2 Complexified differentials: multiple variables ...... 4 3.3 Conjugation ...... 5

4 7

5 Holomorphic functions 10

6 Power series 11

7 Concluding remarks 12

∗Universit´eParis 13, LAGA, CNRS, UMR 7539, F-93430, Villetaneuse, France. Email: [email protected]

1 1 Introduction

Given a smooth f : C → C, it makes a priori no sense to talk about ∂f ∂f the Wirtinger derivatives and . We could define them by ∂z ∂z¯ ∂f 1 ∂f ∂f  = − i ∂z 2 ∂x ∂y (1) ∂f 1 ∂f ∂f  = + i ∂z¯ 2 ∂x ∂y but that does not explain why we denote them like partial derivatives, where the confusing sign convention comes from, why they behave like partial derivatives, and why they act on power series in z and z in the way they should. Here follows a construction of Wirtinger derivatives that avoids the explicit formula (1). The reader may be interested in the definitions in section 3, in the sanity check in Proposition 3.2, the chain rule from Proposition 4.3 and the results from section 6.

2 Partial derivatives revisited

It is worthwhile to take a second look at what a partial is. Con- sider finite-dimensional real vector spaces V and W .1 They carry a canonical smooth structure. Consider a differentiable function f : V → W . At every p ∈ V , f has a differential Dpf : V → W , an R-linear . In order to talk about the partial derivatives of f, we need to fix a basis of V , say ∗ ∗ ∗ (e1, . . . , en), which has a dual basis, (e1, . . . , en). Here ei is characterized by ∗ the property that ei (ej) = δi,j. We can then define ∂f ∗ (p) := (Dpf)(ei)(2) ∂ei n Usually, we take V = R and let (ei) be the standard basis, whose dual we are used to denote by (xi). The same holds for a holomorphic map between complex vector spaces, where the differential Dpf is then C-linear. We see that a really is a directional derivative, and we need not fix an entire basis to define the derivative in the direction of ei. However, in some situations, we are given a linear form e∗ ∈ V ∗. In order

1We may and will also take W to be a complex and regard it as a real one, by restriction of scalars.

2 to define the derivative w.r.t. e∗, we have to extend it to a basis of V ∗, in order to make sense of the antidual vector e ∈ V . We recall:

Proposition 2.1. Let k be a field and V a finite dimensional k-vector space. ∗ Then for every basis (xi) of V there is a unique basis (ei) of V such that ∗ (xi) = (ei ). ◦ We denote the unique antidual basis of (xi) by (xi ). When k ∈ {R, C} and ∗ (xi) is a basis of V , we thus have, for a A : V → W ,

∂A ◦ (3) = A(xi ) ∂xi Note that the choice of a basis of V ∗ is implicit in the LHS.

∗ Proposition 2.2. Let (xi) be a basis of V and f : V → W an isomorphism. Then ◦ −1 ◦ f(xi ) = (xi ◦ f ) Proof. It suffices to remark that

−1 ◦ (xj ◦ f )(f(xi )) = δi,j

3 Complexification

Now let W be a complex vector space and consider a differentiable function 2 2 2 f : R → W . Take p ∈ R . Then f has a differential Dpf : R → C, which is R-linear. We would like to make sense of (∂f/∂z)(p), so we want to view 2 z andz ¯ as linear forms on R that form a basis of its dual. . . The solution 2 is to complexify R . Definition 3.1. Let V be a real vector space. Then there exists a complex vector space VC and an R-linear map ι : V → VC such that for every R-linear map f : V → W to a complex vector space, there exists a unique C-linear map Cf : VC → W such that f = (Cf) ◦ ι:

f V W ι Cf VC

We call (VC, ι) a complexification of V .

3 This is a special case of the adjunction between the extension of scalars and restriction of scalars functors. We may indeed take VC = V ⊗R C. We call VC the complexification of V . This gives a functor: for real vector spaces V1,V2 and a linear map f : V1 → V2, there is a unique C-linear map fC :(V1)C → (V2)C such that the following diagram commutes:

f V1 V2

ι1 ι2

fC (V1)C (V2)C

In particular, a complexification is unique up to unique C-isomorphism that is compatible with the maps ι. n n If we denote ι1 : R → C the inclusion, then we have (R )C = C where the n n canonical map ιn : R → C is ι1 × · · · × ι1.

3.1 Complexified differentials: two variables 2 We have the R-linear map Dpf : R → C. Passing to the complexification 2 2 of R , this gives us a canonical C-linear map CDpf : C → C. We want to apply this map to an appropriate vector, and call the result (∂f/∂z)(p). To 2 do that, we introduce coordinates on C . 2 We have R-linear maps z, z¯ : R → C. They give canonical C-linear maps 2 Cz, Cz¯ : C → C, which are given in coordinates by

(Cz)(z1, z2) = z1 + iz2 , (Cz¯)(z1, z2) = z1 − iz2 2 ∗ They form a basis of the dual (C ) , and we define, in analogy to (3):

∂f ◦ (p) := (CDpf)((Cz) ) (4) ∂z ∂f (p) := ( D f)(( z¯)◦) ∂z¯ C p C It is a straightforward calculation that these definitions coincide with (1). ◦ 1 i t ◦ 1 i t Explicitly, (Cz) = ( 2 , − 2 ) and (Cz¯) = ( 2 , 2 ) . But we don’t like to compute, so we will instead derive those formulas from the chain rule from Proposition 4.2.

3.2 Complexified differentials: multiple variables n Take n > 1 and let f : C → C be differentiable. We denote the coordinates n on C by (zi). We want to understand what ∂f/∂zi means.

4 n Take p ∈ C . As in the two case, we have a C-linear map CDpf : 2n 2n C → C, and we have coordinates Czi, Cz¯i on C . We define ∂f (p) := ( D f)(( z )◦) ∂z C p C i (5) i ∂f ◦ (p) := (CDpf)((Cz¯i) ) ∂z¯i Using the universal property of complexification, one verifies that these def- n initions do not depend on the choice of complexification of C . So far this was just language. What we really want is to answer the questions from the introduction. For now, we remark the following:

n Proposition 3.2. Write the standard complex coordinates on C as (zi). Then

∂zi ∂z¯i = = δi,j ∂zj ∂z¯j ∂z ∂z¯ i = i = 0 ∂z¯j ∂zj n Proof. The map zi : C → C is R-linear, so its differential equals itself: Dp(zi) = zi. Thus the extended differential CDp(zi) equals the coordinate 2n Czi of C . Similarly forz ¯i. The statement now follows from the definition of ∂/∂zi and ∂/∂z¯i, and the definition of the antidual basis.

◦ ◦ Defining the partial derivatives in terms of the antidual vectors (Czi) , (Cz¯i) has a great advantage. We need not know exactly what those vectors are, yet 2n we know the coordinates of each w ∈ C in this basis: they are zi(w), z¯i(w). This observation will serve us when we prove chain rules, in the next section.

3.3 Conjugation The end goal of this subsection is Proposition 3.5. The remainder of the text does not rely on it, so it may safely be skipped.

For every R-automorphism σ of C, a complexification VC has a canonical R- automorphism given by 1 ⊗ σ : V ⊗R C → V ⊗R C, conjugated by the unique ∼ C-isomorphism VC = V ⊗R C compatible with ι. This allows us to define conjugation on VC. The conjugation may also be constructed intrinsically as follows. Define another C-linear structure on VC by precomposing the ring representation C → EndZ(VC) with complex conjugation. Call the resulting -vector space V 0 . We can define conjugation as the unique - C C C homomorphism V → V 0 compatible with ι, given by the universal property C C

5 of VC. It is not hard to see that, because conjugation is an automorphism of , we have that (V 0 , ι) is still a complexification of V , which implies that C C the conjugation on VC is an automorphism. More generally, given commutative rings A, B, a ring homomorphism g : A → B, an A-endomorphism σ of B and an A-module M, the endomorphism σ induces a canonical A-linear M-endomorphism of the extension of scalars MB = g∗M. It can be constructed in two ways, as before. We denote it by M ⊗ σ, because we may also view it as the image of σ under the functor M ⊗ −. The map f = M ⊗ σ is σ-semilinear: it satisfies, by construction, f(lm) = σ(l)f(m) for m ∈ MB, l ∈ L. If M,N,P are B-modules, f : M → N is a σ-semilinear and g : N → P is τ-semilinear, then g ◦ f is (τ ◦ σ)-semilinear.

Proposition 3.3. Let M be an A-module, (MB, ι) its extension of scalars, N,P be B-modules, f : M → N be A-linear, σ ∈ AutA(B), g : N → P be σ-semilinear. Take Bf and B(g ◦ f) as in the universal property of MB:

f g M N P ι Bf MB B(g◦f) Then B(g ◦ f) = g ◦ Bf ◦ (M ⊗ σ−1) Proof. The RHS is a B-linear map extending g ◦ f.

Proposition 3.4. Let K be a field, L/K a field extension, σ ∈ AutK (L), ∗ (xi) a basis of VL and f : V → W an σ-semilinear isomorphism. (That is, a σ-semilinear bijection.) Then ◦ −1 ◦ f(xi ) = (σ ◦ xi ◦ f ) −1 Proof. Note that σ ◦ xi ◦ f is indeed L-linear. It thus suffices to remark that −1 ◦ (σ ◦ xj ◦ f )(f(xi )) = σ(δi,j) = δi,j n n Because C is the complexification of R , it has a canonical conjugation t t map, denoted σn. Explicitly, it is given by (zi) 7→ (¯zi) . n m Proposition 3.5 (Conjugation distributes over fractions). Let f : C → C be differentiable. Then

∂f ∂(σm ◦ f) σm ◦ = ∂zi ∂(σ1 ◦ zi)

6 (Here, σ1 ◦ zi =z ¯i.) Proof. We have a commutative diagram

Dp(σm◦f)

2n Dpf m σm m R C C ι2n ι2n CDpf σ−1 2n 2n 2n C C CDp(σm◦f)

−1 2 where CDp(σm ◦ f) = σm ◦ CDpf ◦ σ2n by Proposition 3.3. We now look at ◦ the image of (Cz¯i) under this map. On the one hand, it equals ∂(σm◦f)/∂z¯i. On the other hand,

−1 ◦ −1 ◦ (σm ◦ CDpf ◦ σ2n )((Cz¯i) ) = (σm ◦ CDpf)((σ1 ◦ Cz¯i ◦ σ2n) ) −1 ◦ = (σm ◦ CDpf)((C(σ1 ◦ z¯i)) ) ◦ = (σm ◦ CDpf)((Czi) )  ∂f  = σm (p) ∂zi where we used Propositions 3.4 and 3.3.

Remark 3.6. The chain rule (Proposition 4.3) does not allow to directly prove Proposition 3.5.

4 Chain rule

a b b c For differentiable functions f : R → R and g : R → R , the chain rule b ∂(f ◦ g) X ∂g ∂fj = ◦ f · ∂x ∂x j ∂x i j=1 j i is the coordinate version of the identity

Dp(g ◦ f) = Df(p)(g) ◦ Dp(f) For practical purposes, we want to study chain rules involving the Wirtinger derivatives ∂/∂zi, ∂/∂z¯i. We place ourselves immediately in the setting of

2 −1 Of course, we have σ2n = σ2n, but we are really using that the semilinearity of σm −1 cancels with that of σ2n .

7 a b multiple variables. Consider differentiable functions f : C → C and g : b c 3 C → C . n Denote the complex coordinates on C by (zi); the real coordinates by (xi, yi), so that xi =

b  ¯  ∂(g ◦ f) X ∂g ∂fj ∂g ∂fj = ◦ f · + ◦ f · ∂x ∂z ∂x ∂z¯ ∂x i j=1 j i j i b  ¯  ∂(g ◦ f) X ∂g ∂fj ∂g ∂fj = ◦ f · + ◦ f · ∂y ∂z ∂x ∂z¯ ∂y i j=1 j i j i

Note that ∂f¯j/∂xi = ∂fj/∂xi, because (by the chain rule for differentiable functions) ∂/∂xi commutes with R-linear maps, in particular, with complex conjugation.

Proof. We want a coordinate-free version of the chain rule in the statement. We have a commutative diagram

D g a Dpf 2b f(p) c C R C

ι2b CDf(p)g 2b C

The chain rule Dp(g ◦ f) = Df(p)g ◦ Dpf implies

Dp(g ◦ f) = (CDf(p)g) ◦ ι2b ◦ Dpf which is the statement of the proposition, when we choose the coordinates a 2b (xi, yi) on C and (Czi, Cz¯i) on C . Proposition 4.2. We have

b   ∂(g ◦ f) X ∂g ∂

8 Proof. We have a commutative diagram

D g 2a Dpf b f(p) c R C C

ι2a CDpf 2a C The claim follows from

Dp(g ◦ f) = Df(p)g ◦ (CDf(p)g) ◦ ι2a

2a b when we choose the coordinates (Czi, Cz¯i) on C and (xi, yi) on C . In particular, the second chain rule implies, together with Proposition 3.2, the usual formula (1) for the Wirtinger derivatives.

Proposition 4.3. We have

b  ¯  ∂(g ◦ f) X ∂g ∂fj ∂g ∂fj = ◦ f · + ◦ f · ∂z ∂z ∂z ∂z¯ ∂z i j=1 j i j i b  ¯  ∂(g ◦ f) X ∂g ∂fj ∂g ∂fj = ◦ f · + ◦ f · ∂z¯ ∂z ∂z¯ ∂z¯ ∂z¯ i j=1 j i j i

Proof. If we wanted to use explicit formulas for Wirtinger derivatives, this follows from the first or the second chain rule by . But we don’t want to do that, so we give a direct proof via a coordinate-free chain rule. We have a commutative diagram

D g 2a Dpf 2b f(p) c R R C

ι2a ι2b CDf(p)g (D f) 2a p C 2b C C

CDp(g◦f)

The fact that

CDp(g ◦ f) = (CDf(p)g) ◦ (Dpf)C 2a follows from the uniqueness in the definition of the complexification of R : 2a both sides are extensions of Dp(g ◦ f) to C . When we choose coordinates 2a 2b (Czi, Cz¯i) on C and C , that equality is what we wanted to prove.

9 5 Holomorphic functions

n m Let f : C → C be a . It is in particular differen- 2n 2m tiable, so viewing it as a function R → R gives us Wirtinger derivatives ∂f/∂zi, ∂f/∂z¯i. On the other hand, there are the complex partial derivatives defined by ∂f C ◦ (p) = (Dp f)(zi ) ∂Czi n m Proposition 5.1. For a holomorphic f : C → C , we have ∂f ∂f = ∂zi ∂Czi ∂f = 0 ∂z¯i

Proof. Because f is holomorphic, at every point p it has a best C-linear C 2n ∼ n approximation Dp f. Because the R-isomorphism R = C respects the norm, it is in particular a best R-linear approximation. Thus, under the ∼ 2n n R C identification (zi)i : R −→ C , the differentials Dp f and Dp f coincide. C R Because Dp f is C-linear, the unique map CDp f given by the universal n property of complexification, factors through C :

DpRf

DCf 2n (zi)i n p m R C C

ι2n (Czi)i 2n C CDpRf

We can now compute

R ◦ C ◦ (CDp f)((Czi) ) = (Dp f) [(Czi)i((Czi) )] C = (Dp f)(ei) C ◦ = (Dp f)(zi ) and

R ◦ C ◦ (CDp f)((Cz¯i) ) = (Dp f) [(Czi)i((Cz¯i) )] C = (Dp f)(0) = 0

10 6 Power series

n n n Define smooth maps ∆n : C → C × C by

∆n(zi) := ((zi), (¯zi))

n n c Proposition 6.1 (Diagonal restriction). Let f : C × C → C be differen- n n tiable. Write the standard coordinates on C × C as (zi, wi). Then

∂(f ◦ ∆n) ∂f = ◦ ∆n ∂zi ∂zi ∂(f ◦ ∆n) ∂f = ◦ ∆n ∂z¯i ∂wi Proof. This follows from the third chain rule. By Proposition 3.2, only one term survives.

Proposition 6.2. Let (ak,l)k,l∈Nn ∈ C and suppose that the series

X k l f(z) = ak,lz z¯ k,l∈Nn

n converges for z in some open polydisk D ⊂ C centered at 0. Then for z in that polydisk, ∂f X k−ei l = ak,lkiz z¯ ∂zi k,l∈Nn ∂f X k l−ei = ak,lliz z¯ ∂z¯i k,l∈Nn Proof. The convergence of the series in the polydisk D implies a bound on the coefficients, which in its turn implies that the series

X k l g(z, w) = ak,lz w k,l∈Nn

n n converges in the polydisk D ×D ⊂ C ×C . Using the previous proposition, the claim reduces to a standard fact about holomorphic power series.

11 7 Concluding remarks

At some point, one has to acknowledge that the Wirtinger derivatives are complex linear combinations of the ∂/∂xi and ∂/∂yi. For example, if one 2 wants to show that for f : C → C of class C , the Wirtinger derivatives commute with each other, or that (∂f/∂z)(p) depends only on the values of f in a neighborhood of p. But hopefully the reader will be convinced that some of the more algebraic properties of Wirtinger derivatives can be explained conceptually. We haven’t been very careful about the domain of definition of our differen- n tiable functions, assuming most of the time that they are defined on all of C . The extension to functions defined on open , as well as differentiable functions between manifolds with their partial derivatives with respect to a chart, is straightforward: the differential Dpf becomes a linear map between real tangent spaces, which we can complexify. The complexification is less explicit, but we have taken care to state everything as canonical as possible, so that the extra work should be minimal.

12