Tải bản đầy đủ (.pdf) (120 trang)

(Luận án tiến sĩ) Một Số Hàm Khoảng Cách Trong Lý Thuyết Thông Tin Lượng Tử Và Các Vấn Đề Liên Quan

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (566.8 KB, 120 trang )

<span class="text_page_counter">Trang 1</span><div class="page_container" data-page="1">

MINISTRY OF EDUCATION AND TRAINING QUY NHON UNIVERSITY

VUONG TRUNG DUNG

SOME DISTANCE FUNCTIONS IN QUANTUM INFORMATION THEORY AND RELATED PROBLEMS

DOCTORAL DISSERTATION IN MATHEMATICS

BINH DINH Ð 2024

</div><span class="text_page_counter">Trang 2</span><div class="page_container" data-page="2">

MINISTRY OF EDUCATION AND TRAINING QUY NHON UNIVERSITY

VUONG TRUNG DUNG

SOME DISTANCE FUNCTIONS IN QUANTUM INFORMATION THEORY AND RELATED PROBLEMS

Speciality: Mathematical Analysis Speciality code: 9 46 01 02

Reviewer 1: Prof. Dang Duc Trong Reviewer 2: Prof. Pham Tien Son

Reviewer 3: Assoc. Prof. Pham Quy Muoi

1. Assoc. Prof. Dr. Le Cong Trinh 2. Assoc. Prof. Dr. Dinh Trung Hoa

BINH DINH Ð 2024

</div><span class="text_page_counter">Trang 3</span><div class="page_container" data-page="3">

This thesis was completed at the Department of Mathematics and Statistics, Quy Nhon Uni-versity under the supervision of Assoc. Prof. Dr. Le Cong Trinh and Assoc. Prof. Dr. Dinh Trung Hoa. I hereby declare that the results presented in it are new and original. Most of them were published in peer-reviewed journals, others have not been published elsewhere. For using results from joint papers I have gotten permission from my co-authors.

Binh Dinh, 2024

Vuong Trung Dung

</div><span class="text_page_counter">Trang 4</span><div class="page_container" data-page="4">

This thesis was undertaken during my years as a PhD student at the Department of Math-ematics and Statistics, Quy Nhon University. Upon the completion of this thesis, I am deeply indebted to numerous individuals. On this occasion, I would like to extend my sincere appreci-ation to all of them.

First and foremost, I would like to express my sincerest gratitude to Assoc. Prof. Dr. Dinh Trung Hoa, who guided me into the realm of matrix analysis and taught me right from the early days. Not only that, but he also devoted a signiÞcant amount of valuable time to engage in dis-cussions, and provided problems for me to solve. He motivated me to participate in workshops and establish connections with senior researchers in the Þeld. He guided me to Þnd enjoyment in solving mathematical problems and consistently nurtured my enthusiasm for my work. I canÕt envision having a more exceptional advisor and mentor than him.

The second person I would like to express my gratitude to is Assoc. Prof. Dr. Le Cong Trinh, who has been teaching me since my undergraduate days and also introduced me to Prof. Hoa. From the early days of sitting in lecture halls at university, Prof. Trinh has been instilling inspiration and a love for mathematics in me. ItÕs fortunate that now I have the opportunity to be mentored by him once again. He has always provided enthusiastic support not only in my work but also in life. Without that dedicated support, it would have been difÞcult for me to complete this thesis.

I would like to extend a special thank you to the educators at both the Department of

</div><span class="text_page_counter">Trang 5</span><div class="page_container" data-page="5">

Math-ematics and Statistic and the Department of Graduate Training at Quy Nhon University for providing the optimal environment for a postgraduate student who comes from a distant loca-tion like myself. Binh Dinh is also my hometown and the place where I have spent all my time from high school to university. The privilege and personal happiness of coming back to Quy Nhon University for advanced studies cannot be overstated.

I am grateful to the Board and Colleagues of VNU-HCM High School for the Gifted for providing me with much support to complete my PhD study. Especially, I would like to extend my heartfelt gratitude to Dr. Nguyen Thanh Hung, who has assisted to me in both material and spiritual aspects since the very Þrst days I set foot in Saigon. He is not only a mentor and colleague but also a second father to me, who not only supported me Þnancially and emotionally during challenging times but also constantly encouraged me to pursue a doctoral degree. Without this immense support and encouragement, I wouldnÕt be where I am today.

I also want to express my gratitude to Su for the wonderful time weÕve spent together, which has been a driving force for me to complete the PhD program and strive for even greater achieve-ments that I have yet to attain.

Lastly, and most signiÞcantly, I would like to express my gratitude to my family. They have always been by my side throughout work, studies, and life. I want to thank my parents for giving birth to me and nurturing me to adulthood. This thesis is a gift I dedicate to them.

Binh Dinh, 2024 Vuong Trung Dung

</div><span class="text_page_counter">Trang 6</span><div class="page_container" data-page="6">

1.1 Matrix theory fundamentals . . . 13

1.2 Matrix function and matrix mean . . . . 19

2 Weighted Hellinger distance 28 2.1 Weighted Hellinger distance . . . . 29

2.2 In-betweenness property . . . . 32

3 The³-z-Bures Wasserstein divergence 38 3.1 The ³-z-Bures Wasserstein divergence and the least squares problem . . . . 42

3.2 Data processing inequality and in-betweenness property . . . 60

3.3 Quantum Þdelity and its parameterized versions . . . . 64

3.4 The ³-z-Þdelity between unitary orbits . . . . 75

4 A new weighted spectral geometric mean 82 4.1 A new weighted spectral geometric mean and its basic properties . . . . 83

</div><span class="text_page_counter">Trang 7</span><div class="page_container" data-page="7">

4.2 The Lie-Trotter formula and weak log-majorization . . . . 87

</div><span class="text_page_counter">Trang 8</span><div class="page_container" data-page="8">

Glossary of notation

C<sup>n</sup> : The set of alln-tuples of complex numbers 8x, y9 : The scalar product of vectorsx and y M<sub>n</sub> : The set ofn × n complex matrices

B(H) : The set of all bounded linear operators acting on Hilbert spaceH H<sub>n</sub> : The set of alln × n Hermitian matrices

<small>n</small> : The set of alln ì n positive semi-deịnite matrices P<sub>n</sub> : The set of alln × n positive deÞnite matrices I, O : The identity and zero elements of M<sub>n</sub>, respectively A<small>7</small> : The conjugate transpose (or adjoint) of the matrixA |A| : The positive semi-deÞnite matrix(A<small>7</small>A)<small>1/2</small>

Tr(A) : The canonical trace of matrixA

»(A) : The vector of eigenvalues of matrixA in decreasing order s(A) : The vector of singular values of matrixA in decreasing order Sp(A) : The spectrum of matrixA

�胀A�胀 : The operator norm of matrixA

|||A||| : The unitarily invariant norm of matrixA x z y : x is majorized by y

x z<small>w</small> y : x is weakly majorized by y

A�胉B : The geometric mean of two matricesA and B

</div><span class="text_page_counter">Trang 9</span><div class="page_container" data-page="9">

A�胉<small>t</small>B : The weighted geometric mean of two matricesA and B A�胉B : The spectral geometric mean of two matricesA and B

A�胉<small>t</small>B : The weighted spectral geometric mean of two matricesA and B F<small>t</small>(A, B) : TheF -mean of two matrices A and B

A'B : The arithmetic mean of two matricesA and B A!B : The harmonic mean of two matricesA and B A : B : The parallel sum of two matricesA and B µ<small>p</small>(A, B, t) : The matrixp-power mean of matrices A and B

</div><span class="text_page_counter">Trang 10</span><div class="page_container" data-page="10">

Quantum information stands at the conßuence of quantum mechanics and information the-ory, wielding the mathematical elegance of both realms to delve into the profound nature of information processing at the quantum level. In classical information theory, bits are the funda-mental units representing 0 and 1. Quantum information theory, however, introduces the concept of qubits, the quantum counterparts to classical bits. Unlike classical bits, qubits can exist in a superposition of states, allowing them to be both 0 and 1 simultaneously. This unique property empowers quantum computers to perform certain calculations exponentially faster than classical computers.

Entanglement is a crucial phenomenon in quantum theory where two or more particles be-come closely connected. When particles are entangled, changing the state of one immediately affects the state of the other, no matter the distance between them. This has important impli-cations for quantum information and computing, offering new possibilities for unique ways of handling information.

Quantum algorithms, such as ShorÕs algorithm for factoring large numbers and GroverÕs algorithm for quantum search, exemplify the power of quantum information in tackling complex computational tasks with unparalleled efÞciency.

In order to treat information processing in quantum systems, it is necessary to mathemati-cally formulate fundamental concepts such as quantum systems, states, and measurements, etc. Useful tools for researching quantum information are functional analysis and matrix theory. First, we consider the quantum system. It is described by a Hilbert space H, which is called a

<i>representation space</i>. This will be advantageous because it is not only the underlying basis of

</div><span class="text_page_counter">Trang 11</span><div class="page_container" data-page="11">

quantum mechanics but is also as helpful in introducing the special notation used for quantum

<i>mechanics. The (pure) physical states of the system correspond to unit vectors of the Hilbert</i>

space. This correspondence is not 1-1. When f<small>1</small> andf<small>2</small> are unit vectors, then the correspond-ing states are identical if f<small>1</small> = zf<small>2</small> for a complex number z of modulus 1 . Such z is often

<i>called phase. The pure physical state of the system determines a corresponding state vector upto a phase. Traditional quantum mechanics distinguishes between pure states and mixed states.Mixed states are described by density matrices. A density matrix or statistical operator is a </i>

pos-itive matrix of trace 1 on the Hilbert space. This means that the space has a basis consisting of eigenvectors of the statistical operator and the sum of eigenvalues is 1. In quantum information theory, distance functions are used to measure the distance between two mixed states. Addition-ally, these distance functions can be employed to characterize the properties of a given quantum state. For instance, they can quantify the quantum entanglement between two parts of a state, representing the shortest distance between the state and the set of all separable states. These distance functions naturally extend to the set of positive semi-deÞnite matrices, which is also the main focus of this thesis.

Nowadays, the signiÞcance of matrix theory has been widely recognized across various Þelds, including engineering, probability and statistics, quantum information, numerical analy-sis, biological and social sciences. In image processing (subdivision schemes), medical imaging (MRI), radar signal processing, statistical biology (DNA/genome), and machine learning, data from numerous experiments are stored as positive deÞnite matrices. To work with each set of data, we need to select its representative element. In other words, we need to compute the aver-age of the corresponding positive deÞnite matrices. Therefore, considering global solutions of the least-squares problems for matrices is of paramount importance (refer to [2, 8, 18, 28, 67, 73] for examples).

Let0 < a f x f b. Consider the following least squares problem:

d<sup>2</sup>(x, a) + d<sup>2</sup>(x, b) ³ min, x * [a, b],

</div><span class="text_page_counter">Trang 12</span><div class="page_container" data-page="12">

whered := d<small>E</small>(x, y) = |y 2 x|, or, d := d<small>R</small>(x, y) := | log(y) 2 log(x)|.

The arithmetic mean(a + b)/2 and the geometric mean:

ab are unique solutions to the above problem with respect to d<small>E</small> and d<small>R</small> distance, respectively. Moreover, based on the AM-GM inequality for two non-negative numbersa and b, we have a new distance as follows

d(a, b) = <sup>a + b</sup>

2 2<sup>:</sup>ab. ForA, B * P<small>n</small>, some matrix analogs of scalar distances are:

¥ Euclidean distance induced from Euclidean/Frobenius inner product 8A, B9 = Tr(A<small>7</small>B). The associated norm is�胀A�胀<small>F</small> = 8A, A9<small>1/2</small> = (Tr(A<small>7</small>A))<small>1/2</small>.

¥ The Riemann distance [12] is ·<sub>R</sub>(A, B) = || log(A<sup>21</sup>B)||<sub>2</sub> =

¥ The Log-Determinant metric [75] in machine learning and quantum information:

d<small>l</small>(A, B) = log det<sup>A + B</sup>

</div><span class="text_page_counter">Trang 13</span><div class="page_container" data-page="13">

between two data points. Such functions are not necessarily symmetric; and the triangle inequal-ity does not need to be true. Divergences [11] are such distance-like functions .

Deịnition. A smooth function Đ : P<small>n</small>ì P<small>n</small> R<small>+</small><i>is called a quantum divergence if</i>

(i) Đ(A, B) = 0 if and only if A = B.

(ii) The derivativeD§ with respect to the second variable vanishes on the diagonal, i.e.,

D§(A, X)|<small>X=A</small> = 0.

(iii) The second derivativeD<small>2</small>

§ is positive on the diagonal, i.e.,

D<sup>2</sup>§(A, X)|<small>X=A</small>(Y, Y ) g 0 for all Hermitian matrix Y.

Some divergences that have recently received a lot of attention are in [11, 14, 35, 56]. Now let us revisit the scalar mean theory which serves as a starting point for our next problem

A two-variable function M (x, y) satisfying condition 6) can be reduced to a one-variable function f (x) := M (1, x). Namely, M (x, y) is recovered from f as M (x, y) = xf (x<small>21</small>y).

</div><span class="text_page_counter">Trang 14</span><div class="page_container" data-page="14">

Notice that the functionf , corresponding to M is monotone increasing on R<small>+</small>. And this relation forms a one-to-one correspondence between means and monotone increasing functions on R<sup>+</sup>.

<i>The following are some desired properties of any object that is called a ỊmeanĨ</i>M on H<small>+n</small>. (A1). Positivity: A, B �背 0 ó M(A, B) �背 0.

(A2). Monotonicity:A �背 A<small>2</small>, B �背 B<small>2</small> ó M(A, B) �背 M (A<small>2</small>, B<small>2</small>). (A3). Positive homogeneity:M (kA, kB) = kM (A, B) for k * R<small>+</small>.

(A4). Transformer inequality:X<small>7</small>M (A, B)X �胍 M (X<small>7</small>AX, X<small>7</small>BX) for X * B(H).

(A5). Congruence invariance:X<small>7</small>M (A, B)X = M (X<small>7</small>AX, X<small>7</small>BX) for invertible X * B(H). (A6). Concavity: M (tA + (1 2 t)B, tA<small>2</small>+ (1 2 t)B<small>2</small>) �背 tM (A, A<small>2</small>) + (1 2 t)M (B, B<small>2</small>) for

t * [0, 1].

(A7). Continuity from above: ifA<sub>n</sub>³ A and B<small>n</small> ³ B, then M (A<small>n</small>, B<sub>n</sub>) ³ M(A, B). (A8). Betweenness: ifA �胍 B, then A �胍 M (A, B) �胍 B.

(A9). Fixed point property:M (A, A) = A.

To study matrix or operator means in general, we must Þrst consider three classical means in mathematics: arithmetic, geometric, and harmonic means. These means are deÞned in the following manner, respectively,

A'B = <sup>1</sup><sub>2</sub>(A + B),

A�胉B = A<sup>1/2</sup><sup>�胀</sup>A<sup>21/2</sup>BA<sup>21/2</sup><sup>�胀</sup><sup>1/2</sup>A<sup>1/2</sup>, and

A!B = 2(A<sup>21</sup>+ B<sup>21</sup>)<sup>21</sup>.

In the above deÞnitions, if matrixA is not invertible, we replace A with A<small>�胈</small> = A+�胈I and then let �胈 tend to 0 (similarly for matrixB). It can be seen that the arithmetic, harmonic and geometric

</div><span class="text_page_counter">Trang 15</span><div class="page_container" data-page="15">

means share the properties (A1)-(A9) in common. In 1980, Kubo and Ando [54] developed an axiomatic theory of operator mean on H<sup>+</sup><sub>n</sub><i>. At Þrst, they deÞned a connection of two matrices</i>

as follows (the term ỊconnectionĨ comes from the study of electrical network connections). DeÞnition. A connection on H<sup>+</sup><sub>n</sub> is a binary operation à on H<sup>+</sup><sub>n</sub> satisfying the following axioms for allA, A<small>2</small>, B, B<small>2</small>, C * H<small>+</small>

(M1). Monotonicity:A �胍 A<small>2</small>, B �胍 B<small>2</small> =ó ẪB �胍 A<small>2</small>ÃB<small>2</small>. (M2). Transformer inequality:C(AÃB)C �胍 (CAC)Ã(CBC).

(M3). Joint continuity from above: if A<small>n</small>, B<small>n</small> * B(H)<small>+</small>satisfy A<small>n</small> ³ A and B<small>n</small> ³ B, then A<sub>n</sub>ÃB<sub>n</sub> ³ AÃB.

<i>A mean is a connection with normalization condition</i>

(M4) IÃI = I.

<i>To each connection à corresponds its transpose Ã</i><sup>2</sup> dned byẪ<small>2</small>B = BÃA. A connection Ã

<i>is symmetric by deÞnition if Ã</i> = Ã<small>2</small><i>. The adjoint of Ã, denoted by Ã</i><sup>7</sup>, is dned byẪ<small>7</small>B = (A<small>21</small>ÃB<small>21</small>)<sup>21</sup><i>, for invertible A, B. When à is a non-zero connection, its dual, in symbol </i><small>Đ</small>, is deịned by <sup>Đ</sup> = (<small>2</small>)<sup>7</sup> = (<small>7</small>)<sup>2</sup>.

However, Kubo-Ando theory of means still has many limitations. In applied and engineering Þelds, people need more classes of means that are non Kubo-Ando. For some non Kubo-Ando means we refer the interested readers to [17, 23, 25, 35, 37].

One of the famous non-Kubo-Ando means is the spectral geometric mean [37], denoted as A�胉B, introduced in 1997 by Fiedler and Pt«ak . It is called the spectral geometric mean because (A�胉B)<small>2</small> is similar toAB and that the eigenvalues of their spectral mean are the positive square roots of the corresponding eigenvalues ofAB. In 2015, Kim and Lee [52] deÞned the weighted spectral mean:

A�胉<small>t</small>B :=<sup>�胀</sup>A<sup>21</sup>�胉B<sup>�胀</sup><sup>t</sup>A<sup>�胀</sup>A<sup>21</sup>�胉B<sup>�胀</sup><sup>t</sup>, t * [0, 1]. In this thesis we focus on two problems:

</div><span class="text_page_counter">Trang 16</span><div class="page_container" data-page="16">

1. Distance function generated by operator means. We introduce some new distance on the set of positive deÞnite matrices in the relation to operator means, and their applications. In addition, we also study some geometric properties for means such as the in-betweenness property, and data processing inequality in quantum information.

2. A new weighted spectral geometric mean. We introduce a new weighted spectral geo-metric mean, denoted byF<sub>t</sub>(A, B) and study basic properties for this quantity. We also establish a weak log-majorization relation involvingF<small>t</small>(A, B) and the Lie-Trotter formula forF<sub>t</sub>(A, B).

The main tools in our research are the spectral theorem for Hermitian matrices and the theory of Kubo-Ando means. Some fundamental techniques in the theory of operator monotone func-tions and operator convex funcfunc-tions are also utilized in the dissertation. We also employ basic knowledge in matrix theory involving unitarily invariant norms, trace, etc.

The main results in this thesis are presented in the following articles:

<i>1. Vuong T.D., Vo B.K (2020), ÒAn inequality for quantum Þdelit, Quy Nhon Univ. J. Sci.,</i>

4 (3).

2. Dinh T.H., Le C.T., Vo B.K, Vuong T.D. (2021), ÒWeighted Hellinger distance and in

<i>betweenness propertyÓ, Math. Ine. Appls., 24, 157-165.</i>

3. Dinh T.H., Le C.T., Vo B.K., Vuong T.D. (2021), ÒThe ³-z-Bures Wasserstein

<i>diver-genceÓ, Linear Algebra Appl., 624, 267-280.</i>

4. Dinh T.H., Le C.T., Vuong T.D., ³-z-Þdelity and ³-z-weighted right mean, Submitted.

<i>5. Dinh T.H., Tam T.Y., Vuong T.D, On new weighted spectral geometric mean, Submitted.</i>

They were presented on the seminars at the Department of Mathematics and Statistics at Quy Nhon University and at the following international workshops and conferences as follows:

1. First SIBAU-NU Workshop on Matrix Analysis and Linear Algebra, 15-17 October, 2021.

</div><span class="text_page_counter">Trang 17</span><div class="page_container" data-page="17">

2. 20th Workshop on Optimization and ScientiÞc Computing, April 21-23, 2022 - Ba Vi,

5. International Workshop on Matrix Analysis and Its Applications, July 7-8, 2023, Quy Nhon, Viet Nam.

6. 10th Viet Nam Mathematical Congress, August 8-12, 2023, Da Nang, Viet Nam.

This thesis has introduction, three chapters, conclusion, further investigation, a list of the authorÕs papers related to the thesis and preprints related to the topics of the thesis, and a list of references.

The introduction provides a background on the topics covered in this work and explains why they are meaningful and relevant. It also brießy summarizes the content of the thesis and highlights the main results from the main three chapters.

In the Þrst chapter, the author collects some basic preliminaries which are used in this thesis. In the second chapter, we introduce the weighted Hellinger distance for matrices which is an interpolating between the Euclidean distance and the Hellinger distance. In 2019, Minh [43] introduced the Alpha Procrustes distance as follows: For ³ > 0, and for positive semi-deÞnite

In this chapter, by employing this approach, we deÞne a new distance called the Weighted Hellinger distance as follows:

d<small>h,³</small>(A, B) = <sup>1</sup> ³d<small>h</small>

A<sup>2³</sup>, B<sup>2³</sup><sup>�胀</sup>

</div><span class="text_page_counter">Trang 18</span><div class="page_container" data-page="18">

and then studied its properties. In the Þrst section of this chapter, we show that the weighted Hellinger distance, as ³ tends to zero, is exactly the Log-Euclidean distance (Proposition 2.1.1), that is for two positive semi-deÞnite matricesA and B,

<small>³³0</small>d<sup>2</sup><sub>h,³</sub>(A, B) = || log(A) 2 log(B)||<sup>2</sup><small>F</small>.

Afterwards, in Proposition 2.1.2 we demonstrate the equivalence between the weighted Hellinger distance and the Alpha Procrustes distance, it means

d<small>b,³</small>(A, B) f d<small>h,³</small>(A, B) f<sup>:</sup>2d<small>b,³</small>(A, B).

<i>We say that a matrix mean à satisÞes the in-betweenness property with respect to the metric</i>d if for any pair of positive deÞnite operatorsA and B,

d(A, AÃB) f d(A, B).

In the second section, we prove that the matrix power meanµ<small>p</small>(t, A, B) = (tA<small>p</small>+ (1 2 t)B<small>p</small>)<small>1/p</small>

satisÞes the in-betweenness property in the weighted Hellinger and Alpha Procrustes distances (Theorem 2.2.1 and Theorem 2.2.2). At the end of this chapter, we prove that if à is a symmetric mean and satisÞes the in-betweenness property with respect to the Alpha Procrustes distance or the Weighted Hellinger distance, then it can only be the arithmetic mean (Theorem 2.2.3).

In chapter 3, we study a new quantum divergence so-called the ³-z-Bures Wasserstein di-vergence. In 2015, Audenaert and Datta [7] introduced the R«enyi power mean of matrices via the matrix functionP<small>³,z</small>(A, B) =<sup>�胁</sup>B<sup>12³</sup><small>2z</small> A<small>³</small>

. Based on this quantity, in this chapter, the ³-z-Bures Wasserstein divergence for positive semi-deÞnite matrices A and B is deịned by

Đ(A, B) = Tr((12 ³)A + ³B) 2 Tr (Q<small>³,z</small>(A, B)) , where Q<small>³,z</small>(A, B) = <sup>�胁</sup>A<sup>12³</sup><small>2z</small> B<small>³</small>

. Then we prove that this quantity is a quantum di-vergence (Theorem 3.1.1) We also solve the least square problem with respect to §(A, B) and

</div><span class="text_page_counter">Trang 19</span><div class="page_container" data-page="19">

showed that the solution of this problem is exactly the unique positive deÞnite solution of the matrix equation

w<small>i</small>Q<small>³,z</small>(X, A<small>i</small>) = X (Theorem 3.1.2). In [49], M. Jeong and co-authors investigated this solution and denoted it byR<small>³,z</small>(Ë, A)-called ³-z-weighted right mean. In this thesis, we continue our study of this quantity and obtain some new results. An important result is an inequality for R<small>³,z</small>(Ë, A), which can be considered a version of the AM-GM inequality (Theorem 3.1.3). Hwang and Kim [48] proved that for any weightedm-mean G<small>m</small>between arith-metic mean and geometric mean, the functionG<small>Ë</small>

Notice that the ³-z-weighted right mean does not satisfy the above condition. However, we do have a similar result for R<small>Ë</small>

<small>³,z</small> := R<small>³,z</small>(Ë, á) (Theorem 3.1.4). The well-known Lie-Trotter formula [76] states that forX, Y * M<small>n</small>,

This formula plays an essential role in the development of Lie Theory, and frequently appears in different research Þelds [44, 47, 48]. In [48], J.Hwang and S.Kim introduced the multi-variate Lie-Trotter mean on the convex cone P<sub>n</sub> of positive deÞnite matrices. For a positive probability vector Ë = (w<small>1</small>, ..., w<small>m</small>) and differentiable curves ³<small>1</small>, ..., ³<small>m</small> on P<sub>n</sub> with ³<sub>i</sub>(0) = I (i = 1, á á á , m), a weighted m-mean G<sub>m</sub> (form g 2) is the multivariate Lie-Trotter mean if

In the end of this section, we prove thatR<small>³,z</small>(Ë, A) is a multivariate Lie-Trotter mean (Theorem 3.1.5) . In the second section of this chapter, we show that this divergence satisÞes the data processing inequality (DPI) in quantum information (Theorem 3.2.1). The data processing in-equality is an information-theoretic concept that states that the information content of a signal

</div><span class="text_page_counter">Trang 20</span><div class="page_container" data-page="20">

cannot be increased via a local physical operation. This can be expressed concisely as Ịpost-processing cannot increase informationĨ, that is, for any completely positive trace preserving mapE and any positive semi-deÞnite matrices A and B,

§(E(A), E(B))f §(A, B).

Furthermore, we show that the matrix power meanµ(t, A, B) = ((1 2 t)A<small>p</small>+ tB<small>p</small>)<sup>1/p</sup> satisÞes the in-betweenness property with respect to the ³-z-Bures Wasserstein divergence (Theorem 3.2.2). Quantum Þdelity is an important quantity in quantum information theory and quantum chaos theory. It is a distance measure between density matrices, which are considered as quan-tum states. Although it is not a metric, it has many useful properties that can be used to deÞne a metric on the space of density matrices. In the next section, we give some properties of quantum Þdelity and its extended version. An important results is we establish some variational principles for the quantum ³-z-Þdelity

f<small>³,z</small>(Ã, Ã) := Tr<sup>�胀</sup>Ã<sup>³/2z</sup>Ã<sup>(12³)/z</sup>Ã<sup>³/2z</sup><sup>�胀</sup><sup>z</sup> = Tr<sup>�胀</sup>Ã<sup>(12³)/2z</sup>Ã<sup>³/z</sup>Ã<sup>(12³)/2z</sup><sup>�胀</sup><sup>z</sup>,

where à and à are two postitive deÞnite matrices (Theorem 3.3.4). That is, it is the extremal value of two matrix functions

Let U (H) be the set of n × n unitary matrices, and D<small>n</small> the set of density matrices. For Ã* D<small>n</small>, its unitary orbit is deÞned as

U<small>Ã</small>= {U ÃU<sup>7</sup> : U * U(H)}.

</div><span class="text_page_counter">Trang 21</span><div class="page_container" data-page="21">

In the last section we are going to obtain the maximum and minimum distance between orbits of two state à and à inD<small>n</small>via the quantum ³-z-Þdelity and prove that the set of these distance is a close interval in R<sup>+</sup>(Theorem 2.4.2 and Theorem 3.4.3)

In chapter 4, we introduce a new weighted spectral geometric mean

F<small>t</small>(A, B) = (A<sup>21</sup>�胉<small>t</small>B)<sup>1/2</sup>A<sup>222t</sup>(A<sup>21</sup>�胉<small>t</small>B)<sup>1/2</sup>, t * [0, 1],

where A and B are positive deÞnite matrices. We study basic properties and inequalities for F<small>t</small>(A, B). An important property that we obtain in this chapter is that F<small>t</small>(A, B) satisÞes the Lie-Trotter formula (Theorem 4.2.1).

At the end of this chapter, we compare the weak-log majorization between theF-mean and the Wasserstein mean, which is the solution to the least square problem with respect to the Bures distance or Wasserstein distance (Theorem 4.2.3).

</div><span class="text_page_counter">Trang 22</span><div class="page_container" data-page="22">

Chapter 1

1.1Matrix theory fundamentals

Let N be the set of all natural numbers. For each n * N, we denote by M<small>n</small> the set of all n × n complex matrices, H<small>n</small> is the set of alln × n Hermitian matrices, H<small>+</small>

<small>n</small> is the set ofn × n positive semi-deÞnite matrices, P<sub>n</sub>is the cone of positive deÞnite matrices in M<sub>n</sub>, andD<small>n</small>is the set of density matrices which are the positive deÞnite matrices with trace equal to one. Denote byI and O the identity and zero elements of M<small>n</small>, respectively. This thesis deals with problems for matrices, which are operators in Þnite-dimensional Hilbert spacesH. We will indicate if the case is inÞnite-dimensional.

Recall that for two vectors x = (x<small>j</small>) , y = (y<small>j</small>) * C<small>n</small> the inner product 8x, y9 of x and y is deÞned as8x, y9 c <sup>�胅</sup>

x<small>j</small>ø<small>j</small>. Now let A be a matrix in M<small>n</small>, the conjugate transpose or the adjointA<small>7</small> ofA is the complex conjugate of the transpose A<small>T</small>. We have,8Ax, y9 = 8x, A<small>7</small>y9. DeÞnition 1.1.1. A matrixA = (a<small>ij</small>)<sup>n</sup><sub>i,j=1</sub> * M<small>n</small>is said to be:

(i) diagonal ifa<small>ij</small> = 0 when i 7= j.

(ii) invertible if there exists an matrixB of order n × n such that AB = I<small>n</small>. In this situation A has a unique inverse matrix A<small>21</small> * M<small>n</small>such thatA<small>21</small>A = AA<small>21</small> = I<sub>n</sub>.

</div><span class="text_page_counter">Trang 23</span><div class="page_container" data-page="23">

(iii) normal ifAA<small>7</small> = A<small>7</small>A. (iv) unitary ifAA<small>7</small> = A<small>7</small>A = I<small>n</small>.

(v) Hermitian ifA = A<small>7</small>.

(vi) positive semi-deÞnite if8Ax, x9 g 0 for all x * C<small>n</small>. (vii) positive deÞnite if8Ax, x9 > 0 for all x * C<small>n</small>\{0}.

Deịnition 1.1.2 (Lơownerếs Order, [86]). LetA and B be two Hermitian matrices of same order n. We say that A g B if and only if A 2 B is a positive semi-deÞnite matrix.

DeÞnition 1.1.3. A complex number » is said to be an eigenvalue of a matrixA corresponding to its non-zero eigenvectorx if

Ax = »x.

The multiset of the eigenvalues ofA is denoted by Sp(A) and called the spectrum of A.

There are several conditions that characterize positive matrices. Some of them are listed in theorem below [10].

Proposition 1.1.1.

<i>(i)A is positive semi-deÞnite if and only if it is Hermitian and all its eigenvalues are </i>

<i>nonneg-ative. Moreover,A is positive deÞnite if and only if it is Hermitian and all its eigenvalues</i>

<i>are positive.</i>

<i>(ii)A is positive semi-deÞnite if and only if it is Hermitian and all its principal minors are</i>

<i>nonnegative. Moreover,A is positive deÞnite if and only if it is Hermitian and all its</i>

<i>principal minors are positive.</i>

<i>(iii)A is positive semi-deÞnite if and only if A = B</i><small>7</small><i>B for some matrix B. Moreover, A is</i>

<i>positive deÞnite if and only ifB is nonsingular.</i>

</div><span class="text_page_counter">Trang 24</span><div class="page_container" data-page="24">

<i>(iv)A is positive semi-deÞnite if and only if A = T</i><small>7</small><i>T for some upper triangular matrix T .</i>

<i>Further,T can be chosen to have nonnegative diagonal entries. If A is positive deÞnite,</i>

<i>thenT is unique. This is called the Cholesky decomposition of A. Moreover, A is positive</i>

<i>deÞnite if and only ifT is nonsingular.</i>

<i>(v)A is positive semi-deÞnite if and only if A = B</i><small>2</small> <i>for some positive matrixB. Such a B</i>

<i>is unique. We write</i>B = A<small>1/2</small> <i>and call it the (positive) square root ofA. Moreover, A is</i>

<i>positive deÞnite if and only ifB is positive deÞnite.</i>

<i>(vi)A is positive semi-deÞnite if and only if there exist x</i><small>1</small>, . . . , x<small>n</small><i>inH such that</i>

a<small>ij</small> = 8x<small>i</small>, x<small>j</small>9 .

<i>A is positive deÞnite if and only if the vectors x</i><sub>j</sub><i>, 1 f j f n, are linearly independent.</i>

Let A * M<small>n</small>, we denote the eigenvalues of A by »<small>j</small>(A), for j = 1, 2, ..., n. For a matrix A * M<small>n</small>, the notation »(A) c (»<small>1</small>(A), »<small>2</small>(A), . . . , »<small>n</small>(A)) means that »<small>1</small>(A) g »<small>2</small>(A) g . . . g »<small>n</small><i>(A). The absolute value of matrix A * M</i><small>n</small>is the square root of matrixA<small>7</small>A and denoted by

|A| = (A<small>7</small>A)<sup>1</sup>.

We call the eigenvalues of<i>|A| by the singular value of A and denote as s</i><small>j</small>(A), for j = 1, 2, ..., n. For a matrix A * M<small>n</small>, the notation s(A) c (s<small>1</small>(A), s<sub>2</sub>(A), . . . , s<sub>n</sub>(A)) means that s<sub>1</sub>(A) g s<small>2</small>(A) g . . . g s<small>n</small>(A).

There are some basic properties of the spectrum of a matrix.

<i>Proposition 1.1.2. Let</i>A, B * M<small>n</small><i>, then(i)Sp(AB) = Sp(BA).</i>

<i>(ii) IfA is a Hermitian matrix then Sp(A) Â R.</i>

<i>(iii) A is a positive semi-deịnite (respectively positive deÞnite) if and only ifA is a Hermitian</i>

<i>matrix and</i>Sp(A) ¢ R<small>g0</small><i>(respectively</i> Sp(A) ¢ R<small>+</small><i>).</i>

</div><span class="text_page_counter">Trang 25</span><div class="page_container" data-page="25">

<i>(iv) IfA, B g 0 then Sp(AB) ¢ R</i><small>+</small><i>.</i>

<i>The trace of a matrix</i>A = (a<small>ij</small>) * M<small>n</small>, denoted byTr(A), is the sum of all diagonal entries, or, we often use the sum of all eigenvalues »<sub>i</sub>(A) of A, i.e.,

Related to the trace of the matrix, we recall the Araki-Lieb-Thirring trace inequality [18] used consistently throughout the thesis.

<i>Theorem 1.1.1. LetA and B be two positive semi-deÞnite matrices, and let q > 0, we have</i>

where S<sub>n</sub>is the set of all permutations à of the set S= {1, 2, . . . , n}.

<i>Proposition 1.1.3. Let</i>A, B * H<small>n</small><i>with</i>»(A) = (»<small>1</small>, »<small>2</small>, . . . , »<small>n</small><i>) and »(B) = (µ</i><small>1</small>, µ<small>2</small>, . . . , µ<small>n</small><i>).</i>

<i>(i) IfA > 0 and B > 0, then A g B if and only if B</i><small>21</small> g A<small>21</small><i>.(ii) IfA g B, then X</i><small>7</small>AX g X<small>7</small><i>BX for every X * M</i><small>n</small><i>.(iii) IfA g B, then »</i><small>j</small> g µ<small>j</small> <i>for eachj = 1, 2, . . . , n.</i>

<i>(iv) IfA g B g 0, then Tr(A) g Tr(B) g 0.</i>

</div><span class="text_page_counter">Trang 26</span><div class="page_container" data-page="26">

<i>(v) IfA g B g 0, then det(A) g det(B) g 0.</i>

A function�胀 á �胀 : M<small>n</small>³ R is said to be a matrix norm if for all A, B * M<small>n</small>and"³ * C we have:

(i) �胀A�胀 g 0.

(ii) �胀A�胀 = 0 if and only if A = 0. (iii) �胀³A|| = |³| á ||A||.

(iv) �胀A + B�胀 f �胀A�胀 + �胀B�胀.

In addition, a matrix norm is said to be sub-multiplicative matrix norm if

�胀AB�胀 f �胀A�胀 á �胀B�胀.

<i>A matrix norm is said to be a unitarily invariant norm if for every</i> A * M<small>n</small>, we have �胀UAV �胀 = �胀A�胀 for all U, V * U<small>n</small>unitary matrices. It is denoted as�胀| á |�胀.

These are some important norms over M<sub>n</sub>. The operator norm ofA, deÞned by

</div><span class="text_page_counter">Trang 27</span><div class="page_container" data-page="27">

Whenp = 2, we have the Frobenius norm or sometimes called the Hilbert-Schmidt norm :

Letx = (x<small>1</small>, x<small>2</small>, . . . , x<small>n</small>) and y = (y<small>1</small>, y<small>2</small>, . . . , y<small>n</small>) be in R<small>n</small>. Let x<small>³</small> = <sup>�胀</sup>x<sub>[1]</sub>, x<sub>[2]</sub>, . . . , x<sub>[n]</sub><sup>�胀</sup> denote a rearrangement of the components ofx such that x<small>[1]</small> �背x<small>[2]</small> �背. . . �背 x<small>[n]</small>. We say thatx

In other words,x z<small>log</small> y if and only if log x z log y.

MatrixP * M<small>n</small><i>is called a projection if</i>P<small>2</small> <i>= P . One says that P is a Hermitian projection</i>

if it is both Hermitian and a projection; <i>P is an orthogonal projection if the range of P is</i>

orthogonal to its null space. The partial ordering is very simple for projections. IfP and Q are projections, then the relationP f Q means that the range of P is included in the range of Q. An equivalent algebraic formulation isP Q = P . The largest projection in M<sub>n</sub> is the identityI and the smallest one is 0 . Therefore0 f P f I for any projection P * M<small>n</small>. Assume thatP andQ are projections on the same Hilbert space. Among the projections which are smaller than P and Q there is a maximal projection, denoted by P ' Q, which is the orthogonal projection onto the intersection of the ranges ofP and Q.

</div><span class="text_page_counter">Trang 28</span><div class="page_container" data-page="28">

<i>Theorem 1.1.2. [45] Assume thatP and Q are orthogonal projections. Then</i>

P ' Q = lim<sub>n³></sub>(P QP )<sup>n</sup>= lim

<small>n³></small>(QP Q)<sup>n</sup>.

1.2Matrix function and matrix mean

Now let us recall the spectral theorem which is one of the most important tools in functional analysis and matrix theory.

<i>Theorem 1.2.1 (Spectral decomposition, [9]). Let »</i><sub>1</sub> > »<small>2</small>. . . > »<small>k</small><i>be eigenvalues of a Hermi-tian matrix A. Then</i>

For a real-valued functionf deÞned on some interval K ¢ R, and for a self-adjoint matrix A * M<small>n</small> with spectrum in K, the matrix f (A) is deÞned by means of the functional calculus,

We are now at the stage where we will discuss matrix/operator functions. L¬owner was the Þrst to study operator monotone functions in his seminal papers [63] in 1930. In the same time, Kraus investigated the notion operator convex function [55].

DeÞnition 1.2.1 ([63]). A continuous functionf deịned on an interval K(K Â R) is said to be

<i>operator monotone of order</i>n on K if for two Hermitian matrices A and B in M<small>n</small>with spectras

</div><span class="text_page_counter">Trang 29</span><div class="page_container" data-page="29">

inK, one has

A f B implies f (A) f f(B).

If<i>f is operator monotone of any orders then f is called operator monotone.</i>

<i>Theorem 1.2.2 (L¬owner-HeinzÕs Inequality, [86]). The function</i>f (t) = t<small>r</small><i>is operator monotoneon[0, >) for 0 f r f 1. More speciÞcally, for two positive semi-deÞnite matrices such thatA f B. Then</i>

A<sup>r</sup> f B<sup>r</sup>, 0 f r f 1.

DeÞnition 1.2.2 ([55]). A continuous functionf deÞned on an interval K(K ¢ R) is said to be

<i>operator convex of order</i>n on K if for any Hermitian matrices A and B in M<small>n</small>with spectra in K, and for all real numbers 0 f » f 1,

f (»A + (1 2 »)B) f »f(A) + (1 2 »)f(B).

If<i>f is operator convex of any order n then f is called operator convex. If 2f is operator convex</i>

then we callf is operator concave.

<i>Theorem 1.2.3 ([10]). Function</i>f (t) = t<small>r</small><i>in[0, >) is operator convex when r * [21, 0]*[1, 2].</i>

<i>More speciÞcally, for any positive semi-deÞnite matricesA, B and for any » * [0, 1],</i>

(»A + (1 2 »)B)<sup>r</sup> f »A<sup>r</sup>+ (1 2 »)B<sup>r</sup>.

Another important example is the function f (t) = log t, which is operator monotone on (0, >) and the function g(t) = t log t is operator convex. The relations between operator mono-tone and operator convex via the theorem below.

<i>Theorem 1.2.4 ([9]). Letf be a (continuous) real function on the interval [0, ³). Then the</i>

<i>following two conditions are equivalent:(i)f is operator convex and f (0) f 0.</i>

</div><span class="text_page_counter">Trang 30</span><div class="page_container" data-page="30">

<i>(ii) The function</i>g(t) = <sup>f (t)</sup>

t <i><sup>is operator monotone on</sup><sup>(0, ³).</sup></i>

DeÞnition 1.2.3 ([10]). Letf (A, B) be a real valued function of two matrix variables. Then, f

<i>is called jointly concave, if for all</i>0 f ³ f 1,

f (³A<small>1</small> + (1 2 ³)A<small>2</small>, ³B<small>1</small>+ (1 2 ³)B<small>2</small>) g ³f(A<small>1</small>, B<small>1</small>) + (1 2 ³)f(A<small>2</small>, B<small>2</small>)

for allA<small>1</small>, A<small>2</small>, B<small>1</small>, B<small>2</small><i>. If 2f is jointly concave, we say f is jointly convex.</i>

We will review very quickly some basic concepts of the Fr«echet differential calculus, with special emphasis on matrix analysis. LetX, Y be real Banach spaces, and let L(X, Y ) be the space of bounded linear operators fromX to Y . Let U be an open subset of X. A continuous mapf from U to Y is said to be differentiable at a point u of U if there exists T * L(X, Y )

It is clear that if such aT exists, it is unique. If f is differentiable at u, the operator T above is called the derivative of f at u. We will use for it the notation Df (u), of "f (u). This is sometimes called the Fr«echet derivative. Iff is differentiable at every point of U , we say that it is differentiable onU . One can see that, if f is differentiable at u, then for every v * X,

This is also called the directional derivative off at u in the direction v. Iff<small>1</small>, f<small>2</small>are two differentiable maps, thenf<small>1</small> + f<small>2</small>is differentiable and

D (f<small>1</small>+ f<small>2</small>) (u) = Df<small>1</small>(u) + Df<small>2</small>(u).

The composite of two differentiable mapsf and g is differentiable and we have the chain rule

D(g ỗ f)(u) = Dg(f(u)) á Df(u).

</div><span class="text_page_counter">Trang 31</span><div class="page_container" data-page="31">

One important rule of differentiation for real functions is the product rule: (f g)<small>2</small> = f<small>2</small>g + gf<small>2</small>. If f and g are two maps with values in a Banach space, their product is not deÞned - unless the range is an algebra as well. Still, a general product rule can be established. Letf, g be two differentiable maps fromX into Y<small>1</small>, Y<small>2</small>, respectively. LetB be a continuous bilinear map from Y<small>1</small>× Y<small>2</small> intoZ. Let × be the map from X to Z deÞned as ×(x) = B(f (x), g(x)). Then for all u, v in X

D×(u)(v) = B(Df (u)(v), g(u)) + B(f (u), Dg(u)(v)).

This is the product rule for differentiation. A special case of this arises whenY<sub>1</sub> = Y<sub>2</sub> = L(Y ), the algebra of bounded operators in a Banach space Y . Now ×(x) = f (x)g(x) is the usual product of two operators. The product rule then is

D×(u)(v) = [Df (u)(v)] á g(u) + f (u) á [Dg(u)(v)]

Higher order Fr«echet derivatives can be identiÞed with multilinear maps. Letf be a differen-tiable map from X to Y . At each point u, the derivative Df (u) is an element of the Banach spaceL(X, Y ). Thus we have a map Df from X into L(X, Y ), deÞned as Df : u ³ Df(u). If this map is differentiable at a pointu, we say that f is twice differentiable at u. The derivative of the mapDf at the point u is called the second derivative of f at u. It is denoted as D<small>2</small>f (u). This is an element of the spaceL(X, L(X, Y )). Let L<small>2</small>(X, Y ) be the space of bounded bilinear maps fromX × X into Y . The elements of this space are maps f from X × X into Y that are linear in both variables, and for whom there exists a constantc such that

�胀f (x<small>1</small>, x<small>2</small>)�胀 f c �胀x<small>1</small>�胀 �胀x<small>2</small>�胀

for all x<small>1</small>, x<small>2</small> * X. The inÞmum of all such c is called �胀f�胀. This is a norm on the space L<small>2</small>(X, Y ), and the space is a Banach space with this norm. If × is an element of L(X, L(X, Y )), let

×(x<small>1</small>, x<small>2</small>) = [× (x<small>1</small>)] (x<small>2</small>) for x<small>1</small>, x<small>2</small> * X.

</div><span class="text_page_counter">Trang 32</span><div class="page_container" data-page="32">

Thenìữ * L<small>2</small>(X, Y ). It is easy to see that the map ì ữì is an isometric isomorphism. Thus the second derivative of a twice differentiable mapf from X to Y can be thought of as a bilinear map fromX × X to Y . It is easy to see that this map is symmetric in the two variables; i.e.,

D<sup>2</sup>f (u) (v<small>1</small>, v<small>2</small>) = D<sup>2</sup>f (u) (v<small>2</small>, v<small>1</small>)

for allu, v<small>1</small>, v<small>2</small>. Derivatives of higher order can be deÞned by repeating the above procedure. Thep th derivative of a map f from X to Y can be identiÞed with a p-linear map from the space X × X × á á á × X ( p copies) into Y . A convenient method of calculating the p th derivative of f is provided by the formula

For the convenience of readers, let us provide some examples for the derivatives of matrices. Example 1.2.1. In these examplesX = Y = L(H).

(i) Letf (A) = A<small>2</small>. Then

</div><span class="text_page_counter">Trang 33</span><div class="page_container" data-page="33">

(iii) Letf (A) = A<small>22</small> for each invertibleA. Then

<i>In connections with electrical engineering, Anderson and DufÞn [3] deÞned the parallel sum</i>

of two positive deÞnite matricesA and B by

</div><span class="text_page_counter">Trang 34</span><div class="page_container" data-page="34">

investi-gated the structure of the Riemannian manifold H<sup>+</sup><sub>n</sub>. They showed that the curve

³(t) = A�胉<small>t</small>B = A<sup>1/2</sup><sup>�胀</sup>A<sup>21/2</sup>BA<sup>21/2</sup><sup>�胀</sup><sup>t</sup>A<sup>1/2</sup> (t * [0, 1])

is the unique geodesic joining A and B, and called t-geometric mean or weighted geometric mean. The weighted harmonic and the weighted arithmetic means are deÞned by

A!<small>t</small>B =<sup>�胀</sup>tA<sup>21</sup>+ (1 2 t)B<sup>21</sup><sup>�胀</sup><sup>21</sup>,

A'<small>t</small>B = tA + (1 2 t)B.

The well-known inequality related to these quantities is the harmonic, geometric, and arithmetic means inequality [47, 60] , that is,

A!<small>t</small>B f A�胉<small>t</small>B f A'<small>t</small>B.

These three means are Kubo-Ando means. LetÕs collect the main content of the Kubo-Ando means theory in the general case [54]. For x > 0 and t g 0, the function Ç(x, t) = <sup>x(1 + t)</sup><sub>x + t</sub> is bounded and continuous on the extended half-line [0, >]. The L¬owner theory ([9, 45]) on operator-monotone functions states that the mapm f, deịned by

f (x) =

ầ(x, t)dm(t) for x > 0,

establishes an afÞne isomorphism from the class of positive Radon measures on[0, >] onto the class of operator-monotone functions. In the representation abvove,f (0) = inf

</div><span class="text_page_counter">Trang 35</span><div class="page_container" data-page="35">

<i>monotone function</i>f : R<small>+</small>³ R<small>+</small><i>, satisfying</i>

<i>We callf the representing function of Ã.</i>

The next theorem follows from the integral representation of matrix monotone functions and from the previous theorem.

<i>Theorem 1.2.6. The map,m �胀³ Ã, deÞned by</i>

IfP and Q are two projections, then the explicit formulation for P ÃQ is simpler.

<i>Theorem 1.2.7. Ifà is a mean, then for every pair of projectionsP and Q</i>

P ÃQ = a(P 2 P ' Q) + b(Q 2 P ' Q) + P ' Q,

</div><span class="text_page_counter">Trang 36</span><div class="page_container" data-page="36">

Let f be the representing function of Ã. Since xf (x<small>21</small>) is the representing function of the transpose Ã<sup>2</sup>, then à is symmetric if and only iff (x) = xf (x<small>21</small>) . The next theorem gives the representation for a symmetric connection.

<i>Theorem 1.2.8. The map,n �胀³ Ã, deÞned by</i>

<i>wherec = n({0}), establishes an afÞne isomorphism from the class of positive Radon measures</i>

<i>on the unit interval[0, 1] onto the class of symmetric connections.</i>

</div><span class="text_page_counter">Trang 37</span><div class="page_container" data-page="37">

Chapter 2

Weighted Hellinger distance

In recent years, many researchers have paid attention to different distance functions on the set P<sub>n</sub>of positive deÞnite matrices. Along with the traditional Riemannian metricd<small>R</small>(A, B) =

(where »<sub>i</sub>(A<small>21</small>B) are eigenvalues of the matrix A<small>21/2</small>BA<small>21/2</small>), there are other important functions. Two of them are the Bures-Wasserstein distance [13], which are adapted from the theory of optimal transport :

d<small>b</small>(A, B) =<sup>�胀</sup>Tr(A + B) 2 2 Tr((A<sup>1/2</sup>BA<sup>1/2</sup>)<sup>1/2</sup>)<sup>�胀</sup><sup>1/2</sup>,

and the Hellinger metric or Bhattacharya metric [11] in quantum information :

d<small>h</small>(A, B) =<sup>�胀</sup>Tr(A + B) 2 2 Tr(A<sup>1/2</sup>B<sup>1/2</sup>)<sup>�胀</sup><sup>1/2</sup>.

Notice that the metric d<small>h</small> is the same as the Euclidean distance between A<small>1/2</small> and B<small>1/2</small>, i.e., �胀A<small>1/2</small>2 B<small>1/2</small>�胀<small>F</small>.

Recently, Minh [43] introduced the Alpha Procrustes distance as follows: For ³> 0 and for two positive semi-deÞnite matricesA and B,

d<small>b,³</small> = <sup>1</sup>

³d<small>b</small>(A<sup>2³</sup>, B<sup>2³</sup>).

</div><span class="text_page_counter">Trang 38</span><div class="page_container" data-page="38">

He showed that the Alpha Procrustes distances are the Riemannian distances corresponding to a family of Riemannian metrics on the manifold of positive deÞnite matrices, which encom-pass both the Log-Euclidean and Wasserstein Riemannian metrics. Since the Alpha Procrustes

<i>distances are deÞned based on the Bures-Wasserstein distance, we also call them the weighted</i>

<i>Bures-Wasserstein distances. In that òow, in this chapter we can deịne the weighted Hellinger</i>

<i>metric</i>for two positive semi-deÞnite matrices as follows: d<small>h,³</small>(A, B) = <sup>1</sup>

³d<small>h</small>(A<sup>2³</sup>, B<sup>2³</sup>), then investigate its properties within this framework.

The results of this chapter are taken from [32].

2.1Weighted Hellinger distance

DeÞnition 2.1.1. For two positive semi-deÞnite matricesA and B and for ³ > 0, the weighted Hellinger distance betweenA and B is deÞned as

d<small>h,³</small>(A, B) = <sup>1</sup>

³d<small>h</small>(A<sup>2³</sup>, B<sup>2³</sup>) = <sup>1</sup>

³(Tr(A<sup>2³</sup>+ B<sup>2³</sup>) 2 2 Tr(A<small>³</small>B<small>³</small>))<sup>1</sup>. (2.1.1) It turns out that d<small>h,³</small>(A, B) is an interpolating metric between the Log-Euclidean and the Hellinger metrics. We start by showing that the limit of the weighted Hellinger distance as ³ tends to 0 is the Log-Euclidean distance. We also show that the weighted Bures-Wasserstein and weighted Hellinger distances are equivalent (Proposition 2.1.2).

<i>Proposition 2.1.1. For two positive semi-deÞnite matricesA and B,</i>

<small>³³0</small>d<sup>2</sup><sub>h,³</sub>(A, B) = || log(A) 2 log(B)||<sup>2</sup><small>F</small>.

<i>Proof.</i> We rewrite the expression ofd<sub>h,³</sub>(A, B) as

</div><span class="text_page_counter">Trang 39</span><div class="page_container" data-page="39">

Tending ³ to zero, we obtain

d<sup>2</sup><sub>h,³</sub>(A, B) = || log A||<sup>2</sup><sub>F</sub> + || log B||<sup>2</sup><sub>B</sub>2 2<sup>�胄</sup>log A, log B<sup>�胄</sup>

<small>F</small> = || log A 2 log B||<sup>2</sup><small>F</small>.

</div><span class="text_page_counter">Trang 40</span><div class="page_container" data-page="40">

This completes the proof.

It is interesting to note that the weighted Bures-Wasserstein and weighted Hellinger distances are equivalent.

<i>Proposition 2.1.2. For two positive semi-deÞnite matricesA and B,</i>

d<small>b,³</small>(A, B) f d<small>h,³</small>(A, B) f<sup>:</sup>2d<small>b,³</small>(A, B).

<i>Proof.</i> According the Araki-Lieb-Thirring inequality [43] , we have Tr(A<sup>1/2</sup>BA<sup>1/2</sup>)<sup>r</sup> g Tr(A<sup>r</sup>B<sup>r</sup>), |r| f 1.

ReplaceA with A<small>2³</small>,B with B<small>2³</small>andr with <sup>1</sup>

2 <sup>we obtain the following</sup> In the above inequality replace à with A<small>2³</small>

Tr(A<small>2³</small>) <sup>and à with</sup>

Tr(B<small>2³</small>) <sup>we have</sup>

</div>

×