From state vectors to mixed quantum states

Usually, the states are identified with normalized vectors of a particular vector space \(\mathcal{H}\). Fortunately, in quantum information the entire complexity of function spaces can be simplified to just a finite-dimensional subset. Then, the state

\begin{equation}\vert\psi\rangle=\sum_{n=1}^{d} \psi_n \vert{n}\rangle \end{equation}

can be identified with the amplitude vector \((\psi_1, \ldots, \psi_d)\in \mathbb{C}^d\). The physical states are usually taken to be the normalized ones, forming the complex sphere

\begin{equation} S_d = \{ \vert\psi\rangle \in \mathbb{C}^d : {\overbrace{\langle \psi \vert\psi\rangle}^{\color{gray}=\sum_{i=1}^d \psi_i^* \psi_i}}\!\!\! =1 \}. \end{equation}

Any (finite-dimensional) system states could be therefore understood via the (relatively simple) spherical geometry. It's useful in some problems, but many elementary concepts of quantum information can't be phrased in simple terms via the vector state description. For instance, the expectation value of an observable \(X\),

\begin{equation} \bra \psi X \ket\psi = \sum_{i,j} \psi_i^* X_{ij} \psi_j, \end{equation}

is a quadratic form, and any algebraic manipulation required careful handling of the fundamental set of variables ($(\psi_1,\ldots,\psi_d)$) and its conjugates (\((\psi_1^*, \ldots, \psi_d^*)\)).

But there is a tradeoff to be made here. Instead of having a simple set of states and complicated operations, we could split the difficulty more evenly1. This is one of the motivating points for the introduction of mixed states. The other is a consistent treatment of a partial loss of information about the observable properties related to a quantum state. If we know that a system is found precisely in the state \(\ket\psi\), all of the expectation values \(\bra \psi X \ket\psi\) can be calculated.

But what if we encounter a machine that produces an ensemble of states, with $\ket{\psi_k}$ being produced with probability $p_k$? This is just a classical probability as 'incomplete knowledge about a real underlying process' – imagine that the states are produced according to an internal deterministic random number generator, and we simply don't know its seed.

Now, if you imagine the real, physical measurement process, we measure the individual outcomes – the eigenvectors ${\ket{x_n}}_{n=1}^d$ to eigenvalues $x_n$ of $X$ – multiple times, and recreate the empirical frequencies

\begin{equation} P_n = \frac{N_n}{\sum_m N_m} \end{equation}

which are used to determine the expectation value as $\braket{X}_\psi \approx \sum x_n P_n$, and the more total experimental runs $\sum_m N_m$ we have, the better the approximation. From this operational standpoint, if the state is itself prepared probabilistically, the empirical expectation value will be a weighted sum of the pure expectation values:

\begin{equation} \begin{array}{rl}P_n^{\text{(ens)}}&\approx\sum_k p_k P_n^{\ket{\psi_k}}, \text{and}\\ \braket{X}_{\text{(ens)}} &\approx \sum_n x_n P_n^{\text{(ens)}}=\sum_k p_k\braket{X}_{\ket{\psi_k}}\end{array}. \end{equation}

If arbitrary observable can be $X$ can be measured, it can be proven that for a nontrivial ensemble ${(p_k,\ket{\psi_k})}$, the resulting expectation values cannot be obtained by any pure vector $\ket\psi\in\CC^d$. 2 For instance, if a qubit state $\ket\psi\in\CC^2$ is measured in the observables described by the Pauli matrices,

\begin{equation} X=\begin{pmatrix}0&1\\1&0\end{pmatrix}, Y=\begin{pmatrix}0&-i\\i&0\end{pmatrix}, Z=\begin{pmatrix}1&0\\0&-1\end{pmatrix}, \end{equation}

it can be shown that

\begin{equation} \braket{X}_{\ket\psi}^2 +\braket{Y}_{\ket\psi}^2+\braket{Z}_{\ket\psi}^2=1. \end{equation}

However, an ensemble creating states $\ket0=(1,0)^T$ and $\ket1=(0,1)^T$ with equal probabilities $p_0=p_1=\frac12$ yields each of the expectation values \(\braket{X}_{\text{(ens)}}=\braket{Y}_{\text{(ens)}}=\braket{Z}_{\text{(ens)}}=0\).

There is an elegant solution to this problem: the mixed states. See that the formula for the ensemble expectation value can be manipulated: 3

\begin{equation} \braket{X}_{\text{(ens)}} =\sum_k p_k\bra{\psi_k}{X}{\ket{\psi_k}}=\Tr \underbrace{\left( \sum p_k \ket{\psi_k}\bra{\psi_k}\right)}_{\rho} X. \end{equation}

The operator $\rho$ is a convex combination of projectors onto the ensemble states, and captures exactly only the observable properties of an ensemble. This helps, since different ensembles can yield the same empirical probabilities $P_n$ (and hence, the empirical expectation values). See for instance that the ensemble \(\left\{\left(\frac12,\ket{\pm}\right)\right\}_\pm\) 4 also yields \(\braket{X}_{\text{(ens)}}=\braket{Y}_{\text{(ens)}}=\braket{Z}_{\text{(ens)}}=0\). And no wonder: this one, as well as the one composed by $\ket0$ and $\ket1$ yield the exact same density operator,

\begin{equation} \frac12 \ket0\bra0 +\frac12 \ket0\bra0 = \begin{pmatrix}\frac12 & 0\\0&\frac12\end{pmatrix}=\frac12 \ket+\bra+ +\frac12 \ket-\bra-. \end{equation}

See that the introduction of density operators actually simplifies some things: the expectation value is just a linear map, $\braket{X}_\rho = \Tr \rho X$. But there is a price to pay for this simplicity: the geometry of all possible states $\rho$ becomes less trivial than the one of just a sphere…

Or, at least, for systems more complicated than a single qubit. Since we can parameterize every Hermitian unit trace operators as

\begin{equation} \frac12\begin{pmatrix} 1+z & x-iy\\ x+iy&1+z\end{pmatrix}, \label{eq:qubitparam} \end{equation}

is is easy to check that the physical states are exactly those with $x^2+y^2+z^2\le 1$. It is exactly the Bloch ball, not exactly a complex geometric object.

The situation changed drastically already for the qutrit, where the density operators are Hermitian, positive semidefinite, and unit trace matrices of dimension $d=3$. A linear parameterization analogous to \eqref{eq:qubitparam} would need 8 real variables $(r_1, \ldots, r_8)\in\RR^8$ – and the set of allowed $\vec r$ is not a ball.

  1. This is a common occurence in quantum mechanics, see for instance the pair of Glauber-Sudarshan P and Husimi Q representations: the first is, in general, a highly singular linear differential operator and not a function at all, while the second is a simple coherent state expectation value. Such an approach can be simplified via Wigner functions, which are well-defined quasiprobability functions with only moderately complex properties. 

  2. But they can be obtained by a vector $\ket{\psi'}\in\CC^{d\cdot d'}$ and $X'=X\otimes\II_{d'}$, where $\otimes$ is a Kronecker product and $\II_{d'}$ denotes the identity matrix of dimension $d'$! At least, if $d'\ge d$. 

  3. This is a result of the cyclic property of trace and vectors being treated as $1\times d$ matrices. Then, $ \bra\psi X \ket\psi=\Tr \ket\psi\bra\psi X$. 

  4. With $\ket\pm = \frac1{\sqrt2}(\ket0\pm\ket1)$.