6.1.2 Entropy#

Prompts

  • What is von Neumann entropy? How does it generalize Shannon entropy to quantum states?

  • Why is entropy zero for pure states but positive for mixed states? What does this tell you about uncertainty?

  • How does the maximum entropy principle connect to the thermal state and the Boltzmann distribution?

  • Why is the partition function \(Z = \operatorname{Tr}(\mathrm{e}^{-\beta \hat{H}})\) the key bridge between quantum mechanics and thermodynamics?

  • Why is the Helmholtz free energy \(F = -k_B T \ln Z\) the natural thermodynamic potential built from the partition function? How does \(F = \langle E \rangle - TS\) balance energy against entropy?

Lecture Notes#

Overview#

Von Neumann entropy \(S(\hat{\rho}) = -\operatorname{Tr}(\hat{\rho} \ln \hat{\rho})\) quantifies how much we don’t know about a quantum state. It is zero for pure states (perfect knowledge) and positive for mixed states (ignorance). The maximum entropy principle shows that thermal states—those that maximize entropy subject to a constraint on average energy—are precisely the Boltzmann distributions. This simple principle bridges quantum mechanics to thermodynamics without assuming temperature a priori.

Von Neumann Entropy#

Von Neumann Entropy

For a density matrix \(\hat{\rho}\) on a \(d\)-dimensional Hilbert space, the von Neumann entropy is:

(208)#\[ S(\hat{\rho}) = -\operatorname{Tr}(\hat{\rho} \ln \hat{\rho}) \]

Equivalently, using spectral decomposition \(\hat{\rho} = \sum_i \lambda_i \vert \psi_i\rangle\langle\psi_i\vert\) (eigenvalues \(\lambda_i \geq 0\)):

(209)#\[ S(\hat{\rho}) = -\sum_i \lambda_i \ln \lambda_i \]

with the convention \(0 \ln 0 = 0\).

Bounds and limits:

(210)#\[ 0 \leq S(\hat{\rho}) \leq \ln d \]

The minimum \(S = 0\) occurs for pure states (a single eigenvalue equals 1). The maximum \(S = \ln d\) occurs for the maximally mixed state \(\hat{\rho} = \hat{I}/d\) (all eigenvalues equal \(1/d\)).

Example: Pure and Mixed States

Pure state: \(\hat{\rho} = \vert \psi\rangle\langle\psi\vert\) has one eigenvalue \(\lambda_1 = 1\). Thus:

\[ S(\hat{\rho}) = -1 \cdot \ln 1 = 0 \]

Maximally mixed qubit: \(\hat{\rho} = \hat{I}/2\) (2-dimensional) has eigenvalues \(\lambda_1 = \lambda_2 = 1/2\). Thus:

\[ S(\hat{\rho}) = -2 \cdot \frac{1}{2} \ln\frac{1}{2} = \ln 2 \]

Thermal qubit at finite temperature: \(\hat{\rho} = \frac{\mathrm{e}^{-\beta E_1} \vert 1\rangle\langle 1\vert + \mathrm{e}^{-\beta E_2} \vert 2\rangle\langle 2\vert}{Z}\) where \(Z = \mathrm{e}^{-\beta E_1} + \mathrm{e}^{-\beta E_2}\). The entropy ranges between 0 (zero temperature, ground state only) and \(\ln 2\) (infinite temperature, maximally mixed).

Properties of Entropy#

Properties of Von Neumann Entropy

  1. Non-negativity: \(S(\hat{\rho}) \geq 0\), with equality iff \(\hat{\rho}\) is a pure state. — zero entropy means perfect knowledge of the state.

  2. Upper bound: \(S(\hat{\rho}) \leq \ln d\) for \(d\)-dimensional systems, with equality iff \(\hat{\rho} = \hat{I}/d\). — complete ignorance corresponds to the maximally mixed state \(\hat{I}/d\).

  3. Unitary invariance: \(S(\hat{U}\hat{\rho} \hat{U}^\dagger) = S(\hat{\rho})\) for any unitary \(\hat{U}\). Entropy is unchanged by reversible quantum operations — unitaries are information-preserving.

  4. Concavity: For \(p \in [0,1]\),

(211)#\[ S(p\hat{\rho}_1 + (1-p)\hat{\rho}_2) \geq p S(\hat{\rho}_1) + (1-p) S(\hat{\rho}_2) \]

Mixing states increases entropy — ignorance about which mixture we’re in adds to the internal entropy of each component.

Connection to Information Theory#

Von Neumann entropy is the quantum generalization of Shannon entropy. For a classical probability distribution \(P = \{p_i\}\):

\[ H(P) = -\sum_i p_i \ln p_i \]

If we measure \(\hat{\rho}\) in basis \(\{\vert \psi_i\rangle\}\) with outcomes \(i\) having probability \(p_i = \langle\psi_i\vert \hat{\rho}\vert \psi_i\rangle\), we obtain the Shannon entropy \(H(P)\) of the measurement outcome distribution. But von Neumann entropy \(S(\hat{\rho})\) is basis-independent and captures the intrinsic quantum uncertainty.

Interpretation: \(S(\hat{\rho})\) measures how much we don’t know about the state \(\hat{\rho}\).

  • \(S = 0\): We know the state exactly (pure state).

  • \(S > 0\): We have incomplete information; measuring may yield different outcomes.

Maximum Entropy Principle#

Among all density matrices with a fixed average energy \(\langle E \rangle = \operatorname{Tr}(\hat{\rho} \hat{H})\), which one maximizes entropy?

Thermal State and Boltzmann Distribution

The state maximizing entropy at fixed \(\langle E \rangle\) is the thermal state:

(212)#\[ \hat{\rho}_{\text{th}} = \frac{\mathrm{e}^{-\beta \hat{H}}}{Z} \]

where \(Z = \operatorname{Tr}(\mathrm{e}^{-\beta \hat{H}})\) is the partition function and \(\beta = 1/(k_B T)\).

In the energy eigenbasis, diagonal elements are Boltzmann weights:

(213)#\[ P_n = \frac{\mathrm{e}^{-\beta E_n}}{Z} \]

where \(E_n\) are energy eigenvalues.

Key insight: The Boltzmann distribution is not an assumption—it follows from the principle of maximum entropy. When we know only the average energy, the state of maximum ignorance (maximum entropy) is thermal.

Partition Function and Thermodynamics#

The partition function \(Z(\beta) = \operatorname{Tr}(\mathrm{e}^{-\beta \hat{H}})\) counts the effective number of thermally accessible energy levels — it is the bridge between microscopic quantum mechanics and macroscopic thermodynamics.

Average Energy:

(214)#\[ \langle E \rangle = \operatorname{Tr}(\hat{\rho}_{\text{th}} \hat{H}) = -\frac{\mathrm{d} \ln Z}{\mathrm{d} \beta} \]

Free Energy (Helmholtz):

(215)#\[ F = -k_B T \ln Z \]

where \(T = 1/(k_B \beta)\).

Entropy in terms of \(Z\):

From \(\hat{\rho}_{\text{th}} = \mathrm{e}^{-\beta \hat{H}}/Z\):

(216)#\[ S(\hat{\rho}_{\text{th}}) = k_B [\beta \langle E \rangle + \ln Z] \]

Equivalently, \(S = -(\partial F/\partial T)\vert_V\) (standard thermodynamic relation).

Thermodynamic Identities:

The first law connects all three:

\[ \mathrm{d}F = -S \, \mathrm{d}T - P \, \mathrm{d}V \]

(ignoring volume dependence for simplicity). From \(F = \langle E \rangle - TS\):

\[ S = \frac{\langle E \rangle - F}{T} = \frac{\langle E \rangle + k_B T \ln Z}{T} \]

which is consistent with equation (216).

Summary#

  • Von Neumann entropy: \(S(\hat{\rho}) = -\operatorname{Tr}(\hat{\rho} \ln \hat{\rho}) = -\sum_i \lambda_i \ln \lambda_i\) over the eigenvalues \(\lambda_i\) of \(\hat{\rho}\) quantifies how much is unknown about a quantum state.

  • Bounds: \(0 \leq S(\hat{\rho}) \leq \ln d\) — the minimum \(S = 0\) is attained only by pure states, the maximum \(S = \ln d\) only by the maximally mixed state \(\hat{I}/d\).

  • Structural properties: entropy is non-negative, invariant under unitary evolution (reversible dynamics preserve information), and concave — mixing states can only increase entropy.

  • Link to information theory: \(S(\hat{\rho})\) is the quantum generalization of the Shannon entropy \(-\sum_i p_i \ln p_i\); measuring \(\hat{\rho}\) in a fixed basis returns the Shannon entropy of the outcome distribution, while \(S(\hat{\rho})\) itself is basis-independent and captures the intrinsic quantum uncertainty.

  • Maximum entropy principle: maximizing \(S\) at fixed average energy \(\langle E \rangle\) singles out the thermal state \(\hat{\rho}_{\text{th}} = \mathrm{e}^{-\beta \hat{H}}/Z\); the Boltzmann weights \(P_n = \mathrm{e}^{-\beta E_n}/Z\) are derived, not assumed, with \(\beta = 1/(k_B T)\) the Lagrange multiplier fixing the energy.

  • Partition function bridges to thermodynamics: \(Z = \operatorname{Tr}(\mathrm{e}^{-\beta \hat{H}})\) generates the macroscopic quantities — average energy \(\langle E \rangle = -\mathrm{d}\ln Z/\mathrm{d}\beta\), free energy \(F = -k_B T \ln Z\), and entropy \(S = k_B[\beta \langle E \rangle + \ln Z]\) — connecting microscopic quantum mechanics to thermodynamics.

See Also

  • 6.1.1 Mixed States: Density matrices, convex combinations, and the state space on which entropy is defined.

  • 6.1.3 Quantum Statistics: Bose-Einstein and Fermi-Dirac occupation numbers, obtained by applying the thermal state and partition function to single bosonic and fermionic modes.

  • 6.2.2 Entanglement Entropy: Entropy of subsystems—distinct but related use of partial trace and \(\operatorname{Tr}\,\hat{\rho}\ln\hat{\rho}\).

Homework#

1. Entropy and measurement outcomes. A density matrix \(\hat{\rho}\) is measured in an orthonormal basis \(\{\vert e_i\rangle\}\), producing outcome \(i\) with probability \(p_i = \langle e_i\vert \hat{\rho}\vert e_i\rangle\).

(a) Show that \(\{p_i\}\) is a valid probability distribution: \(p_i \geq 0\) and \(\sum_i p_i = 1\).

(b) For the maximally mixed qubit \(\hat{\rho} = \hat{I}/2\), compute the Shannon entropy \(H = -\sum_i p_i \ln p_i\) of the outcome distribution and verify that it equals the von Neumann entropy \(S(\hat{\rho})\) for any choice of measurement basis.

(c) For the pure state \(\hat{\rho} = \vert 0\rangle\langle 0\vert\), compute \(H\) when the measurement uses (i) the basis \(\{\vert 0\rangle, \vert 1\rangle\}\) and (ii) the basis \(\{\vert +\rangle, \vert -\rangle\}\). Explain why \(H\) depends on the basis while the von Neumann entropy \(S(\hat{\rho}) = 0\) does not.

2. Entropy of a diagonal mixture. A qubit is in state \(\hat{\rho} = p\vert 0\rangle\langle 0\vert + (1-p)\vert 1\rangle\langle 1\vert\) (diagonal mixture). Compute entropy \(S(\hat{\rho})\) as a function of \(p \in [0,1]\). Show that \(S\) is maximized when \(p = 1/2\). What is the physical interpretation?

3. Entropy concavity. Prove that entropy is concave: for any \(\hat{\rho}_1, \hat{\rho}_2\) and \(p \in [0,1]\),

\[ S(p\hat{\rho}_1 + (1-p)\hat{\rho}_2) \geq pS(\hat{\rho}_1) + (1-p)S(\hat{\rho}_2) \]

4. Unitary invariance. Show that von Neumann entropy is invariant under unitary transformations: \(S(\hat{U}\hat{\rho} \hat{U}^\dagger) = S(\hat{\rho})\) for any unitary \(\hat{U}\). Why does this make physical sense?

5. Thermal state entropy. For a thermal state \(\hat{\rho} = \mathrm{e}^{-\beta \hat{H}}/Z\) with Hamiltonian \(\hat{H}\) having energy levels \(E_n\) with multiplicity \(g_n\) (degeneracy), write the partition function as \(Z = \sum_n g_n \mathrm{e}^{-\beta E_n}\). Express \(\langle E \rangle\) and \(S\) in terms of \(Z(\beta)\).

6. Free energy relations. Starting from \(F = -k_B T \ln Z\), show that \(S = -(\partial F/\partial T)\vert _V\) and \(\langle E \rangle = F + TS\). Verify that these satisfy the first law of thermodynamics.

7. Partition function and entropy. A system has partition function \(Z(\beta) = 1 + 2\mathrm{e}^{-\beta}\). Compute the free energy \(F(\beta)\), average energy \(\langle E \rangle(\beta)\), and entropy \(S(\beta)\). At what temperature do \(\langle E \rangle\) and \(S\) reach half their maximum values?

8. Oscillator partition function. For a harmonic oscillator with Hamiltonian \(\hat{H} = \hbar \omega (\hat{a}^\dagger \hat{a} + 1/2)\) (ground state energy \(\hbar\omega/2\)), compute the partition function \(Z(\beta)\) and show that the average occupation number is \(\langle n \rangle = 1/(\mathrm{e}^{\beta\hbar\omega} - 1)\) (Bose-Einstein distribution).

9. Two-level entropy. A two-state system has \(E_{1} = 0\) and \(E_{2} = \Delta\), both non-degenerate.

(a) Compute the canonical-ensemble entropy \(S(\beta) = -p_{1}\ln p_{1} - p_{2}\ln p_{2}\) as a function of \(\beta\Delta\) using \(p_{1,2} = \mathrm{e}^{-\beta E_{1,2}}/Z\).

(b) Show that \(S(\beta)\) is monotonically decreasing on \(\beta\in[0,\infty)\): at \(\beta = 0\) both populations are equal and \(S = \ln 2\) (maximally mixed), while as \(\beta\to\infty\) the system collapses into the ground state and \(S\to 0\).

(c) The maximum entropy \(\ln 2\) therefore occurs at \(\beta = 0\) (infinite temperature), not at any interior \(\beta > 0\). Explain physically why the only way to make a finite-level thermal system more mixed is to flatten the Boltzmann weights — i.e., raise \(T\).