6.1.2 Entropy#
Prompts
What is von Neumann entropy? How does it generalize Shannon entropy to quantum states?
Why is entropy zero for pure states but positive for mixed states? What does this tell you about uncertainty?
How does the maximum entropy principle connect to the thermal state and the Boltzmann distribution?
Why is the partition function \(Z = \mathrm{Tr}(\mathrm{e}^{-\beta \hat{H}})\) the key bridge between quantum mechanics and thermodynamics?
Can you derive free energy \(F = -k_B T \ln Z\) from the partition function?
Lecture Notes#
Overview#
Von Neumann entropy \(S(\hat{\rho}) = -\mathrm{Tr}(\hat{\rho} \ln \hat{\rho})\) quantifies how much we don’t know about a quantum state. It is zero for pure states (perfect knowledge) and positive for mixed states (ignorance). The maximum entropy principle shows that thermal states—those that maximize entropy subject to a constraint on average energy—are precisely the Boltzmann distributions. This simple principle bridges quantum mechanics to thermodynamics without assuming temperature a priori.
Von Neumann Entropy#
Von Neumann Entropy
For a density matrix \(\hat{\rho}\) on a \(d\)-dimensional Hilbert space, the von Neumann entropy is:
Equivalently, using spectral decomposition \(\hat{\rho} = \sum_i \lambda_i |\psi_i\rangle\langle\psi_i|\) (eigenvalues \(\lambda_i \geq 0\)):
with the convention \(0 \ln 0 = 0\).
Bounds and limits:
The minimum \(S = 0\) occurs for pure states (a single eigenvalue equals 1). The maximum \(S = \ln d\) occurs for the maximally mixed state \(\hat{\rho} = \hat{I}/d\) (all eigenvalues equal \(1/d\)).
Example: Pure and Mixed States
Pure state: \(\hat{\rho} = |\psi\rangle\langle\psi|\) has one eigenvalue \(\lambda_1 = 1\). Thus:
Maximally mixed qubit: \(\hat{\rho} = \hat{I}/2\) (2-dimensional) has eigenvalues \(\lambda_1 = \lambda_2 = 1/2\). Thus:
Thermal qubit at finite temperature: \(\hat{\rho} = \frac{\mathrm{e}^{-\beta E_1} |1\rangle\langle 1| + \mathrm{e}^{-\beta E_2} |2\rangle\langle 2|}{Z}\) where \(Z = \mathrm{e}^{-\beta E_1} + \mathrm{e}^{-\beta E_2}\). The entropy ranges between 0 (zero temperature, ground state only) and \(\ln 2\) (infinite temperature, maximally mixed).
Discussion
A thermometer measures entropy by heat capacity. At high temperatures, a system has many accessible energy levels and entropy is large. At low temperatures, only the ground state is occupied and entropy vanishes. Why is entropy the right quantity to measure this ignorance?
Properties of Entropy#
Properties of Von Neumann Entropy
Non-negativity: \(S(\hat{\rho}) \geq 0\), with equality iff \(\hat{\rho}\) is a pure state.
Upper bound: \(S(\hat{\rho}) \leq \ln d\) for \(d\)-dimensional systems, with equality iff \(\hat{\rho} = \hat{I}/d\).
Unitary invariance: \(S(U\hat{\rho} U^\dagger) = S(\hat{\rho})\) for any unitary \(U\). Entropy is unchanged by reversible quantum operations.
Concavity: For \(p \in [0,1]\),
Mixing states increases entropy: ignorance about which mixture we’re in adds to internal entropy.
Connection to Information Theory#
Von Neumann entropy is the quantum generalization of Shannon entropy. For a classical probability distribution \(P = \{p_i\}\):
If we measure \(\hat{\rho}\) in basis \(\{|\psi_i\rangle\}\) with outcomes \(i\) having probability \(p_i = \langle\psi_i|\hat{\rho}|\psi_i\rangle\), we obtain the Shannon entropy \(H(P)\) of the measurement outcome distribution. But von Neumann entropy \(S(\hat{\rho})\) is basis-independent and captures the intrinsic quantum uncertainty.
Interpretation: \(S(\hat{\rho})\) measures how much we don’t know about the state \(\hat{\rho}\).
\(S = 0\): We know the state exactly (pure state).
\(S > 0\): We have incomplete information; measuring may yield different outcomes.
Maximum Entropy Principle#
Among all density matrices with a fixed average energy \(\langle E \rangle = \mathrm{Tr}(\hat{\rho} \hat{H})\), which one maximizes entropy?
Derivation: Maximum Entropy and Thermal State
We maximize \(S(\hat{\rho}) = -\mathrm{Tr}(\hat{\rho} \ln \hat{\rho})\) subject to:
\(\mathrm{Tr}(\hat{\rho}) = 1\) (normalization)
\(\mathrm{Tr}(\hat{\rho} \hat{H}) = E\) (fixed average energy)
Using Lagrange multipliers with constraints, the extremum satisfies:
This gives \(-\ln \hat{\rho} - \alpha \hat{I} - \beta \hat{H} = 0\), or:
Applying normalization: \(\mathrm{Tr}(\hat{\rho}) = 1\) defines the partition function \(Z = \mathrm{Tr}(\mathrm{e}^{-\beta \hat{H}})\), so:
where \(\beta = 1/(k_B T)\) is the inverse temperature.
The Lagrange multiplier \(\beta\) is not arbitrary: it adjusts so that \(\langle E \rangle = \mathrm{Tr}(\hat{\rho}_{\text{thermal}} \hat{H})\) matches the constraint.
Thermal State and Boltzmann Distribution
The state maximizing entropy at fixed \(\langle E \rangle\) is the thermal state:
where \(Z = \mathrm{Tr}(\mathrm{e}^{-\beta \hat{H}})\) is the partition function and \(\beta = 1/(k_B T)\).
In the energy eigenbasis, diagonal elements are Boltzmann weights:
where \(E_n\) are energy eigenvalues.
Key insight: The Boltzmann distribution is not an assumption—it follows from the principle of maximum entropy. When we know only the average energy, the state of maximum ignorance (maximum entropy) is thermal.
Discussion
Why does nature prefer maximum entropy states? If a system is isolated and in thermal equilibrium, why should it be the state that maximizes entropy? Is this a fundamental principle or a consequence of statistical mechanics?
Partition Function and Thermodynamics#
The partition function \(Z(\beta) = \mathrm{Tr}(\mathrm{e}^{-\beta \hat{H}})\) encodes all thermodynamic properties of a system at inverse temperature \(\beta\).
Average Energy:
Free Energy (Helmholtz):
where \(T = 1/(k_B \beta)\).
Entropy in terms of \(Z\):
From \(\hat{\rho}_{\text{thermal}} = \mathrm{e}^{-\beta \hat{H}}/Z\):
Equivalently, \(S = -(\partial F/\partial T)|_V\) (standard thermodynamic relation).
Thermodynamic Identities:
The first law connects all three:
(ignoring volume dependence for simplicity). From \(F = \langle E \rangle - TS\):
which is consistent with equation (114).
Summary#
Von Neumann entropy \(S(\hat{\rho}) = -\mathrm{Tr}(\hat{\rho} \ln \hat{\rho}) = -\sum_i \lambda_i \ln \lambda_i\) (where \(\lambda_i\) are eigenvalues).
\(S = 0\) for pure states; \(S = \ln d\) for maximally mixed state in \(d\) dimensions.
Ranges from 0 to \(\ln d\); basis-independent; increased by mixing.
Maximum entropy principle: Among all \(\hat{\rho}\) with fixed \(\langle E \rangle\), the one maximizing \(S\) is the thermal state \(\hat{\rho} = \mathrm{e}^{-\beta \hat{H}}/Z\).
This derives the Boltzmann distribution \(P_n \propto \mathrm{e}^{-\beta E_n}\) from maximum ignorance, not as an assumption.
Partition function \(Z = \mathrm{Tr}(\mathrm{e}^{-\beta \hat{H}})\) bridges quantum mechanics to thermodynamics:
Free energy: \(F = -k_B T \ln Z\)
Average energy: \(\langle E \rangle = -\mathrm{d}(\ln Z)/\mathrm{d}\beta\)
Entropy: \(S = k_B(\beta \langle E \rangle + \ln Z)\)
Homework#
1. For a two-level system (qubit) with Hamiltonian \(\hat{H} = E_0 |0\rangle\langle 0| + E_1 |1\rangle\langle 1|\) where \(E_0 < E_1\), compute the thermal state \(\hat{\rho}_{\text{thermal}} = \mathrm{e}^{-\beta \hat{H}}/Z\) and its entropy \(S(\beta)\). Show that \(S \to 0\) as \(\beta \to \infty\) (zero temperature) and \(S \to \ln 2\) as \(\beta \to 0\) (infinite temperature).
2. A qubit is in state \(\hat{\rho} = p|0\rangle\langle 0| + (1-p)|1\rangle\langle 1|\) (diagonal mixture). Compute entropy \(S(\hat{\rho})\) as a function of \(p \in [0,1]\). Show that \(S\) is maximized when \(p = 1/2\). What is the physical interpretation?
3. Prove that entropy is concave: for any \(\hat{\rho}_1, \hat{\rho}_2\) and \(p \in [0,1]\), $\(S(p\hat{\rho}_1 + (1-p)\hat{\rho}_2) \geq pS(\hat{\rho}_1) + (1-p)S(\hat{\rho}_2)\)$
4. Show that von Neumann entropy is invariant under unitary transformations: \(S(U\hat{\rho} U^\dagger) = S(\hat{\rho})\) for any unitary \(U\). Why does this make physical sense?
5. For a thermal state \(\hat{\rho} = \mathrm{e}^{-\beta \hat{H}}/Z\) with Hamiltonian \(\hat{H}\) having energy levels \(E_n\) with multiplicity \(g_n\) (degeneracy), write the partition function as \(Z = \sum_n g_n \mathrm{e}^{-\beta E_n}\). Express \(\langle E \rangle\) and \(S\) in terms of \(Z(\beta)\).
6. Starting from \(F = -k_B T \ln Z\), show that \(S = -(\partial F/\partial T)|_V\) and \(\langle E \rangle = F + TS\). Verify that these satisfy the first law of thermodynamics.
7. A system has partition function \(Z(\beta) = 1 + 2\mathrm{e}^{-\beta}\). Compute the free energy \(F(\beta)\), average energy \(\langle E \rangle(\beta)\), and entropy \(S(\beta)\). At what temperature do \(\langle E \rangle\) and \(S\) reach half their maximum values?
8. Explain why a classical system approaching thermal equilibrium with a heat bath increases entropy. How does this connect to the quantum maximum entropy principle?
9. For a harmonic oscillator with Hamiltonian \(\hat{H} = \hbar \omega (\hat{a}^\dagger \hat{a} + 1/2)\) (ground state energy \(\hbar\omega/2\)), compute the partition function \(Z(\beta)\) and show that the average occupation number is \(\langle n \rangle = 1/(\mathrm{e}^{\beta\hbar\omega} - 1)\) (Bose–Einstein distribution).
10. A two-state system has \(E_1 = 0\) (ground state) and \(E_2 = \Delta\) (excited state), both non-degenerate. Show that the entropy has a peak at \(\beta = (1/\Delta) \ln[1 + \sqrt{2}]\). Explain why entropy peaks at intermediate temperature rather than at very high or very low \(\beta\).