5.1.2 Non-Degenerate Perturbation Theory#

Prompts

If you know \(\hat{H}_0\) and matrix elements \(V_{mn}=\langle m\vert\hat V\vert n\rangle\), how can you systematically build first- and second-order energy corrections and first-order state corrections?
Starting from the parameter-dependent eigenvalue equation, why does projection with \(m=n\) isolate energy shifts while \(m\neq n\) gives state-mixing amplitudes?
How do coupling strength and level spacing together control the size of corrections, convergence quality, and hybridization?
What signals that non-degenerate perturbation theory is no longer controlled, and why does that signal point to degenerate perturbation theory?

Lecture Notes#

Overview#

Sec. 5.1.1 used an exactly solvable toy model as a benchmark. This subsection takes the next step: derive the perturbative coefficients directly from unperturbed data, without exact diagonalization.

We focus on the non-degenerate case and build formulas for first- and second-order energy shifts, first-order state mixing, and the validity condition that tells us when this method breaks down and degenerate perturbation theory is needed.

Problem Setup#

Non-Degenerate Perturbation Problem

Consider \(\hat H(\lambda)=\hat H_0+\lambda\hat V\), and work in the eigenbasis of \(\hat H_0\):

\[ \hat H_0=\sum_n \vert n\rangle E_n\langle n\vert, \]

\[ \hat V=\sum_{m,n}\vert m\rangle V_{mn}\langle n\vert. \]

Here \(V_{mn}=\langle m\vert\hat V\vert n\rangle\), and we assume a non-degenerate unperturbed spectrum at \(\lambda=0\) (that is, \(E_n\neq E_m\) for \(n\neq m\)).

The corresponding eigenvalue equation is \(\hat H(\lambda)\vert n(\lambda)\rangle=E_n(\lambda)\vert n(\lambda)\rangle\), and our objective is to construct \(E_n(\lambda)\) and \(\vert n(\lambda)\rangle\) order by order in \(\lambda\).

Notation Clarification

To simplify notation, when \((\lambda)\) is omitted we evaluate at \(\lambda=0\):

\[ \hat H\equiv \hat H(0)=\hat H_0, \]

\[ \vert n\rangle\equiv \vert n(0)\rangle, \]

\[ E_n\equiv E_n(0). \]

Also, \(\vert\partial_\lambda n\rangle\) means derivative of the state vector \(\vert n(\lambda)\rangle\) (not derivative of the integer label \(n\)):

\[ \vert\partial_\lambda n\rangle\equiv \left.\frac{\partial}{\partial\lambda}\vert n(\lambda)\rangle\right\vert_{\lambda=0}. \]

If \(\vert n(\lambda)\rangle=\sum_k c_{nk}(\lambda)\vert k\rangle\), then \(\vert\partial_\lambda n\rangle=\sum_k\left(\partial_\lambda c_{nk}(\lambda)\right)_{\lambda=0}\vert k\rangle\).

The Rayleigh-Schrödinger notation used in the Summary and Homework writes Taylor coefficients with explicit order superscripts:

\[ \vert n^{(k)}\rangle\equiv \left.\frac{1}{k!}\partial_\lambda^k\vert n(\lambda)\rangle\right\vert_{\lambda=0}, \]

\[ E_n^{(k)}\equiv \left.\frac{1}{k!}\partial_\lambda^k E_n(\lambda)\right\vert_{\lambda=0}. \]

In particular, \(\vert n^{(0)}\rangle=\vert n\rangle\), \(E_n^{(0)}=E_n\), \(\vert n^{(1)}\rangle=\vert\partial_\lambda n\rangle\), \(E_n^{(1)}=\partial_\lambda E_n\), and \(E_n^{(2)}=\frac{1}{2}\partial_\lambda^2 E_n\).

Hellmann-Feynman Identities#

From the Taylor-expansion viewpoint, perturbation theory is a derivative problem: we need derivatives of energies and states with respect to \(\lambda\).

The Hellmann-Feynman identities are exactly the tool we need: they convert those derivatives into matrix elements of \(\hat V\), giving a recursive route to higher-order corrections.

Hellmann-Feynman identities (non-degenerate)

Differentiating the eigenvalue equation with respect to \(\lambda\) gives the two Hellmann-Feynman identities below.

1st Hellmann-Feynman Identity (energy derivative):

\[ \partial_\lambda E_n = V_{nn}. \]

2nd Hellmann-Feynman Identity (state derivative):

\[ \langle m\vert\partial_\lambda n\rangle = \frac{V_{mn}}{E_n-E_m} \text{ for }m\neq n. \]

Derivation: Hellmann-Feynman identities

Start from the eigenvalue equation:

\[ \hat H(\lambda)\vert n(\lambda)\rangle=E_n(\lambda)\vert n(\lambda)\rangle. \]

Differentiate with respect to \(\lambda\):

\[ \partial_\lambda\hat H\vert n\rangle + \hat H\vert\partial_\lambda n\rangle = \partial_\lambda E_n\vert n\rangle + E_n\vert\partial_\lambda n\rangle. \]

Left-multiply by \(\langle m\vert\):

\[ \langle m\vert\partial_\lambda\hat H\vert n\rangle + \langle m\vert\hat H\vert\partial_\lambda n\rangle = \partial_\lambda E_n\langle m\vert n\rangle + E_n\langle m\vert\partial_\lambda n\rangle. \]

Use \(\langle m\vert\hat H=E_m\langle m\vert\), orthonormality \(\langle m\vert n\rangle=\delta_{mn}\), and \(\partial_\lambda\hat H=\hat V\):

\[ V_{mn}=\partial_\lambda E_n\,\delta_{mn}+(E_n-E_m)\langle m\vert\partial_\lambda n\rangle. \]

Now separate two cases:

If \(m=n\):

\[ V_{nn}=\partial_\lambda E_n. \]

If \(m\neq n\):

\[ V_{mn}=(E_n-E_m)\langle m\vert\partial_\lambda n\rangle, \]

\[ \langle m\vert\partial_\lambda n\rangle=\frac{V_{mn}}{E_n-E_m}. \]

This is the coupling-over-gap structure that controls state mixing.

With these identities in hand, we can now compute energy and state corrections order by order.

Energy Corrections#

Now use the Taylor expansion of the eigenenergy around \(\lambda=0\). Once \(\partial_\lambda E_n\) and \(\partial_\lambda^2E_n\) are known, the perturbative coefficients follow immediately.

Energy Expansion

Using Hellmann-Feynman identities, the energy correction is given by:

\[\begin{split} \begin{split} E_n(\lambda)&=E_n+(\partial_\lambda E_n)\lambda+\frac{1}{2}(\partial_\lambda^2E_n)\lambda^2+O(\lambda^3)\\ &=E_n+V_{nn}\lambda+\sum_{m\neq n}\frac{\vert V_{mn}\vert^2}{E_n-E_m}\lambda^2+O(\lambda^3). \end{split} \end{split}\]

Derivation: first-order energy derivative

From the first Hellmann-Feynman identity (the \(m=n\) case):

\[ \partial_\lambda E_n=V_{nn}=\langle n\vert\hat V\vert n\rangle. \]

This is exactly the first derivative needed in the Taylor series.

Derivation: second-order energy derivative

Start from

\[ \partial_\lambda E_n=\langle n\vert\hat V\vert n\rangle. \]

Differentiate once more; since \(\hat V\) has no \(\lambda\) dependence in the linear ansatz \(\hat H(\lambda)=\hat H_0+\lambda\hat V\) (so \(\partial_\lambda\hat V=0\)), only the bra and ket are differentiated:

\[ \partial_\lambda^2E_n =\langle\partial_\lambda n\vert\hat V\vert n\rangle+\langle n\vert\hat V\vert\partial_\lambda n\rangle. \]

Insert the resolution of identity carefully:

\[ \partial_\lambda^2E_n =\sum_m\left(\langle\partial_\lambda n\vert m\rangle V_{mn}+V_{nm}\langle m\vert\partial_\lambda n\rangle\right). \]

Now separate the \(m\neq n\) and \(m=n\) contributions:

\[\begin{split} \begin{split} \partial_\lambda^2E_n &=\sum_{m\neq n}\left(\langle\partial_\lambda n\vert m\rangle V_{mn}+V_{nm}\langle m\vert\partial_\lambda n\rangle\right) +\left(\langle\partial_\lambda n\vert n\rangle V_{nn}+V_{nn}\langle n\vert\partial_\lambda n\rangle\right)\\ &=\sum_{m\neq n}\left(\frac{V_{nm}}{E_n-E_m}V_{mn}+V_{nm}\frac{V_{mn}}{E_n-E_m}\right) +V_{nn}\left(\langle\partial_\lambda n\vert n\rangle+\langle n\vert\partial_\lambda n\rangle\right). \end{split} \end{split}\]

Here the second Hellmann-Feynman identity is used only for \(m\neq n\). For the \(m=n\) piece,

\[ \langle\partial_\lambda n\vert n\rangle+\langle n\vert\partial_\lambda n\rangle =\partial_\lambda\langle n\vert n\rangle=0, \]

because normalization gives \(\langle n\vert n\rangle=1\).

Therefore

\[ \partial_\lambda^2E_n=2\sum_{m\neq n}\frac{V_{nm}V_{mn}}{E_n-E_m}. \]

Poll: sign of ground-state shift

For a non-degenerate ground state \(n=0\), what is the sign of \(E_0^{(2)}=\sum_{m\neq 0}\frac{\vert V_{m0}\vert^2}{E_0-E_m}\)?

(A) always positive

(B) always negative

(D) always zero

State Corrections#

Apply the same Taylor logic to states:

State expansion and first-order correction

Using the second Hellmann-Feynman identity, the state correction is given by:

\[\begin{split} \begin{split} \vert n(\lambda)\rangle&=\vert n\rangle+\vert\partial_\lambda n\rangle\lambda+O(\lambda^2)\\ &=\vert n\rangle+\sum_{m\neq n}\vert m\rangle\frac{V_{mn}}{E_n-E_m}\lambda+O(\lambda^2). \end{split} \end{split}\]

Derivation: first-order state derivative

Expand the derivative vector in the unperturbed basis and separate diagonal/off-diagonal parts:

\[ \vert\partial_\lambda n\rangle =\vert n\rangle\langle n\vert\partial_\lambda n\rangle +\sum_{m\neq n}\vert m\rangle\langle m\vert\partial_\lambda n\rangle. \]

For \(m\neq n\), apply the second Hellmann-Feynman identity:

\[ \langle m\vert\partial_\lambda n\rangle=\frac{V_{mn}}{E_n-E_m}. \]

For \(m=n\), normalization gives

\[ \partial_\lambda\langle n\vert n\rangle=0 \quad\Rightarrow\quad \langle\partial_\lambda n\vert n\rangle+\langle n\vert\partial_\lambda n\rangle=0, \]

so \(\langle n\vert\partial_\lambda n\rangle\) is pure imaginary. One may choose the phase convention (parallel-transport gauge) so that

\[ \langle n\vert\partial_\lambda n\rangle=0. \]

Under this gauge choice,

\[ \vert\partial_\lambda n\rangle=\sum_{m\neq n}\vert m\rangle\frac{V_{mn}}{E_n-E_m}. \]

This makes the physical meaning transparent: mixing is stronger for larger coupling and smaller energy gap.

Physical Intuition and Validity#

Diagonal matrix elements \(V_{nn}\) shift energies at first order.
Off-diagonal matrix elements \(V_{mn}\) mix states and generate second-order shifts.
Level repulsion: virtual transitions push levels apart.
Breakdown criterion: if \(\vert E_n-E_m\vert\) becomes comparable to \(\vert V_{mn}\vert\), denominators become large and non-degenerate perturbation theory loses validity.

Then we must switch to degenerate perturbation theory (Sec. 5.1.3).

Example: check against the qubit toy model

Take exactly the toy model from Sec. 5.1.1:

\[ \hat H(\lambda)=\hat H_0+\lambda\hat V. \]

\[ \hat H_0=\hat Z. \]

\[ \hat V=\hat X. \]

Use the \(\{\vert 0\rangle,\vert 1\rangle\}\) basis of \(\hat Z\):

\[ \hat H_0\vert 0\rangle=+1\vert 0\rangle. \]

\[ \hat H_0\vert 1\rangle=-1\vert 1\rangle. \]

and matrix elements

\[ V_{00}=V_{11}=0. \]

\[ V_{10}=V_{01}=1. \]

For the branch connected to \(\vert 0\rangle\) at \(\lambda=0\):

\[ E_0^{(1)}=V_{00}=0, \]

\[ E_0^{(2)}=\frac{\vert V_{10}\vert^2}{E_0-E_1}=\frac{1}{1-(-1)}=\frac{1}{2}, \]

\[ E_0(\lambda)=1+\frac{\lambda^2}{2}+O(\lambda^3). \]

For the branch connected to \(\vert 1\rangle\):

\[ E_1^{(1)}=0. \]

\[ E_1^{(2)}=\frac{\vert V_{01}\vert^2}{E_1-E_0}=-\frac{1}{2}. \]

\[ E_1(\lambda)=-1-\frac{\lambda^2}{2}+O(\lambda^3). \]

These match the exact energies

\[ E_\pm(\lambda)=\pm\sqrt{1+\lambda^2} \]

expanded at small \(\lambda\).

For state derivatives:

\[ \vert\partial_\lambda 0\rangle=\frac{V_{10}}{E_0-E_1}\vert 1\rangle=\frac{1}{2}\vert 1\rangle, \]

\[ \vert\partial_\lambda 1\rangle=\frac{V_{01}}{E_1-E_0}\vert 0\rangle=-\frac{1}{2}\vert 0\rangle, \]

which is also consistent with the exact eigenvector expansion.

Discussion: what does divergence mean?

When \(E_n-E_m\to 0\), formulas like \(V_{mn}/(E_n-E_m)\) diverge. Is this a physical singularity, or a sign that our basis choice is no longer appropriate? Explain how block-diagonalization inside the nearly degenerate subspace resolves this issue.

Summary#

Non-degenerate perturbation theory is an iterative derivative algorithm built from Hellmann-Feynman identities.
First-order energy comes from diagonal matrix elements: \(E_n^{(1)}=V_{nn}\).
First-order state mixing is coupling over gap: \(\vert n^{(1)}\rangle=\sum_{m\neq n}\vert m\rangle V_{mn}/(E_n-E_m)\).
Second-order energy is a sum over virtual processes: \(E_n^{(2)}=\sum_{m\neq n}\vert V_{mn}\vert^2/(E_n-E_m)\).
Near degeneracy is not a failure of QM; it signals a change of method (degenerate perturbation theory).

Homework#

The problems are ordered to follow the lecture algorithm: setup and identities \(\to\) corrections \(\to\) physical interpretation and breakdown.

1. Gauge choice and normalization convention. Starting from \(\langle n(\lambda)\vert n(\lambda)\rangle=1\), show that one can choose the phase convention so that

\[ \langle n^{(0)}\vert n^{(1)}\rangle=0. \]

Explain why this choice simplifies perturbative state corrections.

2. Hellmann-Feynman in practice. Consider

\[\begin{split} \hat H_0=\begin{pmatrix}1&0&0\\0&2&0\\0&0&4\end{pmatrix}, \end{split}\]

\[\begin{split} \hat V=\begin{pmatrix}1&1&1\\1&0&1\\1&1&-1\end{pmatrix}, \end{split}\]

so \(\hat H(\lambda)=\hat H_0+\lambda\hat V\).

(a) Use the first Hellmann-Feynman identity \(E_n^{(1)}=V_{nn}\) to read off the three first-order energies.

(b) Use the second Hellmann-Feynman identity to compute \(\langle 2\vert\partial_\lambda 1\rangle\) and \(\langle 3\vert\partial_\lambda 1\rangle\) at \(\lambda=0\). Assemble \(\vert 1^{(1)}\rangle\).

(c) Verify your result for \(E_1^{(1)}\) by diagonalizing \(\hat H(\lambda)\) at \(\lambda=0.01\) and checking \([E_1(0.01)-E_1(0)]/0.01\approx E_1^{(1)}\).

3. Coupling over gap. Two two-level systems share the same coupling magnitude \(\vert V_{12}\vert=\hbar\omega_c\) but different gaps. System A has \(E_2-E_1=10\hbar\omega_c\); system B has \(E_2-E_1=2\hbar\omega_c\). Set \(\lambda=1\) in both.

(a) Compute the first-order state correction \(\vert 1^{(1)}\rangle=\sum_{m\neq 1}\vert m\rangle V_{m1}/(E_1-E_m)\) for each system.

(b) Compute the second-order energy correction \(E_1^{(2)}\) for each.

(c) Estimate the largest coupling magnitude \(\vert V_{12}\vert\) for which the perturbative expansion is well-behaved (next correction small compared to the gap). Comment on which system is “more perturbative” and why.

4. Second-order energy correction and sign. Starting from

\[ E_n^{(2)}=\sum_{m\neq n}\frac{\vert V_{mn}\vert^2}{E_n-E_m}, \]

(a) show \(E_0^{(2)}\le 0\) for the non-degenerate ground state,

(b) interpret the result as level repulsion,

5. Toy-model consistency check. For \(\hat H=\hat Z+\lambda \hat X\):

(a) compute \(E_\pm^{(1)}\) and \(E_\pm^{(2)}\) by perturbation theory,

(b) compute \(\vert\pm^{(1)}\rangle\),

(c) compare with the exact expansion of \(E_\pm(\lambda)=\pm\sqrt{1+\lambda^2}\) and comment on agreement order-by-order.

6. Harmonic oscillator with linear perturbation. Let

\[ \hat H_0=\hbar\omega\left(\hat a^\dagger \hat a+\frac12\right), \]

\[ \hat V=\lambda \hat x, \]

\[ \hat x=\sqrt{\frac{\hbar}{2m\omega}}(\hat a+\hat a^\dagger). \]

(a) Compute \(V_{nn}\), \(V_{n+1,n}\), \(V_{n-1,n}\).

(b) Use non-degenerate perturbation theory to obtain \(E_n\) up to second order.

7. Selection rules and parity. For a 1D parity-symmetric potential with odd perturbation \(\hat V=\lambda \hat x\):

(a) show \(E_n^{(1)}=0\) for all \(n\),

(b) identify which matrix elements contribute to \(E_n^{(2)}\),

8. Near-degeneracy and breakdown. A 3-level system has unperturbed energies \(E_1=0\), \(E_2=\Delta\), \(E_3=10\Delta\), with nonzero couplings \(V_{12}\) and \(V_{23}\).

(a) Write \(E_1^{(2)}\) explicitly.

(b) Analyze \(\Delta\to 0\) and identify which term causes the breakdown.

(c) Give the correct next-step method (basis choice and effective subspace treatment) instead of applying non-degenerate formulas blindly.