Boo-AI — Master Artificial Intelligence by Building from Scratch

Learning Objectives

By the end of this section you should be able to:

Tell the story of where SOC comes from in two sentences — a moving electron sees the nuclear electric field as a magnetic field, and that field couples to the electron's own magnetic moment.
Write down the SOC Hamiltonian $\hat{H}_{\text{SOC}} = \xi(r)\,\mathbf{L}\cdot\mathbf{S}$ and use the identity $2\,\mathbf{L}\cdot\mathbf{S} = \mathbf{J}^2 - \mathbf{L}^2 - \mathbf{S}^2$ to predict the splittings of any LS-coupled atomic shell.
Read off the canonical p-shell partition: a 6-fold degenerate $p$ level breaks into a 4-fold $p_{3/2}$ at $+\xi/2$ and a 2-fold $p_{1/2}$ at $-\xi$ , with fine-structure gap $\tfrac{3}{2}\xi$ .
Trace the same partition into a band picture: the 6-fold p-derived valence top of a zincblende crystal becomes a 4-fold $\Gamma_8$ (HH + LH) and a 2-fold $\Gamma_7$ (split-off) separated by $\Delta_{\text{SO}}$ .
Predict the order of magnitude of $\Delta_{\text{SO}}$ for a new material from a quick $Z^4$ argument and the chemistry of its anion.
Switch on SOC in VASP with the right INCAR tags and read $\Delta_{\text{SO}}$ straight out of vasprun.xml with a few lines of pymatgen.

One-line preview: SOC is the relativistic correction that turns spin from a passive label into an actor in the band Hamiltonian. Without it, every band has a spurious 2-fold spin degeneracy; with it, valence-band tops split, conduction bands twist, and quantum dots like Mn:CdSe acquire the precise emission energies we want them to have.

The Story — Why Should an Electron Care About Its Own Spin?

In the non-relativistic Schrödinger equation, spin is a spectator. You attach a 2-component spinor to every wavefunction, sum probabilities at the end, and never let $\hat{S}$ appear in the Hamiltonian. Every eigenstate is automatically Kramers-degenerate: spin-up and spin-down of the same orbital share an energy.

The Dirac equation tells a different story. When you reduce it to its non-relativistic limit (the Pauli equation) and keep terms of order $(v/c)^2$ , three corrections fall out: the kinetic-energy correction, the Darwin term, and spin–orbit coupling. The first two shift levels rigidly. SOC is the only one that splits degeneracies.

Heuristic — the moving electron in the nucleus's field

Sit in the rest frame of the orbiting electron. From its point of view, the nucleus is a positive charge sweeping past at velocity $-\mathbf{v}$ . Special relativity says a moving charge produces a magnetic field

$\mathbf{B}_{\text{eff}} \;=\; -\frac{1}{c^2}\,\mathbf{v}\times \mathbf{E}_{\text{nucleus}}.$

The electron carries a magnetic moment $\boldsymbol{\mu}_S = -g_s \mu_B \mathbf{S}/\hbar$ . The Zeeman energy of this moment in the effective field — corrected by a factor 1/2 from Thomas precession — is exactly $\xi(r)\,\mathbf{L}\cdot\mathbf{S}$ . That is the entire physical content of SOC in one paragraph.

The size of the effect is set by $(v/c)^2$ . Inner-shell electrons of a heavy atom orbit at a respectable fraction of the speed of light — for an innermost $1s$ in lead, $v/c \sim Z\alpha \approx 0.6$ , where $\alpha \approx 1/137$ is the fine-structure constant. SOC is therefore a property of the chemistry, and it grows spectacularly fast with atomic number.

Why this section appears in a crystallography book

For light elements (C, B, N) you can ignore SOC and lose nothing. For mid-row anions (Si, Se, As) you must include it to get the right order of valence-band states. For heavy elements (Te, Pb, Bi) SOC is so large that it inverts the band order — giving rise to topological insulators (HgTe, Bi₂Se₃). Mn-doped CdSe quantum dots, the Chapter 7 case study, sit in the "must include" regime: ignoring SOC misplaces the emission energy by ~0.1 eV.

The L·S Hamiltonian

From the Pauli reduction of Dirac, the SOC operator inside an atom is

$\hat{H}_{\text{SOC}} \;=\; \frac{1}{2 m_e^2 c^2}\,\frac{1}{r}\,\frac{dV(r)}{dr}\;\mathbf{L}\cdot \mathbf{S} \;\equiv\; \xi(r)\,\mathbf{L}\cdot\mathbf{S}.$

The function $\xi(r)$ is sharply peaked near the nucleus where $dV/dr$ is largest — SOC is overwhelmingly an atomic, near-nucleus phenomenon. Inside a crystal, the lattice is just a perturbation on top: the SOC matrix elements are dominated by single-atom integrals, and to a very good approximation

$\langle n l j \,|\, \xi(r) \,|\, n l j\rangle \;\equiv\; \xi_{nl}\quad \text{(an atomic constant)}.$

For a single open shell with orbital angular momentum $L$ and spin $S$ , the operator $\mathbf{L}\cdot \mathbf{S}$ is diagonal in the coupled basis $|J, m_J\rangle$ via the simple algebraic identity

$2\,\mathbf{L}\cdot \mathbf{S} \;=\; \mathbf{J}^{2} - \mathbf{L}^{2} - \mathbf{S}^{2},$

with eigenvalues $J(J{+}1) - L(L{+}1) - S(S{+}1)$ . The same one line generates all the textbook fine-structure formulas of atomic physics. For a p-electron ( $L=1, S=1/2$ ):

State	J	Degeneracy 2J+1	2 L·S	Energy	m_J
p₃ⳆⳆ₂	3/2	4	+1	+ξ/2	−3/2, −1/2, +1/2, +3/2
p₁ⳆⳆ₂	1/2	2	−2	−ξ	−1/2, +1/2

Notice the bookkeeping: 4 + 2 = 6 (the original p-shell degeneracy survives) and the weighted sum $4(\xi/2) + 2(-\xi) = 0$ — SOC redistributes the levels but does not move their centre of mass. The fine-structure gap is

$\Delta_{\text{fs}} \;=\; E(p_{3/2}) - E(p_{1/2}) \;=\; \tfrac{3}{2}\,\xi.$

One identity, every shell

For a d-shell ( $L=2, S=1/2$ ): J = 5/2 (6-fold, E = +ξ), J = 3/2 (4-fold, E = −3ξ/2), gap = 5ξ/2. For an f-shell: J = 7/2 (E = +3ξ/2), J = 5/2 (E = −2ξ), gap = 7ξ/2. The same algebraic identity supplies every result — no new physics, just $J(J+1)$ .

Interactive — Splitting a p-Shell

Drag the slider through ξ. Six degenerate ticks at zero break into the $p_{3/2}$ quadruplet at $+\xi/2$ (amber) and the $p_{1/2}$ doublet at $-\xi$ (cyan). The dashed grey line marks the unsplit centre of mass.

Loading atomic splitter…

Two checks worth doing in your head as you drag:

Centre-of-mass conservation. The amber states sit at $+\xi/2$ , the cyan ones at $-\xi$ . Verify that $4(\xi/2) + 2(-\xi) = 0$ for any ξ. SOC never moves the average energy of a closed shell — it just spreads it.
Linear scaling. Both levels move linearly with ξ, and the gap $\tfrac{3}{2}\xi$ grows linearly too. SOC is a first-order effect on the atomic energies; everything that looks non-linear in a real material (e.g. how the conduction-band Bychkov–Rashba splitting evolves with strain) comes from how ξ couples to the surrounding orbital structure, not from the L·S operator itself.

The Z⁴ Scaling — Why Heavy Atoms Matter

The atomic SOC parameter $\xi_{nl}$ is the expectation value of $\xi(r)$ over a hydrogenic-like radial wavefunction. Plugging in the hydrogenic forms gives the textbook estimate

$\xi_{nl} \;\sim\; \frac{Z^4 \alpha^2}{n^3\,l(l+1/2)(l+1)}\,\text{Ry}.$

The headline message is the $Z^4$ : doubling the nuclear charge multiplies SOC by sixteen. Real screening flattens the scaling somewhat — outer-shell electrons see an effective charge well below $Z$ — but the qualitative trend survives, and it is the only rule of thumb you need when judging whether SOC matters for a new material.

Loading Z⁴ scaling chart…

Two observations worth internalising:

The anion dominates. For a II-VI semiconductor like ZnSe, CdSe, or CdTe, the valence-band top is built from anion-p orbitals. The SOC at the VBM scales with $\xi_p$ of the anion, not the cation. Replace Se by Te (Z = 34 → 52) and $\Delta_{\text{SO}}$ jumps from ≈ 0.42 eV to ≈ 0.94 eV.
Z⁴ overshoots. The dashed reference curve, pinned at Si, predicts much larger splittings for heavy elements than what real materials show. Screening by valence electrons reduces the effective nuclear charge that the relevant p-orbital sees; self-consistent DFT-PAW calculations capture this naturally and land much closer to experiment.

From Atomic Splittings to Band Splittings

The same algebra carries straight into the solid. Consider the top of the valence band of a zincblende II-VI or III-V crystal (CdSe, GaAs, InP, ZnSe). At the Γ-point — the centre of the Brillouin zone — the crystal's point group is $T_d$ . Without SOC, the valence-band top is derived from anion-p orbitals and transforms as the 3-dimensional $\Gamma_{15}$ representation of $T_d$ . Including spin (which we have ignored until now) doubles every state, so the manifold is 6-fold degenerate.

Switching on $\hat{H}_{\text{SOC}}$ reduces the effective symmetry to the double group $T_d^*$ , whose representations are labelled by half-integer J. The 6-D representation reduces as

$\Gamma_{15} \otimes \Gamma_{1/2} \;=\; \Gamma_{8}\;\oplus\;\Gamma_{7}.$

Translation: the 6-fold splits into a 4-fold $\Gamma_{8}$ (J = 3/2, the HH and LH bands) and a 2-fold $\Gamma_{7}$ (J = 1/2, the split-off band). It is the same $6 = 4 + 2$ partition you saw in the atomic splitter; only the names of the labels have changed.

Band	Symmetry at Γ	J	Degeneracy	k away from Γ
Heavy hole (HH)	Γ_8	3/2	2 (m_J = ±3/2)	Flat parabola — large effective mass
Light hole (LH)	Γ_8	3/2	2 (m_J = ±1/2)	Sharp parabola — small effective mass
Split-off (SO)	Γ_7	1/2	2 (m_J = ±1/2)	Parabolic, rigidly shifted by Δ_SO

Why HH and LH split as you leave Γ

At $\mathbf{k} = 0$ , all four $\Gamma_8$ states are exactly degenerate. For finite $\mathbf{k}$ , the $\mathbf{k}\cdot\mathbf{p}$ Hamiltonian mixes them differently depending on $m_J$ . The Luttinger parameters $\gamma_1, \gamma_2, \gamma_3$ encode the mixing. For a cubic crystal along $[100]$ , the heavy-hole curvature is $\gamma_1 - 2\gamma_2$ and the light-hole curvature is $\gamma_1 + 2\gamma_2$ . The SO band is rigidly displaced; its curvature is roughly $\gamma_1$ .

Interactive — HH, LH, and the Split-Off Band

Drag $\Delta_{\text{SO}}$ from 0 (light-element limit) up to past 1 eV (heavy elements like CdTe, PbTe). At Γ, the amber circle is the $\Gamma_8$ manifold and the cyan circle is the $\Gamma_7$ split-off band. Away from Γ you can already see the curvature difference between the heavy and light holes — the heavy hole is the flatter band.

Loading band splitter…

Things to verify visually:

At $\Delta_{\text{SO}} = 0$ the SO band (cyan) collapses onto the LH band (orange) — they share the same curvature, after all.
At Γ, HH and LH are exactly degenerate. The classic ring-shaped isoenergy surface near the VBM is born here: two bands of the same energy but different curvatures.
For CdSe ( $\Delta_{\text{SO}} \approx 0.42$ eV), the split-off band sits well below the optical gap — visible photons couple only to HH/LH states. For HgTe, the SO splitting is large enough to invert the band order: the s-like Γ_6 actually falls below Γ_8, producing a topological insulator.

Where the SOC Lives in CdSe

SOC is an atomic effect glued onto the band structure by the chemistry. For CdSe in the zincblende phase below, the cation Cd (Z = 48) and the anion Se (Z = 34) both contribute, but the valence-band top is almost entirely Se 4p. To the bands at the gap, only the anion's ξ_p matters.

Loading CdSe 3D viewer…

A predictive heuristic

Want to design a II-VI quantum dot with a specific Δ_SO at the valence-band edge? Pick the anion. Want a large Δ_SO for spin textures and topological physics? Use Te or Pb. Want a small one to keep simple effective-mass models valid? Use S or even O. The cation tunes the gap and lattice constant; the anion sets Δ_SO.

Bonus — Inversion Asymmetry, Rashba and Dresselhaus

Everything above lived at $\mathbf{k} = 0$ . Away from Γ in a crystal with broken inversion symmetry, SOC produces a more subtle effect: a k-linear spin splitting of bands that would otherwise be Kramers-degenerate. Two famous flavours:

Effect	Symmetry broken	Hamiltonian (small k)	Where you see it
Rashba	Structural inversion (interface, surface)	α_R (k × ẑ) · σ	InGaAs/GaAs 2DEGs, topological-insulator surfaces
Dresselhaus	Bulk inversion (zincblende lacks inversion)	β (k_x σ_x − k_y σ_y) (linear-Dresselhaus)	GaAs, CdSe, ZnSe quantum dots

Both terms split each band into two spin-textured copies of energy $E_\pm(\mathbf{k}) = E_0(k) \pm |\boldsymbol{\Omega}(\mathbf{k})|$ , where $\boldsymbol{\Omega}$ is an effective magnetic field that depends on $\mathbf{k}$ . In Mn-doped CdSe, the Dresselhaus term is what eventually limits electron spin coherence — Section 5.8 will use this fact when we compute spin lifetimes for Mn impurity states.

Sanity check

A crystal that has inversion symmetry (Si, Ge, NaCl) cannot show either effect in the bulk. Every band is exactly spin-degenerate at every $\mathbf{k}$ by the combination of time-reversal and inversion. Break inversion (build a heterostructure, take a surface, dope asymmetrically) and the spin texture switches on.

VASP — Switching On Spin–Orbit Coupling

VASP turns SOC on as a single, inexpensive switch. The cost is that the calculation becomes non-collinear — every band is now a 2-component spinor, the spinor density requires complex arithmetic, and the basis size doubles. Plan for ~3–4× longer SCF cycles compared to a collinear spin-polarised run.

The three INCAR tags you actually need

Tag	Meaning	Default	What we use
LSORBIT	Master switch — turns on the L·S operator inside the PAW augmentation spheres	.FALSE.	.TRUE.
MAGMOM	Initial spinor magnetic moment, three components (m_x, m_y, m_z) per atom	auto	Three numbers per atom
SAXIS	Spin quantisation axis (the z direction of the spinor frame)	(0 0 1)	Match your easy axis or pick (1 0 0)
ISYM	Symmetry handling — must be 0 or -1 once SAXIS breaks crystal symmetry	1	0 (recommended for SOC + magnetism)
GGA_COMPAT	Compatibility flag for non-collinear GGA	.TRUE.	.FALSE. (cleaner forces)

The MAGMOM gotcha

In a collinear run, MAGMOM is one number per atom (the projection of spin on +z). In a non-collinear/SOC run, MAGMOM is three numbers per atom in the order m_x m_y m_z. For a 16-atom CdSe supercell with all-zero starting moments and one Mn impurity carrying a 5 µ_B moment along z:

📝text

1MAGMOM = 15*0 0 0   0 0 5     # 15 (Cd,Se) atoms ×  zero, then Mn = (0,0,5)

A complete CdSe SOC INCAR

📝text

1# INCAR — CdSe non-collinear band structure (step 2 of the recipe)
2SYSTEM   = CdSe zincblende, SOC, line-mode bands
3
4# Plane-wave + accuracy
5PREC     = Accurate
6ENCUT    = 400         # eV — set by ENCUT convergence test
7EDIFF    = 1E-6        # SCF tolerance
8
9# Smearing — semiconductor
10ISMEAR   = 0
11SIGMA    = 0.05
12
13# Spin–orbit coupling
14LSORBIT  = .TRUE.      # MASTER SWITCH
15SAXIS    = 0 0 1       # quantisation axis along z (cubic, take any)
16GGA_COMPAT = .FALSE.
17
18# Initial moments — three per atom; CdSe is non-magnetic
19MAGMOM   = 24*0  0 0   # 8 atoms (zincblende cell) × (0 0 0)
20
21# Read pre-converged charge density, do not update it
22ICHARG   = 11
23
24# Symmetry off (SOC + non-trivial SAXIS can clash with crystal symmetry)
25ISYM     = 0
26
27# Output the projections so we can colour the bands
28LORBIT   = 11
29
30NBANDS   = 64          # spinor count = 2 × old NBANDS — at least double it

Two subtle traps

Double the bands. A non-collinear run treats every spinor band as a single index, but you still need at least 2× the bands you used in the collinear case, otherwise you will lose high-energy conduction states.
Symmetry off. SOC + arbitrary SAXIS breaks the point-group symmetries that VASP would normally use to fold equivalent k-points. Forgetting $\texttt{ISYM}=0$ can produce a confusing mix of spurious degeneracies.

VASP Walkthrough — Reading Δ_SO From the Output

After the non-collinear band run finishes, the eigenvalues at every k-point of the line-mode path live in vasprun.xml. The script below extracts the spin–orbit splitting at Γ and compares it to the experimental CdSe value of 0.42 eV. Every line is annotated — click any line of code on the right to see what is happening to the variables at that step.

read_soc_bands.py — extract Δ_SO at Γ

🐍read_soc_bands.py

Explanation(22)

Code(40)

1Module docstring

Triple-quoted block that documents the purpose of the script. PyMatGen, NumPy, and a non-collinear VASP run (LSORBIT = .TRUE.) are the moving parts.

EXECUTION STATE

Δ_SO = The spin–orbit splitting at Γ — the energy gap between the J = 3/2 (HH+LH) manifold and the J = 1/2 split-off band. For CdSe, ≈ 0.42 eV.

9import numpy as np

NumPy provides ndarray, sorting, and slicing. We use it to sort eigenvalues, take row averages, and pull contiguous slices of the band stack.

EXECUTION STATE

numpy = Numerical computing library — fast C-backed arrays, slicing, mean(), sort(), argsort(). Aliased to np by convention.

10from pymatgen.io.vasp import BSVasprun

BSVasprun is a lighter-weight cousin of Vasprun, optimised for parsing the eigenvalues and projected eigenvalues from a band-structure run. It still reads the full vasprun.xml but skips DOS-related sections.

EXECUTION STATE

📚 BSVasprun(filename, parse_projected_eigen) = PyMatGen class that wraps vasprun.xml. Lazy-loads eigenvalues, k-points, and Fermi level. Use it instead of Vasprun whenever you only need band-structure data.

11from pymatgen.electronic_structure.core import Spin

Spin is a small enum with two members, Spin.up (=1) and Spin.down (=-1). PyMatGen indexes spin-resolved arrays by these symbols. In a non-collinear (SOC) run, only Spin.up is populated — every band is a 2-component spinor and is doubly Kramers-degenerate.

EXECUTION STATE

Spin.up = Enum value 1. Used as a key into bs.bands[...] dictionaries.

Spin.down = Enum value -1. Empty / not-populated in a non-collinear run because spin is no longer a good quantum number.

14Comment — step 1

Marks the first concrete action: parse the XML output. We will get back a BandStructureSymmLine object with .bands, .kpoints, .efermi, and projection data.

15vr = BSVasprun("vasprun.xml", parse_projected_eigen=True)

Construct the parser. parse_projected_eigen=True asks PyMatGen to also read the per-atom, per-orbital projection coefficients (PROCAR-equivalent). We need them later if we want to colour the bands by Cd vs Se character.

EXECUTION STATE

📚 BSVasprun(...) = Constructor reads the XML once and caches eigenvalues. Subsequent calls to .get_band_structure() are cheap.

⬇ arg 1: "vasprun.xml" = Path to the XML produced by the band-structure run. Contains every eigenvalue at every k-point of the line-mode path.

⬇ arg 2: parse_projected_eigen=True = Tells PyMatGen to parse the per-atom, per-orbital orbital character (the same data PROCAR holds). Costs more memory; required for orbital-resolved analysis.

⬆ result: vr = BSVasprun instance. Exposes .get_band_structure(), .efermi, .actual_kpoints, .projected_eigenvalues, etc.

16bs = vr.get_band_structure(line_mode=True)

Build a BandStructureSymmLine object. line_mode=True signals that the KPOINTS file used a high-symmetry path (not a uniform mesh), so PyMatGen knows to associate Γ, X, L, K labels with the right indices.

EXECUTION STATE

⬇ arg: line_mode=True = Set to True for high-symmetry path k-points (KPOINTS Line mode). False for uniform Monkhorst–Pack grids. Affects how labels are assigned and which methods are available.

⬆ bs = BandStructureSymmLine. Key attributes: .bands (Spin → ndarray of shape n_bands × n_kpts), .kpoints (list of Kpoint objects), .efermi (float).

→ why this object? = It abstracts the difference between collinear (Spin.up + Spin.down) and non-collinear (Spin.up only, doubly degenerate) runs, so the same downstream code works for both.

19Comment — step 2: locate Γ on the path

We need the column index of the Γ k-point in bs.bands[Spin.up]. Path order varies: an FCC path L → Γ → X → W → K → Γ has Γ at two different indices.

20labels = bs.kpoints

List of pymatgen Kpoint objects, one per k-point along the entire line-mode path. Each carries a fractional-coord array and a .label string (set when KPOINTS provides one).

EXECUTION STATE

labels = List of Kpoint objects, length n_kpts ≈ 200 for a typical six-segment FCC path with 40 points/segment.

→ Kpoint.label = Either a string like "\\Gamma", "X", "L", "W", "K" (LaTeX-friendly escape), or None for intermediate points.

21gamma_idx = next(... for i, k in enumerate(labels) if ...)

Generator expression with next() — finds the first i for which the k-point is labelled as Γ. Equivalent to a tiny for-loop with early break.

EXECUTION STATE

📚 next(iterator) = Built-in. Pulls the first item from a generator. Raises StopIteration if empty — here that would mean Γ never appeared on the path (would be a configuration bug).

📚 enumerate(seq) = Built-in. Pairs each element with its index: enumerate(['a','b']) → [(0,'a'), (1,'b')]. Standard idiom for index + value loops.

k.label in ("\\Gamma", "GAMMA", "G") = Tuple membership test — different VASP wrappers tag Γ slightly differently. We accept all three to be portable across pymatgen versions and seekpath.

⬆ gamma_idx = Integer column index, e.g. 40 if the path is L → Γ → X → … with 40 points per segment.

25Comment — step 3

Reminder that non-collinear VASP packs everything under Spin.up. Skipping this trips up newcomers used to spin-polarised collinear runs.

27eigs = bs.bands[Spin.up][:, gamma_idx]

Slice the eigenvalues at the Γ column. bs.bands[Spin.up] has shape (n_bands, n_kpts); we take all bands at one k-point.

EXECUTION STATE

bs.bands = Dict[Spin, np.ndarray]. For non-collinear: only key is Spin.up; the array shape is (n_bands, n_kpts) where n_bands counts spinor states (each Kramers pair is a single index).

[:, gamma_idx] = NumPy fancy indexing. : keeps every row (every band); gamma_idx selects exactly one column (the Γ k-point). Result shape: (n_bands,).

⬆ eigs = 1-D array of all band energies at Γ, in eV, in band order (smallest first). Example for CdSe (n_bands=32): [-12.4, -12.4, …, -1.21, -0.79, -0.79, -0.79, -0.79, +1.74, +1.74, …]

→ reading the example = Bottom: deep Cd 4d / Se 4s core-like states. Around -0.79: the 4-fold Γ_8. Around -1.21: the 2-fold Γ_7. The -0.79 → -1.21 step is Δ_SO ≈ 0.42 eV.

30vbm_band = bs.get_vbm()["band_index"][Spin.up][0]

Ask the band structure for the topmost occupied band index at the VBM. Returns a dict with band_index (per spin), kpoint_index, energy, etc.

EXECUTION STATE

📚 bs.get_vbm() = PyMatGen helper. Scans every k-point, every band; returns the highest-energy occupied state.

⬆ result["band_index"][Spin.up] = List of band indices that touch the VBM. We take [0] — the lowest such index, i.e. the topmost continuously-occupied band.

→ example: vbm_band = 17 = For a 32-band CdSe run with 18 valence electrons in the spinor count, the VBM lives at band index 17 (0-based).

31valence_six = np.sort(eigs[vbm_band - 5 : vbm_band + 1])[::-1]

Pull the six bands ending at the VBM (the p-like manifold), sort ascending, then reverse with [::-1] so index 0 is the highest in energy.

EXECUTION STATE

eigs[vbm_band - 5 : vbm_band + 1] = Slice of length 6. For vbm_band=17 this is eigs[12:18] — the six p-like valence states at Γ. Example values: [-1.21, -1.21, -0.79, -0.79, -0.79, -0.79]

📚 np.sort(arr) = Returns a sorted copy of the array (ascending). Original is unchanged. Use np.argsort() instead if you need the indices.

[::-1] = Python slice trick — full slice, step −1, i.e. reverse. After this, valence_six[0] is the highest-energy state (top of Γ_8) and valence_six[-1] is the bottom of Γ_7.

⬆ valence_six = [-0.79, -0.79, -0.79, -0.79, -1.21, -1.21]

33Comment — step 5: identify Γ_8 and Γ_7

Group theory tells us the six p-like states at Γ split into a 4-fold Γ_8 (J = 3/2, HH + LH) plus a 2-fold Γ_7 (J = 1/2, split-off). Sorting by energy gives us this partition for free.

34Comment — picking the partition

Pure book-keeping: top 4 of valence_six = Γ_8, bottom 2 = Γ_7. Works as long as the SOC has actually opened a gap between them — converge ENCUT and k-mesh first.

35gamma8 = valence_six[:4].mean()

Average energy of the four Γ_8 states. They should be exactly degenerate at Γ; averaging silences any sub-meV numerical wobble.

EXECUTION STATE

valence_six[:4] = [-0.79, -0.79, -0.79, -0.79] — the heavy + light hole states

📚 .mean() = NumPy method — sum / count. For a length-4 array of equal numbers, just returns the number.

⬆ gamma8 = −0.7900 eV

36gamma7 = valence_six[4:].mean()

Same idea for the two-fold split-off band. Slice [4:] picks elements 4 and 5 — the bottom two of the sorted-and-reversed array.

EXECUTION STATE

valence_six[4:] = [-1.21, -1.21] — the split-off doublet

⬆ gamma7 = −1.2100 eV

37delta_so = gamma8 - gamma7

The spin–orbit splitting at Γ — the headline number every II-VI / III-V paper quotes. Sign convention: Γ_8 is the higher level, so Δ_SO > 0.

EXECUTION STATE

gamma8 - gamma7 = −0.79 − (−1.21) = +0.42 eV

⬆ delta_so = 0.4200 eV — matches the experimental CdSe value to within DFT uncertainty. Hybrid functionals (HSE06) agree even better.

→ sanity check = From our atomic-splitter widget: Δ_SO = (3/2) ξ_p(Se). With ξ_p(Se) ≈ 0.28 eV the predicted bulk Δ_SO is ≈ 0.42 eV — almost exactly the DFT result.

39print Γ_8 average

Formatted printout to four decimal places. f-strings (PEP 498) are the cleanest way to mix variables with text.

EXECUTION STATE

f"Gamma_8 (HH + LH) = {gamma8:.4f} eV" = f-string. {gamma8:.4f} formats the float with 4 decimal places. Output: Gamma_8 (HH + LH) = -0.7900 eV

40print Γ_7 average

Same formatting for the split-off doublet. Output: Gamma_7 (SO) = -1.2100 eV.

41print Δ_SO with experimental reference

Print Δ_SO with a literal '(CdSe expt: 0.42 eV)' tag so the operator can eyeball whether the calculation has reproduced the literature. Always sanity-check against experiment when SOC matters.

EXECUTION STATE

Output line = Delta_SO = 0.4200 eV (CdSe expt: 0.42 eV)

18 lines without explanation

1"""
2read_soc_bands.py
3=================
4Extract the spin-orbit splitting Delta_SO at Gamma from a VASP
5non-collinear band-structure calculation (LSORBIT = .TRUE.).
6
7Run AFTER the band-structure step (ICHARG = 11, line-mode KPOINTS).
8"""
9
10import numpy as np
11from pymatgen.io.vasp import BSVasprun
12from pymatgen.electronic_structure.core import Spin
13
14# 1.  Parse vasprun.xml (line-mode bands, with projections)
15vr = BSVasprun("vasprun.xml", parse_projected_eigen=True)
16bs = vr.get_band_structure(line_mode=True)
17
18# 2.  Locate the index of the Gamma k-point on the path
19labels = bs.kpoints
20gamma_idx = next(
21    i for i, k in enumerate(labels) if k.label in ("\\Gamma", "GAMMA", "G")
22)
23
24# 3.  In a non-collinear run, eigenvalues live under Spin.up only
25#     (each band is doubly Kramers-degenerate, so n_bands counts spinors)
26eigs = bs.bands[Spin.up][:, gamma_idx]   # shape: (n_bands,)
27
28# 4.  Find the valence-band maximum and the six p-like states beneath it
29vbm_band = bs.get_vbm()["band_index"][Spin.up][0]
30valence_six = np.sort(eigs[vbm_band - 5 : vbm_band + 1])[::-1]   # top 6
31
32# 5.  Top-4 = Gamma_8 (HH + LH degenerate at Gamma)
33#     Bottom-2 of these six = Gamma_7 (split-off)
34gamma8 = valence_six[:4].mean()
35gamma7 = valence_six[4:].mean()
36delta_so = gamma8 - gamma7
37
38print(f"Gamma_8 (HH + LH) = {gamma8:.4f} eV")
39print(f"Gamma_7 (SO)      = {gamma7:.4f} eV")
40print(f"Delta_SO          = {delta_so:.4f} eV   (CdSe expt: 0.42 eV)")

Sanity check from the atomic side

The DFT result Δ_SO ≈ 0.42 eV agrees with the atomic estimate $\tfrac{3}{2}\xi_p(\text{Se}) \approx 1.5 \times 0.28\,\text{eV} \approx 0.42\,\text{eV}$ . Two pictures, one number. When the atomic and band estimates of Δ_SO disagree by more than ~20 %, suspect either an unconverged calculation or strong covalent mixing of cation states into the VBM — both worth chasing down.

Summary

Origin. SOC is the relativistic correction in the Pauli reduction of the Dirac equation. Physically, an electron orbiting a nucleus sees a magnetic field; its own magnetic moment couples to that field. The result is $\hat{H}_{\text{SOC}} = \xi(r)\,\mathbf{L}\cdot \mathbf{S}$ .
Master identity. $2\,\mathbf{L}\cdot\mathbf{S} = \mathbf{J}^2 - \mathbf{L}^2 - \mathbf{S}^2$ . Every fine-structure splitting in atomic and condensed-matter physics is a one-line application of this identity.
p-shell partition. 6 → 4 + 2: a degenerate p-shell breaks into a J = 3/2 quadruplet at +ξ/2 and a J = 1/2 doublet at −ξ. Gap = (3/2) ξ. Centre of mass preserved.
In a crystal. The band-edge p-manifold of a zincblende semiconductor splits as Γ_15 ⊗ Γ_½ = Γ_8 ⊕ Γ_7. The Γ_8 quadruplet contains HH (m_J = ±3/2) and LH (m_J = ±1/2) bands, degenerate at Γ; Γ_7 is the split-off band, lower by Δ_SO.
Z⁴ scaling. Atomic ξ_p grows as Z⁴ (screening softens this somewhat). For II-VI semiconductors, the anion sets Δ_SO at the VBM.
Inversion-asymmetric crystals. Bulk inversion asymmetry (Dresselhaus) and structural inversion asymmetry (Rashba) lift the spin degeneracy of bands at finite $\mathbf{k}$ . Necessary for spin textures, topological surface states, and finite spin-relaxation times.
VASP. LSORBIT = .TRUE., set MAGMOM with three components per atom, choose SAXIS, turn ISYM off. Cost ~3–4× collinear; bands need to be doubled.
Reading the output. In a non-collinear run all eigenvalues live under Spin.up. The six topmost valence states at Γ split into a 4-fold Γ_8 and a 2-fold Γ_7. Their average difference is Δ_SO.

Section 5.7 Core Insight

"Spin–orbit coupling is one identity — 2 L·S = J² − L² − S² — applied at every level of the hierarchy: atomic fine structure, band edges of a crystal, and Rashba/Dresselhaus textures. The same 6 = 4 + 2 partition shows up in the p-shell of a free Se atom and in the Γ_8 + Γ_7 split of CdSe's valence-band top."

Coming next: Section 5.8 — Defects and Doping — where we put a Mn atom on a Cd site, watch the d-shell of Mn split into $t_2$ and $e$ mid-gap states under the tetrahedral ligand field, and use SOC + crystal field together to predict the emission energy of a Mn:CdSe quantum dot.