Boo-AI — Master Artificial Intelligence by Building from Scratch

Learning Objectives

By the end of this section you should be able to:

Explain in plain words why an electron inside a crystal behaves dynamically as if it had a different mass than the bare electron in vacuum — and recognise the effective mass $m^{*}$ as the single number that captures this dressing of the electron by the periodic potential.
Derive the master formula $(m^{*})^{-1} = \hbar^{-2}\,\partial^{2}E/\partial k^{2}$ from the semiclassical equations of motion, and read it the other way: a band's curvature is a mass, expressed in inverse units.
Generalise to anisotropic minima — the inverse mass $(m^{*})^{-1}_{ij}$ is a tensor — and tell apart the three masses you will meet in practice: the band-curvature mass, the density-of-states mass $m_{\text{DOS}}^{*}$ , and the conductivity mass $m_{\sigma}^{*}$ .
Explain holes — why a missing electron at the top of a valence band carries positive charge and a positive effective mass, and where the sign flips happen in the bookkeeping.
Connect $m^{*}$ to transport via the Drude formulae $\mu = e\tau / m^{*}$ and $\sigma = n e^{2}\tau / m^{*}$ , and predict which materials make fast transistors and which make sluggish insulators just from a band structure plot.
Compute an effective mass from VASP output: locate the band edge in vasprun.xml, sample E(k) along a dense path, fit a parabola, and convert the leading coefficient to $m^{*}/m_{0}$ .

One-line preview: the curvature of a band edge is a mass, the inverse of that mass is a velocity-per-force, and that single number — together with a scattering time τ — predicts every bulk transport coefficient your device-physics professor will ever ask you to compute.

The Puzzle — A Real Mass That Lies

Take an electron in vacuum. Push it with a force $F$ . Newton, undisturbed since 1687, says it accelerates at $a = F/m_{0}$ with $m_{0} = 9.109\times 10^{-31}\,\text{kg}$ . Now drop the same electron into a silicon crystal at the bottom of the conduction band, apply the same force, and measure the acceleration. You will find it accelerates as if it had a mass of $0.26\,m_{0}$ along the longitudinal axis of an ellipsoidal valley and $0.19\,m_{0}$ in the perpendicular directions. Push an electron at the bottom of the GaAs conduction band and it responds as if it weighed only $0.067\,m_{0}$ — barely more than a feather's worth of the bare value. Push it into the heavy-hole band of GaAs and the same particle suddenly weighs $0.51\,m_{0}$ .

The electron itself never changed. What changed is its environment. Inside the crystal, the periodic potential of the ions is constantly scattering the electron — emitting and reabsorbing it as Bloch waves — and the net effect is that response to an external force looks exactly like Newton's second law would for a particle with a different mass. We call that number the effective mass $m^{*}$ . It is not a fudge factor; it is a genuine, calculable property of the band structure, and it determines almost every electronic property the engineer cares about.

A useful analogy: marbles in molasses, beads on a wire

Imagine a marble rolling on a flat table — that is the vacuum electron. Now drop it into a viscous fluid: pushing it requires more force per unit acceleration, so it acts heavier. That is the heavy-electron limit. Conversely, a bead threaded on a frictionless straight wire, pushed along the wire, requires less force than a free marble — most of the bead's "mass" is constrained by the wire and only the parallel component matters. That is the light-electron limit. The crystal lattice is doing something analogous: it dresses the electron with a band-dependent inertia.

Newton's Law for a Bloch Electron

Section 5.1 already introduced the two semiclassical equations of motion that govern a Bloch electron in a smoothly varying external force $\mathbf{F}$ . We restate them here because they are the only physics input we need:

\hbar\,\dot{\mathbf{k}} \;=\; \mathbf{F}, \qquad \mathbf{v} \;=\; \frac{1}{\hbar}\,\nabla_{\mathbf{k}} E_n(\mathbf{k})

The first equation says the wavevector evolves under the external force, exactly as the momentum does in classical mechanics — the crystal momentum $\hbar\mathbf{k}$ plays the role of $m\mathbf{v}$ . The second says the velocity of the wavepacket is the gradient of the band in k-space: a wavepacket centred at $\mathbf{k}$ moves with the group velocity $\mathbf{v}_n(\mathbf{k})$ . Put these two together and you have a complete dynamical theory: an external force slides the wavepacket along the band, and the band itself prescribes the velocity at every point.

Why this is "semiclassical"

The Bloch wavefunction is fully quantum, but the centre of mass of a wavepacket built from nearby k-states obeys these classical-looking equations, exactly. The magic is hidden in $E_n(\mathbf{k})$ : every quantum-mechanical subtlety of the periodic potential is already baked into the shape of the band. Solve the band structure once, then run classical equations of motion. This is why a single-particle band calculation is so powerful in device physics.

Curvature → Mass: The Derivation

Differentiate the velocity equation with respect to time and use the chain rule:

\dot{v}_i \;=\; \frac{d}{dt}\left[\frac{1}{\hbar}\,\partial_{k_i} E_n\right] \;=\; \frac{1}{\hbar}\,\partial_{k_i}\partial_{k_j} E_n \cdot \dot{k}_j \;=\; \frac{1}{\hbar^{2}}\,\partial_{k_i}\partial_{k_j} E_n \cdot F_j

Compare to Newton's second law $\dot{v}_i = (m^{*-1})_{ij}\,F_j$ . We are forced to identify the inverse effective-mass tensor as the Hessian of the band:

\boxed{\quad \bigl(m^{*}\bigr)^{-1}_{ij} \;=\; \frac{1}{\hbar^{2}}\,\frac{\partial^{2} E_n(\mathbf{k})}{\partial k_i\,\partial k_j} \quad}

This is the master formula. Read it slowly. The left-hand side is a familiar mechanical object — a 3×3 inverse mass tensor with units of $\text{kg}^{-1}$ . The right-hand side is a purely geometric property of the band structure — the curvature of $E_n(\mathbf{k})$ at a chosen k-point. These are the same thing. Curvature in k-space is mass.

The physical meaning of the formula

Sharp parabola at a band minimum (large curvature $\partial^{2}E/\partial k^{2}$ ) → small $m^{*}$ → light, mobile carrier. GaAs electron with $m^{*}_e \approx 0.067\,m_{0}$ .
Flat saucer at a band minimum (small curvature) → large $m^{*}$ → heavy, sluggish carrier. Heavy-fermion compounds reach $m^{*} > 100\,m_{0}$ .
Negative curvature (a band maximum, like the valence band top) → negative effective mass for the electron description. We will see in §Holes how this sign flip is absorbed by introducing the hole.

The 1D scalar form

For a one-dimensional band, or for an isotropic minimum where the Hessian is a multiple of the identity, the tensor collapses to a single number:

m^{*} \;=\; \hbar^{2}\bigg/\frac{\partial^{2}E}{\partial k^{2}}

Around a band minimum, expand the energy in a Taylor series. The zero-th order is the band-edge energy $E_{0}$ . The first derivative vanishes (it's an extremum). What remains is the parabolic approximation:

E(k) \;\approx\; E_{0} \;+\; \frac{\hbar^{2}\,(k-k_{0})^{2}}{2 m^{*}}

— exactly the kinetic energy of a free particle of mass $m^{*}$ . Near a band edge, every semiconductor pretends to be a free-electron gas with a renormalised mass. This is the single most useful approximation in solid-state physics. Every textbook formula for carrier statistics, mobility, optical absorption near the gap, exciton binding energy, and donor level depth is built on it.

The k·p method — where the formula really comes from

Strictly speaking, the parabolic expansion is the leading order of k·p perturbation theory: write $H = H_{0} + (\hbar/m_{0})\,\mathbf{k}\cdot\mathbf{p} + \hbar^{2}k^{2}/2m_{0}$ and treat the second term as a perturbation around $\mathbf{k}_{0}$ . The second-order correction couples the band of interest to all other bands through momentum matrix elements $\langle n|\hat{p}|m\rangle$ , and yields exactly the inverse-mass tensor formula above. The lighter the band edge, the stronger its coupling to a remote band of opposite parity. In III-V semiconductors most of the conduction electron's lightness comes from coupling to the valence band ~1 eV away — the smaller the gap, the lighter the electron. That is why InSb (gap 0.17 eV) has $m^{*}_e \approx 0.014\,m_{0}$ while diamond (gap 5.5 eV) has $m^{*}_e \approx 0.4\,m_{0}$ .

Interactive — Bend the Band, Read the Mass

Below is a one-band sandbox. Slide the effective-mass control to make the parabola open wide (light mass) or narrow (heavy mass). Drag the pink dot along the band — the dashed tangent gives you the slope at that point, which is exactly $\hbar$ times the group velocity $v_g = \hbar^{-1}\,\partial E/\partial k$ . Toggle between the electron view (parabola opens upward at the conduction band minimum) and the hole view (parabola opens downward at the valence band maximum) to see the sign flip we will discuss in §Holes below. The faint dashed parabola is the free-electron reference $m^{*} = m_{0}$ : as you change the slider, watch how dramatically the band departs from it.

Carrier:

m*/m₀ = 1.000

Curvature ∂²E/∂k²

7.620 eV·Å²

Group velocity v_g

4.63 × 10⁵ m/s

Mobility μ = eτ/m* (τ = 0.1 ps)

176 cm²/V·s

Drag the pink dot along the band to see how slope (group velocity) and height (kinetic energy) change. Slide m*/m₀: a heavier mass narrows the parabola — same Δk costs more energy, the carrier moves slower and is harder to push. Switch to hole and the parabola flips — the maximum is the band edge, and curvature is negative even though the hole's effective mass is positive.

Three things to try

Set $m^{*}/m_{0} = 0.1$ and read the mobility. Now set it to $m^{*}/m_{0} = 1$ . The mobility drops by exactly a factor of 10 — that is the Drude scaling $\mu = e\tau/m^{*}$ staring you in the face.
Drag the dot away from $k = 0$ . The curvature is the same everywhere on a parabola, but the slope (and therefore the group velocity) grows linearly with $k$ . This is the textbook fact that an electron at the bottom of a band has zero group velocity, even though it has finite kinetic energy.
Switch to the hole view. The parabola flips upside down — and yet the curvature has the same magnitude. The negative sign of the second derivative is what makes us reinvent the hole.

Anisotropy — The Effective Mass Tensor

Real bands are almost never spherical. The conduction band of silicon has six equivalent ellipsoidal valleys near (but not at) the X points; each valley is elongated along the $\langle 100\rangle$ direction. The valence band of GaAs has two heavy degenerate bands at Γ that are warped: the equal-energy surfaces are not even ellipsoids but rumpled shapes resembling potatoes. To handle these, we have to keep the full Hessian.

At a band edge in three dimensions, the parabolic approximation is

E(\mathbf{k}) \;=\; E_{0} \;+\; \tfrac{1}{2}\,\hbar^{2}\,(k - k_{0})_i\,(m^{*-1})_{ij}\,(k - k_{0})_j

The inverse mass tensor is symmetric, so it can be diagonalised: rotate coordinates to align with its principal axes and the energy surface becomes

E(\mathbf{k}) \;=\; E_{0} \;+\; \frac{\hbar^{2}}{2}\left[\frac{(\Delta k_{1})^{2}}{m_{1}^{*}} + \frac{(\Delta k_{2})^{2}}{m_{2}^{*}} + \frac{(\Delta k_{3})^{2}}{m_{3}^{*}}\right]

— three independent effective masses along the three principal directions. The equal-energy surface is now an ellipsoid:

\frac{(\Delta k_{1})^{2}}{m_{1}^{*}} + \frac{(\Delta k_{2})^{2}}{m_{2}^{*}} + \frac{(\Delta k_{3})^{2}}{m_{3}^{*}} \;=\; \text{const.}

For an axially symmetric valley like silicon, two of the masses are equal — call them the transverse mass $m_{t}^{*}$ — and the third is the longitudinal mass $m_{\ell}^{*}$ . The whole ellipsoid is then specified by two numbers. For Si: $m_{\ell}^{*}/m_{0} \approx 0.92$ , $m_{t}^{*}/m_{0} \approx 0.19$ — the valley is a cigar pointing along ⟨100⟩, much heavier along its long axis than across.

Why anisotropy matters for devices

When you grow an MOSFET on a (100) silicon wafer, the channel current flows in the wafer plane. Two of the six Si valleys point out of the plane (their long, heavy axis is normal to the channel) while four point in-plane. Surface confinement raises the energy of the four in-plane valleys, leaving the carriers in the two out-of-plane "heavy-up, light-along" valleys — and these give the channel its low transport mass $m_{t}^{*}$ . This is one of the rare cases where you actually have to care which face of silicon you cut, and it is band-curvature physics all the way down.

Three Flavours of Effective Mass

Once the valley is anisotropic, "the" effective mass ceases to be a single number. Different physical observables average the principal masses in different ways. You will meet three:

1. The band-curvature mass

Direct readout of $(m^{*-1})_{ij}$ from the Hessian. This is the mass tensor itself, with three components for an ellipsoidal valley. It is what you get out of a parabolic fit to a VASP band along a chosen direction. Use it for k·p models and for direction-resolved transport.

2. The density-of-states effective mass

The number of carrier states up to energy $E$ in a single ellipsoidal valley equals the same quantity for a sphere with mass $m_{\text{DOS}}^{*}$ and the same volume. That gives the geometric mean of the principal masses, multiplied by the number of equivalent valleys $g_{v}$ raised to a 2/3 power:

m_{\text{DOS}}^{*} \;=\; g_{v}^{2/3}\,(m_{1}^{*}\,m_{2}^{*}\,m_{3}^{*})^{1/3}

Use this when computing the carrier concentration in a doped semiconductor, the position of the Fermi level versus temperature, or the effective DOS prefactors $N_c, N_v$ . For Si: $m_{\text{DOS},e}^{*}/m_{0} = 6^{2/3}(0.92\cdot 0.19^{2})^{1/3} \approx 1.08$ — substantially larger than any single principal mass, because the six valleys all contribute states at the same energy.

3. The conductivity effective mass

For transport, the relevant average is the harmonic mean of the principal masses (the inverse mass tensor adds when you sum over equivalent valleys):

\frac{1}{m_{\sigma}^{*}} \;=\; \frac{1}{3}\!\left(\frac{1}{m_{1}^{*}} + \frac{1}{m_{2}^{*}} + \frac{1}{m_{3}^{*}}\right)

Use this in the Drude formula $\sigma = ne^{2}\tau/m_{\sigma}^{*}$ . For Si: $m_{\sigma}^{*}/m_{0} \approx 0.26$ — much lighter than the DOS mass because conductivity is dominated by the easy directions, while DOS counts all directions equally.

Cheat sheet: which mass goes where

Quantity	Mass to use	Why
Carrier concentration n(T), Fermi level	DOS effective mass	You are counting states up to an energy — geometric mean × valley count
Mobility, conductivity, drift velocity	Conductivity effective mass	Inverse masses average — the easy axis carries the current
Cyclotron resonance, k·p Hamiltonian	Principal-axis masses (the tensor)	Direction-resolved physics — keep all three
Optical absorption near gap, exciton binding	Reduced mass μ from m_e* and m_h*	Two-particle problem — both band-curvature masses combine

Holes — Curvature Turned Upside Down

At a band maximum like the valence-band top, the curvature is negative: $\partial^{2}E/\partial k^{2} < 0$ . Plug that into our master formula and you get a negative effective mass for the electron, which then accelerates opposite to the applied force — apply an electric field pointing right and the electron drifts left. That sounds bizarre until you realise the valence band is almost completely full, and what is moving is the absence of an electron.

Define the hole as the missing electron. Its charge is $+e$ (because removing a negative contributor leaves a net positive); its momentum is $-\hbar\mathbf{k}$ (the missing electron had $+\hbar\mathbf{k}$ ); its energy is $-E_{n}(\mathbf{k})$ measured from the band top. With these substitutions the curvature

m_{h}^{*} \;=\; -\,\hbar^{2}\bigg/\frac{\partial^{2}E_{v}(\mathbf{k})}{\partial k^{2}} \;>\; 0

is positive. The hole has positive charge, positive mass, and obeys Newton's second law in the ordinary sense: an electric field pulls it in the field direction, and it drifts down in the valence-band picture (which is up in the hole-band picture — the hole's energy increases as it moves toward higher E-states from the valence-band perspective).

Two simultaneous bookkeeping conventions

You will see both conventions in the literature, and switching between them is a common source of sign errors. (i) The electron picture: keep the electron charge $-e$ and the negative curvature, and let the dynamics handle themselves. (ii) The hole picture: flip the sign of charge, momentum, and curvature simultaneously, and pretend the carrier is a positively charged particle with positive mass. Both give identical predictions for any observable. VASP and most band-structure codes report eigenvalues in the electron picture; mobility models and device simulators use the hole picture. When reading a number, ask: was its sign already flipped?

Drude Transport: Mobility and Conductivity

We now have everything we need to derive the central transport formula of solid-state physics. Apply a uniform electric field $\mathbf{E}$ . The semiclassical equation gives $\hbar\dot{\mathbf{k}} = -e\mathbf{E}$ ; the wavevector slides at constant rate. If nothing else happened, the electron would just keep accelerating: it would slide up the band, reach the zone boundary, Bragg-reflect, slide back, and oscillate (these are Bloch oscillations, which have actually been observed in cold-atom systems and superlattices). In a normal crystal, however, scattering intervenes long before the electron reaches the zone boundary.

Model scattering as a Poisson process with mean free time $\tau$ : in a small interval $dt$ the electron is randomly scattered with probability $dt/\tau$ , after which its crystal momentum is reset to a random value drawn from the equilibrium distribution. Between scatterings it accelerates as a Bloch electron of effective mass $m^{*}$ . The steady-state ensemble-averaged velocity along the field — the drift velocity — is

\mathbf{v}_{d} \;=\; \frac{-e\,\tau}{m^{*}}\,\mathbf{E} \;\equiv\; -\,\mu\,\mathbf{E}

defining the mobility $\mu \equiv e\tau/m^{*}$ , with units of $\text{cm}^{2}\,\text{V}^{-1}\,\text{s}^{-1}$ . Multiply both sides by the carrier density $n$ and the charge $-e$ to get the current density and read off the Drude conductivity:

\sigma \;=\; \frac{n\,e^{2}\,\tau}{m^{*}} \;=\; n\,e\,\mu

Three numbers determine every bulk transport coefficient: how many carriers there are ( $n$ ), how heavy each one is ( $m^{*}$ ), and how often they scatter ( $1/\tau$ ). DFT gives us $m^{*}$ and (with statistics) the carrier density; $\tau$ is harder — it comes from electron-phonon coupling, electron-impurity scattering, and electron-electron scattering, each of which is its own chapter. For our purposes it is enough to remember: a band edge with light curvature gives a small mass, which gives high mobility, which gives a fast device.

Numerical sanity check

For GaAs at 300 K: $m^{*}_e \approx 0.067\,m_{0}$ , $\tau \approx 0.5\,\text{ps}$ . Plugging in: $\mu = e\tau/m^{*} = (1.602\times 10^{-19})(5\times 10^{-13})/(0.067 \cdot 9.109\times 10^{-31}) \approx 1.31\,\text{m}^{2}/\text{V}\cdot\text{s} = 1.31\times 10^{4}\,\text{cm}^{2}/\text{V}\cdot\text{s}$ — bang on the experimental value of $\mu_e \approx 8000\text{–}9000\,\text{cm}^{2}/\text{V}\cdot\text{s}$ once you account for finite-temperature phonon scattering. The Drude formula is an excellent first-principles estimate.

Interactive — Drift in an Electric Field

Below is a 2D Drude sandbox. Each cyan dot is a carrier with effective mass $m^{*}$ (slider). The yellow arrows on the right show an applied electric field; the carriers feel a force in the +x direction. With scattering off, every carrier accelerates indefinitely — there is no steady state, the drift velocity climbs linearly, and a perfect crystal would have infinite conductivity. Toggle scattering on and watch the drift velocity saturate near the theoretical value $v_d = eE\tau/m^{*}$ : each carrier accelerates between scattering events, then has its momentum randomised, and the ensemble settles into a steady drift. This is why insulators of any quality have any finite conductivity at all, and why cooling a metal increases its conductivity (longer $\tau$ ).

E-field = 1.00m*/m₀ = 1.00τ = 40 ms

scattering on

Scattering events

v_d (measured)

0.000

v_d (theory) = eEτ/m*

0.040

Turn scattering off and watch the carriers accelerate forever — a perfect crystal would be a perfect conductor. Turn it on and the drift velocity saturates near eEτ/m*: a balance between the field accelerating the carrier and scattering randomising its momentum. Increase m* and the same field produces a smaller drift — heavier carriers are sluggish, exactly the prediction of the Drude formula μ = eτ/m*.

Three things to try

With scattering on, double the field. The drift velocity doubles. With scattering off, doubling the field doubles the rate of acceleration, not the velocity — Ohm's law fails completely without scattering.
With scattering on, halve $m^{*}$ . The drift velocity doubles. Same field, half the inertia, twice the response — exactly $\mu \propto 1/m^{*}$ .
With scattering on, halve $\tau$ . The drift velocity halves. Mobility scales linearly with mean free time — the cleaner your sample (longer $\tau$ ), the faster your device.

Real Materials — A Reference Table

Here is a sampling of room-temperature effective masses for the semiconductors and metals you will most often meet. Numbers are cited in units of $m_{0}$ ; for ellipsoidal valleys we list both principal masses; for warped valence bands we list the heavy and light hole branches separately.

Material	m_e* / m₀	m_hh* / m₀	m_lh* / m₀	Gap (eV)	Notes
Si (indirect)	0.92 (∥), 0.19 (⊥)	0.49	0.16	1.12	6 ellipsoidal valleys near X
Ge (indirect)	1.59 (∥), 0.082 (⊥)	0.33	0.043	0.66	4 ellipsoidal valleys at L
GaAs	0.067	0.51	0.082	1.42	Direct gap at Γ — workhorse for HEMTs
InAs	0.023	0.41	0.026	0.36	Very light electrons — high-frequency electronics
InSb	0.014	0.43	0.015	0.17	Lightest electrons of any common semiconductor
CdSe (zincblende)	0.13	0.45	0.27	1.74	Our running example — see §VASP below
GaN (wurtzite)	0.20	1.4 (A)	1.1 (B)	3.4	Power electronics, heavy holes
Diamond	0.36	1.08	0.36	5.5	Ultra-wide gap — heavy carriers
Cu (metal)	1.0–1.4 (anisotropic)	—	—	—	Metals don't have a 'mass' in the same sense

Where these numbers come from

Most of the values in the table are not raw DFT predictions — plain PBE notoriously underestimates band gaps and correspondingly overestimates curvature, leaving electron masses ~30% too light in many III-V materials. The numbers above are experimental, typically extracted from cyclotron resonance, Shubnikov-de Haas oscillations, or magneto-optical absorption. When you compute an effective mass with VASP for the first time, expect to be within a factor of 2 of the experimental number using PBE, and within ~20% using HSE06. Spin-orbit coupling matters for the valence bands of any III-V or II-VI compound, and for the conduction bands of heavy elements like Pb and Bi — turn it on with LSORBIT = .TRUE. in the INCAR (more in §6.3 and §5.7).

Computing Effective Mass in VASP

With the theory in place, the actual VASP computation is a short recipe. Run a band structure (recipe from §5.1), find the band edge in the eigenvalue file, sample the band along a chosen direction densely enough that a parabola is a good fit (≥ 7 points within ±0.05 of the BZ in each direction), and least-squares fit a polynomial. The leading coefficient is $\hbar^{2}/2m^{*}$ ; divide and you have the mass.

The INCAR — what to set, and why

The two-step SCF + non-SCF recipe from §5.1 carries over almost unchanged. The two extra knobs that matter for an accurate effective mass are k-point density and convergence of the eigenvalues at the band edge.

📝text

1# INCAR (step 2: non-SCF along high-symmetry path, dense sampling near edge)
2SYSTEM = CdSe zincblende — effective-mass extraction
3PREC   = Accurate
4ENCUT  = 400          # converged from §6.3 test
5ICHARG = 11           # read CHGCAR, do not update
6LORBIT = 11
7ISMEAR = 0            # Gaussian — semiconductors
8SIGMA  = 0.02         # tight smearing — we fit eigenvalues, not occupations
9EDIFF  = 1E-7         # tighter than usual SCF — small Δk × small ΔE
10NBANDS = 32
11LSORBIT = .TRUE.      # spin-orbit on — matters for VBM in CdSe

📝text

1# KPOINTS (step 2: line mode, 60 points/segment near Γ for fine sampling)
2Bands for effective-mass fit
360               ! 60 points per segment for fine Δk
4Line
5Reciprocal
6
70.00 0.00 0.00  Gamma
80.05 0.00 0.00  near-Gamma-X    ! short segment, dense — for m_e* along ⟨100⟩
9
100.00 0.00 0.00  Gamma
110.05 0.05 0.00  near-Gamma-K    ! along ⟨110⟩ — second principal direction
12
130.00 0.00 0.00  Gamma
140.05 0.05 0.05  near-Gamma-L    ! along ⟨111⟩ — third

Why short segments instead of the full Γ→X path

The parabolic approximation only holds within ~5% of the BZ around the edge. If you fit a parabola to the entire Γ→X line (which goes all the way to the zone boundary) you will get the curvature averaged over a region where the band has already turned non-parabolic — often by a factor of 2 wrong. Always run a separate KPOINTS file with short, dense segments around the band edge for effective-mass extraction. For an anisotropic valley you need at least three independent directions to reconstruct the full tensor.

Parsing vasprun.xml — the parabolic fit, line by line

Every modern VASP post-processing pipeline starts with pymatgen's Vasprun parser. The snippet below loads the band structure, locates the conduction band minimum, slices a 9-point window around it, fits a parabola, and prints the resulting effective mass. Click any line to see the full execution trace and what each library function does.

Effective-mass extraction — interactive trace

🐍effmass.py

Explanation(23)

Code(29)

1import numpy as np

NumPy is a numerical computing library that provides the ndarray type plus fast linear-algebra routines written in C. We need it for vector norms (np.linalg.norm), the polynomial fit (np.polyfit), array indexing, and the argmin used to locate the closest k-point to the band edge.

EXECUTION STATE

numpy = Python library for fast array math. Used here for: norms, polyfit, argmin, slicing.

as np = Standard alias so we can write np.polyfit(...) instead of numpy.polyfit(...).

2from pymatgen.io.vasp import Vasprun

pymatgen is the Materials Project's Python toolkit for solid-state calculations. Vasprun is the parser for vasprun.xml — VASP's master output file containing eigenvalues, k-points, lattice, INCAR, ionic positions, and the Fermi level. We import it so we can pull the band structure as Python objects instead of regex-parsing text.

EXECUTION STATE

📚 Vasprun = Class that reads vasprun.xml. Construct it once, then call methods like get_band_structure(), get_dos(), efermi.

4# Step 1 — load the band-structure run from VASP

Comment marking the loading stage. The code below assumes vasprun.xml comes from a non-self-consistent (ICHARG=11) run along a high-symmetry path — the standard recipe from §5.1.

5vr = Vasprun("vasprun.xml")

Construct a Vasprun object by parsing the XML file in the current directory. The parser walks the entire DOM and stores eigenvalues, the lattice, the Fermi level, and the INCAR settings on the object. This is a one-shot read; the object is now self-contained.

EXECUTION STATE

📚 Vasprun(filename) = Constructor — parses vasprun.xml. Optional arguments include parse_dos=True, parse_eigen=True (default True), parse_projected_eigen=False. We accept the defaults: eigenvalues yes, projected eigenvalues no.

⬇ arg: "vasprun.xml" = Path to the file VASP wrote at the end of the run. By convention it sits in the run directory next to OUTCAR, EIGENVAL, and CHGCAR.

⬆ vr = A Vasprun object exposing: vr.efermi (float, eV), vr.final_structure (pymatgen Structure), vr.eigenvalues (dict spin → ndarray), vr.actual_kpoints (list of fractional coords).

6bs = vr.get_band_structure(line_mode=True)

Convert the raw eigenvalue arrays into a high-level BandStructureSymmLine object: it knows the high-symmetry labels (Γ, X, L, …), the segmentation of the k-path, and how to ask for the band gap, the CBM, the VBM. line_mode=True tells pymatgen to interpret the k-points as a connected path, not a uniform mesh.

EXECUTION STATE

📚 vr.get_band_structure() = Returns a BandStructure (or BandStructureSymmLine if line_mode=True). The latter has helpers like get_cbm(), get_vbm(), get_band_gap(), branches[].

⬇ arg: line_mode=True = Tells pymatgen the k-points are a high-symmetry line path (KPOINTS file with 'Line' mode). With line_mode=False it would try to interpret them as a uniform grid and would raise on a non-uniform path.

⬆ bs = BandStructureSymmLine with: bs.bands (dict Spin → array of shape (n_bands, n_kpoints)), bs.kpoints (list of Kpoint objects), bs.efermi (Fermi level), bs.is_metal(), bs.get_cbm(), bs.get_vbm().

8# Step 2 — locate the conduction band minimum (CBM)

Comment marking the CBM-finding stage. The CBM is the lowest unoccupied energy in the band structure; for a parabolic fit, this is the centre of our window.

9cbm = bs.get_cbm()

Returns a dictionary describing the conduction band minimum: its energy, the k-point at which it occurs, and which band index(es) reach it. For a direct-gap semiconductor like CdSe the CBM is at Γ; for Si it is between Γ and X.

EXECUTION STATE

📚 bs.get_cbm() = Pymatgen helper: scans bands above the Fermi level, finds the minimum, returns a dict with keys 'energy', 'kpoint', 'kpoint_index', 'band_index'.

⬆ cbm = {'energy': 1.74, 'kpoint': <Kpoint Γ frac=[0,0,0]>, 'band_index': {Spin.up: [16]}, 'kpoint_index': [120]} — example values for CdSe.

10E0 = cbm["energy"]

Pull out the CBM energy in eV. We will subtract it from every E(k) to give us the offset ΔE used in the parabolic fit. Setting E0 as the origin is critical — np.polyfit will fit E(k) ≈ a·k² + b·k + c, and we want b ≈ 0 and c ≈ 0 so the leading coefficient a is unambiguously ℏ²/2m*.

EXECUTION STATE

E0 = 1.74 eV — the conduction band minimum, with the Fermi level set to zero. CdSe's experimental gap is ~1.74 eV at room temperature; PBE+U usually gives ~0.7 eV, HSE06 gives ~1.7 eV.

11k0 = cbm["kpoint"].frac_coords

Pull out the location of the CBM in reciprocal space, expressed as fractional coordinates of the conventional reciprocal lattice (each component lies in [-½, ½] inside the first Brillouin zone). For a Γ-point CBM you get k0 = [0, 0, 0].

EXECUTION STATE

📚 .frac_coords = Attribute of a pymatgen Kpoint: returns the (k_x, k_y, k_z) triplet in fractional units of the reciprocal lattice. Multiply by 2π/a_i to get Å⁻¹ along each axis (only when the basis is orthogonal — for non-cubic cells use the full reciprocal lattice matrix).

⬆ k0 = array([0.0, 0.0, 0.0]) — CdSe is a direct-gap material at Γ.

13# Step 3 — extract eigenvalues E(k) for the CBM band

Comment marking the eigenvalue-extraction stage. We need the array of E(k_i) for every k-point on the path, restricted to the single band that touches the CBM.

14band_idx = list(cbm["band_index"].values())[0][0]

Dig through the nested structure to get a plain integer band index. cbm['band_index'] is a dict {Spin.up: [16]}; .values() gives [[16]]; list(...)[0] gives [16]; [0] gives 16. We take the first value because for a non-spin-polarised semiconductor only one spin channel matters.

EXECUTION STATE

🧩 unpacking cbm['band_index'] = {Spin.up: [16]} → list(.values()) → [[16]] → [0] → [16] → [0] → 16

⬆ band_idx = 16 — the 17th band (0-indexed) is the lowest conduction band of CdSe in this calculation.

15spin = list(bs.bands.keys())[0]

Grab the first spin channel key from bs.bands. For ISPIN=1 (no spin polarisation) there is only Spin.up. For ISPIN=2 there would be both — and you'd loop. Using list(.keys())[0] is a defensive way to get the key without hard-coding pymatgen's Spin enum.

EXECUTION STATE

📚 bs.bands = Dict {Spin.up: ndarray(n_bands, n_kpoints), Spin.down: ndarray(...)} — the eigenvalues. Spin is a pymatgen enum (Spin.up = 1, Spin.down = -1).

⬆ spin = Spin.up — the only key when ISPIN=1.

16E_band = bs.bands[spin][band_idx]

Two-level indexing into the eigenvalue array. bs.bands[spin] is a 2D array with shape (n_bands, n_kpoints); we then take row band_idx, giving a 1D array E(k_i) for every k-point along the path. This is the curve we will fit.

EXECUTION STATE

⬆ E_band = ndarray of shape (n_kpoints,) — for a typical Γ→X→W→K→Γ→L path with 40 points per segment, that is 200 eigenvalues, one per k-point on the path.

17k_frac = np.array([k.frac_coords for k in bs.kpoints])

Build a (n_kpoints, 3) array of k-point fractional coordinates by list-comprehending over bs.kpoints. We need this both to find the CBM index along the path and to compute |Δk| in Step 4.

EXECUTION STATE

🧩 list comprehension = [k.frac_coords for k in bs.kpoints] — produces a Python list of length n_kpoints, each entry a length-3 tuple.

📚 np.array(...) = Wrap the Python list in an ndarray for vectorised math (norm, slicing). The list-of-tuples is automatically promoted to shape (N, 3).

⬆ k_frac =

ndarray of shape (n_kpoints, 3). Example first three rows: [[0,0,0], [0.025,0,0], [0.05,0,0]] for a Γ→X path.

19# Step 4 — pick a 9-point window around k0 and convert to Å⁻¹

Comment marking the windowing stage. A parabolic fit is only valid in the immediate vicinity of the band edge; nine points roughly cover ±0.05 of the BZ in each direction, which is the safe k·p regime.

20center = int(np.argmin(np.linalg.norm(k_frac - k0, axis=1)))

Find the index of the k-point along the path that is closest to k0 (the CBM). np.linalg.norm with axis=1 computes the Euclidean length of each row; np.argmin returns the row index of the minimum. We cast to int because the result is a numpy.int64 and slicing wants a Python int.

EXECUTION STATE

📚 np.linalg.norm(x, axis=1) = Computes √(sum of squares) along axis 1. For a (N,3) array, returns a length-N vector of distances. axis=1 means 'reduce columns' → one number per row.

🧩 k_frac − k0 = Broadcasting: (n_kpoints, 3) − (3,) = (n_kpoints, 3). Each row holds the displacement from k0.

📚 np.argmin = Returns the index of the smallest element. For a 1D array, returns a single integer. We use it to find the path-index of the CBM.

⬆ center = 120 — for a 200-point path through Γ this is roughly the index where the path returns to Γ on its way to L.

21a_lat = vr.final_structure.lattice.abc[0]

Pull the lattice constant a (in Å) from the relaxed structure. .lattice.abc returns the (a, b, c) triple — for a cubic crystal these are equal, and we take the first one. We need a to convert fractional k-coordinates to absolute Å⁻¹ via Δk_abs = (2π/a) · Δk_frac.

EXECUTION STATE

📚 .final_structure = The structure at the end of the run (after relaxation, if ISIF≠0). For a static run it equals the input POSCAR structure.

📚 .lattice.abc = Tuple (a, b, c) in Å — the lengths of the three lattice vectors. For CdSe zincblende: a = b = c ≈ 6.05 Å.

⬆ a_lat = 6.05 — Å. The conventional cubic edge of zincblende CdSe.

22dk = np.linalg.norm(k_frac[center-4:center+5] - k0, axis=1) * 2*np.pi / a_lat

Compute |Δk| for the 9-point window in Å⁻¹. Slice center-4 to center+5 gives 9 points (Python slice excludes the upper bound). Subtracting k0 gives displacements; np.linalg.norm reduces each to a scalar magnitude; multiplying by 2π/a converts from fractional units to absolute reciprocal-space units. This formula assumes a cubic cell — for non-cubic, multiply by the reciprocal lattice matrix instead.

EXECUTION STATE

🧩 slice center-4:center+5 = Selects 9 consecutive k-points centred on center. Python slice convention means [a:b] gives indices a, a+1, …, b−1 (length b−a). So center-4:center+5 has length 9.

🧩 axis=1 = Norm reduces along the second axis (the 3 components of each k-vector), giving one |Δk| per k-point. Output shape: (9,).

🧩 *2π/a_lat = Conversion factor. A fractional displacement of 1 corresponds to one reciprocal-lattice vector of length 2π/a_lat in a cubic cell.

⬆ dk = [0.000, 0.026, 0.052, 0.078, 0.104, 0.130, 0.156, 0.182, 0.208] — Å⁻¹. (dk[4] = 0.104 corresponds to one quarter of the path between Γ and X for a 6 Å lattice.)

23dE = E_band[center-4:center+5] - E0

Build the matching ΔE vector by slicing the band along the same 9-point window and subtracting the CBM energy. Now dE[i] is the energy offset above the CBM at displacement dk[i]. By construction dE[4] ≈ 0.

EXECUTION STATE

⬆ dE = [0.025, 0.014, 0.006, 0.002, 0.000, 0.002, 0.006, 0.014, 0.025] — eV. Roughly symmetric around the central minimum: a textbook parabola.

25# Step 5 — fit a parabola E(k) = (ℏ²/2m*) k² and read off m*

Comment marking the fit. Around a band extremum the dispersion is parabolic by Taylor expansion (the linear term vanishes by symmetry of an extremum). The leading coefficient of the parabola is exactly ℏ²/2m*.

26coeff_a = np.polyfit(dk, dE, deg=2)[0]

Fit a degree-2 polynomial dE = a·dk² + b·dk + c by ordinary least squares. np.polyfit returns the coefficients in descending order of power: [a, b, c]. We grab [0] = a, the curvature coefficient. Because we centred dk and dE on the minimum, b should be ≈ 0 and c ≈ 0.

EXECUTION STATE

📚 np.polyfit(x, y, deg) = Least-squares polynomial fit. Returns coefficients [a_n, a_{n-1}, …, a_0] where y = a_n x^n + … + a_0. For deg=2: [a, b, c] with y = a x² + b x + c.

⬇ arg: dk = [0.000, 0.026, …, 0.208] — the x-data, |Δk| in Å⁻¹ (length 9).

⬇ arg: dE = [0.025, 0.014, …, 0.025] — the y-data, E offset in eV (length 9).

⬇ arg: deg=2 = Fit a quadratic. deg=1 would fit a line (and miss the curvature); deg=4 would over-fit nine noisy points.

🧩 [0] = Take only the leading (a) coefficient. The full return is [a, b, c]; [0] indexes a. We discard b and c — by construction they should be tiny.

⬆ coeff_a = 5.45 eV·Å² — the curvature. By the formula a = ℏ²/2m*, a small a means a heavy mass; a large a means a light mass.

27m_star = 3.80998212 / coeff_a

Divide the universal constant ℏ²/2m₀ = 3.80998 eV·Å² by the fitted curvature to get m*/m₀. The 8-digit constant comes from CODATA: ℏ = 1.054571817×10⁻³⁴ J·s, m₀ = 9.1093837×10⁻³¹ kg, with the convenient unit conversion 1 J = 1/1.602176634×10⁻¹⁹ eV.

EXECUTION STATE

🧩 3.80998212 = ℏ²/2m₀ in eV·Å². Memorise this number — it appears every time you connect a parabolic band to an effective mass. (For comparison: 1 Hartree = 27.211 eV; 1 Bohr = 0.529 Å.)

🧩 / coeff_a = Solving E(k) = (ℏ²/2m*) k² for m*: m* = (ℏ²/2) / a. Dividing by m₀ gives the dimensionless m*/m₀.

⬆ m_star = 0.699 — m*/m₀ for CdSe's electron pocket. (Experimental ~0.13 m₀; PBE+SOC gives ~0.10 m₀; this hand-tuned number is illustrative only.)

29print(f"Effective mass m* = {m_star:.3f} m₀")

Print the result. The f-string formatter :.3f renders the float to three decimal places. For real production use you would also report the residual of the fit (np.polyfit can return it via full=True) so you know whether a parabola was actually a good model in this window.

EXECUTION STATE

🧩 f-string = Python literal-template syntax: f"...{var:.3f}..." embeds var formatted to 3 decimals.

⬆ stdout = Effective mass m* = 0.699 m₀

6 lines without explanation

1import numpy as np
2from pymatgen.io.vasp import Vasprun
3
4# Step 1 — load the band-structure run from VASP
5vr = Vasprun("vasprun.xml")
6bs = vr.get_band_structure(line_mode=True)
7
8# Step 2 — locate the conduction band minimum (CBM)
9cbm = bs.get_cbm()
10E0  = cbm["energy"]                    # CBM energy (eV)
11k0  = cbm["kpoint"].frac_coords        # CBM location (fractional)
12
13# Step 3 — extract eigenvalues E(k) for the CBM band
14band_idx = list(cbm["band_index"].values())[0][0]
15spin     = list(bs.bands.keys())[0]
16E_band   = bs.bands[spin][band_idx]    # all energies along the path
17k_frac   = np.array([k.frac_coords for k in bs.kpoints])
18
19# Step 4 — pick a 9-point window around k0 and convert to Å⁻¹
20center = int(np.argmin(np.linalg.norm(k_frac - k0, axis=1)))
21a_lat  = vr.final_structure.lattice.abc[0]
22dk     = np.linalg.norm(k_frac[center-4:center+5] - k0, axis=1) * 2*np.pi / a_lat
23dE     = E_band[center-4:center+5] - E0
24
25# Step 5 — fit a parabola E(k) = (ℏ²/2m*) k² and read off m*
26coeff_a = np.polyfit(dk, dE, deg=2)[0]   # leading coefficient (eV·Å²)
27m_star  = 3.80998212 / coeff_a            # ℏ²/2m₀ = 3.80998 eV·Å²
28
29print(f"Effective mass m* = {m_star:.3f} m₀")

Production-grade upgrades

Use a higher-order fit with cross-validated order. np.polyfit(dk, dE, deg=4) gives the secondand fourth derivatives — the latter measures non-parabolicity and is essential for narrow-gap semiconductors like InSb where the conduction band is famously non-parabolic.
Fit a 3D paraboloid when you sample multiple directions: scipy.optimize.curve_fit with the model $E = E_0 + \sum_i (\hbar^{2}/2 m_i^{*})\,(\Delta k_i)^{2}$ extracts the full mass tensor in one shot.
Use the EffectiveMass class from effmass (a community Python package built specifically for this) — it handles non-parabolicity, finite-temperature occupations, and the Boltzmann transport integrals for you. Cite Whalley, J. Open Source Softw. 3, 797 (2018).

Summary

We started with a puzzle — that an electron in a crystal responds to force as if its mass had been silently rewritten — and we resolved it with a single, beautiful identity:

(m^{*})^{-1}_{ij} \;=\; \frac{1}{\hbar^{2}}\,\frac{\partial^{2} E_n(\mathbf{k})}{\partial k_i\,\partial k_j}

Curvature in k-space is, literally, an inverse mass. From this one relation, with no further heavy lifting, we deduced:

The parabolic-band approximation $E(k) = E_{0} + \hbar^{2}(k-k_{0})^{2}/2 m^{*}$ — the universal local description of every band edge.
The effective-mass tensor for anisotropic valleys, with three principal masses; and the difference between $m^{*}$ (band curvature), $m_{\text{DOS}}^{*}$ (geometric mean of principal masses, used for state counting), and $m_{\sigma}^{*}$ (harmonic mean, used for conductivity).
The hole concept — a missing electron at a band maximum, with positive charge, positive mass, and ordinary dynamics, accounting for the negative curvature with a sign flip in every equation simultaneously.
The Drude formulae $\mu = e\tau/m^{*}$ and $\sigma = ne^{2}\tau/m^{*}$ , which turn a single number on a band-structure plot into a quantitative prediction of every bulk transport coefficient — and the interactive sandbox showed why scattering is the physics that makes those formulae finite.
A VASP recipe with a parabolic fit to vasprun.xml, fully traceable line by line, that gives $m^{*}/m_{0}$ in 30 lines of Python.

In the next section (5.5) we use these masses to compute the optical properties of a semiconductor — the absorption edge, the dielectric tensor, and the joint density of states are all written in the language of effective masses and the parabolic-band approximation we built here.