Boo-AI — Master Artificial Intelligence by Building from Scratch

Learning Objectives

The previous seven sections of this chapter built up reciprocal space — the reciprocal lattice (3.2), the Brillouin zone (3.3), high-symmetry paths (3.4), Bloch's theorem (3.5), the structure factor (3.6), and systematic absences (3.7). Now we put it to work. Every electronic-structure code, VASP included, replaces continuous Brillouin-zone integrals by discrete sums over a finite mesh of k-points. Choosing that mesh is the single most consequential numerical decision in any DFT calculation. By the end of this section you should be able to:

Explain why integration over the BZ is unavoidable, and what quantity is actually being averaged.
Translate the integral $\frac{V}{(2\pi)^3}\int_{\text{BZ}} f(\mathbf{k})\,d\mathbf{k}$ into the discrete sum $\sum_{\mathbf{k}} w_{\mathbf{k}}\,f(\mathbf{k})$ and know what the weights $w_{\mathbf{k}}$ mean.
State the inverse-volume rule: $(\text{real cell side}) \times (\text{k-mesh density}) \;=\; \text{constant}$ . A 2×2×2 supercell needs half the k-mesh of the primitive cell per axis.
Recognise that convergence is not always monotonic — band crossings, Fermi surfaces, and grid-edge effects can make the energy oscillate before it settles.
Read VASP's KPOINTS file and understand what each line of the simplest automatic form actually requests.
Run a clean convergence test in Python or VASP and decide when the answer is "converged enough" for the question you are asking.

One-line preview: a k-point grid is a Riemann-sum discretisation of the Brillouin zone. The denser it is, the closer the sum gets to the true integral — but the right density depends on the smoothness of the integrand and the size of your real-space cell.

Why Do We Sample at All?

Bloch's theorem (Section 3.5) tells us that every single-particle eigenstate of a periodic crystal can be labelled by a wavevector $\mathbf{k}$ living in the first Brillouin zone and a band index $n$ . The total energy of the electrons, the charge density, the density of states, the optical spectrum — every observable in this chapter and the next — is built from a Brillouin-zone average of band quantities. For total energy at zero temperature:

E_{\text{band}} \;=\; \frac{V_{\text{cell}}}{(2\pi)^3} \sum_n \int_{\text{BZ}} f(\varepsilon_{n\mathbf{k}})\,\varepsilon_{n\mathbf{k}}\;d\mathbf{k},

where $f(\varepsilon)$ is the Fermi–Dirac occupation (effectively a step function at $T = 0$ for insulators, smeared by temperature for metals). The factor $V_{\text{cell}}/(2\pi)^3$ converts the integral into a per-cell quantity using the result we proved in Section 3.2 — the reciprocal-cell volume is $(2\pi)^3/V_{\text{cell}}$ .

For an infinite crystal this is a clean continuous integral. For a computer it is something we must approximate, because the integrand $\varepsilon_{n\mathbf{k}}$ is the eigenvalue of the Kohn–Sham Hamiltonian at $\mathbf{k}$ — every evaluation costs an entire diagonalisation. We pick a finite set of $\mathbf{k}$ -points, diagonalise once at each, and add the results with weights.

One band, one matrix, one k-point

Why does VASP need many k-points? Because it solves a separate Kohn–Sham eigenvalue problem at every k. The wavefunctions $\psi_{n\mathbf{k}}(\mathbf{r}) = e^{i\mathbf{k}\cdot\mathbf{r}}u_{n\mathbf{k}}(\mathbf{r})$ have a different periodic part $u_{n\mathbf{k}}$ at each $\mathbf{k}$ . Doubling your k-mesh roughly doubles the wall-clock cost. That is the budget you are spending when you choose a mesh — make it count.

From an Integral to a Weighted Sum

Replace the integral by a Riemann-style sum:

\frac{V_{\text{cell}}}{(2\pi)^3}\int_{\text{BZ}} f(\mathbf{k})\,d\mathbf{k} \;\approx\; \sum_{\mathbf{k}\in\text{mesh}} w_{\mathbf{k}}\,f(\mathbf{k}),\quad \sum_{\mathbf{k}} w_{\mathbf{k}} = 1.

Three ideas are bundled into that line.

The mesh. A discrete subset of points in the BZ. The most popular choice (Section 3.9) is a regular $N_1\times N_2\times N_3$ grid — Monkhorst–Pack — but other meshes (Γ-only, Chadi–Cohen, hybrid) are common.
The weights. $w_{\mathbf{k}} = (V_{\text{BZ}}/N_{\text{tot}}) / V_{\text{BZ}} = 1/N_{\text{tot}}$ for an unfolded uniform grid. After symmetry reduction (Section 3.4 taught us what symmetries do to k), points related by a symmetry operation are merged into one representative whose weight equals its orbit size. So in a VASP IBZKPT file, weights of 1, 4, 6, 8, … encode "how many full-BZ partners does this irreducible point stand in for?".
The convergence claim. If $f(\mathbf{k})$ is smooth (band-gap material, insulator) the sum approaches the integral with error that falls off rapidly in the mesh density. If $f(\mathbf{k})$ has a kink — a Fermi surface, a Dirac cone — convergence is much slower and requires either denser meshes or special tricks (tetrahedron method, Gaussian smearing).

Read the discrete formula as "average f over the mesh, weighting each point by how much of the BZ it stands for". The mesh is your ruler; the weights are the tick spacing.

The Inverse-Volume Rule

How dense should the mesh be? Section 3.2 already gave the answer in compressed form: $V_{\text{BZ}} = (2\pi)^3 / V_{\text{cell}}$ . Spelt out for a uniform grid:

\Delta k_i \;=\; \frac{|\mathbf{b}_i|}{N_i} \;=\; \frac{2\pi}{a_i\,N_i}\quad\text{(orthorhombic, primitive)}.

The thing that matters for accuracy is $\Delta k_i$ , not $N_i$ . A 2×2×2 supercell has each side $a_i$ twice as long, so its reciprocal vector $|\mathbf{b}_i|$ is twice as short, and a mesh with $N_i = 4$ on the supercell achieves the same $\Delta k_i$ as $N_i = 8$ on the primitive cell. Doubling the cell halves the required N.

The single-most-useful k-point fact

Two cells of the same material at converged k-mesh density should give (essentially) the same energy per atom. If you double the supercell and forget to halve the k-grid, you are wasting CPU. If you keep $N_i$ fixed when you grow the cell, you are also wasting CPU and getting more accuracy than you can use. Always think in terms of mesh density, not mesh size.

VASP exposes this directly: the KSPACING tag in INCAR sets $\Delta k$ in Å⁻¹; VASP then picks the smallest $N_i$ that satisfies it. A typical metallic calculation uses KSPACING = 0.2 (Å⁻¹); a typical insulator uses 0.3–0.4 Å⁻¹. We will see why in a moment.

Real cell	\|b₁\| (Å⁻¹)	ΔK = 0.3 Å⁻¹ → N₁	ΔK = 0.2 Å⁻¹ → N₁
Cu primitive (a = 3.61 Å)	1.74	6	9
Cu 2×2×2 conventional (a = 7.22 Å)	0.87	3	5
GaN wurtzite (a = 3.19 Å)	1.97	7	10
GaN 4×4×2 supercell	0.49 / 0.49 / 1.21	2 / 2 / 5	3 / 3 / 7

Interactive — k-Grid in a 2D Brillouin Zone

Slide the controls below to change the grid density $N$ on a 2-D square Brillouin zone (a tight-binding analogue of a Cu (001) surface, with $\varepsilon(\mathbf{k}) = -2t\,(\cos k_x a + \cos k_y a)$ ). The colour map is the band; cyan dots are filled k-points ( $\varepsilon < 0$ ), pink dots are empty. The right panel reports the current sum estimate, the "exact" value computed on a 256×256 reference grid, and the absolute error. The log–log convergence curve below shows how the error falls as you slide $N$ upward.

First Brillouin Zone — 2D square lattice (a = 4 Å)

filled (ε < 0)empty (ε > 0)

Controls

Grid density N × N4 × 4 = 16 k-pts

Γ-centred shift (½, ½)show ε(k) heatmap

⟨ε⟩ this grid

-2.82843 eV

exact (256×256)

-1.63386 eV

|error|

1.19456 eV

Convergence — error vs N (log–log)

The BZ is a square of side 2π/a. Your N×N grid samples it uniformly. The cyan curve is the exact band-average error vs. N on a log–log axis — for this smooth band the error falls roughly like 1/N², the signature of well-behaved BZ integration. Toggle Γ-centred shift off to see how an unshifted grid that hits the band edges converges more slowly.

Three things to play with:

Slide N up. Watch the error fall. Notice it does not fall monotonically — it overshoots and undershoots. The orange ring on the convergence plot is your current $N$ .
Toggle Γ-centred shift. Without the half-step, the unshifted grid hits the BZ edges (and the Fermi-surface line at $k_x = \pm\pi/a$ ) directly. Edge points are ambiguous: do they belong to this BZ or the next? Convergence slows down. The shift dodges this by sampling only the interior of each mesh cell.
Compare even and odd N. Near band-crossing lines, an even- $N$ grid that straddles the crossing can be lucky or unlucky. The oscillation in the curve is real — it is the metallic Fermi-surface signature of the integrand having a kink at $\varepsilon = 0$ .

What the visualisation is really showing

For a smooth, fully gapped integrand the convergence would be very fast — error falls roughly as $1/N^2$ or better. The wiggle you see for this half-filled metal is the honest answer: metals are harder to converge than insulators, and the harder they are, the denser the mesh you must afford.

How Convergence Actually Behaves

Three regimes show up in practice. Knowing which one you are in is half the battle.

1. Insulators and large-gap semiconductors

Smooth integrand, no occupation discontinuity. Errors fall like a polynomial — for a 3-D Monkhorst–Pack grid, like $1/N^4$ for many ground-state quantities. A coarse 4×4×4 mesh on the primitive Si cell (8 atoms in the conventional cube) is already excellent; 6×6×6 is overkill for most properties.

2. Metals

The integrand has a step at the Fermi surface; the analogue of taking the Riemann sum of a step function. Bare uniform grids converge as slowly as $1/N$ . The cure is smearing — replace the abrupt step $f(\varepsilon)$ by a smoothed version, recover smoothness, and pay a tiny accuracy tax. VASP supports several flavours:

ISMEAR = -5 — tetrahedron with Blöchl correction. Best for static properties of metals; needs a Γ-centred mesh and at least 4 k-points in the irreducible BZ.
ISMEAR = 1, 2 — Methfessel–Paxton of order 1 or 2. Excellent for total energies and forces in metals at modest mesh density.
ISMEAR = 0, SIGMA = 0.05 — Gaussian smearing. Robust default for ionic relaxations of metals.

3. Narrow-gap, doped, or magnetic systems

Treat as metal-like for k-point convergence. Doped semiconductors have a Fermi level inside a band; magnetic systems with spin polarisation have Fermi surfaces in each spin channel. Both want the dense mesh + smearing combo, not the sparse-mesh insulator workflow.

What "converged" means

Pick the property you care about — total energy per atom, lattice constant, magnetic moment, band gap, vacancy formation energy. Run the calculation at meshes $N$ , $N+2$ , $N+4$ . Stop when consecutive answers agree to your physics tolerance: ~1 meV per atom for total energy, ~0.005 Å for lattice constants, ~0.01 $\mu_B$ for moments. The tolerance is set by the question, not by VASP's default printout precision.

Choosing a Smearing — Gaussian, MP, Tetrahedron

We just hand-waved the metal cure as "smearing". That word hides four distinct numerical schemes — VASP's ISMEAR tag — that look superficially similar but differ in which order in σ they cancel. Picking the wrong one is one of the most common mistakes in plane-wave DFT. This block lays the four side by side, derives the why for each, and ends with a one-glance decision rule.

The integral we are approximating

For a metal, the total electron number per cell is

N_e \;=\; \frac{V}{(2\pi)^3}\int_{\text{BZ}} d^3k \sum_n \Theta(\mu - \varepsilon_{n\mathbf{k}}),

with $\Theta$ the step function. The same kind of integral, with extra factors, gives the total energy and the density. The integrand jumps at the Fermi surface, so a finite Riemann sum on a uniform mesh has unavoidable noise unless we replace $\Theta$ by a smooth approximation $f_\sigma(\mu - \varepsilon)$ of width $\sigma$ . The four VASP options are four different choices of $f_\sigma$ — and a fifth, the tetrahedron method, throws the smooth-occupation idea away entirely and integrates a piecewise-linear interpolation of the bands instead.

The four kernels — what each one is

ISMEAR	Kernel f(ε−μ)	Error in σ	Best for	Watch out for
−5 Tetrahedron + Blöchl	(no σ — linear interpolation between vertices, then exact integral)	Order 4 in mesh spacing, σ-free	Static energies, DOS, optical spectra of any system	Needs Γ-centred mesh and ≥ 4 irreducible k-points; forces are not stable.
−1 Fermi–Dirac	1 / (1 + exp((ε−μ)/σ))	Order 2 in σ	Finite-temperature electronic structure, extended Born–Oppenheimer MD	σ here is a real temperature (k_B T); document it.
0 Gaussian	½ erfc((ε−μ)/σ)	Order 2 in σ	Robust default for relaxations, semiconductors, anything where you do not know the gap yet	Slow convergence in σ → 0 — needs σ ≲ 0.05 eV for forces.
1 Methfessel–Paxton (N=1)	Gaussian + 1st Hermite correction	Order 3 in σ	Total energies and forces of metals on dense meshes	Wiggles below μ and above can give negative occupations — fine for energies, ugly for DOS plots.
2 Methfessel–Paxton (N=2)	Gaussian + 1st + 2nd Hermite	Order 4 in σ	Same as N=1 on very dense meshes	Stronger wiggles; rarely worth it over N=1 unless σ is unusually large.

The kernels, plotted side by side

Toggle each scheme on or off and slide $\sigma$ . Three things should be visible at a glance:

All curves cross $f = \tfrac{1}{2}$ at the Fermi level — that is the definition of $\mu$ .
As $\sigma \to 0$ , every kernel converges to the grey step. The rate of convergence is what differs.
Methfessel–Paxton overshoots 1 just below $\mu$ and undershoots 0 just above. The area of those two wiggles is exactly equal — and that cancellation is what kills the order-1 error in $\sigma$ . Gaussian and Fermi–Dirac do not have this cancellation.

Occupation kernels f(ε − μ) for the four ISMEAR options

σ = 0.40 eV

Horizontal axis is energy relative to the Fermi level, in units of σ. Vertical axis is the occupation a state at that energy receives. The true T = 0 occupation is the grey step. Toggle each scheme on or off to compare. Notice how Methfessel–Paxton overshoots 1 just below μ and undershoots 0 just above — those wiggles are the trick that makes its first-order error in σ vanish.

SIGMA0.40 eV

Read this

Gaussian is monotone and easy to converge in σ → 0, but the first-order error in σ is non-zero.
Fermi–Dirac is the physical answer at finite electronic temperature; useful as an electron-temperature, not just a numerical trick.
MP N = 1 wiggles and adds order-2 cancellation — the workhorse for total energies of metals.
MP N = 2 wiggles more and adds order-3 cancellation — only worth it on very dense meshes.

Sanity check

All curves cross f = ½ at exactly ε = μ. That is the definition of the Fermi level — half-filled at the boundary — and every well-behaved smearing function shares it.

Why MP's wiggles cancel — one paragraph

For a smooth band structure, the total energy can be Taylor-expanded around its $\sigma = 0$ value. Methfessel and Paxton (PRB 40, 3616) chose the Hermite-correction coefficients precisely so that the coefficients of $\sigma, \sigma^2, \ldots, \sigma^{2N}$ in that expansion are zero. The first surviving error is $\mathcal{O}(\sigma^{2N+1})$ . So N = 1 means the answer is correct through $\sigma^2$ , and you can run with $\sigma \sim 0.2$ eV without paying for it — provided the mesh is dense enough that the surviving piecewise wobble averages out. That last clause is why MP requires a denser k-mesh than Gaussian: under-sampled, the wiggles do not cancel and you can get a wrong sign on the force.

Why tetrahedron is exact at zero σ

The tetrahedron method (ISMEAR = −5) does not smear at all. It treats the Brillouin zone as a tiling of tiny tetrahedra (8 per Monkhorst–Pack cube) with one band eigenvalue at each vertex, linearly interpolates the bands inside each tetrahedron, and then integrates the interpolation analytically. The Blöchl correction adjusts the integration weights to remove the leading curvature error, lifting accuracy from $\mathcal{O}(\Delta k^2)$ to $\mathcal{O}(\Delta k^4)$ . The result is the most accurate static-energy / static-DOS method on a fixed k-mesh — but it depends on neighbouring k-points sharing a connected tetrahedron, which fails for sparse meshes (hence the "at least 4 irreducible k-points" rule), Γ-only sampling, and non-Γ-centred MP grids.

The decision tree — one glance

Insulator or large-gap semiconductor. Use ISMEAR = 0 with σ = 0.05 eV. Tetrahedron also fine for static properties; not needed.
Metal, want a static total energy or DOS. Use ISMEAR = −5 (tetrahedron + Blöchl). Verify the mesh is Γ-centred and produces ≥ 4 irreducible k-points.
Metal, want forces or a relaxation. Use ISMEAR = 1 with σ = 0.1–0.2 eV. Tetrahedron forces are not implemented in VASP for legacy reasons.
Narrow-gap, doped, or magnetic system you do not yet understand. Use ISMEAR = 0 with σ = 0.05 eV first. Once you know whether the Fermi level lies in a gap or a band, switch to (1) or (3).
Finite electronic temperature on purpose. (Hot-electron transport, MD with non-zero electronic temperature for alkali plasmas.) Use ISMEAR = −1 with σ = k_B T.

The σ → 0 sanity check

Whichever ISMEAR you pick (except −5), report the total energy for at least two values of $\sigma$ and confirm that the answer changes by less than your physics tolerance. For Gaussian (ISMEAR = 0) the energy converges in σ like a polynomial, not exponentially — so "σ small" is not the same as "σ converged". Always check.

Anisotropic Cells Need Anisotropic Grids

The inverse-volume rule applies per axis. A wurtzite GaN cell with $a = 3.19$ Å and $c = 5.19$ Å has $|\mathbf{b}_3|$ shorter than $|\mathbf{b}_1|$ by the ratio $a/c \approx 0.61$ . To match $\Delta k$ on all three axes you want an $N_1 \times N_1 \times N_3$ grid with $N_3 \approx 0.61\,N_1$ . A common pragmatic choice: 8×8×5 instead of 8×8×8.

Slab calculations push this even further. A slab cell with vacuum along $\mathbf{a}_3$ often has $c \sim 25$ Å of vacuum, making $|\mathbf{b}_3|$ tiny. Sampling along that axis is wasteful — most authors set $N_3 = 1$ for slabs (a single Γ-line of k-points) and use the in-plane $N_1 \times N_2$ as for the bulk surface unit cell.

The most expensive mistake in surface DFT

Forgetting that $N_3 = 1$ is the right answer for a slab — and using 8×8×8 because it "looks balanced" — can multiply your runtime by 8× with zero accuracy gain. Always check your KPOINTS file by hand for slabs.

Symmetry — The Irreducible Wedge

Section 3.4 noticed that the cubic BZ has 48 symmetry operations; the hexagonal BZ has 24. Apply these to the full mesh and most points fall on top of each other. VASP automatically reduces the full $N_1 N_2 N_3$ grid to the irreducible Brillouin zone (IBZ): the smallest wedge from which the rest of the BZ can be regenerated by symmetry.

For an FCC primitive cell with an 8×8×8 mesh:

Full BZ: 512 points.
Irreducible BZ for the 48-element $O_h$ symmetry: 60 points (printed at the top of IBZKPT).
Cost saving: ~8.5×. The diagonalisation runs only over the irreducible set; the BZ averages are reconstructed using the orbit-size weights $w_{\mathbf{k}}$ .

Two everyday consequences:

If you break the symmetry — say, by introducing a single Mn dopant in a CdSe supercell — the irreducible set grows. A 4×4×4 mesh that gave you 8 irreducible k-points in pristine CdSe might give 32 in the doped supercell. Plan your CPU budget accordingly.
A Γ-centred grid keeps every point group operation as a strict symmetry of the mesh. A non-Γ-centred (offset) grid can break inversion symmetry and double the irreducible set. For magnetic and spin-orbit calculations, Γ-centred is almost always the right choice.

VASP — The KPOINTS File at a Glance

VASP's simplest k-point input — and the one you will write most often — is the automatic form. Five lines, no surprises:

📝text

1Automatic mesh                         ! line 1: comment, free-text
20                                       ! line 2: 0 → automatic generation
3Gamma                                    ! line 3: Gamma  (Γ-centred) or Monkhorst-Pack
48 8 8                                    ! line 4: subdivisions N1 N2 N3
50 0 0                                    ! line 5: optional shift (s1 s2 s3)

Line 1. A free-form comment. Use it to document the mesh density (e.g. KSPACING ≈ 0.20 Å⁻¹) so your future self can remember what was tested.
Line 2. Number of k-points to read explicitly. 0 means "generate automatically from the next three lines". A positive integer would be followed by a manual list — used for band-structure paths (Section 3.10).
Line 3. Mesh type. Gamma places one point exactly at $\mathbf{k} = 0$ and is the safe default for hexagonal cells, slabs, magnetic systems, and tetrahedron method. Monkhorst-Pack shifts the grid by half a step so no point sits at Γ — better for some metallic cubic systems but breaks inversion-related savings in hexagonal lattices. When in doubt, choose Gamma.
Line 4. Subdivisions $N_1, N_2, N_3$ along $\mathbf{b}_1, \mathbf{b}_2, \mathbf{b}_3$ . This is what you change in a convergence sweep.
Line 5. Additional shift in fractional reciprocal coordinates. Almost always zero. Used in rare cases where you need to hit a specific high-symmetry point exactly.

KSPACING — even simpler

Set KSPACING = 0.25 in INCAR and delete the KPOINTS file entirely (or use the bare automatic form). VASP will compute the smallest $N_i$ giving $|\mathbf{b}_i|/N_i \le 0.25$ Å⁻¹ for every axis. You write one number; the code handles the inverse-volume arithmetic.

Running a Convergence Test in Python

Before launching a 10-day VASP job on the right mesh, test the same idea in Python with a toy band. The 2-D tight-binding model from the interactive above is perfect: it has a Fermi surface (so it behaves like a metal), a closed-form band, and runs in microseconds. Click any line for the variable values and the why-each-argument story.

Tight-binding k-point convergence sweep

🐍kpoint_convergence.py

Explanation(12)

Code(15)

1import numpy as np

NumPy is the workhorse of numerical Python. Here we need it for three things: (1) np.arange to build the integer index grid, (2) np.meshgrid to turn two 1-D axes into two 2-D coordinate arrays, and (3) np.cos plus boolean array indexing to evaluate the band and pick out the filled states. Every operation runs in optimised C, so a 32×32 grid evaluates in microseconds.

EXECUTION STATE

numpy = Library that provides ndarray (N-dimensional array), broadcasting, and vectorised math. The exact tool for evaluating a band on a k-grid in one line.

as np = Universal alias used in every electronic-structure script — write np.cos(...) instead of numpy.cos(...).

3def band_energy(kx, ky, t=1.0, a=4.0) → ndarray

Tight-binding band of a 2-D square lattice: ε(k) = -2t·(cos(kx·a) + cos(ky·a)). This is the simplest non-trivial band — it has a minimum -4t at Γ, a maximum +4t at the corner (π/a, π/a), and a flat ε = 0 contour that is the Fermi surface at half-filling.

EXECUTION STATE

⬇ input: kx = x-component of the wavevector in Å⁻¹. Can be a scalar OR a NumPy array — the function vectorises automatically. Range of physical interest: [-π/a, π/a] = [-0.785, 0.785] Å⁻¹ for a = 4 Å.

⬇ input: ky = Same as kx but for the y-component.

⬇ input: t = 1.0 = Hopping integral in eV — sets the band width (4t). The default keeps the math human-readable: bandwidth 4 eV.

⬇ input: a = 4.0 = Lattice constant in Å. Sets the BZ half-width π/a = 0.785 Å⁻¹. Realistic for a Cu (3.6 Å) or Al (4.05 Å) lattice.

⬆ returns = ndarray (same shape as kx/ky) — the band energy in eV. For a 4×4 input grid, returns a 4×4 matrix of energies.

4return -2.0 * t * (np.cos(kx * a) + np.cos(ky * a))

Compute the band. The -2t prefactor is the standard tight-binding convention (each site has 2 nearest neighbours along x and 2 along y, giving the factor of 2). cos is element-wise and broadcasts: passing a 4×4 kx and a 4×4 ky returns a 4×4 ε.

EXECUTION STATE

📚 np.cos() = Element-wise cosine. Accepts scalars or arrays. Example: np.cos([0, π/2, π]) → [1, 0, -1].

kx * a = Dimensionless argument of cos. For kx = π/a, kx·a = π so cos(kx·a) = -1, and the band hits its maximum +4t at the BZ corner.

→ Example: kx = ky = 0 (Γ point) = ε = -2·1·(cos(0) + cos(0)) = -4t = -4.0 eV (band minimum)

→ Example: kx = π/a, ky = 0 (X point) = ε = -2·1·(cos(π) + cos(0)) = -2·(-1 + 1) = 0 eV (Fermi level)

⬆ return: ε(k) in eV = Same shape as the input grid; one energy per k-point.

6def average_filled(N, a=4.0) → float

Estimate the average band energy of the occupied states (those with ε < 0) using an N×N k-grid. This is a simplified version of how VASP computes the band-structure contribution to total energy: replace the BZ integral by a uniform-grid sum.

EXECUTION STATE

⬇ input: N = Number of k-points along each axis. Total grid: N×N. Increasing N means more accurate BZ sampling, more cost. We will sweep N to study convergence.

⬇ input: a = 4.0 = Lattice constant — must match the value used when calling band_energy.

⬆ returns = float — the mean band energy over the points where ε < 0, in eV. The true value (256×256 reference) is ≈ -1.6175 eV.

7shifts = (np.arange(N) + 0.5) / N - 0.5

Build N centred shifts in the half-open interval [-0.5, 0.5). Adding 0.5 to np.arange(N) staggers the grid by half a step — this is the Γ-centred Monkhorst-Pack shift. The shift avoids putting k-points on the BZ boundary, which is critical for fast convergence.

EXECUTION STATE

📚 np.arange(N) = Integer range [0, 1, …, N-1] as an ndarray. Example: np.arange(4) → [0, 1, 2, 3].

→ Example: N = 4 = np.arange(4) = [0, 1, 2, 3] + 0.5 → [0.5, 1.5, 2.5, 3.5] / 4 → [0.125, 0.375, 0.625, 0.875] - 0.5 → [-0.375, -0.125, 0.125, 0.375]

→ why the half-step? = Without +0.5 the shifts would include 0 and miss the symmetric point on the other side. The half-step makes the grid Γ-centred: it samples the interior of the BZ symmetrically around k = 0 and never lands on the boundary.

⬆ shifts (N = 4) = [-0.375, -0.125, 0.125, 0.375]

8grid = 2 * np.pi / a * shifts

Convert dimensionless shifts to physical k-values in Å⁻¹. The first BZ for a square lattice spans k ∈ [-π/a, π/a], which has total width 2π/a. Multiplying the shifts in [-0.5, 0.5) by 2π/a maps them into [-π/a, π/a) — exactly the BZ.

EXECUTION STATE

2 * np.pi / a = BZ width along one axis. For a = 4 Å: 2π/4 = 1.5708 Å⁻¹. The reciprocal lattice constant from Section 3.2.

→ BZ half-width π/a = π/4 = 0.7854 Å⁻¹ — the X point along each axis.

⬆ grid (N = 4) = [-0.5890, -0.1963, +0.1963, +0.5890] Å⁻¹

→ sanity check = All four values lie in (-π/a, π/a) = (-0.785, 0.785). ✓ None on the boundary.

9kx, ky = np.meshgrid(grid, grid)

Turn the 1-D axis array into two 2-D coordinate matrices. kx[i,j] holds the x-component of the (i,j) grid point; ky[i,j] holds the y-component. This is the standard idiom for evaluating a 2-D function on a Cartesian grid.

EXECUTION STATE

📚 np.meshgrid(x, y) = Returns two 2-D arrays: the first replicates x along rows, the second replicates y along columns. Both have shape (len(y), len(x)).

→ Example: meshgrid([1,2], [10,20]) = kx = [[1, 2], ky = [[10, 10], [1, 2]] [20, 20]]

⬆ kx (4×4) =

[[-0.589, -0.196, +0.196, +0.589],
 [-0.589, -0.196, +0.196, +0.589],
 [-0.589, -0.196, +0.196, +0.589],
 [-0.589, -0.196, +0.196, +0.589]]

⬆ ky (4×4) =

[[-0.589, -0.589, -0.589, -0.589],
 [-0.196, -0.196, -0.196, -0.196],
 [+0.196, +0.196, +0.196, +0.196],
 [+0.589, +0.589, +0.589, +0.589]]

10eps = band_energy(kx, ky)

Evaluate the band on every grid point in one vectorised call. Because band_energy uses np.cos and arithmetic that broadcast over arrays, this single line replaces a Python double-loop and runs ~100× faster.

EXECUTION STATE

→ broadcasting in action = kx and ky are both (4,4). np.cos(kx*a) is (4,4). The sum is (4,4). The result eps is (4,4) — one ε per k-point.

⬆ eps (4×4) in eV =

[[+2.828,  0.000,  0.000, +2.828],
 [ 0.000, -2.828, -2.828,  0.000],
 [ 0.000, -2.828, -2.828,  0.000],
 [+2.828,  0.000,  0.000, +2.828]]

→ reading the matrix = Corners (large |kx|, |ky|) sit near +4t — the band top. The four central points sit near -4t — the band bottom. The edge points are right on ε = 0, the Fermi surface at half-filling.

11filled = eps < -1e-10

Build the boolean mask of filled states. We compare against -1e-10, not 0, to dodge a subtle floating-point trap: many BZ-boundary points have ε that is mathematically zero but evaluates to ±1e-16 in IEEE-754 doubles. Using strict `< 0` would let half of those drift into the ‘filled’ set and bias the mean. A tiny negative tolerance is best practice for any ‘is this number negative?’ test in numerical work.

EXECUTION STATE

📚 eps < -1e-10 = Element-wise comparison against -1e-10. Returns a boolean array of the same shape; True wherever eps is meaningfully negative (below the noise floor of float arithmetic).

→ why the tolerance? = On a square lattice at half-filling the Fermi surface ε = 0 is a straight line through the BZ. Many sample points sit exactly on it — but Math.cos(π/2) ≠ 0 in IEEE-754, it is 6.12e-17. Without a tolerance, whether a boundary point is ‘filled’ depends on rounding, which makes the sum unstable and platform-dependent.

⬆ filled (4×4) for N=4 =

[[F, F, F, F],
 [F, T, T, F],
 [F, T, T, F],
 [F, F, F, F]]

→ 4 True entries = Only the four inner points (ε = -2.828) pass the filter. The eight edge points (ε ≈ 0 ± 1e-16) and the four corners (ε = +2.828) are excluded. The same physics, robustly counted.

12return eps[filled].mean()

Boolean array indexing: eps[filled] picks out only the True entries as a flat 1-D array. Calling .mean() on that array gives the average band energy of the occupied states — what VASP would call the band-structure energy at half-filling.

EXECUTION STATE

📚 eps[filled] = Fancy indexing with a boolean mask. Returns a 1-D array containing only the elements where mask is True. Length = number of True entries.

→ eps[filled] for N=4 = [-2.828, -2.828, -2.828, -2.828] (4 inner points)

📚 .mean() = Arithmetic mean of the array. Equivalent to .sum() / .size. Returns a scalar float.

⬆ return for N = 4 = (-2.828 × 4) / 4 = -2.8284 eV. Compare to the converged value -1.6339 eV — at N = 4 we have only the band-minimum points sampled, so the average is dragged hard toward the bottom of the band. We are 73% off — the worst point of the sweep.

14for N in [3, 4, 5, 6, 8, 12, 16, 24, 32]:

Sweep the grid density and watch the answer change. We include both odd (3, 5) and even (4, 6, 8…) values because they sample the Fermi surface very differently and produce dramatically different averages — the metallic non-monotonic convergence we want to see.

LOOP TRACE · 10 iterations

N = 3

average_filled(3) = -1.600000 eV (lucky odd-N hit near limit!)

N = 4

average_filled(4) = -2.828427 eV (only inner points sampled)

N = 5

average_filled(5) = -1.611098 eV (odd N again — close to limit)

N = 6

average_filled(6) = -2.309401 eV

N = 8

average_filled(8) = -2.102881 eV

N = 12

average_filled(12) = -1.922605 eV

N = 16

average_filled(16) = -1.840664 eV

N = 24

average_filled(24) = -1.763434 eV

N = 32

average_filled(32) = -1.726428 eV

exact (256×256)

reference = -1.633863 eV

15print(f"{N:3d}x{N:3d} E = {average_filled(N):+.6f} eV")

f-string formatted printout. {N:3d} pads N to 3 digits; {…:+.6f} prints the energy with an explicit sign and 6 decimal places. The fixed-width column makes oscillations visually obvious — that is the point of running this sweep.

EXECUTION STATE

→ why fixed-width? = When the answer wobbles between -1.6 and -2.8 on its way to the limit, aligned columns make the wobble jump out. You will see this same trick in real VASP convergence-test scripts that loop over INCAR parameters.

⬆ printed output = 3x3 E = -1.600000 eV 4x4 E = -2.828427 eV 5x5 E = -1.611098 eV 6x6 E = -2.309401 eV 8x8 E = -2.102881 eV 12x12 E = -1.922605 eV 16x16 E = -1.840664 eV 24x24 E = -1.763434 eV 32x32 E = -1.726428 eV

→ the pattern = Odd N (3, 5) lands close to the limit by accident — those grids happen to oversample the band-edge regions where ε is small. Even N is dominated by the band minimum at small N (-2.83 at N=4) and creeps up toward -1.63 only as the grid becomes dense enough to resolve the Fermi surface. Both sequences converge to the same -1.6339 eV, but neither does so monotonically.

3 lines without explanation

1import numpy as np
2
3def band_energy(kx, ky, t=1.0, a=4.0):
4    return -2.0 * t * (np.cos(kx * a) + np.cos(ky * a))
5
6def average_filled(N, a=4.0):
7    shifts = (np.arange(N) + 0.5) / N - 0.5
8    grid = 2 * np.pi / a * shifts
9    kx, ky = np.meshgrid(grid, grid)
10    eps = band_energy(kx, ky)
11    filled = eps < -1e-10  # tolerance dodges floating-point ε≈0 on the BZ edge
12    return eps[filled].mean()
13
14for N in [3, 4, 5, 6, 8, 12, 16, 24, 32]:
15    print(f"{N:3d}x{N:3d}  E = {average_filled(N):+.6f} eV")

Run it and look at the output. The energy lurches from $-1.60$ at $N=3$ down to $-2.83$ at $N=4$ , back up to $-1.61$ at $N=5$ , then creeps monotonically toward the converged value $-1.6339$ eV from below. That dramatic odd/even swing at low $N$ is the metallic Fermi-surface signature in action — exactly what we predicted from the integrand having a kink at $\varepsilon = 0$ . The fix in a real VASP calculation is ISMEAR = 1, SIGMA = 0.1 — smear the step out and convergence becomes monotonic and fast.

Translating this script to VASP

For a full VASP convergence test, replace average_filled(N) with a function that writes a KPOINTS file with subdivisions $N\times N\times N$ , runs vasp_std, and parses OUTCAR for free energy TOTEN. The convergence plot will look qualitatively the same — a fast-decaying envelope with residual oscillation in metals, smooth and monotonic in insulators — and your stopping criterion (1 meV/atom) carries over verbatim.

Common Pitfalls

Pitfall	Symptom	Fix
Same N for primitive and supercell	8× more k-points than needed in a 2×2×2 supercell — runtime scales linearly with k-points.	Halve N_i per dimension when you double the cell. Or use KSPACING and let VASP do it.
Fixed N forgetting axis ratios	Hexagonal or layered cells with c ≫ a converge slowly along c, or sample the c-axis far past need.	Make N_i ∝ \|b_i\|. For a:c = 1:2, use 2N × 2N × N (or just KSPACING).
Single k-point on a metal	Total energy nonsensical, magnetic moment fluctuates between runs.	Use ≥ 6×6×6 for cubic metals and ISMEAR = 1 with SIGMA = 0.05–0.1 eV.
Mixing Gamma and MP between runs	Tiny energy differences (~1 meV) when you compare cells with different KPOINTS modes.	Stick to one mode for any convergence series. Γ-centred is the safer default.
N₃ > 1 on a slab	Wasted CPU; some authors also see false dispersion along the surface normal.	Always set N₃ = 1 on slabs with vacuum along c.
Stopping the sweep at 1 mesh	‘Converged’ is wishful thinking — you have one data point.	Run at least three meshes; check the trend; converge against your physics tolerance, not VASP defaults.

Summary

Brillouin-zone integrals are unavoidable: every observable in electronic-structure theory is a BZ average of band quantities. We replace the integral by a weighted sum over a finite mesh.
The mesh density that matters is $\Delta k_i = |\mathbf{b}_i|/N_i$ , not $N_i$ . Doubling the real-space cell halves the required N — the inverse-volume rule.
Insulators converge fast (polynomial in $1/N$ ); metals converge slowly because the integrand has a Fermi-surface step. Smearing (ISMEAR) cures the metal case.
The interactive 2-D viz makes the oscillation visible: even a textbook square-lattice Fermi surface produces a non-monotonic convergence curve until enough density is reached.
Anisotropic cells need anisotropic grids. Slabs need $N_3 = 1$ .
Symmetry shrinks the mesh dramatically — VASP's irreducible BZ can be 8× smaller than the full mesh on cubic cells. Breaking symmetry with dopants undoes that saving.
The simplest VASP KPOINTS file is five lines; KSPACING in INCAR is even simpler. Always run a convergence sweep on the property you care about, not just on total energy.
The Python sweep above mirrors the VASP sweep one-to-one. If a toy band converges at $N \sim 30$ , expect the same for a real metal at the same $\Delta k$ .

Section 3.8 Core Insight

"A k-point grid is a Riemann sum over the Brillouin zone. The right density is set by the smoothness of the integrand and the size of the cell — not by what looks balanced."

Coming next: Section 3.9 — Monkhorst–Pack Grids — where we make the "regular grid" idea precise, derive the optimal half-step shift you saw above, and learn why Gamma and Monkhorst-Pack can give different irreducible point counts for the same cell.