Learning Objectives
The previous seven sections of this chapter built up reciprocal space — the reciprocal lattice (3.2), the Brillouin zone (3.3), high-symmetry paths (3.4), Bloch's theorem (3.5), the structure factor (3.6), and systematic absences (3.7). Now we put it to work. Every electronic-structure code, VASP included, replaces continuous Brillouin-zone integrals by discrete sums over a finite mesh of k-points. Choosing that mesh is the single most consequential numerical decision in any DFT calculation. By the end of this section you should be able to:
- Explain why integration over the BZ is unavoidable, and what quantity is actually being averaged.
- Translate the integral into the discrete sum and know what the weights mean.
- State the inverse-volume rule: . A 2×2×2 supercell needs half the k-mesh of the primitive cell per axis.
- Recognise that convergence is not always monotonic — band crossings, Fermi surfaces, and grid-edge effects can make the energy oscillate before it settles.
- Read VASP's KPOINTS file and understand what each line of the simplest automatic form actually requests.
- Run a clean convergence test in Python or VASP and decide when the answer is "converged enough" for the question you are asking.
One-line preview: a k-point grid is a Riemann-sum discretisation of the Brillouin zone. The denser it is, the closer the sum gets to the true integral — but the right density depends on the smoothness of the integrand and the size of your real-space cell.
Why Do We Sample at All?
Bloch's theorem (Section 3.5) tells us that every single-particle eigenstate of a periodic crystal can be labelled by a wavevector living in the first Brillouin zone and a band index . The total energy of the electrons, the charge density, the density of states, the optical spectrum — every observable in this chapter and the next — is built from a Brillouin-zone average of band quantities. For total energy at zero temperature:
where is the Fermi–Dirac occupation (effectively a step function at for insulators, smeared by temperature for metals). The factor converts the integral into a per-cell quantity using the result we proved in Section 3.2 — the reciprocal-cell volume is .
For an infinite crystal this is a clean continuous integral. For a computer it is something we must approximate, because the integrand is the eigenvalue of the Kohn–Sham Hamiltonian at — every evaluation costs an entire diagonalisation. We pick a finite set of -points, diagonalise once at each, and add the results with weights.
One band, one matrix, one k-point
Why does VASP need many k-points? Because it solves a separate Kohn–Sham eigenvalue problem at every k. The wavefunctions have a different periodic part at each . Doubling your k-mesh roughly doubles the wall-clock cost. That is the budget you are spending when you choose a mesh — make it count.
From an Integral to a Weighted Sum
Replace the integral by a Riemann-style sum:
Three ideas are bundled into that line.
- The mesh. A discrete subset of points in the BZ. The most popular choice (Section 3.9) is a regular grid — Monkhorst–Pack — but other meshes (Γ-only, Chadi–Cohen, hybrid) are common.
- The weights. for an unfolded uniform grid. After symmetry reduction (Section 3.4 taught us what symmetries do to k), points related by a symmetry operation are merged into one representative whose weight equals its orbit size. So in a VASP IBZKPT file, weights of 1, 4, 6, 8, … encode "how many full-BZ partners does this irreducible point stand in for?".
- The convergence claim. If is smooth (band-gap material, insulator) the sum approaches the integral with error that falls off rapidly in the mesh density. If has a kink — a Fermi surface, a Dirac cone — convergence is much slower and requires either denser meshes or special tricks (tetrahedron method, Gaussian smearing).
Read the discrete formula as "average f over the mesh, weighting each point by how much of the BZ it stands for". The mesh is your ruler; the weights are the tick spacing.
The Inverse-Volume Rule
How dense should the mesh be? Section 3.2 already gave the answer in compressed form: . Spelt out for a uniform grid:
The thing that matters for accuracy is , not . A 2×2×2 supercell has each side twice as long, so its reciprocal vector is twice as short, and a mesh with on the supercell achieves the same as on the primitive cell. Doubling the cell halves the required N.
The single-most-useful k-point fact
Two cells of the same material at converged k-mesh density should give (essentially) the same energy per atom. If you double the supercell and forget to halve the k-grid, you are wasting CPU. If you keep fixed when you grow the cell, you are also wasting CPU and getting more accuracy than you can use. Always think in terms of mesh density, not mesh size.
VASP exposes this directly: the KSPACING tag in INCAR sets in Å⁻¹; VASP then picks the smallest that satisfies it. A typical metallic calculation uses KSPACING = 0.2 (Å⁻¹); a typical insulator uses 0.3–0.4 Å⁻¹. We will see why in a moment.
| Real cell | |b₁| (Å⁻¹) | ΔK = 0.3 Å⁻¹ → N₁ | ΔK = 0.2 Å⁻¹ → N₁ |
|---|---|---|---|
| Cu primitive (a = 3.61 Å) | 1.74 | 6 | 9 |
| Cu 2×2×2 conventional (a = 7.22 Å) | 0.87 | 3 | 5 |
| GaN wurtzite (a = 3.19 Å) | 1.97 | 7 | 10 |
| GaN 4×4×2 supercell | 0.49 / 0.49 / 1.21 | 2 / 2 / 5 | 3 / 3 / 7 |
Interactive — k-Grid in a 2D Brillouin Zone
Slide the controls below to change the grid density on a 2-D square Brillouin zone (a tight-binding analogue of a Cu (001) surface, with ). The colour map is the band; cyan dots are filled k-points (), pink dots are empty. The right panel reports the current sum estimate, the "exact" value computed on a 256×256 reference grid, and the absolute error. The log–log convergence curve below shows how the error falls as you slide upward.
The BZ is a square of side 2π/a. Your N×N grid samples it uniformly. The cyan curve is the exact band-average error vs. N on a log–log axis — for this smooth band the error falls roughly like 1/N², the signature of well-behaved BZ integration. Toggle Γ-centred shift off to see how an unshifted grid that hits the band edges converges more slowly.
Three things to play with:
- Slide N up. Watch the error fall. Notice it does not fall monotonically — it overshoots and undershoots. The orange ring on the convergence plot is your current .
- Toggle Γ-centred shift. Without the half-step, the unshifted grid hits the BZ edges (and the Fermi-surface line at ) directly. Edge points are ambiguous: do they belong to this BZ or the next? Convergence slows down. The shift dodges this by sampling only the interior of each mesh cell.
- Compare even and odd N. Near band-crossing lines, an even- grid that straddles the crossing can be lucky or unlucky. The oscillation in the curve is real — it is the metallic Fermi-surface signature of the integrand having a kink at .
What the visualisation is really showing
For a smooth, fully gapped integrand the convergence would be very fast — error falls roughly as or better. The wiggle you see for this half-filled metal is the honest answer: metals are harder to converge than insulators, and the harder they are, the denser the mesh you must afford.
How Convergence Actually Behaves
Three regimes show up in practice. Knowing which one you are in is half the battle.
1. Insulators and large-gap semiconductors
Smooth integrand, no occupation discontinuity. Errors fall like a polynomial — for a 3-D Monkhorst–Pack grid, like for many ground-state quantities. A coarse 4×4×4 mesh on the primitive Si cell (8 atoms in the conventional cube) is already excellent; 6×6×6 is overkill for most properties.
2. Metals
The integrand has a step at the Fermi surface; the analogue of taking the Riemann sum of a step function. Bare uniform grids converge as slowly as . The cure is smearing — replace the abrupt step by a smoothed version, recover smoothness, and pay a tiny accuracy tax. VASP supports several flavours:
- ISMEAR = -5 — tetrahedron with Blöchl correction. Best for static properties of metals; needs a Γ-centred mesh and at least 4 k-points in the irreducible BZ.
- ISMEAR = 1, 2 — Methfessel–Paxton of order 1 or 2. Excellent for total energies and forces in metals at modest mesh density.
- ISMEAR = 0, SIGMA = 0.05 — Gaussian smearing. Robust default for ionic relaxations of metals.
3. Narrow-gap, doped, or magnetic systems
Treat as metal-like for k-point convergence. Doped semiconductors have a Fermi level inside a band; magnetic systems with spin polarisation have Fermi surfaces in each spin channel. Both want the dense mesh + smearing combo, not the sparse-mesh insulator workflow.
What "converged" means
Pick the property you care about — total energy per atom, lattice constant, magnetic moment, band gap, vacancy formation energy. Run the calculation at meshes , , . Stop when consecutive answers agree to your physics tolerance: ~1 meV per atom for total energy, ~0.005 Å for lattice constants, ~0.01 for moments. The tolerance is set by the question, not by VASP's default printout precision.
Choosing a Smearing — Gaussian, MP, Tetrahedron
We just hand-waved the metal cure as "smearing". That word hides four distinct numerical schemes — VASP's ISMEAR tag — that look superficially similar but differ in which order in σ they cancel. Picking the wrong one is one of the most common mistakes in plane-wave DFT. This block lays the four side by side, derives the why for each, and ends with a one-glance decision rule.
The integral we are approximating
For a metal, the total electron number per cell is
with the step function. The same kind of integral, with extra factors, gives the total energy and the density. The integrand jumps at the Fermi surface, so a finite Riemann sum on a uniform mesh has unavoidable noise unless we replace by a smooth approximation of width . The four VASP options are four different choices of — and a fifth, the tetrahedron method, throws the smooth-occupation idea away entirely and integrates a piecewise-linear interpolation of the bands instead.
The four kernels — what each one is
| ISMEAR | Kernel f(ε−μ) | Error in σ | Best for | Watch out for |
|---|---|---|---|---|
| −5 Tetrahedron + Blöchl | (no σ — linear interpolation between vertices, then exact integral) | Order 4 in mesh spacing, σ-free | Static energies, DOS, optical spectra of any system | Needs Γ-centred mesh and ≥ 4 irreducible k-points; forces are not stable. |
| −1 Fermi–Dirac | 1 / (1 + exp((ε−μ)/σ)) | Order 2 in σ | Finite-temperature electronic structure, extended Born–Oppenheimer MD | σ here is a real temperature (k_B T); document it. |
| 0 Gaussian | ½ erfc((ε−μ)/σ) | Order 2 in σ | Robust default for relaxations, semiconductors, anything where you do not know the gap yet | Slow convergence in σ → 0 — needs σ ≲ 0.05 eV for forces. |
| 1 Methfessel–Paxton (N=1) | Gaussian + 1st Hermite correction | Order 3 in σ | Total energies and forces of metals on dense meshes | Wiggles below μ and above can give negative occupations — fine for energies, ugly for DOS plots. |
| 2 Methfessel–Paxton (N=2) | Gaussian + 1st + 2nd Hermite | Order 4 in σ | Same as N=1 on very dense meshes | Stronger wiggles; rarely worth it over N=1 unless σ is unusually large. |
The kernels, plotted side by side
Toggle each scheme on or off and slide . Three things should be visible at a glance:
- All curves cross at the Fermi level — that is the definition of .
- As , every kernel converges to the grey step. The rate of convergence is what differs.
- Methfessel–Paxton overshoots 1 just below and undershoots 0 just above. The area of those two wiggles is exactly equal — and that cancellation is what kills the order-1 error in . Gaussian and Fermi–Dirac do not have this cancellation.
Occupation kernels f(ε − μ) for the four ISMEAR options
σ = 0.40 eVHorizontal axis is energy relative to the Fermi level, in units of σ. Vertical axis is the occupation a state at that energy receives. The true T = 0 occupation is the grey step. Toggle each scheme on or off to compare. Notice how Methfessel–Paxton overshoots 1 just below μ and undershoots 0 just above — those wiggles are the trick that makes its first-order error in σ vanish.
- Gaussian is monotone and easy to converge in σ → 0, but the first-order error in σ is non-zero.
- Fermi–Dirac is the physical answer at finite electronic temperature; useful as an electron-temperature, not just a numerical trick.
- MP N = 1 wiggles and adds order-2 cancellation — the workhorse for total energies of metals.
- MP N = 2 wiggles more and adds order-3 cancellation — only worth it on very dense meshes.
Why MP's wiggles cancel — one paragraph
For a smooth band structure, the total energy can be Taylor-expanded around its value. Methfessel and Paxton (PRB 40, 3616) chose the Hermite-correction coefficients precisely so that the coefficients of in that expansion are zero. The first surviving error is . So N = 1 means the answer is correct through , and you can run with eV without paying for it — provided the mesh is dense enough that the surviving piecewise wobble averages out. That last clause is why MP requires a denser k-mesh than Gaussian: under-sampled, the wiggles do not cancel and you can get a wrong sign on the force.
Why tetrahedron is exact at zero σ
The tetrahedron method (ISMEAR = −5) does not smear at all. It treats the Brillouin zone as a tiling of tiny tetrahedra (8 per Monkhorst–Pack cube) with one band eigenvalue at each vertex, linearly interpolates the bands inside each tetrahedron, and then integrates the interpolation analytically. The Blöchl correction adjusts the integration weights to remove the leading curvature error, lifting accuracy from to . The result is the most accurate static-energy / static-DOS method on a fixed k-mesh — but it depends on neighbouring k-points sharing a connected tetrahedron, which fails for sparse meshes (hence the "at least 4 irreducible k-points" rule), Γ-only sampling, and non-Γ-centred MP grids.
The decision tree — one glance
- Insulator or large-gap semiconductor. Use ISMEAR = 0 with σ = 0.05 eV. Tetrahedron also fine for static properties; not needed.
- Metal, want a static total energy or DOS. Use ISMEAR = −5 (tetrahedron + Blöchl). Verify the mesh is Γ-centred and produces ≥ 4 irreducible k-points.
- Metal, want forces or a relaxation. Use ISMEAR = 1 with σ = 0.1–0.2 eV. Tetrahedron forces are not implemented in VASP for legacy reasons.
- Narrow-gap, doped, or magnetic system you do not yet understand. Use ISMEAR = 0 with σ = 0.05 eV first. Once you know whether the Fermi level lies in a gap or a band, switch to (1) or (3).
- Finite electronic temperature on purpose. (Hot-electron transport, MD with non-zero electronic temperature for alkali plasmas.) Use ISMEAR = −1 with σ = k_B T.
The σ → 0 sanity check
Whichever ISMEAR you pick (except −5), report the total energy for at least two values of and confirm that the answer changes by less than your physics tolerance. For Gaussian (ISMEAR = 0) the energy converges in σ like a polynomial, not exponentially — so "σ small" is not the same as "σ converged". Always check.
Anisotropic Cells Need Anisotropic Grids
The inverse-volume rule applies per axis. A wurtzite GaN cell with Å and Å has shorter than by the ratio . To match on all three axes you want an grid with . A common pragmatic choice: 8×8×5 instead of 8×8×8.
Slab calculations push this even further. A slab cell with vacuum along often has Å of vacuum, making tiny. Sampling along that axis is wasteful — most authors set for slabs (a single Γ-line of k-points) and use the in-plane as for the bulk surface unit cell.
The most expensive mistake in surface DFT
Forgetting that is the right answer for a slab — and using 8×8×8 because it "looks balanced" — can multiply your runtime by 8× with zero accuracy gain. Always check your KPOINTS file by hand for slabs.
Symmetry — The Irreducible Wedge
Section 3.4 noticed that the cubic BZ has 48 symmetry operations; the hexagonal BZ has 24. Apply these to the full mesh and most points fall on top of each other. VASP automatically reduces the full grid to the irreducible Brillouin zone (IBZ): the smallest wedge from which the rest of the BZ can be regenerated by symmetry.
For an FCC primitive cell with an 8×8×8 mesh:
- Full BZ: 512 points.
- Irreducible BZ for the 48-element symmetry: 60 points (printed at the top of IBZKPT).
- Cost saving: ~8.5×. The diagonalisation runs only over the irreducible set; the BZ averages are reconstructed using the orbit-size weights .
Two everyday consequences:
- If you break the symmetry — say, by introducing a single Mn dopant in a CdSe supercell — the irreducible set grows. A 4×4×4 mesh that gave you 8 irreducible k-points in pristine CdSe might give 32 in the doped supercell. Plan your CPU budget accordingly.
- A Γ-centred grid keeps every point group operation as a strict symmetry of the mesh. A non-Γ-centred (offset) grid can break inversion symmetry and double the irreducible set. For magnetic and spin-orbit calculations, Γ-centred is almost always the right choice.
VASP — The KPOINTS File at a Glance
VASP's simplest k-point input — and the one you will write most often — is the automatic form. Five lines, no surprises:
1Automatic mesh ! line 1: comment, free-text
20 ! line 2: 0 → automatic generation
3Gamma ! line 3: Gamma (Γ-centred) or Monkhorst-Pack
48 8 8 ! line 4: subdivisions N1 N2 N3
50 0 0 ! line 5: optional shift (s1 s2 s3)- Line 1. A free-form comment. Use it to document the mesh density (e.g. KSPACING ≈ 0.20 Å⁻¹) so your future self can remember what was tested.
- Line 2. Number of k-points to read explicitly. 0 means "generate automatically from the next three lines". A positive integer would be followed by a manual list — used for band-structure paths (Section 3.10).
- Line 3. Mesh type. Gamma places one point exactly at and is the safe default for hexagonal cells, slabs, magnetic systems, and tetrahedron method. Monkhorst-Pack shifts the grid by half a step so no point sits at Γ — better for some metallic cubic systems but breaks inversion-related savings in hexagonal lattices. When in doubt, choose Gamma.
- Line 4. Subdivisions along . This is what you change in a convergence sweep.
- Line 5. Additional shift in fractional reciprocal coordinates. Almost always zero. Used in rare cases where you need to hit a specific high-symmetry point exactly.
KSPACING — even simpler
Set KSPACING = 0.25 in INCAR and delete the KPOINTS file entirely (or use the bare automatic form). VASP will compute the smallest giving Å⁻¹ for every axis. You write one number; the code handles the inverse-volume arithmetic.
Running a Convergence Test in Python
Before launching a 10-day VASP job on the right mesh, test the same idea in Python with a toy band. The 2-D tight-binding model from the interactive above is perfect: it has a Fermi surface (so it behaves like a metal), a closed-form band, and runs in microseconds. Click any line for the variable values and the why-each-argument story.
Run it and look at the output. The energy lurches from at down to at , back up to at , then creeps monotonically toward the converged value eV from below. That dramatic odd/even swing at low is the metallic Fermi-surface signature in action — exactly what we predicted from the integrand having a kink at . The fix in a real VASP calculation is ISMEAR = 1, SIGMA = 0.1 — smear the step out and convergence becomes monotonic and fast.
Translating this script to VASP
For a full VASP convergence test, replace average_filled(N) with a function that writes a KPOINTS file with subdivisions , runs vasp_std, and parses OUTCAR for free energy TOTEN. The convergence plot will look qualitatively the same — a fast-decaying envelope with residual oscillation in metals, smooth and monotonic in insulators — and your stopping criterion (1 meV/atom) carries over verbatim.
Common Pitfalls
| Pitfall | Symptom | Fix |
|---|---|---|
| Same N for primitive and supercell | 8× more k-points than needed in a 2×2×2 supercell — runtime scales linearly with k-points. | Halve N_i per dimension when you double the cell. Or use KSPACING and let VASP do it. |
| Fixed N forgetting axis ratios | Hexagonal or layered cells with c ≫ a converge slowly along c, or sample the c-axis far past need. | Make N_i ∝ |b_i|. For a:c = 1:2, use 2N × 2N × N (or just KSPACING). |
| Single k-point on a metal | Total energy nonsensical, magnetic moment fluctuates between runs. | Use ≥ 6×6×6 for cubic metals and ISMEAR = 1 with SIGMA = 0.05–0.1 eV. |
| Mixing Gamma and MP between runs | Tiny energy differences (~1 meV) when you compare cells with different KPOINTS modes. | Stick to one mode for any convergence series. Γ-centred is the safer default. |
| N₃ > 1 on a slab | Wasted CPU; some authors also see false dispersion along the surface normal. | Always set N₃ = 1 on slabs with vacuum along c. |
| Stopping the sweep at 1 mesh | ‘Converged’ is wishful thinking — you have one data point. | Run at least three meshes; check the trend; converge against your physics tolerance, not VASP defaults. |
Summary
- Brillouin-zone integrals are unavoidable: every observable in electronic-structure theory is a BZ average of band quantities. We replace the integral by a weighted sum over a finite mesh.
- The mesh density that matters is , not . Doubling the real-space cell halves the required N — the inverse-volume rule.
- Insulators converge fast (polynomial in ); metals converge slowly because the integrand has a Fermi-surface step. Smearing (ISMEAR) cures the metal case.
- The interactive 2-D viz makes the oscillation visible: even a textbook square-lattice Fermi surface produces a non-monotonic convergence curve until enough density is reached.
- Anisotropic cells need anisotropic grids. Slabs need .
- Symmetry shrinks the mesh dramatically — VASP's irreducible BZ can be 8× smaller than the full mesh on cubic cells. Breaking symmetry with dopants undoes that saving.
- The simplest VASP KPOINTS file is five lines; KSPACING in INCAR is even simpler. Always run a convergence sweep on the property you care about, not just on total energy.
- The Python sweep above mirrors the VASP sweep one-to-one. If a toy band converges at , expect the same for a real metal at the same .
Coming next: Section 3.9 — Monkhorst–Pack Grids — where we make the "regular grid" idea precise, derive the optimal half-step shift you saw above, and learn why Gamma and Monkhorst-Pack can give different irreducible point counts for the same cell.