Chapter 1
25 min read
Section 1 of 60

What Is a Crystal?

The Architecture of Crystals — Real Space

Learning Objectives

After completing this section you will be able to:

  1. Define what makes a material crystalline and distinguish crystals from amorphous solids using both structural and experimental criteria.
  2. Decompose any crystal into its two building blocks — a lattice and a basis — and explain why this decomposition is unique and powerful.
  3. Write the lattice-vector equation R=n1a1+n2a2+n3a3\mathbf{R} = n_1\mathbf{a}_1 + n_2\mathbf{a}_2 + n_3\mathbf{a}_3 and explain every symbol's physical meaning.
  4. Identify the experimental signatures of crystallinity in X-ray, electron, and neutron diffraction patterns.
  5. Explain why translational periodicity enables Bloch's theorem and makes plane-wave DFT codes like VASP computationally feasible.
  6. Connect crystallographic descriptions to modern materials informatics: DFT input files, crystal-structure databases, and machine-learning representations.

Where this knowledge is used

Crystal structure description is the entry point for every solid-state calculation, diffraction experiment, phase identification, and materials-informatics workflow. Mastering this vocabulary is the prerequisite for everything that follows in this book.

The Big Picture: From Mineral Cabinets to Quantum Computers

The idea that solids are built from repeating units is surprisingly old. In 1784, the French mineralogist René Just Haüy dropped a calcite crystal, noticed that every fragment broke into perfect rhombohedra, and proposed that all crystals are built by stacking identical "molécules intégrantes" — tiny building blocks repeated in space. He could not see atoms, but his geometric insight was correct.

Over the next century, this idea matured:

  • Auguste Bravais (1850) proved mathematically that only 14 fundamentally different lattice types exist in three dimensions, classifying all possible "repetition rules."
  • Evgraf Fedorov and Arthur Schoenflies (1891) independently enumerated the 230 space groups — the complete catalog of symmetries a periodic 3D pattern can possess.
  • Max von Laue (1912) directed X-rays at a copper sulfate crystal and observed sharp diffraction spots, proving for the first time that crystals really are periodic on the atomic scale and that X-rays are waves.
  • William Henry and William Lawrence Bragg (1913) solved the first crystal structures (NaCl, diamond) from diffraction data, launching modern materials science.
The central idea: A crystal is not just "a pretty rock." It is matter with long-range translational order — a precise mathematical symmetry that determines how it diffracts radiation, conducts electrons, vibrates thermally, responds to stress, and interacts with light. Understanding that order is the key to predicting and engineering material properties from first principles.

What Defines a Crystal?

A crystal is a solid whose atoms (or ions, or molecules) are arranged in a pattern that repeats periodically in three dimensions. This periodicity means that if you translate the entire structure by certain specific vectors, it maps exactly onto itself.

The Three Pillars of Crystallinity

  1. Translation invariance: there exist three non-coplanar vectors a1,a2,a3\mathbf{a}_1, \mathbf{a}_2, \mathbf{a}_3 such that shifting the structure by any integer combination of these vectors leaves it unchanged.
  2. Complete tiling: the repeating unit (the unit cell) fills all of three-dimensional space without gaps or overlaps, like bricks in a wall.
  3. Long-range order: knowing the arrangement in one unit cell lets you predict the arrangement arbitrarily far away. This is what distinguishes a crystal from a liquid or a glass.
Loading 3D visualization...

The Formal Statement

A structure possesses translational periodicity if there exists a set of translation vectors {R}\{\mathbf{R}\} such that the atomic density ρ(r)\rho(\mathbf{r}) satisfies:

ρ(r+R)=ρ(r)for all r\rho(\mathbf{r} + \mathbf{R}) = \rho(\mathbf{r}) \quad \text{for all } \mathbf{r}

Here ρ(r)\rho(\mathbf{r}) is the electron density (or, equivalently, the probability of finding an atom) at position r\mathbf{r}. The set of all valid translation vectors R\mathbf{R} forms a lattice, which we will define precisely below.

Real crystals are never perfect

No physical crystal is truly infinite or perfectly periodic. Real crystals have surfaces, defects (vacancies, dislocations, grain boundaries), thermal vibrations, and finite size. The periodic model is an idealization — but an extraordinarily useful one that captures the essential physics of the bulk material.

The Wallpaper Analogy

Before jumping to three dimensions, build your intuition with a two-dimensional analogy: wallpaper. A wallpaper pattern has a small motif — a flower, a geometric shape — repeated at regular intervals across a flat surface. If you know the motif and the two repeat vectors, you can reconstruct the entire wall.

A crystal works the same way, extended into three dimensions. There is a small repeating unit (a group of atoms) that tiles all of 3D space without gaps or overlaps. The repeat rule is the lattice, and the motif is the basis.

Wallpaper (2D)Crystal (3D)
2 repeat vectors define the tiling3 lattice vectors define the tiling
Motif: a printed design elementBasis: a group of atoms
17 possible wallpaper groups230 possible space groups
Motif + grid = wallpaperBasis + lattice = crystal

The 17 wallpaper groups classify every possible 2D periodic pattern. They are the two-dimensional analogues of the 230 space groups that classify all possible 3D crystal symmetries (Section 7 of this chapter). The jump from 2D to 3D adds complexity, but the underlying principle is identical: classify all distinct ways a motif can be repeated with translational symmetry.


Lattice + Basis = Crystal

The single most important concept in this entire textbook is the decomposition:

Crystal=LatticeBasis\text{Crystal} = \text{Lattice} \otimes \text{Basis}

The symbol \otimes here means "place a copy of the basis at every lattice point." This is a convolution: the crystal is generated by convolving the lattice (an infinite set of mathematical points) with the basis (a finite set of atoms).

ComponentWhat It IsContains Atoms?Analogy
LatticeAn infinite set of mathematically equivalent points with translational symmetryNoThe grid of hooks on a wall
BasisA group of atoms (species + positions) attached to each lattice pointYesThe ornament hung on each hook
CrystalLattice ⊗ BasisYesThe decorated wall
Loading 3D visualization...

The distinction matters

The lattice is a purely geometric, abstract concept — it contains no atoms. The basis is the physical content. Confusing the two is one of the most common errors in introductory crystallography. When someone says "the FCC lattice of copper," they mean an FCC Bravais lattice with a one-atom basis of Cu at the origin.

The Lattice

A Bravais lattice is the set of all points that can be reached from any one point by integer translations along three linearly independent vectors a1,a2,a3\mathbf{a}_1, \mathbf{a}_2, \mathbf{a}_3. The general lattice point is:

R=n1a1+n2a2+n3a3\mathbf{R} = n_1 \mathbf{a}_1 + n_2 \mathbf{a}_2 + n_3 \mathbf{a}_3

Symbol-by-Symbol Explanation

SymbolTypeMeaning
R3D vector (Å)Position of any lattice point in real (direct) space
a₁, a₂, a₃3D vectors (Å)Primitive lattice vectors: the three independent repeat steps. Their choice is not unique.
n₁, n₂, n₃Integers (ℤ)Coefficients that select which lattice point. n₁ = 0, n₂ = 0, n₃ = 0 gives the origin.

What does this equation say in plain language? Starting from any lattice point, you can reach every other lattice point by taking n1n_1 steps along a1\mathbf{a}_1, then n2n_2 steps along a2\mathbf{a}_2, then n3n_3 steps along a3\mathbf{a}_3. The lattice is the set of all points you can reach this way.

Geometric picture: Think of the lattice vectors as the edges of a parallelipiped (a "squished box"). Stacking copies of this box in all three directions fills all of space — that is the lattice.

Key Properties

  • Discrete: the points are isolated, not continuous. There is a minimum distance between neighbors.
  • Infinite: the lattice extends to infinity in all directions (in VASP, periodic boundary conditions simulate this).
  • Identical environment: every lattice point has exactly the same surroundings. This is the defining property of a Bravais lattice.
  • Non-unique choice: the lattice vectors a1,a2,a3\mathbf{a}_1, \mathbf{a}_2, \mathbf{a}_3 are not unique. There are infinitely many valid choices that generate the same lattice. We typically choose the primitive vectors (smallest cell) or the conventional vectors (highest symmetry cell).
Loading 3D visualization...

The honeycomb is not a Bravais lattice

Graphene's honeycomb atomic arrangement is not a Bravais lattice because two neighboring sites (A and B sublattices) have different local environments. Instead, graphene is described as a hexagonal Bravais lattice with a two-atom basis. This is exactly the lattice + basis decomposition at work.

The Unit Cell

The unit cell is the region of space that, when translated by all lattice vectors, fills all of space exactly once. The primitive cell is the smallest such region (it contains exactly one lattice point). The volume of the primitive cell is:

Vcell=a1(a2×a3)V_{\text{cell}} = |\mathbf{a}_1 \cdot (\mathbf{a}_2 \times \mathbf{a}_3)|

This is the scalar triple product of the three lattice vectors — geometrically, the volume of the parallelepiped they define.


The Basis

The basis specifies the physical content of each unit cell: which atoms are present and where they sit. It is a list of NN atoms, each described by:

  1. Species: the chemical element (e.g., Cd, Se, Mn, O).
  2. Position: expressed in fractional coordinates relative to the lattice vectors.

If the jj-th basis atom has fractional coordinates (xj,yj,zj)(x_j, y_j, z_j), its Cartesian position within the unit cell at lattice site R\mathbf{R} is:

rj=R+xja1+yja2+zja3\mathbf{r}_j = \mathbf{R} + x_j \mathbf{a}_1 + y_j \mathbf{a}_2 + z_j \mathbf{a}_3
SymbolTypeMeaning
rⱼ3D vector (Å)Absolute position of the j-th atom in the crystal
R3D vector (Å)Which unit cell this atom is in (lattice translation)
xⱼ, yⱼ, zⱼReal numbers ∈ [0, 1)Fractional coordinates: how far along each lattice vector the atom sits, measured as a fraction of that vector’s length

Why fractional coordinates? They are dimensionless and lattice-independent. If you change the lattice constant (e.g., under pressure), the fractional coordinates stay the same while the Cartesian positions scale accordingly. This makes them the natural choice for crystal structure databases and VASP input files.

Simple vs Complex Bases

MaterialLatticeBasisAtoms per Cell
Copper (Cu)FCC1 Cu at (0, 0, 0)1
NaClFCCNa at (0,0,0) + Cl at (½,½,½)2
Diamond (C)FCCC at (0,0,0) + C at (¼,¼,¼)2
CdSe zinc blendeFCCCd at (0,0,0) + Se at (¼,¼,¼)2
Perovskite (BaTiO₃)Simple cubicBa + Ti + 3 O5
Protein crystalsVariousThousands of atoms10⁴–10⁶

The basis can range from a single atom (elemental metals) to hundreds of thousands of atoms (protein crystals). The lattice + basis decomposition works for all of them.


Interactive: Build a Crystal

Use the interactive visualization below to explore how the lattice + basis decomposition works. Select a crystal type, then step through the three stages: first the abstract lattice, then the basis atoms in one unit cell, and finally the full crystal.

Interactive: Crystal = Lattice + Basis

Step through the decomposition of a crystal into its two building blocks.

Step 1 — The Lattice
Two atoms, square lattice

An infinite set of mathematically equivalent points. No atoms here — only abstract positions with perfect translational symmetry.

a\u2081a\u2082
a₁ a₂CdSeUnit cellOrigin
Loading 3D visualization...

Things to try

1. Select "Zinc Blende (CdSe)" and step from 1 to 3 to see how two atoms per cell create an entirely different pattern than one. 2. Switch to "Hexagonal (Graphene)" and notice how the 120° lattice angle creates a triangular tiling. 3. Compare the "Simple (Cu)" and "NaCl-type" presets: same lattice, different basis, completely different crystal.

Examples Across Materials and Experiments

The lattice + basis framework is not just a textbook abstraction. It is the language used every day by experimentalists and computational scientists across physics, chemistry, and engineering.

From Simple Metals to Complex Oxides

MaterialApplicationLattice SystemAtoms/CellWhy It Matters
Cu (copper)Wiring, electronicsFCC1Simplest metallic crystal; benchmark for DFT
Si (silicon)SemiconductorsDiamond cubic (FCC + 2-atom basis)2Foundation of the electronics industry
NaCl (rock salt)Table salt, IR opticsFCC + 2-atom basis2Prototype ionic crystal
GaAsLEDs, solar cellsZinc blende (FCC + 2-atom basis)2III-V semiconductor
CdSeQuantum dotsZinc blende or wurtzite2–4Our target material for VASP
BaTiO₃ (perovskite)Capacitors, ferroelectricsSimple cubic + 5-atom basis5Prototype ferroelectric; piezoelectric sensors
YBa₂Cu₃O₇High-Tᶜ superconductorsOrthorhombic13Complex oxide with layered structure

The Same Framework Everywhere

Whether you are indexing powder diffraction peaks in a geology lab, designing a new solar cell absorber, or running a VASP calculation on a supercomputer, you always start the same way: specify the lattice vectors and the basis. The framework is universal.


Example: Zinc Blende CdSe

Cadmium selenide in the zinc blende structure is the prototype material for this textbook. Let us dissect it completely using the lattice + basis framework.

Structure Description

  • Lattice: Face-centered cubic (FCC) with lattice constant a=6.052a = 6.052 Å.
  • Basis: Two atoms — Cd at (0,0,0)(0, 0, 0) and Se at (14,14,14)(\tfrac{1}{4}, \tfrac{1}{4}, \tfrac{1}{4}) in fractional coordinates.
  • Coordination: Each Cd is tetrahedrally surrounded by 4 Se atoms, and each Se by 4 Cd. The nearest-neighbor Cd–Se distance is d=a342.62d = \frac{a\sqrt{3}}{4} \approx 2.62 Å.
  • Space group: F4ˉ3mF\bar{4}3m (number 216).
Loading 3D visualization...

VASP POSCAR File

The POSCAR file is the first input VASP reads. It encodes the lattice vectors and atomic positions — exactly our lattice + basis decomposition. Click on any line to see a detailed explanation.

VASP POSCAR for Zinc Blende CdSe
📝POSCAR
1Comment line

An arbitrary label describing the system. VASP ignores this, but it is essential for your own documentation.

2Scaling factor

A universal multiplier applied to all lattice vectors. Here 6.052 means the lattice constant is 6.052 Å (angstroms). VASP multiplies every vector component by this number.

3Lattice vector a₁

The first lattice vector in Cartesian coordinates: a₁ = 6.052 × (1, 0, 0) = (6.052, 0, 0) Å. Points along the x-axis.

4Lattice vector a₂

The second lattice vector: a₂ = 6.052 × (0, 1, 0) = (0, 6.052, 0) Å. Points along the y-axis.

5Lattice vector a₃

The third lattice vector: a₃ = 6.052 × (0, 0, 1) = (0, 0, 6.052) Å. Points along the z-axis. Together, a₁, a₂, a₃ form a cubic cell.

6Species names

The chemical symbols of each element in the cell. Order matters: the first set of coordinates will be Cd, the second set will be Se.

7Species counts

4 Cd atoms and 4 Se atoms. A conventional FCC cell has 4 lattice points, so we place 4 of each species.

8Coordinate type

'Direct' means fractional coordinates (values between 0 and 1). The alternative is 'Cartesian' for absolute positions in Å.

9Cd at origin

First Cd at fractional position (0, 0, 0). This is the corner of the unit cell.

10Cd at face center

Second Cd at (0, 0.5, 0.5), which is the center of the y-z face. This is an FCC lattice point.

11Cd at face center

Third Cd at (0.5, 0, 0.5), center of the x-z face.

12Cd at face center

Fourth Cd at (0.5, 0.5, 0), center of the x-y face. These four Cd positions define the FCC lattice.

13Se at tetrahedral site

Se at (¼, ¼, ¼). This is the tetrahedral interstitial site displaced from the first Cd. The Se atoms form the second sublattice.

14Se at tetrahedral site

Se at (¼, ¾, ¾), displaced from the second Cd at (0, ½, ½).

15Se at tetrahedral site

Se at (¾, ¼, ¾), displaced from the third Cd at (½, 0, ½).

16Se at tetrahedral site

Se at (¾, ¾, ¼), displaced from the fourth Cd at (½, ½, 0). Every Se is tetrahedrally coordinated by 4 Cd atoms, and vice versa.

1CdSe zinc blende
26.052
3  1.000000  0.000000  0.000000
4  0.000000  1.000000  0.000000
5  0.000000  0.000000  1.000000
6Cd Se
74  4
8Direct
9  0.000  0.000  0.000   ! Cd
10  0.000  0.500  0.500   ! Cd
11  0.500  0.000  0.500   ! Cd
12  0.500  0.500  0.000   ! Cd
13  0.250  0.250  0.250   ! Se
14  0.250  0.750  0.750   ! Se
15  0.750  0.250  0.750   ! Se
16  0.750  0.750  0.250   ! Se

Conventional vs primitive cell

The POSCAR above uses the conventional FCC cell with 8 atoms (4 Cd + 4 Se). The primitive cell would contain only 2 atoms but with non-orthogonal lattice vectors. Both describe the same crystal — the conventional cell is easier to visualize; the primitive cell is more efficient for computation.

Experimental Evidence: How Do We Know?

How do we know that crystals are really periodic? The answer is diffraction: when waves (X-rays, electrons, or neutrons) scatter from a periodic structure, they produce sharp, discrete peaks at specific angles. This is the experimental fingerprint of crystallinity.

X-Ray Diffraction (XRD)

The workhorse of crystallography. X-ray wavelengths (~1 Å) are comparable to interatomic distances, so crystals act as natural diffraction gratings. The positions of the diffraction peaks give the lattice parameters; the intensities give information about the basis (which atoms are where). For CdSe zinc blende, the strongest peaks appear at 2θ2\theta angles corresponding to the (111), (220), and (311) Miller planes.

Electron Diffraction (TEM/SAED)

In a transmission electron microscope, a focused electron beam produces a selected-area electron diffraction (SAED) pattern — a 2D array of spots that is a direct image of the reciprocal lattice (Chapter 3). Each spot corresponds to a set of crystal planes satisfying the Bragg condition. Single-crystal SAED patterns are used to determine crystal orientation and identify unknown phases.

Neutron Diffraction

Neutrons scatter from atomic nuclei (not electrons), making them sensitive to light atoms (hydrogen, lithium) and to magnetic order. Neutron diffraction is essential for determining the positions of light atoms in metal hydrides and the magnetic structure of materials like MnO.

TechniqueProbeScatters FromBest For
XRDX-rays (λ ~ 1 Å)Electron densityLattice parameters, phase identification, crystallite size
Electron diffractionElectrons (λ ~ 0.02 Å)Electrostatic potentialThin films, nanoparticles, local structure
Neutron diffractionNeutrons (λ ~ 1 Å)Nuclear scattering lengthsLight atoms (H, Li), magnetic structures

The key signature

Sharp peaks = crystal. Broad halos = amorphous. This is the most fundamental experimental distinction in materials science. All three diffraction methods show this behavior because the physics is the same: periodicity causes constructive interference at discrete angles.

Crystalline vs Amorphous

Not all solids are crystals. Amorphous materials (glasses, many polymers, amorphous silicon) lack long-range translational order. They may have short-range order — nearest-neighbor bond lengths and angles are well-defined — but no periodicity beyond a few atomic distances.

PropertyCrystallineAmorphous
Long-range orderYes — extends to macroscopic scalesNo — only short-range (1–3 coordination shells)
Diffraction patternSharp Bragg peaks at discrete anglesBroad halos (diffuse scattering)
Melting behaviorSharp melting pointGradual glass transition (Tᵍ)
Described by lattice + basisYesNo
VASP-friendly (periodic)Yes — small unit cell sufficesRequires large supercells (100+ atoms)
ExampleQuartz (crystalline SiO₂)Window glass (amorphous SiO₂)
Pair distribution functionSharp peaks at well-defined distancesPeaks broaden and merge with distance

Use the interactive below to see how introducing atomic disorder transforms a perfect crystal into an amorphous solid. Pay attention to how the simulated diffraction pattern changes.

Interactive: Crystalline vs Amorphous

Drag the slider to add disorder. Watch how the diffraction pattern changes.

0%
Perfect CrystalFully Amorphous
Perfect Crystalq (scattering vector)Sharp Bragg peaks
Crystal (no disorder)q (scattering vector)Slightly broadened
Key insight: Crystalline order produces sharp Bragg peaks in diffraction because periodicity leads to constructive interference at specific angles. Amorphous disorder smears these into broad halos because the regular spacing information is lost.

Amorphous materials in VASP

You can simulate amorphous materials in VASP, but you must use a large supercell (often 100–300 atoms) generated by a melt-quench molecular dynamics protocol. This is far more expensive than a crystalline calculation with 2–10 atoms per cell. The entire efficiency advantage of periodicity is lost, which is why crystallography matters.

Why Periodicity Enables Computation

The central promise of crystallography for computational physics is this: because a crystal is periodic, we only need to solve the quantum-mechanical equations for one unit cell — not for the ~102310^{23} atoms in a macroscopic sample. This is not just a convenience; it is the difference between possible and impossible.

Bloch's Theorem (Preview)

In a periodic potential, the solutions to the Schrödinger equation take a special form known as Bloch states:

ψnk(r)=eikrunk(r)\psi_{n\mathbf{k}}(\mathbf{r}) = e^{i\mathbf{k} \cdot \mathbf{r}} \, u_{n\mathbf{k}}(\mathbf{r})
SymbolTypeMeaning
ψₙₖ(r)Complex functionThe electronic wave function for band n at wave vector k
e^{ik·r}Complex phaseA plane wave that carries the crystal momentum k
uₙₖ(r)Periodic functionHas the same periodicity as the lattice: u(r + R) = u(r)
k3D vector (Å⁻¹)Crystal momentum, lives in reciprocal space (Chapter 3)
nIntegerBand index, labels different energy levels at the same k

What does Bloch's theorem say in plain language? Because the crystal potential is periodic, the wave function at any point in the crystal can be reconstructed from the wave function in just one unit cell, multiplied by a plane-wave phase eikre^{i\mathbf{k} \cdot \mathbf{r}}. Different choices of k\mathbf{k} probe different wavelengths of electronic oscillation through the crystal.

The computational miracle

Bloch's theorem converts an infinite-dimensional problem (all electrons in an infinite crystal) into a finite-dimensional problem: solve for the periodic part unku_{n\mathbf{k}} at a discrete set of k\mathbf{k}-points. This is why VASP — and all plane-wave DFT codes — work. Without periodicity, the entire framework collapses.

From One Cell to Infinity: The VASP Workflow

  1. Define the unit cell in the POSCAR file (lattice vectors + basis).
  2. VASP imposes periodic boundary conditions: the cell is replicated infinitely in all three directions.
  3. Expand wave functions in a plane-wave basis set up to an energy cutoff (set in INCAR via ENCUT).
  4. Solve the Kohn–Sham equations at each k\mathbf{k}-point on a discrete grid (the KPOINTS file).
  5. Integrate over k\mathbf{k}-space to obtain total energy, forces, band structure, density of states, and other properties.

Modern Computing & Materials Informatics

Crystallography is not a relic of the 19th century. It is the natural language of periodic matter, and as such it appears at the foundation of every modern computational and data-driven approach to materials science.

DFT and Periodic Boundary Conditions

All major DFT codes (VASP, Quantum ESPRESSO, ABINIT, CASTEP) assume periodicity. The lattice vectors define the simulation box; the basis defines the atomic content. When you run a DFT calculation, you are exploiting the crystal structure at every step:

  • k\mathbf{k}-space sampling: the Brillouin zone (the unit cell of the reciprocal lattice) is sampled with a discrete grid. More symmetry in the crystal means fewer k\mathbf{k}-points are needed — symmetry reduces computational cost.
  • Plane-wave basis: the periodic part of Bloch states is expanded in plane waves, which are natural eigenfunctions of the translation operator.
  • Symmetry reduction: space-group operations reduce the irreducible Brillouin zone, cutting the number of independent k\mathbf{k}-points by up to 48× for cubic crystals.

Crystal Structure Databases

The crystallographic description (lattice + basis) is how all structure databases store materials:

DatabaseContentEntries
ICSDInorganic crystal structures (experimental)~280,000
Materials ProjectDFT-computed properties for known and predicted crystals~150,000
AFLOWAutomated DFT workflows and structure prototypes~3,600,000
OQMDDFT formation energies~1,000,000
CODOrganic and inorganic structures (open access)~530,000

Machine Learning for Materials

Modern ML approaches to materials discovery rely heavily on crystallographic representations:

  • Graph neural networks (GNNs) such as CGCNN, MEGNet, and M3GNet represent crystals as graphs where nodes are atoms and edges encode distances and lattice periodicity. The lattice vectors are essential for correctly handling periodic boundary conditions.
  • Symmetry-aware descriptors like SOAP, Smooth Overlap of Atomic Positions, encode local atomic environments in a way that respects the translational and rotational symmetry of the crystal.
  • Crystal structure prediction algorithms (USPEX, CALYPSO, AIRSS) explore the space of possible lattice + basis combinations to find new stable phases.
  • Generative models (DiffCSP, CDVAE) learn to generate new crystal structures by operating directly on lattice parameters and fractional coordinates — exactly the quantities defined in this section.

Why crystallography is the language of materials ML

Every crystal can be uniquely described by a small set of numbers: 6 lattice parameters (a,b,c,α,β,γ)(a, b, c, \alpha, \beta, \gamma) plus 3NN fractional coordinates for NN atoms. This compact, physically meaningful representation is exactly what makes crystals amenable to machine learning — far more so than amorphous or disordered materials.

Computational Exploration: Building a Crystal in Python

Let us put theory into practice. The Python code below uses the Atomic Simulation Environment (ASE) to build the zinc blende CdSe crystal, inspect its structure, create a supercell, and export it in formats ready for VASP and crystallographic databases. Click on any line to see a detailed explanation.

Building Zinc Blende CdSe with ASE
🐍build_cdse.py
1Import ASE modules

The Atomic Simulation Environment (ASE) is a Python library for atomistic simulations. 'bulk' builds common crystal structures from a database of space groups and lattice parameters.

6Build CdSe crystal

ASE's bulk() function creates a CdSe unit cell in the zinc blende structure with lattice constant a = 6.052 Å. Internally it uses space group F-43m (#216) and places Cd at (0,0,0) and Se at (¼,¼,¼).

9Print lattice vectors

cdse.cell[:] returns a 3×3 array where each row is a lattice vector in angstroms. For cubic zinc blende, this is a diagonal matrix with 6.052 on the diagonal.

12Print fractional coordinates

Fractional (or 'direct') coordinates express each atom position as a linear combination of lattice vectors, with coefficients between 0 and 1. These are the numbers that go into a POSCAR file.

19Space group

The Hermann-Mauguin symbol F-43m (number 216) encodes all symmetry operations of the zinc blende structure: face-centering (F), 4-fold rotoinversion (-4), 3-fold rotation (3), and mirror planes (m).

22Build supercell

repeat((2,2,2)) tiles the unit cell 2×2×2 = 8 times, producing a supercell with 8 × 2 = 16 atoms. Supercells are needed for defect calculations (e.g., Mn doping) and molecular dynamics.

26Export POSCAR

Writes the structure in VASP's native format. The resulting POSCAR file is identical to the one shown above and can be used directly as VASP input.

29Export CIF

Crystallographic Information File (CIF) is the standard exchange format for crystal structures, used by databases like the ICSD, Materials Project, and the Cambridge Structural Database.

20 lines without explanation
1from ase.build import bulk
2from ase.visualize import view
3from ase.io import write
4
5# Build zinc blende CdSe from its space group
6cdse = bulk('CdSe', crystalstructure='zincblende', a=6.052)
7
8# Inspect the unit cell
9print("Lattice vectors (angstrom):")
10print(cdse.cell[:])
11
12print("\nAtom positions (fractional):")
13for atom in cdse:
14    frac = cdse.cell.scaled_positions(atom.position)
15    print(f"  {atom.symbol}: ({frac[0]:.3f}, {frac[1]:.3f}, {frac[2]:.3f})")
16
17print(f"\nNumber of atoms: {len(cdse)}")
18print(f"Space group: F-43m (216)")
19
20# Create a 2x2x2 supercell for visualization
21supercell = cdse.repeat((2, 2, 2))
22print(f"Supercell atoms: {len(supercell)}")
23
24# Export to VASP POSCAR format
25write('POSCAR', cdse, format='vasp')
26
27# Export to CIF for database submission
28write('CdSe_zincblende.cif', cdse)

Try it yourself

Install ASE with pip install ase and run this script. Then open the generated POSCAR file and verify it matches the one shown above. Try building other crystal structures:bulk('Si', 'diamond', a=5.43),bulk('NaCl', 'rocksalt', a=5.64),bulk('GaAs', 'zincblende', a=5.65).

Summary

In this section we established the foundational vocabulary of crystallography:

  • A crystal is a solid with long-range translational order. Its atomic density satisfies ρ(r+R)=ρ(r)\rho(\mathbf{r} + \mathbf{R}) = \rho(\mathbf{r}) for all lattice translations R\mathbf{R}.
  • Every crystal decomposes uniquely into a lattice (abstract grid) and a basis (atoms per grid point): Crystal = Lattice \otimes Basis.
  • The lattice is generated by three primitive vectors: R=n1a1+n2a2+n3a3\mathbf{R} = n_1 \mathbf{a}_1 + n_2 \mathbf{a}_2 + n_3 \mathbf{a}_3, where n1,n2,n3Zn_1, n_2, n_3 \in \mathbb{Z}.
  • The basis specifies atom types and fractional coordinates within the unit cell.
  • Zinc blende CdSe is an FCC lattice (a=6.052a = 6.052 Å) with a two-atom basis: Cd at (0,0,0)(0,0,0) and Se at (14,14,14)(\tfrac{1}{4}, \tfrac{1}{4}, \tfrac{1}{4}).
  • Experimentally, crystallinity is detected by sharp diffraction peaks (XRD, electron, neutron). Amorphous materials show broad halos.
  • Periodicity enables Bloch's theorem, which reduces the infinite crystal to a finite computation — the foundation of all VASP and plane-wave DFT calculations.
  • Modern materials informatics (databases, ML, crystal structure prediction) all operate on the lattice + basis representation defined here.

In the next section, we will make the lattice concrete by defining lattice vectors, unit cells, and the six parameters (a,b,c,α,β,γ)(a, b, c, \alpha, \beta, \gamma) that fully characterize the geometry of any lattice.