RDKit’s Python wheels live on PyPI under a maintainer who is technically not part of the core RDKit team – yet the official docs now link to his repo as the recommended pip path. That bit of trivia matters because half the tutorials on Google still tell you pip install rdkit doesn’t work. It does. Has for years.
This guide covers the latest RDKit (2026.03.2, released May 16, 2026) on Linux, macOS, and Windows. RDKit is the de facto open-source cheminformatics library – it parses SMILES, computes fingerprints, generates 3D conformers, and powers most ML pipelines for small molecules. We’ll cover the fast path, the source build, and the one error that ruins more installs than anything else: Boost version mismatch.
System requirements
RDKit is a C++ library with Python bindings, so the requirements are about your compiler toolchain more than your CPU.
| Component | Minimum | Recommended |
|---|---|---|
| OS | Linux (glibc 2.28+), macOS 11+, Windows 10 | Ubuntu 22.04 / macOS 13+ / Windows 11 |
| Python | 3.9 | 3.11 or 3.12 |
| RAM | 2 GB (pip install) | 8 GB+ (source build) |
| Disk | ~500 MB (wheel) | 5 GB+ (full source build) |
| CMake (source only) | 3.18 | 3.27+ |
| Boost (source only) | 1.70 | conda-forge boost |
The CMake number is a real trap. RDKit’s build system requires CMake 3.18 or newer, and Ubuntu 20.04 LTS ships CMake 3.16 in its default repos (as of the 20.04 LTS package set). The configure step fails immediately – not at compile time – which is good, because compile time is two hours later.
Method 1: pip (the one most guides skip)
If you just want from rdkit import Chem to work, this is it. The official installation page explicitly endorses pip wheels for Linux, Windows, and macOS across all major Python versions.
python -m venv rdkit-env
source rdkit-env/bin/activate # Windows: rdkit-envScriptsactivate
pip install rdkit==2026.3.2
That’s the whole install. No Boost. No conda. No compilers. The wheel bundles everything as .so / .dylib / .dll files.
The wheel ships with most of RDKit but not every optional component – the PostgreSQL cartridge and some niche descriptor modules need a full source build. For 95% of use cases – parsing molecules, fingerprints, descriptors, drawing – it’s complete.
Method 2: conda (still the docs’ first recommendation)
If you’re in a scientific Python stack with NumPy, SciPy, and Jupyter already running on conda, stay there.
conda create -c conda-forge -n my-rdkit-env rdkit
conda activate my-rdkit-env
This pulls the conda-forge build, which gets the most maintenance attention from core contributors. The reason the official docs still lead with conda: RDKit’s C++ extensions have a hard dependency on Boost::Python, which has to be dynamically linked. Conda handles that cleanly. Pip wheels handle it by bundling.
Method 3: build from source (when you actually need it)
Skip this unless you need a non-default build flag (PostgreSQL cartridge, custom fingerprint code, debug symbols). The official source path:
conda create -n rdkit_build -c conda-forge gxx_linux-64 cmake
cairo pillow eigen pkg-config boost-cpp boost
numpy matplotlib pandas pytest
conda activate rdkit_build
git clone https://github.com/rdkit/rdkit.git
cd rdkit
mkdir build && cd build
cmake -DPy_ENABLE_SHARED=1
-DRDK_INSTALL_INTREE=ON
-DRDK_INSTALL_STATIC_LIBS=OFF
-DRDK_BUILD_CPP_TESTS=ON
-DBOOST_ROOT="$CONDA_PREFIX"
..
make -j4
make install
The RDK_INSTALL_INTREE=ON flag installs into the source tree itself – useful for development, annoying for production. Flip it to OFF and add -DCMAKE_INSTALL_PREFIX="$CONDA_PREFIX" if you want it installed into the active conda env.
Pro tip: Use
make -j$(nproc)on Linux only if you have at least 8 GB of free RAM. The link step on Boost-heavy modules can swallow 2 GB per parallel job. If you’re on a laptop with 16 GB,-j4is the safe ceiling.
First-run verification
Don’t just check the version string. A broken Boost link will let import rdkit succeed and then crash on the first real call. Run this instead:
python -c "
from rdkit import Chem, __version__
from rdkit.Chem import AllChem, Draw
print('RDKit version:', __version__)
mol = Chem.MolFromSmiles('Cn1cnc2c1c(=O)n(C)c(=O)n2C') # caffeine
assert mol is not None, 'SMILES parser broken'
AllChem.EmbedMolecule(mol) # uses ETKDG by default
assert mol.GetNumConformers() == 1, '3D embedding broken'
fp = AllChem.GetMorganFingerprintAsBitVect(mol, 2, 2048)
assert fp.GetNumOnBits() > 0, 'Fingerprint broken'
print('All subsystems OK')
"
If all three asserts pass, the install is solid. The EmbedMolecule call exercises ETKDG conformer generation, which replaced standard distance geometry as the default. If that step segfaults, your Boost link is wrong.
The Boost ImportError, explained properly
The most reported install failure:
ImportError: libboost_python3.so.1.65.1: cannot open shared object file: No such file or directory
Most tutorials tell you to reinstall and shrug. The real cause: Boost embeds its version number into the .so filename, so an RDKit binary built against Boost 1.65 cannot dynamically load Boost 1.67 – even though they’re ABI-similar. Conda sometimes upgrades Boost independently of RDKit when you install another package later, and the link breaks.
The fix isn’t pip install --force-reinstall. You need to pin both packages to compatible versions. First, find what version RDKit was built against:
conda list | grep boost
Then pin both together – for example, if your environment shows libboost-python 1.86:
conda install -c conda-forge "rdkit=2025.09" "libboost-python=1.86"
Replace 1.86 with whatever version conda list shows in your environment. Or just use the pip wheel, which bundles its own Boost and sidesteps the problem entirely.
Here’s what makes this frustrating: the error looks like a missing file, not a version conflict. You might have Boost 1.74 installed but the loader is hunting for the 1.65 filename specifically – it won’t find it even if Boost is right there. That’s worth knowing before you spend an hour reinstalling things that aren’t actually broken.
Upgrading and uninstalling
Upgrade in place:
# pip
pip install --upgrade rdkit
# conda
conda update -c conda-forge rdkit
Check the backwards-incompatible changes page before jumping major versions. The 2025.09 cycle switched JSON handling from RapidJSON to Boost, which broke a handful of downstream tools that linked against the old symbol. Dry reading, but 30 seconds there beats two hours debugging later.
Uninstall is boring, which is how it should be:
pip uninstall rdkit
# or
conda remove -n my-rdkit-env --all # nukes the whole env
For source builds, delete the build/ directory and unset RDBASE, PYTHONPATH, and LD_LIBRARY_PATH from your shell rc file – they tend to linger.
A note on the JavaScript build
If you’re building a web app and don’t need full RDKit, there’s @rdkit/rdkit on npm – a WASM-compiled subset called MinimalLib. The latest version as of late 2025 is 2025.3.4-1.0.0, lagging the Python release by roughly two cycles. Not every Python function exists in MinimalLib – the core team chose a subset appropriate for a JS context. Good enough for rendering and basic substructure search, not for fingerprints combined with ML pipelines.
FAQ
Should I use pip or conda for RDKit in 2026?
Not already on conda? Use pip. Done.
Why does my Jupyter kernel not see RDKit after installing it?
This bites people running JupyterLab from a base conda env while installing RDKit into a child env. Jupyter is using the base interpreter, not the env’s. Fix it by installing ipykernel into your RDKit env and registering it: python -m ipykernel install --user --name my-rdkit-env. Then switch kernels in the notebook UI.
Is RDKit really free for commercial use?
Yes – and this trips people up because some cheminformatics tools use GPL or restrictive academic licenses. RDKit is BSD-3-Clause: commercial use, modification, and redistribution are all permitted. The only requirement is keeping the copyright notice in your distribution. No copyleft. No “contact us for a commercial license” catch.
Next step: open a Python shell, paste the verification script above, and confirm all three asserts pass before you write a single line of real code. A broken install will waste hours later – five seconds now saves them.