Export anaconda environment with conda export
conda env export
not working as expected? Jump to the solution!Managing projects with virtual environments is one of the core practices of modern Python development. There are several tools for managing virtual environments, such as venv
, pipenv
, poetry
and conda
. Some of these tools also provide dependency and project management capabilities, and there is no universally acknowledged favorite. The important thing to understand is that the exact tool you use is much less important than a clear and consistent development process that you and your team use.
In this blog post we will discuss one specific problem with managing conda
environment specifications. I would like to keep this post focused, so we will not dive into reasons for choosing conda
over other tools, neither will we discuss the full Python development pipeline.
Why version environment specifications
Saving and versioning your environment spec is always a good idea, even when you work alone on a pet project. A specification stored along with your code will help you recreate your environment on a different machine or when you take a break from you project and get back to it half a year later.
Having an environment spec is even more important when you work with a team: all of your teammates should be able to spin up a new environment quickly and reliably.
As mentioned in this great blog post, we need to accomplish two distinct and sometimes conflicting objectives:
- Reproducibility. You should be able to reproduce your environment with as much precision as possible. It's really important that you deploy your code to the same environment that you develop and test it in. You can achieve great reproducibility with
conda-lock
tool, but such environment specs are really hard to read, reason about and maintain. - Upgradability. You should be able to manage your environment specification by hand in a text editor. Ideally, you would start your project with an empty environment and gradually install packages with
conda install
or evenpip install
. After some initial fiddling you settle on an environment that works, and you would like to save its configuration for your team to use. You would like everyone to have the exact same environment, so you useconda-lock
to generate locked specifications for each platform. But to generate a lockfile, you first need a less strict version of a specification — one that contains only those packages and their versions that you really care about, and not all the transitive dependencies. This file you can edit by hand and be sure that the packaging system will do its best to resolve all the dependencies. Right now you would either author this file completely by hand, or useconda env export
and then edit the output by hand, since the resulting specifications are not portable, unfortunately.
Automating environment specifications with conda export
Since it's quite burdensome to manually track all the packages you install, I decided to find a way to do this automatically. I quickly found that conda env export
doesn't help much, since:
- It exports all the transient dependencies too. This is problematic, since on different platforms transient dependencies might differ.
- It adds a fixed version component, which prevents cross-platform portability and generally defeats the idea of upgradability that we discussed previously.
- It includes environment prefix, which should also be deleted for specification to be portable.
As you can see, conda env export
doesn't help much with our task, and one might argue that it's more effective to write the specification by hand.
To tackle this problem I've created conda-export
— a tool that generates portable environment specifications, which contain only top-level packages that you installed over the lifetime of your environment.
The solution
You can install conda-export
into your root environment (so that your other environments are not cluttered by tools not relevant to the projects):
conda install conda-export -n base
And then you can export you portable environment specification with a command like this:
conda export -n [environment name] -f environment.yml
And that's it! You will find that conda export
tries to minimize total number of packages in a specification by taking into account only those packages that you explicitly installed. It includes specific versions only if you specified them, and also handles pip
packages separately and correctly.
Suggested workflow
So what would be the complete workflow for having both reproducible and upgradable environment specifications? I would suggest the following:
- Start with an empty environment by using
conda create
. - Install packages as you normally would with
conda install
orpip install
. You can even use specific versions, likeconda install numpy=1.26
. - When you are satisfied with your environment, use
conda export
to create a portable and upgradableenvironment.yml
file. - Use
conda lock
to generateconda-lock.yml
with all dependencies fully locked. - Use
conda-lock.yml
to deploy your code to production, build containers and spin up new environments on development machines. - When it's time to upgrade packages, just regenerate
conda-lock.yml
fromenvironment.yml
possibly updating versions for only those packages that need to have fixed versions.