An Improved Python Pip Workflow

October 10, 2018 5 minutes

python

Pip is a quite useful tool. Bundled with Python since 3.4, it’s the default way of adding and removing third party dependencies to projects. Simply activate your virtualenv and run pip install. Sometime later, run pip freeze > requirements.txt to create a requirements file that includes the versions of the packages you installed, so when somebody else installs the dependencies, they install what you used and not some potentially incompatible version. They can do so by running pip install -r requirements.txt.

The problem

The process previously described, while simple, assumes that we have a single set of requirements. This may be acceptable if you don’t have a problem with installing your testing dependencies in your production environment, if you don’t have dependencies specific to an environment, or if you’re just writing a simple program or script.

If you DO care however, then this becomes a real PITA really fast.

The solution: pip mastery

Pip is a easy-to-use tool. So easy that you don’t even need to read the manual.

In the pip install documentation page, in the Requirements File Format section, there’s information about two little things we can add to our requirements.txt files:

-r: to import all the requirements from a different requirements file, similar to inheritance.
-c: to refer to another file which contains contraints, like package version contraints.

Integrating these two things into our workflow, it’ll look like this:

Installing production dependencies

Create a lock.txt file if it doesn’t exist already.

Create a file production.txt and list the packages that you want to install, one package per line.

Refer to the lock.txt file as the constraint file.

-c lock.txt

Flask
gunicorn
SQLAlchemy

Run pip install -r production.txt. Wait until packages are installed.

Run pip freeze > lock.txt.

That’s it. If you need to install further production dependencies, then simply add them to the production.txt file and repeat the process.

Installing development dependencies

Create the lock.txt file if for some reason it doesn’t exist already.

Create a file development.txt and list the development packages you want to install, one package per line.

Refer to the production.txt file as required file, and to lock.txt as contraints file.

-c lock.txt
-r production.txt

flake8
pytest

Run pip install -r development.txt. Wait until packages are installed.

Run pip freeze > lock.txt.

As you see, the process is almost identical to installing production dependencies. To install further development dependencies, add them to development.txt and repeat the process.

Uninstalling a package

Well, ‘ere comes the ugly!

Remove the packages from production.txt, development.txt, or whatever requirements file you’re using.
pip freeze | xargs pip uninstall -y to uninstall all installed packages.
pip install -r *development.txt* (or production.txt).
pip freeze > lock.txt

By doing things this way, you ensure that no dangling dependencies are left installed, and that further clean installs will work correctly after the package removal.

The other solution: Pipenv

There’s another solution that has been gaining mindshare since quite a while, even being featured by the PyPA itself. That’s of course, Pipenv.

So why I don’t recommend using Pipenv?

Pipenv is VERY slow: I don’t know why this is the case, but everytime you install a new Python package, the default behavior is to update the Pipenv.lock file. And this sometimes takes tens of seconds. Unacceptable.
Pipenv updates already installed packages when installing a new package: Maybe there’s yet another command or flag to prevent this, but the fact is that I see no reason why that should be the default behavior.
Pipenv requires a separate package installation: This is more of a nitpick. But pip comes preinstalled, so it works in any computer with Python, and you don’t have to add an additional pip install pipenv layer in Dockerfiles.
Pipenv install without arguments does something unexpected: When you run npm install, it installs from the lockfile if available. If you run yarn install it uses the lockfile. If you run pipenv install it ignores the lockfile and installs from Pipfile silently. You apparently have to run pipenv sync to install from the lockfile… but honestly, I don’t know. It’s confusing.
Pipenv assumes that you always want a virtualenv: Which is not what you really want inside a Docker container. You have to add the –system flag for this, which is only said briefly in the advanced documentation.
Pipenv assumes that you want your virtualenv in the home directory: If you want it inside the project directory, then you have to set some obscure environment variable that’s even more hidden than the --system flag. Earlier I made a blog post and made sure of including that flag just so I could check it later if I forgot about it.

In summary, Pipenv brings a ton of opinions into a project just to solve two issues: create virtualenvs, and install Python packages while also updating a lockfile. The former is solved by running python -m venv .venv, and the latter by following the instructions in this post.

Now, to be fair, it also does hash integrity checks, which our proposed alternative doesn’t do; Pip does support hash checking since version 8.0, so maybe we’re missing just a few more tweaks for feature parity…

Closing notes

This literal paragraph and section is here as to not finish the post with a rant. Please ignore, but not completely.

blog.sinenie.cl/