How to Code with Me - Flake8 Hell
As scientists, we place huge importance on the communication of our results. We spend lots of time on editing, revising, and formatting so people can understand what we did. We also write a lot of code, so why aren’t we investing the same amount of love? Enter, flake8.
It’s incredibly important that we write following community standards so when other people read our work, they don’t have to think about how it’s organized. For scientific prose, this usually means the IMRD (introduction-methods-results-discussion) format. In Python, my current favorite programming language for science, this means using a standardized number of spaces for indents (4), using triple-double quotes for docstrings in the beginning of each module, class, and function, and lots more.
It’s pretty intimidating to figure out style. For english prose, Strunk and White wrote The Elements of Style. For Python, Guido van Rossum wrote PEP-8 and Raymond Hettinger presented Beyond PEP-8. Even with these resources, it’s still hard to learn which are rules and which are more like guidelines.
This post is a short explanation of how I use flake8
to keep a consistent
style in the code in my Python projects. There’s a similar command line tool for
fixing the style in R projects that’s already built into most operating
systems - rm -rf *
, but I won’t get more into that here.
It’s pretty easy to get up and running with flake8
- just run
pip install flake8
then use it from the shell on a python file like
flake8 my_file.py
or flake8 my_directory/
. Then, it outputs a list of
problems that need to be fixed on a line-by-line basis in your code.
You can also install plugins with pip
like that extend the kinds of things it
checks. A few that I install are:
- flake8-builtins - make sure you
don’t accidentally name a variable the same thing as a builtin. This happens a
lot with
id
. - flake8-bugbear - “find likely bugs and design problems in your program”, like when you have an unused variable in a loop
- flake8-colors - add color to the
flake8
output (explanation how to set up is below) - flake8-commas - add trailing commas where appropriate
- flake8-comprehensions reminders to use list comprehensions where appropriate
- flake8-docstrings - make sure your docstrings are present and written in the right format
- flake8-import-order - make sure your imports are organized properly
- flake8-print - make sure you never
ever ever use
print()
. The literal only exception is when using print to get text into a file withprint(..., file=...)
- flake8-use-fstring -
make sure you’re using f-strings instead of
%
or.format()
formatting. Exception being for logging. - pep8-naming - make sure names of variables, classes, and modules look right.
- pydocstyle - docstring style checker
In each of my repositories, I put all of the information on how to install
flake8
and its plugins then run them in a tox
configuration under the
[testenv:flake8]
header so they can easily reproducibly run with
tox -e flake8
. An example of part of one of my tox.ini
files (which always
lives in the root of the repository) is below:
[testenv:flake8]
skip_install = true
deps =
flake8
flake8-bandit
flake8-builtins
flake8-bugbear
flake8-colors
flake8-commas
flake8-comprehensions
flake8-docstrings
flake8-import-order
flake8-print
flake8-use-fstring
pep8-naming
pydocstyle
commands =
flake8 src/pybel/ tests/ setup.py
description = Run the flake8 tool with several plugins (bandit, docstrings, import order, pep8 naming).
Another configuration file you can set up in the root of the repository is
.flake8
. Unfortunately, the Python configuration file reader doesn’t allow
some of the crazy characters that I want to use for the colors so this can’t be
included in setup.cfg
or tox.ini
like most of your other configuration.
[flake8]
ignore =
# line break before binary operator
W503
exclude =
.tox,
.git,
__pycache__,
docs/source/conf.py,
build,
dist,
tests/fixtures/*,
*.pyc,
*.egg-info,
.cache,
.eggs
max-line-length = 120
import-order-style = pycharm
application-import-names =
pybel
bel_resources
tests
format = ${cyan}%(path)s${reset}:${yellow_bold}%(row)d${reset}:${green_bold}%(col)d${reset}: ${red_bold}%(code)s${reset} %(text)s
First thing you’ll notice is the ignore
list. This isn’t here to turn flake8
off because you’re feeling lazy. If somebody includes a change in this list in
their PR, you have to explain to them that compliance is not optional, then help
them work through the problem that they obviously gave up on solving. It’s
actually there for you, as the project maintainer, to enumerate the flake8
rules that you don’t agree with. For example, I totally disagree with the W503
line break before operator rule. I want to write long conditionals with and
statements on the first line, like this:
if (
condition_1
and condition_2
and condition_3
):
print('all true')
One of the benefits of this style is you can add more lines with only single
line diffs. The other is that the reader always sees the operation that goes
with each line. Same could be done with arithmatic that could incorporate not
only +
but also -
.
Next is the exclude
block. Just copy/paste this each time, since it has lots
of garbage you don’t want flake8
to bother with. One of the checkers in
flake8
is for function “cyclomatic” complexity. You can make the maximum
number higher with max-complexity
. Normally, you want this to be enforced, but
sometimes there’s no way around a complex function. For this, you can add a code
comment noqa
followed by the error code like # noqa:W123
. Again, adding tags
to ignore bad style just to pass flake8
is against the point.
The max-line-length
is a very contentious setting. I think 120 is fine. Some
people think 78, 79, or 80 is best because of the standard sizes of old computer
screens or punch cards… When I get older and I can’t read my computer screen,
I’ll probably make the text bigger and change my mind about this. If you find
yourself breaking up lines in a totally non-sensical, unstyled way, then you’re
conforming too tightly to the rules. Sorry about the mixed messages!
import-order-style = pycharm
application-import-names =
pybel
bel_resources
tests
I copied this again because this part is really important. You have to tell
flake8
what rules you use for import order. I use the pycharm rules, which
group python builtin packages, then 3rd party packages, then my packages. The
application-import-names
is a place to list what are your packages.
Last is the format
entry, which gives the nice colorful output. Copy paste
this! I borrowed mine from Scott Colby.
After all of that, I set up Travis CI to run tox
every time code is pushed to
the repository. If you’re working in a team, you probably do something like the
fork/pull request or branch/pull request workflow on GitHub to support doing
code review before merging new code. The best part is that there’s a big box on
each pull request that checks if flake8
passed (among other tests), which
means that there were no errors detected.
I encourage my teammates to make pull requests as soon as they start working on
code. GitHub even has a “draft pull request” mode now. However, before asking
anyone to review your code, it has to pass flake8
. And obviously, no code that
isn’t passing flake8 can be merged.
This is a very painful process to get people used to. I’ve done it with many
groups of people and always got pushback. However, everyone who has gone through
the process with me has come out the other side happy that they did it. It’s
important that when you start enforcing coding rules on other people that you
are a resource for them - when somebody is frustrated by a flake8 error code
they have never seen, they will likely forget how to use Google. They will
probably ask you for help. You have to resist the urge to send
lmgtfy links to them and be patient. Because eventually,
they will do it on their own, and spread the gospel of flake8
.
While a good arsenal of flake8
plugins provides a solid foundation, it’s not
all that needs to be done to make your code readable and look good. Just like
with reading and speaking, the best way to develop a sense of style is by
reading lots of code (with the caveat that reading poorly written code
probably won’t teach you much). Within the rules imposed by flake8
, there is
lots of space for style. If you watch lectures from David Beazley, you’ll notice
a very different style from Raymond Hettinger, and also from me.
Now that you’ve made it to the end of this short guide, I wish you the best of luck on developing your own style!
Are you working with people who are particularly unsusceptible to Travis CI emails or checking the big red box on pull requests? You could try getting them set up with pre-commit hooks, which run the style checks locally any time they try and push (even if it’s to a branch) and it will give them the message in the console.
Is style not your thing at all / you’re not ready to let go of your identity as a Java/Perl developer? Maybe consider Black, which actually re-writes your code in a deterministic style. I don’t live by it, but it’s a great tool to run on a code base that’s never been loved before going back and stylizing it.