Useful command-line options
Now we will take a look at command-line options that will make you more productive in your daily work. As stated at the beginning of the chapter, this is not a complete list of all of the command-line features; just the ones that you will use (and love) the most.
Keyword expressions: -k
Often, you don't exactly remember the full path or name of a test that you want to execute. At other times, many tests in your suite follow a similar pattern and you want to execute all of them because you just refactored a sensitive area of the code.
By using the -k <EXPRESSION>
flag (from keyword expression), you can run tests whose item id
loosely matches the given expression:
λ pytest -k "test_parse"
This will execute all tests that contain the string parse
in their item IDs. You can also write simple Python expressions using Boolean operators:
λ pytest -k "parse and not num"
This will execute all tests that contain parse
but not num
in their item IDs.
Stop soon: -x, --maxfail
When doing large-scale refactorings, you might not know beforehand how or which tests are going to be affected. In those situations, you might try to guess which modules will be affected and start running tests for those. But, often, you end up breaking more tests than you initially estimated and quickly try to stop the test session by hitting CTRL+C
when everything starts to fail unexpectedly.
In those situations, you might try using the --maxfail=N
command-line flag, which stops the test session automatically after N
failures or errors, or the shortcut -x
, which equals --maxfail=1
.
λ pytest tests/core -x
This allows you to quickly see the first failing test and deal with the failure. After fixing the reason for the failure, you can continue running with -x
to deal with the next problem.
If you find this brilliant, you don't want to skip the next section!
Last failed, failed first: --lf, --ff
Pytest always remembers tests that failed in previous sessions, and can reuse that information to skip right to the tests that have failed previously. This is excellent news if you are incrementally fixing a test suite after a large refactoring, as mentioned in the previous section.
You can run the tests that failed before by passing the --lf
flag (meaning last failed):
λ pytest --lf tests/core ... collected 6 items / 4 deselected run-last-failure: rerun previous 2 failures
When used together with -x
(--maxfail=1
) these two flags are refactoring heaven:
λ pytest -x --lf
This lets you start executing the full suite and then pytest stops at the first test that fails. You fix the code, and execute the same command line again. Pytest starts right at the failed test, and goes on if it passes (or stops again if you haven't yet managed to fix the code yet). It will then stop at the next failure. Rinse and repeat until all tests pass again.
Keep in mind that it doesn't matter if you execute another subset of tests in the middle of your refactoring; pytest always remembers which tests failed, regardless of the command-line executed.
If you have ever done a large refactoring and had to keep track of which tests were failing so that you didn't waste your time running the test suite over and over again, you will definitely appreciate this boost in your productivity.
Finally, the --ff
flag is similar to --lf
, but it will reorder your tests so the previous failures are run first, followed by the tests that passed or that were not run yet:
λ pytest -x --lf ======================== test session starts ======================== ... collected 6 items run-last-failure: rerun previous 2 failures first
Output capturing: -s and --capture
Sometimes, developers leave print
statements laying around by mistake, or even on purpose, to be used later for debugging. Some applications also may write to stdout
or stderr
as part of their normal operation or logging.
All that output would make understanding the test suite display much harder. For this reason, by default, pytest captures all output written to stdout
and stderr
automatically.
Consider this function to compute a hash of some text given to it that has some debugging code left on it:
import hashlib def commit_hash(contents): size = len(contents) print('content size', size) hash_contents = str(size) + '\0' + contents result = hashlib.sha1(hash_contents.encode('UTF-8')).hexdigest() print(result) return result[:8]
We have a very simple test for it:
def test_commit_hash(): contents = 'some text contents for commit' assert commit_hash(contents) == '0cf85793'
When executing this test, by default, you won't see the output of the print
calls:
λ pytest tests\test_digest.py ======================== test session starts ======================== ... tests\test_digest.py . [100%] ===================== 1 passed in 0.03 seconds ======================
That's nice and clean.
But those print statements are there to help you understand and debug the code, which is why pytest will show the captured output if the test fails.
Let's change the contents of the hashed text but not the hash itself. Now, pytest will show the captured output in a separate section after the error traceback:
λ pytest tests\test_digest.py ======================== test session starts ======================== ... tests\test_digest.py F [100%] ============================= FAILURES ============================== _________________________ test_commit_hash __________________________ def test_commit_hash(): contents = 'a new text emerges!' > assert commit_hash(contents) == '0cf85793' E AssertionError: assert '383aa486' == '0cf85793' E - 383aa486 E + 0cf85793 tests\test_digest.py:15: AssertionError ----------------------- Captured stdout call ------------------------ content size 19 383aa48666ab84296a573d1f798fff3b0b176ae8 ===================== 1 failed in 0.05 seconds ======================
Showing the captured output on failing tests is very handy when running tests locally, and even more so when running tests on CI.
Disabling capturing with -s
While running your tests locally, you might want to disable output capturing to see what messages are being printed in real-time, or whether the capturing is interfering with other capturing your code might be doing.
In those cases, just pass -s
to pytest to completely disable capturing:
λ pytest tests\test_digest.py -s ======================== test session starts ======================== ... tests\test_digest.py content size 29 0cf857938e0b4a1b3fdd41d424ae97d0caeab166 . ===================== 1 passed in 0.02 seconds ======================
Capture methods with --capture
Pytest has two methods to capture output. Which method is used can be chosen with the --capture
command-line flag:
--capture=fd
: captures output at the file-descriptor level, which means that all output written to the file descriptors, 1 (stdout) and 2 (stderr), is captured. This will capture output even from C extensions and is the default.--capture=sys
: captures output written directly tosys.stdout
andsys.stderr
at the Python level, without trying to capture system-level file descriptors.
Usually, you don't need to change this, but in a few corner cases, depending on what your code is doing, changing the capture method might be useful.
For completeness, there's also --capture=no
, which is the same as -s
.
Traceback modes and locals: --tb, --showlocals
Pytest will show a complete traceback of a failing test, as expected from a testing framework. However, by default, it doesn't show the standard traceback that most Python programmers are used to; it shows a different traceback:
============================= FAILURES ============================== _______________________ test_read_properties ________________________ def test_read_properties(): lines = DATA.strip().splitlines() > grids = list(iter_grids_from_csv(lines)) tests\test_read_properties.py:32: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ tests\test_read_properties.py:27: in iter_grids_from_csv yield parse_grid_data(fields) tests\test_read_properties.py:21: in parse_grid_data active_cells=convert_size(fields[2]), _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ s = 'NULL' def convert_size(s): > return int(s) E ValueError: invalid literal for int() with base 10: 'NULL' tests\test_read_properties.py:14: ValueError ===================== 1 failed in 0.05 seconds ======================
This traceback shows only a single line of code and file location for all frames in the traceback stack, except for the first and last one, where a portion of the code is shown as well (in bold).
While some might find it strange at first, once you get used to it you realize that it makes spotting the cause of the error much simpler. By looking at the surrounding code of the start and end of the traceback, you can usually understand the error better. I suggest that you try to get used to the default traceback provided by pytest for a few weeks; I'm sure you will love it and never look back.
If you don't like pytest's default traceback, however, there are other traceback modes, which are controlled by the --tb
flag. The default is --tb=auto
and was shown previously. Let's have a look at an overview of the other modes in the next sections.
--tb=long
This mode will show a portion of the code forallframes of failure tracebacks, making it quite verbose:
============================= FAILURES ============================== _______________________ t________ def test_read_properties(): lines = DATA.strip().splitlines() > grids = list(iter_grids_from_csv(lines)) tests\test_read_properties.py:32: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ lines = ['Main Grid,48,44', '2nd Grid,24,21', '3rd Grid,24,null'] def iter_grids_from_csv(lines): for fields in csv.reader(lines): > yield parse_grid_data(fields) tests\test_read_properties.py:27: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ fields = ['3rd Grid', '24', 'null'] def parse_grid_data(fields): return GridData( name=str(fields[0]), total_cells=convert_size(fields[1]), > active_cells=convert_size(fields[2]), ) tests\test_read_properties.py:21: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ s = 'null' def convert_size(s): > return int(s) E ValueError: invalid literal for int() with base 10: 'null' tests\test_read_properties.py:14: ValueError ===================== 1 failed in 0.05 seconds ======================
--tb=short
This mode will show a single line of code from all the frames of the failure traceback, providing short and concise output:
============================= FAILURES ============================== _______________________ test_read_properties ________________________ tests\test_read_properties.py:32: in test_read_properties grids = list(iter_grids_from_csv(lines)) tests\test_read_properties.py:27: in iter_grids_from_csv yield parse_grid_data(fields) tests\test_read_properties.py:21: in parse_grid_data active_cells=convert_size(fields[2]), tests\test_read_properties.py:14: in convert_size return int(s) E ValueError: invalid literal for int() with base 10: 'null' ===================== 1 failed in 0.04 seconds ======================
--tb=native
This mode will output the exact same traceback normally used by Python to report exceptions and is loved by purists:
_______________________ test_read_properties ________________________ Traceback (most recent call last): File "X:\CH2\tests\test_read_properties.py", line 32, in test_read_properties grids = list(iter_grids_from_csv(lines)) File "X:\CH2\tests\test_read_properties.py", line 27, in iter_grids_from_csv yield parse_grid_data(fields) File "X:\CH2\tests\test_read_properties.py", line 21, in parse_grid_data active_cells=convert_size(fields[2]), File "X:\CH2\tests\test_read_properties.py", line 14, in convert_size return int(s) ValueError: invalid literal for int() with base 10: 'null' ===================== 1 failed in 0.03 seconds ======================
--tb=line
This mode will output a single line per failing test, showing only the exception message and the file location of the error:
============================= FAILURES ============================== X:\CH2\tests\test_read_properties.py:14: ValueError: invalid literal for int() with base 10: 'null'
This mode might be useful if you are doing a massive refactoring and except a ton of failures anyway, planning to enter refactoring-heaven mode with the --lf -x
flags afterwards.
--tb=no
This does not show any traceback or failure message at all, making it also useful to run the suite first to get a glimpse of how many failures there are, so that you can start using --lf -x
flags to fix tests step-wise:
tests\test_read_properties.py F [100%] ===================== 1 failed in 0.04 seconds ======================
--showlocals (-l)
Finally, while this is not a traceback mode flag specifically, --showlocals
(or -l
as shortcut) augments the traceback modes by showing a list of the local variables and their values when using --tb=auto
, --tb=long
, and --tb=short
modes.
For example, here's the output of --tb=auto
and --showlocals
:
_______________________ test_read_properties ________________________ def test_read_properties(): lines = DATA.strip().splitlines() > grids = list(iter_grids_from_csv(lines)) lines = ['Main Grid,48,44', '2nd Grid,24,21', '3rd Grid,24,null'] tests\test_read_properties.py:32: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ tests\test_read_properties.py:27: in iter_grids_from_csv yield parse_grid_data(fields) tests\test_read_properties.py:21: in parse_grid_data active_cells=convert_size(fields[2]), _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ s = 'null' def convert_size(s): > return int(s) E ValueError: invalid literal for int() with base 10: 'null' s = 'null' tests\test_read_properties.py:14: ValueError ===================== 1 failed in 0.05 seconds ======================
Notice how this makes it much easier to see where the bad data is coming from: the '3rd Grid,24,null'
string that is being read from a file at the start of the test.
--showlocals
is extremely useful both when running your tests locally and in CI, being a firm favorite. Be careful, though, as this might be a security risk: local variables might expose passwords and other sensitive information, so make sure to transfer tracebacks using secure connections and be careful to make them public.
Slow tests with --durations
At the start of a project, your test suite is usually blazingly fast, running in a few seconds, and life is good. But as projects grow in size, so do their test suites, both in the number of tests and the time it takes for them to run.
Having a slow test suite affects productivity, especially if you follow TDD and run tests all the time. For this reason, it is healthy to periodically take a look at your longest running tests and perhaps analyze whether they can be made faster: perhaps you are using a large dataset in a place where a much smaller (and faster) dataset would do, or you might be executing redundant steps that are not important for the actual test being done.
When that happens, you will love the --durations=N
flag. This flag provides a summary of the N
longest running tests, or uses zero to see a summary of all tests:
λ pytest --durations=5 ... ===================== slowest 5 test durations ====================== 3.40s call CH2/tests/test_slow.py::test_corner_case 2.00s call CH2/tests/test_slow.py::test_parse_large_file 0.00s call CH2/tests/core/test_core.py::test_type_checking 0.00s teardown CH2/tests/core/test_parser.py::test_parse_expr 0.00s call CH2/tests/test_digest.py::test_commit_hash ================ 3 failed, 7 passed in 5.51 seconds =================
This output provides invaluable information when you start hunting for tests to speed up.
Although this flag is not something that you will use daily, because it seems that many people don't know about it, it is worth mentioning.
Extra test summary: -ra
Pytest shows rich traceback information on failing tests. The extra information is great, but the actual footer is not very helpful in identifying which tests have actually failed:
...
________________________ test_type_checking _________________________
def test_type_checking():
> assert 0
E assert 0
tests\core\test_core.py:12: AssertionError
=============== 14 failed, 17 passed in 5.68 seconds ================
The -ra
flag can be passed to produce a nice summary with the full name of all failing tests at the end of the session:
... ________________________ test_type_checking _________________________ def test_type_checking(): > assert 0 E assert 0 tests\core\test_core.py:12: AssertionError ====================== short test summary info ====================== FAIL tests\test_assert_demo.py::test_approx_simple_fail FAIL tests\test_assert_demo.py::test_approx_list_fail FAIL tests\test_assert_demo.py::test_default_health FAIL tests\test_assert_demo.py::test_default_player_class FAIL tests\test_assert_demo.py::test_warrior_short_description FAIL tests\test_assert_demo.py::test_warrior_long_description FAIL tests\test_assert_demo.py::test_get_starting_equiment FAIL tests\test_assert_demo.py::test_long_list FAIL tests\test_assert_demo.py::test_starting_health FAIL tests\test_assert_demo.py::test_player_classes FAIL tests\test_checks.py::test_invalid_class_name FAIL tests\test_read_properties.py::test_read_properties FAIL tests\core\test_core.py::test_check_options FAIL tests\core\test_core.py::test_type_checking =============== 14 failed, 17 passed in 5.68 seconds ================
This flag is particularly useful when running the suite from the command line directly, because scrolling the terminal to find out which tests failed can be annoying.
The flag is actually -r
, which accepts a number of single-character arguments:
f
(failed):assert
failede
(error): raised an unexpected exceptions
(skipped): skipped (we will get to this in the next chapter)x
(xfailed): expected to fail, did fail (we will get to this in the next chapter)X
(xpassed): expected to fail, but passed (!) (we will get to this in the next chapter)p
(passed): test passedP
(passed with output): displays captured output even for passing tests (careful – this usually produces a lot of output)a
: shows all the above, except forP
; this is the default and is usually the most useful.
The flag can receive any combination of the above. So, for example, if you are interested in failures and errors only, you can pass -rfe
to pytest.
In general, I recommend sticking with -ra
, without thinking too much about it and you will obtain the most benefits.