Writing and running tests
Using pytest, all you need to do to start writing tests is to create a new file named test_*.py
and write test functions that start with test
:
# contents of test_player_mechanics.py
def test_player_hit():
player = create_player()
assert player.health == 100
undead = create_undead()
undead.hit(player)
assert player.health == 80
To execute this test, simply execute pytest
, passing the name of the file:
λ pytest test_player_mechanics.py
If you don't pass anything, pytest will look for all of the test files from the current directory recursively and execute them automatically.
Note
You might encounter examples on the internet that use py.test
in the command line instead of pytest
. The reason for that is historical: pytest used to be part of the py
package, which provided several general purpose utilities, including tools that followed the convention of starting with py.<TAB>
for tab completion, but since then, it has been moved into its own project. The old py.test
command is still available and is an alias to pytest
, but the latter is the recommended modern usage.
Note that there's no need to create classes; just simple functions and plain assert
statements are enough, but if you want to use classes to group tests you can do so:
class TestMechanics:
def test_player_hit(self):
...
def test_player_health_flask(self):
...
Grouping tests can be useful when you want to put a number of tests under the same scope: you can execute tests based on the class they are in, apply markers to all of the tests in a class (Chapter 3, Markers and Parametrization), and create fixtures bound to a class (Chapter 4, Fixtures).
Pytest can run your tests in a number of ways. Let's quickly get into the basics now and, later on in the chapter, we will move on to more advanced options.
You can start by just simply executing the pytest
command:
λ pytest
This will find all of the test_*.py
and *_test.py
modules in the current directory and below recursively, and will run all of the tests found in those files:
- You can reduce the search to specific directories:
λ pytest tests/core tests/contrib
- You can also mix any number of files and directories:
λ pytest tests/core tests/contrib/test_text_plugin.py
- You can execute specific tests by using the syntax
<test-file>::<test-function-name>
:
λ pytest tests/core/test_core.py::test_regex_matching
- You can execute all of the
test
methods of a test
class:
λ pytest tests/contrib/test_text_plugin.py::TestPluginHooks
- You can execute a specific
test
method of a test
class using the syntax <test-file>::<test-class>::<test-method-name>
:
λ pytest tests/contrib/
test_text_plugin.py::TestPluginHooks::test_registration
The syntax used above is created internally by pytest, is unique to each test collected, and is called a node id
or item id
. It basically consists of the filename of the testing module, class, and functions joined together by the ::
characters.
Pytest will show a more verbose output, which includes node IDs, with the -v
flag:
λ pytest tests/core -v
======================== test session starts ========================
...
collected 6 items
tests\core\test_core.py::test_regex_matching PASSED [ 16%]
tests\core\test_core.py::test_check_options FAILED [ 33%]
tests\core\test_core.py::test_type_checking FAILED [ 50%]
tests\core\test_parser.py::test_parse_expr PASSED [ 66%]
tests\core\test_parser.py::test_parse_num PASSED [ 83%]
tests\core\test_parser.py::test_parse_add PASSED [100%]
To see which tests there are without running them, use the --collect-only
flag:
λ pytest tests/core --collect-only
======================== test session starts ========================
...
collected 6 items
<Module 'tests/core/test_core.py'>
<Function 'test_regex_matching'>
<Function 'test_check_options'>
<Function 'test_type_checking'>
<Module 'tests/core/test_parser.py'>
<Function 'test_parse_expr'>
<Function 'test_parse_num'>
<Function 'test_parse_add'>
=================== no tests ran in 0.01 seconds ====================
--collect-only
is especially useful if you want to execute a specific test but can't remember its exact name.
As you've probably already noticed, pytest makes use of the built-in assert
statement to check assumptions during testing. Contrary to other frameworks, you don't need to remember various self.assert*
or self.expect*
functions. While this may not seem like a big deal at first, after spending some time using plain asserts, you will realize how much that makes writing tests more enjoyable and natural.
Again, here's an example of a failure:
________________________ test_default_health ________________________
def test_default_health():
health = get_default_health('warrior')
> assert health == 95
E assert 80 == 95
tests\test_assert_demo.py:25: AssertionError
Pytest shows the line of the failure, as well as the variables and expressions involved in the failure. By itself, this would be pretty cool already, but pytest goes a step further and provides specialized explanations of failures involving other data types.
When showing the explanation for short strings, pytest uses a simple difference method:
_____________________ test_default_player_class _____________________
def test_default_player_class():
x = get_default_player_class()
> assert x == 'sorcerer'
E AssertionError: assert 'warrior' == 'sorcerer'
E - warrior
E + sorcerer
Longer strings show a smarter delta, using difflib.ndiff
to quickly spot the differences:
__________________ test_warrior_short_description ___________________
def test_warrior_short_description():
desc = get_short_class_description('warrior')
> assert desc == 'A battle-hardened veteran, can equip heavy armor and weapons.'
E AssertionError: assert 'A battle-har... and weapons.' == 'A battle-hard... and weapons.'
E - A battle-hardened veteran, favors heavy armor and weapons.
E ? ^ ^^^^
E + A battle-hardened veteran, can equip heavy armor and weapons.
E ? ^ ^^^^^^^
Multiline strings are also treated specially:
def test_warrior_long_description():
desc = get_long_class_description('warrior')
> assert desc == textwrap.dedent('''\
A seasoned veteran of many battles. Strength and Dexterity
allow to yield heavy armor and weapons, as well as carry
more equipment. Weak in magic.
''')
E AssertionError: assert 'A seasoned v... \n' == 'A seasoned ve... \n'
E - A seasoned veteran of many battles. High Strength and Dexterity
E ? -----
E + A seasoned veteran of many battles. Strength and Dexterity
E allow to yield heavy armor and weapons, as well as carry
E - more equipment while keeping a light roll. Weak in magic.
E ? ---------------------------
E + more equipment. Weak in magic.
Assertion failures for lists also show only differing items by default:
____________________ test_get_starting_equiment _____________________
def test_get_starting_equiment():
expected = ['long sword', 'shield']
> assert get_starting_equipment('warrior') == expected
E AssertionError: assert ['long sword'...et', 'shield'] == ['long sword', 'shield']
E At index 1 diff: 'warrior set' != 'shield'
E Left contains more items, first extra item: 'shield'
E Use -v to get the full diff
tests\test_assert_demo.py:71: AssertionError
Note that pytest shows which index differs, and also that the -v
flag can be used to show the complete difference between the lists:
____________________ test_get_starting_equiment _____________________
def test_get_starting_equiment():
expected = ['long sword', 'shield']
> assert get_starting_equipment('warrior') == expected
E AssertionError: assert ['long sword'...et', 'shield'] == ['long sword', 'shield']
E At index 1 diff: 'warrior set' != 'shield'
E Left contains more items, first extra item: 'shield'
E Full diff:
E - ['long sword', 'warrior set', 'shield']
E ? ---------------
E + ['long sword', 'shield']
tests\test_assert_demo.py:71: AssertionError
If the difference is too big, pytest is smart enough to show only a portion to avoid showing too much output, displaying a message like the following:
E ...Full output truncated (100 lines hidden), use '-vv' to show
Dictionaries are probably one of the most used data structures in Python, so, unsurprisingly, pytest has specialized representation for them:
_______________________ test_starting_health ________________________
def test_starting_health():
expected = {'warrior': 85, 'sorcerer': 50}
> assert get_classes_starting_health() == expected
E AssertionError: assert {'knight': 95...'warrior': 85} == {'sorcerer': 50, 'warrior': 85}
E Omitting 1 identical items, use -vv to show
E Differing items:
E {'sorcerer': 55} != {'sorcerer': 50}
E Left contains more items:
E {'knight': 95}
E Use -v to get the full diff
Sets also have similar output:
________________________ test_player_classes ________________________
def test_player_classes():
> assert get_player_classes() == {'warrior', 'sorcerer'}
E AssertionError: assert {'knight', 's...r', 'warrior'} == {'sorcerer', 'warrior'}
E Extra items in the left set:
E 'knight'
E Use -v to get the full diff
As with lists, there are also -v
and -vv
options for displaying more detailed output.
By default, Python's assert statement does not provide any details when it fails, but as we just saw, pytest shows a lot of information about the variables and expressions involved in a failed assertion. So how does pytest do it?
Pytest is able to provide useful exceptions because it implements a mechanism called assertion rewriting.
Assertion rewriting works by installing a custom import hook that intercepts the standard Python import mechanism. When pytest detects that a test file (or plugin) is about to be imported, instead of loading the module, it first compiles the source code into an abstract syntax tree (AST) using the built-in ast
module. Then, it searches for any assert
statements and rewrites them so that the variables used in the expression are kept so that they can be used to show more helpful messages if the assertion fails. Finally, it saves the rewritten pyc
file to disk for caching.
This all might seem very magical, but the process is actually simple, deterministic, and, best of all, completely transparent.
Checking exceptions: pytest.raises
A good API documentation will clearly explain what the purpose of each function is, its parameters, and return values. Great API documentation also clearly explains which exceptions are raised and when.
For that reason, testing that exceptions are raised in the appropriate circumstances is just as important as testing the main functionality of APIs. It is also important to make sure that exceptions contain an appropriate and clear message to help users understand the issue.
Suppose we are writing an API for a game. This API allows programmers to write mods
, which are a plugin of sorts that can change several aspects of a game, from new textures to complete new story lines and types of characters.
This API has a function that allows mod writers to create a new character, and it can raise exceptions in some situations:
def create_character(name: str, class_name: str) -> Character:
"""
Creates a new character and inserts it into the database.
:raise InvalidCharacterNameError:
if the character name is empty.
:raise InvalidClassNameError:
if the class name is invalid.
:return: the newly created Character.
"""
...
Pytest makes it easy to check that your code is raising the proper exceptions with the raises
statement:
def test_empty_name():
with pytest.raises(InvalidCharacterNameError):
create_character(name='', class_name='warrior')
def test_invalid_class_name():
with pytest.raises(InvalidClassNameError):
create_character(name='Solaire', class_name='mage')
pytest.raises
is a with-statement that ensures the exception class passed to it will be raised inside its execution block. For more details (https://docs.python.org/3/reference/compound_stmts.html#the-with-statement). Let's see how create_character
implements those checks:
def create_character(name: str, class_name: str) -> Character:
"""
Creates a new character and inserts it into the database.
...
"""
if not name:
raise InvalidCharacterNameError('character name empty')
if class_name not in VALID_CLASSES:
msg = f'invalid class name: "{class_name}"'
raise InvalidCharacterNameError(msg)
...
If you are paying close attention, you probably noticed that the copy-paste error in the preceding code should actually raise an InvalidClassNameError
for the class name check.
Executing this file:
======================== test session starts ========================
...
collected 2 items
tests\test_checks.py .F [100%]
============================= FAILURES ==============================
______________________ test_invalid_class_name ______________________
def test_invalid_class_name():
with pytest.raises(InvalidCharacterNameError):
> create_character(name='Solaire', class_name='mage')
tests\test_checks.py:51:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
name = 'Solaire', class_name = 'mage'
def create_character(name: str, class_name: str) -> Character:
"""
Creates a new character and inserts it into the database.
:param name: the character name.
:param class_name: the character class name.
:raise InvalidCharacterNameError:
if the character name is empty.
:raise InvalidClassNameError:
if the class name is invalid.
:return: the newly created Character.
"""
if not name:
raise InvalidCharacterNameError('character name empty')
if class_name not in VALID_CLASSES:
msg = f'invalid class name: "{class_name}"'
> raise InvalidClassNameError(msg)
E test_checks.InvalidClassNameError: invalid class name: "mage"
tests\test_checks.py:40: InvalidClassNameError
================ 1 failed, 1 passed in 0.05 seconds =================
test_empty_name
passed as expected. test_invalid_class_name
raised InvalidClassNameError
, so the exception was not captured by pytest.raises
, which failed the test (as any other exception would).
Checking exception messages
As stated at the start of this section, APIs should provide clear messages in the exceptions they raise. In the previous examples, we only verified that the code was raising the appropriate exception type, but not the actual message.
pytest.raises
can receive an optional match
argument, which is a regular expression string that will be matched against the exception message, as well as checking the exception type. For more details, go to: https://docs.python.org/3/howto/regex.html. We can use that to improve our tests even further:
def test_empty_name():
with pytest.raises(InvalidCharacterNameError,
match='character name empty'):
create_character(name='', class_name='warrior')
def test_invalid_class_name():
with pytest.raises(InvalidClassNameError,
match='invalid class name: "mage"'):
create_character(name='Solaire', class_name='mage')
Simple!
Checking warnings: pytest.warns
APIs also evolve. New and better alternatives to old functions are provided, arguments are removed, old ways of using a certain functionality evolve into better ways, and so on.
API writers have to strike a balance between keeping old code working to avoid breaking clients and providing better ways of doing things, while all the while keeping their own API code maintainable. For this reason, a solution often adopted is to start to issue warnings
when API clients use the old behavior, in the hope that they update their code to the new constructs. Warning messages are shown in situations where the current usage is not wrong to warrant an exception, it just happens that there are new and better ways of doing it. Often, warning messages are shown during a grace period for this update to take place, and afterward the old way is no longer supported.
Python provides the standard warnings module exactly for this purpose, making it easy to warn developers about forthcoming changes in APIs. For more details, go to: https://docs.python.org/3/library/warnings.html. It lets you choose from a number of warning classes, for example:
UserWarning
: user warnings (user
here means developers, not software users)DeprecationWarning
: features that will be removed in the futureResourcesWarning
: related to resource usage
(This list is not exhaustive. Consult the warnings documentation for the full listing. For more details, go to: https://docs.python.org/3/library/warnings.html).
Warning classes help users control which warnings should be shown and which ones should be suppressed.
For example, suppose an API for a computer game provides this handy function to obtain the starting hit points of player characters given their class name:
def get_initial_hit_points(player_class: str) -> int:
...
Time moves forward and the developers decide to use an enum
instead of class names in the next release. For more details, go to: https://docs.python.org/3/library/enum.html, which is more adequate to represent a limited set of values:
class PlayerClass(Enum):
WARRIOR = 1
KNIGHT = 2
SORCERER = 3
CLERIC = 4
But changing this suddenly would break all clients, so they wisely decide to support both forms for the next release: str
and the PlayerClass
enum
. They don't want to keep supporting this forever, so they start showing a warning whenever a class is passed as a str
:
def get_initial_hit_points(player_class: Union[PlayerClass, str]) -> int:
if isinstance(player_class, str):
msg = 'Using player_class as str has been deprecated' \
'and will be removed in the future'
warnings.warn(DeprecationWarning(msg))
player_class = get_player_enum_from_string(player_class)
...
In the same vein as pytest.raises
from the previous section, the pytest.warns
function lets you test whether your API code is producing the warnings you expect:
def test_get_initial_hit_points_warning():
with pytest.warns(DeprecationWarning):
get_initial_hit_points('warrior')
As with pytest.raises
, pytest.warns
can receive an optional match
argument, which is a regular expression string. For more details, go to:https://docs.python.org/3/howto/regex.html, which will be matched against the exception message:
def test_get_initial_hit_points_warning():
with pytest.warns(DeprecationWarning,
match='.*str has been deprecated.*'):
get_initial_hit_points('warrior')
Comparing floating point numbers: pytest.approx
Comparing floating point numbers can be tricky. For more details, go to: https://docs.python.org/3/tutorial/floatingpoint.html. Numbers that we consider equal in the real world are not so when represented by computer hardware:
>>> 0.1 + 0.2 == 0.3
False
When writing tests, it is very common to compare the results produced by our code against what we expect as floating point values. As shown above, a simple ==
comparison often won't be sufficient. A common approach is to use a known tolerance instead and use abs
to correctly deal with negative numbers:
def test_simple_math():
assert abs(0.1 + 0.2) - 0.3 < 0.0001
But besides being ugly and hard to understand, it is sometimes difficult to come up with a tolerance that works in most situations. The chosen tolerance of 0.0001
might work for the numbers above, but not for very large numbers or very small ones. Depending on the computation performed, you would need to find a suitable tolerance for every set of input numbers, which is tedious and error-prone.
pytest.approx
solves this problem by automatically choosing a tolerance appropriate for the values involved in the expression, providing a very nice syntax to boot:
def test_approx_simple():
assert 0.1 + 0.2 == approx(0.3)
You can read the above as assert that 0.1 + 0.2 equals approximately to 0.3
.
But the approx
function does not stop there; it can be used to compare:
def test_approx_list():
assert [0.1 + 1.2, 0.2 + 0.8] == approx([1.3, 1.0])
- Dictionary
values
(not keys):
def test_approx_dict():
values = {'v1': 0.1 + 1.2, 'v2': 0.2 + 0.8}
assert values == approx(dict(v1=1.3, v2=1.0))
def test_approx_numpy():
import numpy as np
values = np.array([0.1, 0.2]) + np.array([1.2, 0.8])
assert values == approx(np.array([1.3, 1.0]))
When a test fails, approx
provides a nice error message displaying the values that failed and the tolerance used:
def test_approx_simple_fail():
> assert 0.1 + 0.2 == approx(0.35)
E assert (0.1 + 0.2) == 0.35 ± 3.5e-07
E + where 0.35 ± 3.5e-07 = approx(0.35)