Most Django projects have tests. Fewer have tests that reliably catch bugs. There's a meaningful difference between a test suite that exists and a test suite that works — and closing that gap doesn't require writing more tests, it requires writing better ones.
This is the setup we use on every project we build and maintain. It's not academic — it's evolved from the specific failures we've seen in real production systems.
Our testing stack
The core packages:
- pytest-django — replaces Django's test runner with pytest. Better output, fixtures as function arguments, parametrize, and access to the full pytest plugin ecosystem.
- Factory Boy — replaces fixtures for creating model instances. Factories know how to create valid instances with sensible defaults.
- model-bakery — as a fallback when we need a quick valid instance and don't want to write a factory. Less expressive than Factory Boy but useful for one-off tests.
- pytest-cov — coverage reporting integrated into the test run.
- freezegun — for tests that depend on the current date or time.
Conspicuously absent: unittest.mock for mocking the database. We'll come back to that.
Factory Boy over fixtures
Django fixtures (JSON or YAML dumps of database state) have three problems: they go stale as the schema evolves, they create implicit dependencies between tests, and they make it hard to understand what state a test actually needs.
Factory Boy solves all three. A factory is Python code — it stays in sync with the model because the tests will break if it doesn't, it creates only the data each test needs, and the factory definition documents the shape of valid data:
importfactoryfromfactory.djangoimportDjangoModelFactoryfromfakerimportFaker fake = Faker("en-GB")classUserFactory(DjangoModelFactory):classMeta: model ="auth.User"username = factory.LazyAttribute(lambda_: fake.user_name()) email = factory.LazyAttribute(lambdao: f"{o.username}@example.com") first_name = factory.Faker("first_name") last_name = factory.Faker("last_name") is_active =TrueclassArticleFactory(DjangoModelFactory):classMeta: model ="articles.Article"title = factory.Faker("sentence") author = factory.SubFactory(UserFactory) published_at = factory.Faker("past_datetime", tzinfo=None) is_published =True# In tests:deftest_published_articles_appear_in_feed(db): ArticleFactory.create_batch(3, is_published=True) ArticleFactory.create(1, is_published=False) feed = get_article_feed() assert len(feed) ==3
The db fixture (from pytest-django) grants database access. Tests that don't need it can't touch the database, which is both a safeguard and a documentation signal.
What pytest-django actually changes
Beyond nicer output, the biggest practical difference is fixtures as function arguments rather than setUp methods. Each test gets exactly what it declares, with no inherited state from a class hierarchy:
importpytest @pytest.fixturedefpublished_article(db):returnArticleFactory(is_published=True) @pytest.fixturedefeditor_client(client, db): user = UserFactory(is_staff=True) client.force_login(user)returnclientdeftest_editor_can_unpublish(editor_client, published_article): response = editor_client.post( f"/articles/{published_article.pk}/unpublish/") assert response.status_code ==302published_article.refresh_from_db() assert published_article.is_publishedis False
This is legible. The test name tells you what it tests, the fixtures tell you what state it needs, and the assertions tell you what it expects. There's no inheritance chain to trace.
Transaction isolation: when it matters
By default, Django's TestCase wraps each test in a transaction that rolls back when the test ends, which is fast. pytest-django's db fixture does the same thing.
But some tests need TransactionTestCase semantics — specifically, tests that verify behaviour depending on database signals, on_commit() hooks, or anything that crosses transaction boundaries:
deftest_order_email_sent_after_commit(transactional_db, mailoutbox):# on_commit() only fires when the transaction commits. # With plain db fixture, it never fires — the test would wrongly pass.order = OrderFactory() place_order(order) assert len(mailoutbox) ==1assert mailoutbox[0].subject =="Your order has been placed"
Use transactional_db (pytest-django's fixture for TransactionTestCase) only when you genuinely need it — it's significantly slower because the database is truncated between tests rather than rolled back.
on_commit() hooks — sending emails, triggering webhooks, updating search indexes — silently never ran because the calling code was already inside a transaction that rolled back on error. Tests using plain db wouldn't catch this. transactional_db does.
Testing at the right level
The biggest efficiency gain in Django testing comes from testing at the right level of abstraction. We think about three levels:
Unit tests for pure logic — functions that take inputs and return outputs with no side effects. No database, no network, no Django setup. These run instantly and can be hundreds per second.
Integration tests for model behaviour — methods on models, manager methods, signal handlers. Uses the database, Factory Boy for setup, no HTTP.
View tests for HTTP-level contract — status codes, response content, redirects, authentication enforcement. Uses Django's test client (or pytest-django's client fixture). We don't test implementation details here — we test the contract.
classTestArticleDetailView:deftest_published_article_is_visible(self, client, db): article = ArticleFactory(is_published=True) response = client.get(article.get_absolute_url()) assert response.status_code ==200deftest_draft_article_returns_404(self, client, db): article = ArticleFactory(is_published=False) response = client.get(article.get_absolute_url()) assert response.status_code ==404deftest_draft_visible_to_staff(self, editor_client, db): article = ArticleFactory(is_published=False) response = editor_client.get(article.get_absolute_url()) assert response.status_code ==200
Notice these tests don't inspect HTML content — they test status codes and the permission contract. HTML assertions are fragile, couple tests to template details, and rarely catch real bugs.
Mocking external services
We mock external HTTP calls — Stripe, SendGrid, AWS, third-party APIs — but we don't mock the database. The database is the system we're testing. Mocking it means testing code paths that don't exist in production.
For HTTP mocking, we use responses (for requests-based code) or respx (for httpx-based code):
importresponses as resp @resp.activatedeftest_payment_gateway_error_handled(db): resp.add( resp.POST,"https://api.stripe.com/v1/charges", json={"error": {"message":"Your card was declined."}}, status=402, ) order = OrderFactory()withpytest.raises(PaymentDeclinedError): charge_order(order)
This tests the error handling code path that would otherwise require a real declined card — without making any actual Stripe API calls.
Coverage is a floor, not a ceiling
We run with --cov and we set a minimum threshold (usually 80–85% for application code, lower for migration files and configuration). Failing below the threshold blocks CI.
But we don't chase 100%. Coverage tells you what code was executed during tests — it says nothing about whether the tests actually verified the right behaviour. A test that calls a function and makes no assertions gives you coverage with zero value.
The metric we care about more is confidence: can the team merge changes without manually verifying every feature? A test suite you trust is worth more than a 100% coverage badge on a suite nobody reads.
If you're inheriting a codebase with low test coverage, write tests for the things that break. Start at the integration and view level — those catch the most bugs per hour of effort — and add unit tests as you extract logic into pure functions during refactors.