Django runs an enormous slice of the web — from internal admin tools to high-traffic APIs powered by Django REST Framework. But a Django deployment is more than a single process answering HTTP. It is gunicorn or uWSGI workers, a database, a cache, static files served from somewhere, and usually a fleet of Celery workers chewing through background jobs. Any one of those can fail while the others keep humming, which means "is the homepage up?" is nowhere near enough to know whether your application actually works.

This guide walks through monitoring a Django app the way it actually breaks. We will build a real health endpoint, enumerate the Django-specific failure modes worth alerting on, cover background workers, and finish with a concrete CronAlert setup you can copy. CronAlert is agentless and runs on Cloudflare's edge, so everything here is just HTTP checks, content assertions, and heartbeats — no agent to install inside your app server.

Start with a real health endpoint

The single highest-leverage thing you can do is expose a dedicated health-check endpoint that returns 200 only when the application and its critical dependencies are actually healthy. A view bound to a simple path like /healthz gives your monitor one URL whose status code is a faithful signal of whether the app can serve requests. For deeper background on the pattern, see our guide to HTTP health check endpoints.

You have two reasonable options in Django. The fastest is the django-health-check package, which ships pluggable backends for the database, cache, storage, disk usage, and Celery. You install it, register the backends you care about, and wire its URLs into your URLconf — you get a single endpoint that returns a non-200 if any backend reports a problem. It is the path of least resistance and a great default.

The alternative is a custom view, which gives you precise control over what counts as healthy. A good custom check does three things and nothing more: it runs a trivial query through Django's database connection (a bare SELECT 1) to confirm the database is reachable, it does a quick set-and-get against the cache backend, and it touches any service the app cannot function without. If all of those succeed, return HttpResponse with status 200 and a tiny body; if any fail, return a 503. Wrap each check in a try/except so a failing dependency produces a clean non-200 rather than an unhandled 500. Our deep-dive on building a database health endpoint covers the query-level details and the traps to avoid.

Three rules keep the endpoint useful. Keep it fast — no expensive aggregations or external calls that themselves time out, or the health check becomes the outage. Keep the response body small and free of sensitive data. And decide on authentication deliberately: an endpoint that returns only a status code is safe to leave public, while a richer diagnostic endpoint should sit behind a static token passed in a header or query string so your monitor can still reach it.

What to monitor in a Django app

One health endpoint is the foundation, not the whole story. A complete picture spreads a handful of checks across the surfaces your users and integrators actually touch:

  • The public site and homepage. The page real visitors land on, checked for a 200 and for real rendered content — not just that something responded.
  • The health endpoint. Your deep check, giving you a dependency-aware signal that is independent of any single page.
  • Key API endpoints. If you run Django REST Framework, your most important read endpoints deserve their own checks; an API can break while the HTML site is fine. See monitoring API endpoints for assertions on JSON responses.
  • The admin login page. The Django admin at /admin/login/ is your operational lifeline; if it stops rendering, you have lost your emergency control panel.
  • Static and media availability. A direct request to a known CSS, JS, or media URL confirms collectstatic ran and your storage or CDN is serving assets.
  • HTTPS and the SSL certificate. An expired certificate takes the whole site down for browsers regardless of how healthy the app is.

You can cover all of this comfortably inside CronAlert's free plan, which includes 25 monitors at a 3-minute interval with SSL and content checks. Upgrading to Pro tightens the interval to 1 minute and unlocks keyword monitoring across every monitor.

Django-specific failure modes

Generic uptime advice misses the things that actually take Django apps down. These are the failures worth designing your checks around:

  • Database connection exhaustion. Under load, or with a connection leak, Django runs out of database connections and every request starts throwing OperationalError. A health check that runs a real query catches this; a process-only check does not.
  • Migrations not applied. A deploy that ships new models but skips migrate produces ProgrammingError for any view that touches the changed tables — often a 500 on a subset of pages while the homepage looks fine.
  • DEBUG = True in production. A misconfigured deploy leaking DEBUG = True exposes stack traces and settings to the world. It is a security incident that still returns 200, so only a content check will flag it.
  • ALLOWED_HOSTS misconfiguration. If the host you request is not in ALLOWED_HOSTS, Django returns a 400 even though the app is healthy — a classic post-deploy surprise.
  • Static files not collected. Forgetting collectstatic means CSS and JS 404, so the page returns 200 but renders unstyled and broken.
  • Dead Celery workers. When the worker pool dies, async tasks pile up in the broker and silently stop completing while HTTP traffic carries on as normal.
  • gunicorn / uWSGI worker timeouts. Slow views that exceed the worker timeout get killed mid-request, surfacing as intermittent 502s or 504s and climbing response times.

Notice how many of these never change the homepage status code. That is the central lesson of Django monitoring: status codes alone lie, and you need content assertions and dependency-aware checks to see the truth.

Monitoring Celery and background jobs

Background processing is where Django monitoring most often falls short. Your site can be flawlessly up while Celery beat has stopped firing scheduled tasks or every worker has crashed — and no HTTP check will ever notice, because nobody is making a request that fails. Emails stop sending, reports stop generating, cleanup jobs stop running, and you find out from a customer days later.

The fix is heartbeat monitoring, sometimes called dead-man's-switch monitoring. Instead of CronAlert reaching in to your worker, your job reaches out: every time a Celery beat task runs successfully, it makes a quick HTTP request to a unique CronAlert heartbeat URL. You tell CronAlert how often to expect that ping plus a grace period, and if a ping fails to arrive on schedule, CronAlert alerts you that the job went silent. The contract is inverted from an uptime check — silence is the alarm.

A practical setup is a dedicated lightweight beat task scheduled on the same cadence as your most critical periodic work whose only job is to ping the heartbeat after confirming the worker can reach the broker and database. If that task stops landing its ping, you know the whole Celery pipeline is in trouble. Our guides to cron job heartbeat monitoring and background worker monitoring walk through the patterns in depth, and for one-shot scheduled jobs see batch job monitoring. Heartbeat checks are available on every CronAlert plan, including the free tier.

Catching "up but wrong" with content checks

A Django response can be a perfect HTTP 200 and still be completely broken. A deploy that shipped an empty template, a page rendering a styled "something went wrong" error, or a homepage missing its content because static assets 404ed will all pass a status-code check. To catch these you have to assert on what the page actually contains.

The simplest tool is a keyword assertion: pick a string that appears only on a correctly rendered page — a navigation label, a product name, your footer copyright line — and require the check to find it. If a broken deploy strips that content or replaces the page with an error screen, the keyword disappears and the check fails even though the status code is 200. You can also assert that an error-page string is absent, so a leaked Django debug traceback (which often contains telltale phrases) trips the alarm. Our keyword monitoring guide covers building robust assertions, including regex matching on Pro.

For pages that are supposed to be byte-for-byte stable — a marketing page, a terms-of-service document, a generated status snapshot — a SHA-256 content hash is even stronger. CronAlert hashes the rendered output and alerts you the instant it changes, which catches both an unexpected drift and a defacement. See content monitoring for how content-hash checks work in practice.

SSL and response-time thresholds

Two more checks round out a Django setup. First, certificate monitoring: an expired TLS certificate breaks the site for every browser no matter how healthy the app is, and certificate renewals fail more often than people expect. Point an SSL check at your domain and get warned days before expiry — see SSL certificate monitoring.

Second, response-time thresholds. Django views that creep slower — an N+1 query, an unindexed lookup, a degraded cache — are an early warning of trouble well before they turn into worker timeouts and 502s. Configure a response-time threshold on your most important monitors so a view that crosses, say, two seconds alerts you while it is still merely slow rather than down. Our guide to timeout thresholds explains how to pick sensible numbers without generating noise.

A concrete CronAlert setup

Here is an end-to-end setup you can replicate in a few minutes after you create a free CronAlert account:

  • Build the health view. Add a /healthz view that runs SELECT 1, pings the cache, and returns 200 or 503. Use django-health-check if you prefer batteries included.
  • Create an HTTP monitor for the health endpoint. Point a CronAlert HTTP monitor at https://yourapp.com/healthz, expect a 200, and set the interval (3 minutes free, 1 minute on Pro).
  • Add a keyword assertion to the homepage. Create a second monitor on your homepage and require a string only present when the page renders correctly, so a broken deploy fails the check.
  • Add API and admin-login monitors. Check a key DRF endpoint and the /admin/login/ page so your control plane and integrators are covered.
  • Add a heartbeat for Celery beat. Create a heartbeat monitor, set the expected interval and grace period, and have a beat task ping its URL on every successful run.
  • Turn on SSL and response-time checks. Enable certificate monitoring on your domain and set a response-time threshold on your critical monitors.
  • Wire up alert channels. Connect email, Slack, Discord, Teams, Telegram, PagerDuty, Opsgenie, Splunk On-Call, a webhook, or PWA push — so the right people hear about it the right way.

Every plan includes the full REST API, and CronAlert ships an MCP server so you can manage monitors directly from Claude Code, Cursor, or Windsurf. Teams that need multi-region coverage can move to the Team plan, which checks from five edge regions with quorum so a single regional blip never pages you.

Frequently asked questions

Where should I put a health check endpoint in a Django app?

Add a lightweight view wired to a simple URL like /healthz near the top of your root URLconf so it resolves quickly. The view should verify the things that matter for serving traffic — a database query, a cache round-trip, and any critical external service — then return HTTP 200 with a tiny body. Keep it fast so the endpoint never becomes a bottleneck, and either leave it unauthenticated or protect it with a static token.

Should the Django health endpoint check the database?

Yes. A deep health check should run a trivial query such as SELECT 1 through Django's connection so a database that is unreachable or out of connections surfaces as a failure. Many of Django's worst outages are database-related — connection exhaustion, a paused instance, or unapplied migrations — and a process-only check will happily return 200 while every real request 500s. The django-health-check package provides ready-made backends for the database, cache, storage, and Celery.

How do I monitor Celery workers and Celery beat?

Use heartbeat monitoring. Have a periodic Celery beat task make an HTTP request to a CronAlert heartbeat URL every time it runs successfully, and configure the expected interval plus a grace period. If beat stops scheduling or the worker pool dies, the ping stops arriving and CronAlert alerts you. This catches the failure mode HTTP checks miss entirely: the website is up, but background processing has quietly stopped.

Why does my Django site return 400 Bad Request to a monitor?

A 400 from Django is almost always an ALLOWED_HOSTS misconfiguration. Django rejects any request whose Host header is not listed with a SuspiciousOperation that renders as HTTP 400. When you point a monitor at a hostname the app does not recognize — a bare IP, a new domain, or a staging URL — you get a 400 even though the process is healthy. Add every hostname you monitor to ALLOWED_HOSTS.

Can uptime monitoring detect a broken Django deploy that still returns 200?

Yes, with keyword and content checks. A deploy that forgot collectstatic, shipped an empty template, or rendered a styled error page can still return 200, so a status-code-only check passes while users see a broken page. Configure a keyword assertion requiring a string only present on a correctly rendered page, so the check fails when real content is missing. For pages that should never change, a SHA-256 content hash alerts you the moment the rendered output drifts.

Start monitoring your Django app

Django apps fail in ways a single homepage ping will never catch: exhausted database connections, unapplied migrations, dead Celery workers, and broken deploys that still answer 200. A health endpoint, content assertions, heartbeats for background jobs, and SSL plus response-time checks together give you a signal you can trust. CronAlert covers all of it agentlessly from the edge, and the free plan is enough to wire up your first complete setup today. Create a free CronAlert account and point your first check at /healthz in the next five minutes.

Related reading: Building a database health endpoint, Background worker monitoring, API endpoint monitoring, and Postgres replication lag monitoring.