How to Monitor Your Database Health Endpoint

Q: Can I use the same health endpoint for both Kubernetes probes and external monitoring?

Yes, and you should. A single /healthz endpoint that checks database connectivity works for both Kubernetes readiness probes and external monitoring tools like CronAlert. Kubernetes checks it from inside the cluster to manage pod routing, while CronAlert checks it from outside to detect infrastructure-level outages that affect the entire cluster. The two complement each other -- Kubernetes handles internal routing, external monitoring catches problems that Kubernetes cannot see.

Your homepage loads fine. Your login page returns 200. Your marketing site is fast. But your users cannot save data, load their dashboards, or complete purchases -- because the database is down and your application is serving cached pages, stale data, or generic error screens with perfectly healthy HTTP status codes.

This is the blind spot in most uptime monitoring setups. Monitoring your homepage tells you that your web server is running. It does not tell you that your application can actually do its job. For any app backed by a database -- which is nearly all of them -- you need a dedicated health endpoint that verifies database connectivity. Then you need to monitor that endpoint externally.

Why monitoring your homepage is not enough

Consider what happens when your database goes down but your web server stays up:

Static pages and CDN-cached assets continue to serve normally. Your homepage returns 200.
Read-heavy pages might serve stale data from application-level caches for minutes or hours before anyone notices.
Write operations fail silently or show user-facing errors that your status code monitoring never sees (because the server still returns 200 with an error message in the body).
Background jobs, webhooks, and scheduled tasks pile up in queues with no visibility into the underlying cause.

A standard HTTP status code check on your homepage will report "all clear" throughout this entire scenario. The outage is real, your users are affected, and your monitoring is blind to it. You need an endpoint that goes deeper than "can the server respond?" and answers the actual question: "can the application do useful work?"

Database health endpoints sit inside a broader pattern of HTTP health check design — shallow vs deep checks, the /healthz//livez//readyz conventions, and the trade-offs between them. The full taxonomy is in the complete guide to HTTP health check endpoints.

What a database health endpoint should check

A good health endpoint verifies the minimum set of conditions required for your application to function. For a database-backed app, that means:

Basic connectivity

The most fundamental check: can the application open a connection to the database and execute a trivial query? The classic approach is SELECT 1 -- it verifies that the connection is alive, the database is accepting queries, and authentication is working. If this fails, nothing else matters.

Query latency

Measure how long the connectivity check takes. A database that responds to SELECT 1 in 500ms instead of the usual 2ms is not "healthy" -- it is under severe load or experiencing network issues. Include the response time in your health endpoint output so you can set keyword monitoring or timeout thresholds to catch degraded performance before it becomes an outage.

Connection pool stats

If your application uses a connection pool, report the number of active, idle, and waiting connections. A pool with zero available connections and a growing wait queue is about to fail even though individual queries still succeed. This is the kind of leading indicator that a health endpoint can expose before users see errors.

Migration version

Optionally, verify that the database schema is at the expected migration version. This catches a surprisingly common failure mode: a deploy rolls out new application code that expects columns or tables created by a migration that has not run yet. The health endpoint can compare the current migration version against the expected version and report a mismatch.

Implementation examples

Here are practical implementations in three common stacks. Each one exposes a /healthz endpoint that checks database connectivity and returns a structured JSON response.

Node.js with Express and PostgreSQL

const express = require('express');
const { Pool } = require('pg');

const app = express();
const pool = new Pool({
  connectionString: process.env.DATABASE_URL,
  max: 20,
});

app.get('/healthz', async (req, res) => {
  const start = Date.now();
  try {
    await pool.query('SELECT 1');
    const latency = Date.now() - start;
    const { totalCount, idleCount, waitingCount } = pool;

    res.json({
      status: 'healthy',
      db: {
        connected: true,
        latency_ms: latency,
        pool: {
          total: totalCount,
          idle: idleCount,
          waiting: waitingCount,
        },
      },
    });
  } catch (err) {
    res.status(503).json({
      status: 'unhealthy',
      db: { connected: false, error: 'connection failed' },
    });
  }
});

Python with Flask and SQLAlchemy

import time
from flask import Flask, jsonify
from sqlalchemy import create_engine, text

app = Flask(__name__)
engine = create_engine(os.environ["DATABASE_URL"], pool_size=20)

@app.route("/healthz")
def healthz():
    start = time.monotonic()
    try:
        with engine.connect() as conn:
            conn.execute(text("SELECT 1"))
        latency = round((time.monotonic() - start) * 1000, 1)
        pool_status = engine.pool.status()

        return jsonify({
            "status": "healthy",
            "db": {
                "connected": True,
                "latency_ms": latency,
                "pool": pool_status,
            },
        })
    except Exception:
        return jsonify({
            "status": "unhealthy",
            "db": {"connected": False, "error": "connection failed"},
        }), 503

Go with net/http and database/sql

package main

import (
    "database/sql"
    "encoding/json"
    "net/http"
    "time"

    _ "github.com/lib/pq"
)

var db *sql.DB

func healthzHandler(w http.ResponseWriter, r *http.Request) {
    start := time.Now()
    err := db.Ping()
    latency := time.Since(start).Milliseconds()

    stats := db.Stats()

    if err != nil {
        w.WriteHeader(http.StatusServiceUnavailable)
        json.NewEncoder(w).Encode(map[string]any{
            "status": "unhealthy",
            "db":     map[string]any{"connected": false, "error": "connection failed"},
        })
        return
    }

    json.NewEncoder(w).Encode(map[string]any{
        "status": "healthy",
        "db": map[string]any{
            "connected":  true,
            "latency_ms": latency,
            "pool": map[string]any{
                "open":    stats.OpenConnections,
                "in_use":  stats.InUse,
                "idle":    stats.Idle,
                "waiting": stats.WaitCount,
            },
        },
    })
}

All three follow the same pattern: time a trivial query, return connection pool stats, and respond with a clear status field that external monitoring can key on. The "status": "healthy" string in the response body is what you will configure your monitor to look for. For edge and serverless platforms like Cloudflare Workers, the same pattern applies -- check your D1 or KV bindings in the health endpoint alongside any database connections.

What NOT to put in a health check

A health endpoint runs on every check interval -- potentially once per minute across multiple regions. Everything in it should be fast, safe, and minimal. Here is what to avoid:

Expensive queries. Do not run SELECT COUNT(*) FROM orders or any query that scans large tables. SELECT 1 is sufficient for connectivity. If you want to check that specific tables exist, use SELECT 1 FROM table_name LIMIT 1 -- not a full table scan.
Sensitive information. Do not expose connection strings, hostnames, passwords, or detailed error messages. Your health endpoint might be publicly accessible (it needs to be reachable by external monitoring tools). Return "error": "connection failed", not the full stack trace or database credentials.
Write operations. Never insert, update, or delete data from a health check. The check runs constantly and should have zero side effects. If you want to verify write capability, do it in a separate, less frequent diagnostic job -- not in the endpoint that gets hit every minute.
Long dependency chains. If your health endpoint checks the database, Redis, three external APIs, and an S3 bucket, any one of those failing makes the whole check fail. A single slow API response can make your health endpoint timeout, triggering a false positive for a database problem that does not exist. Keep the primary health check focused on your core dependency (the database) and use separate endpoints for other dependencies.

Target response time: under 200ms. Your health endpoint should respond faster than your application's normal endpoints. If the health check itself is slow, it becomes hard to distinguish a slow check from a genuinely degraded database. Keep it trivial: connect, run SELECT 1, return the result.

How to monitor it externally with CronAlert

Once your health endpoint is deployed, you need an external service to check it on a schedule and alert you when it fails. Internal health checks (like Kubernetes probes) are valuable for routing traffic, but they cannot detect infrastructure-level failures that affect your entire cluster. External monitoring verifies that the endpoint is reachable from the outside world -- the same path your users take.

Here is how to set it up in CronAlert:

Create a new monitor pointing at your health endpoint URL, e.g., https://api.yourapp.com/healthz.
Set the expected status code to 200. If your endpoint returns 503 when the database is down, the status code check alone will catch complete outages.
Add keyword monitoring (Pro plan) to check that the response body contains "healthy". This catches the scenario where the server returns 200 but the response body indicates a degraded state -- for example, if a middleware catches the 503 and rewrites it to 200 with an error body.
Set the timeout to a reasonable value. If your health endpoint normally responds in 50ms, a 5-second timeout catches genuine slowdowns without false-positiving on minor latency spikes.
Configure alert channels. Route alerts to Slack for team awareness and PagerDuty for on-call response. Database outages are almost always urgent -- this is not the place for email-only notifications.

The combination of status code checking and keyword monitoring gives you two layers of detection. The status code catches hard failures (503, connection refused, timeout). The keyword check catches soft failures (200 response with an unhealthy status in the body). Together they cover the full range of database outage scenarios.

For microservices architectures, set up a separate health endpoint monitor for each service that has its own database connection. A shared database going down affects every service that depends on it, but you want per-service visibility so you can see exactly which services are impacted and which are still operational.

Advanced patterns

Liveness vs readiness probes

If you are running in Kubernetes or a similar orchestration platform, you should expose two separate endpoints:

/livez (liveness) -- answers "is the process running and not deadlocked?" This should be a trivial check that does not touch external dependencies. Return 200 if the HTTP server can respond at all. If this fails, the container should be restarted.
/readyz (readiness) -- answers "is the application ready to serve traffic?" This is where the database check goes. If the database is unreachable, return 503 so the load balancer stops routing traffic to this instance. The process stays running and will return to service once the database recovers.

For external monitoring with CronAlert, monitor the readiness endpoint. That is the one that tells you whether users can actually use your application. The liveness endpoint is useful for Kubernetes internally but does not provide much value for external monitoring -- a process that is alive but cannot reach its database is not serving your users.

Dependency health mapping

For applications with multiple dependencies beyond the database -- Redis, message queues, external APIs, object storage -- consider a structured response that reports each dependency individually:

{
  "status": "degraded",
  "dependencies": {
    "postgres": { "status": "healthy", "latency_ms": 3 },
    "redis": { "status": "healthy", "latency_ms": 1 },
    "s3": { "status": "unhealthy", "error": "timeout" },
    "stripe_api": { "status": "healthy", "latency_ms": 120 }
  }
}

The top-level status reflects the worst dependency state. If any critical dependency is unhealthy, the overall status is unhealthy. If a non-critical dependency is down (like an analytics service), the status might be "degraded" instead -- the application works, but some features are impaired.

With CronAlert's keyword monitoring, you can monitor this endpoint for the string "healthy" in the top-level status. When any dependency fails, the status changes to "unhealthy" or "degraded", the keyword match fails, and you get an alert. This turns your health endpoint into a comprehensive dependency monitoring system that is checked externally every minute.

Integrating health checks into your CI/CD pipeline

Health endpoints are not just for monitoring -- they are a deployment safety net. After each deploy, your CI/CD pipeline should hit the health endpoint and verify that the new version is healthy before routing traffic to it. If the health check fails after deploy, roll back automatically. This catches broken database migrations, misconfigured connection strings, and other deploy-time failures before they affect users.

A typical post-deploy health check in CI looks like this:

# Wait for the new deployment to be ready, then verify health
for i in $(seq 1 30); do
  response=$(curl -s -o /dev/null -w "%{http_code}" https://api.yourapp.com/healthz)
  if [ "$response" = "200" ]; then
    echo "Health check passed"
    exit 0
  fi
  sleep 2
done
echo "Health check failed after 60 seconds"
exit 1

This complements external monitoring. Your CI pipeline catches deploy-time failures immediately. CronAlert catches failures that develop after the deploy succeeds -- database connection pool exhaustion, slow query degradation, or upstream dependency outages that happen hours later.

Frequently asked questions

How often should I check my database health endpoint?

For production databases, check every 1 to 3 minutes. A 1-minute interval (available on CronAlert paid plans) catches outages quickly without generating excessive load. A 3-minute interval (CronAlert free plan) is sufficient for most applications where a few minutes of detection delay is acceptable. Avoid checking more frequently than once per minute -- the health endpoint itself adds load to your database, and sub-minute checks rarely provide meaningful additional coverage.

Should my health endpoint require authentication?

Generally no. Health endpoints should be lightweight and accessible without authentication so that external monitoring tools, load balancers, and orchestrators can reach them. If you are concerned about information disclosure, return minimal data -- just a status field -- rather than adding authentication. If your security policy requires it, use a static API key in a request header rather than session-based auth, and configure your monitoring tool to send that header with each check. CronAlert supports custom request headers on all plans for exactly this use case. The token-protected pattern is also how you handle monitoring internal tools and admin panels from the outside without exposing them.

What is the difference between a liveness probe and a readiness probe?

A liveness probe answers whether the process is running and not deadlocked. If it fails, the process should be restarted. A readiness probe answers whether the application is ready to serve traffic. If it fails, the instance should be removed from the load balancer but not restarted. A database health check is typically a readiness probe -- if the database is unreachable, the app cannot serve requests, but the process itself is fine and will recover once the database comes back.

Can I use the same health endpoint for both Kubernetes probes and external monitoring?

Yes, and you should. A single /healthz endpoint that checks database connectivity works for both Kubernetes readiness probes and external monitoring tools like CronAlert. Kubernetes checks it from inside the cluster to manage pod routing, while CronAlert checks it from outside to detect infrastructure-level outages that affect the entire cluster. The two complement each other -- Kubernetes handles internal routing, external monitoring catches problems that Kubernetes cannot see.

Start monitoring your database health

A health endpoint takes 15 minutes to implement and closes the biggest gap in most monitoring setups. Without one, you are trusting that a 200 from your homepage means everything is fine. With one, you have direct visibility into whether your application can actually talk to its database -- the single dependency that matters most. For framework-specific examples, see the Next.js and Vercel monitoring guide, which walks through an /api/health route using the App Router, uptime monitoring for Django apps for the Python equivalent that checks the ORM connection, or uptime monitoring for Laravel apps for the PHP equivalent that checks the Eloquent connection.

Build the endpoint, deploy it, then create a free CronAlert account and point a monitor at it. The free plan gives you 25 monitors with 3-minute checks and API endpoint monitoring. Upgrade to Pro ($5/month) for 1-minute checks and keyword monitoring that verifies the response body contains your expected health status. The same pattern extends to monitoring background workers and queue consumers — expose queue depth from the same kind of endpoint and apply the same monitoring approach — to monitoring Postgres replication lag, where the endpoint reports how far behind your replicas are, and to monitoring Redis and ElastiCache endpoints, where the same endpoint surfaces memory pressure and eviction. See the pricing page for full plan details.

Why monitoring your homepage is not enough

What a database health endpoint should check

Basic connectivity

Query latency

Connection pool stats

Migration version

Implementation examples

Node.js with Express and PostgreSQL

Python with Flask and SQLAlchemy

Go with net/http and database/sql

What NOT to put in a health check

How to monitor it externally with CronAlert

Advanced patterns

Liveness vs readiness probes

Dependency health mapping

Integrating health checks into your CI/CD pipeline

Frequently asked questions

How often should I check my database health endpoint?

Should my health endpoint require authentication?

What is the difference between a liveness probe and a readiness probe?

Can I use the same health endpoint for both Kubernetes probes and external monitoring?

Start monitoring your database health

Start monitoring your sites for free