Uptime Monitoring for Kubernetes: Pods, Services, and Ingress

Kubernetes gives you liveness probes, readiness probes, and startup probes. The kubelet restarts unhealthy pods automatically. Your cluster hums along, every probe returning 200, every pod in a Running state. And yet your users are staring at a 502 because someone deleted the ingress annotation that maps your domain to the backend service. Inside the cluster, everything is fine. Outside, nothing works.

Internal health checks answer one question: "Is this pod alive?" External uptime monitoring answers a different one: "Can a real user reach this service right now?" You need both. This guide covers how to set up external monitoring for Kubernetes-hosted services using CronAlert, and why it catches an entire class of failures that probes never will.

Why internal probes are not enough

Kubernetes probes operate inside the cluster network. They hit pod IPs directly, bypassing every layer between your users and your application. That means probes are blind to:

Ingress misconfigurations. A bad annotation, a missing path rule, or a typo in a host field will route traffic to the wrong backend or return a default 404 -- while the pod itself is perfectly healthy.
cert-manager failures. Your TLS certificate expires or fails to renew. The pod is fine. The ingress controller serves a browser-scary certificate error to every visitor.
DNS issues. Your external DNS record points to a stale IP after a cluster migration or load balancer recreation. Probes still hit the pod by IP; users get DNS resolution failures.
Load balancer health. Cloud load balancers (ALB, NLB, GCE Ingress) have their own failure modes. A misconfigured health check on the LB side can drain all targets while pods report ready.
CDN and edge caching problems. A CDN in front of your cluster can serve stale errors, cache 5xx responses, or fail to connect to your origin -- none of which your probes will detect.

The rule of thumb: If a failure mode exists between the user's browser and your pod, internal probes will not catch it. External monitoring fills that gap. For more on SSL certificate monitoring specifically, see our dedicated guide. For the design of the /livez, /readyz, and externally-facing health endpoints themselves, see the complete guide to HTTP health check endpoints.

What to monitor externally on a Kubernetes cluster

Not every Kubernetes resource needs an external monitor -- only the ones with public-facing endpoints. Here is what to target:

Ingress endpoints. Every host defined in your Ingress resources represents a URL your users hit. Monitor each one at its public domain.
API gateway routes. If you run an API gateway (Kong, Ambassador, Traefik as gateway) in front of your services, monitor the key routes through the gateway's external address.
Public-facing service URLs. Services exposed via LoadBalancer type or NodePort that users access directly.
Health check endpoints exposed through ingress. If you expose a /healthz or /ready endpoint through your ingress for external consumers, monitor it the same way your users would reach it -- through the public URL, not the internal pod IP.

For each of these, you want to verify not just that the endpoint returns a 200, but that the response content is correct. A misconfigured ingress might return the default backend's 200 page instead of your actual application. Keyword monitoring catches this by checking that the response body contains expected content.

Setting up monitors with CronAlert

For each external endpoint on your cluster, create a monitor in CronAlert pointed at its public URL. Here is a practical setup for a typical cluster:

List your ingress hosts. Run kubectl get ingress -A to see every host your cluster exposes. Each distinct host is a monitor candidate.
Create a monitor per endpoint. In the CronAlert dashboard, add a monitor for each URL. Use the full public URL including the protocol (e.g., https://api.yourapp.com/healthz).
Enable keyword monitoring. For critical services, add a keyword check. If your health endpoint returns {"status":"ok"}, set that as the expected keyword. This catches cases where the ingress returns a 200 but serves the wrong backend.
Set appropriate intervals. The free plan checks every 3 minutes. For production services where minutes matter, the Pro plan checks every minute.

Naming convention: Use a consistent naming pattern like k8s-prod-api, k8s-prod-web, k8s-staging-api. This makes it easy to filter monitors in the dashboard and manage them via the API.

Monitoring across environments

Most Kubernetes setups have at least two environments -- staging and production -- each with its own ingress. Monitor both, but route alerts differently:

Production monitors alert to your on-call channel (PagerDuty, Slack #incidents, or a webhook that pages someone).
Staging monitors alert to a lower-priority channel (Slack #staging-alerts, email digest) so failures are visible without waking anyone up.

This separation matters because staging endpoints break more often -- someone is testing a new ingress config, a cert is self-signed, a deploy is half-finished. You want visibility into staging health without alert fatigue. CronAlert lets you configure different alert channels per monitor, so the routing is straightforward.

Multi-region monitoring for Kubernetes

If your cluster serves traffic globally -- whether through a single cluster with a CDN in front, or a multi-cluster setup with geographic routing -- single-region monitoring has the same blind spots it always does. Your US-based check returns 200 while your EU users hit a broken edge node.

CronAlert's multi-region monitoring checks from 5 locations simultaneously: US East, US West, EU West, EU Central, and AP Southeast. For Kubernetes clusters behind a CDN or global load balancer, this catches region-specific routing failures that a single probe location would miss. Multi-region is available on Team and Business plans.

Integrating with your Kubernetes workflow

Manual monitor creation works for a handful of services. Once your cluster grows, you want monitoring to be part of the deployment lifecycle. CronAlert's REST API makes this possible:

Create monitors on deploy. Add an API call to your CI/CD pipeline or Helm post-install hook that creates a monitor for each new service. Check if the monitor already exists first to avoid duplicates.
Pause during rollouts. If you do rolling updates that temporarily return errors, use the API to pause the monitor before the rollout and resume it after. This prevents false alerts during planned deployments.
Delete on teardown. When you decommission a service and remove its ingress, delete the corresponding monitor. Orphaned monitors waste your monitor quota and generate noise.
Sync with GitOps. If you use ArgoCD or Flux, consider a post-sync hook that reconciles your monitor list with the current set of ingress hosts.

# Example: create a monitor after helm install
helm install my-service ./chart --wait

curl -X POST https://cronalert.com/api/v1/monitors \
  -H "Authorization: Bearer $CRONALERT_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "k8s-prod-my-service",
    "url": "https://my-service.example.com/healthz",
    "method": "GET"
  }'

Alert routing for Kubernetes teams

Kubernetes clusters typically serve multiple teams. Each team owns a set of services and wants to be notified about their own services, not everyone else's. Set up alert channels to match your team structure:

Slack channel per team. Route each monitor's alerts to the owning team's Slack channel. The backend team gets backend alerts; the frontend team gets frontend alerts.
PagerDuty for on-call. Connect critical production monitors to PagerDuty so the on-call engineer gets paged for real outages, not staging hiccups.
Webhook for custom automation. Use webhook alerts to trigger custom runbooks -- automatically scale up a deployment, restart a pod, or open an incident in your incident management tool.
Email for audit trail. Keep email alerts enabled alongside real-time channels so you have a searchable record of every incident.

CronAlert supports Slack, Discord, PagerDuty, Teams, Telegram, email, and generic webhooks. On the free plan you get email, Slack, Discord, and webhook. PagerDuty, Teams, and Telegram are available on Pro and above.

Common Kubernetes failure modes that external monitoring catches

Here are real failure scenarios where internal probes report healthy but users cannot reach the service:

cert-manager renewal failure. The ClusterIssuer's ACME challenge fails silently. The certificate expires. NGINX ingress controller starts serving a self-signed cert or refusing connections. Pods are healthy. Users see ERR_CERT_AUTHORITY_INVALID.
Ingress controller crash. The ingress controller pods themselves crash or get evicted due to resource pressure. No ingress rules are being served. Every service behind the ingress is unreachable, but all backend pods pass their probes.
HPA scaling too slow. A traffic spike hits. The Horizontal Pod Autoscaler starts scaling, but new pods take 30 seconds to pass readiness checks. During that window, existing pods are overwhelmed and timing out. Probes pass because the kubelet checks on a different port or path than user traffic.
Node pool exhaustion. The cluster runs out of schedulable nodes. New pods sit in Pending state. Existing pods are fine, so probes pass, but the service is degraded because it is running at reduced capacity.
Cloud load balancer target deregistration. A node is drained, but the cloud LB's deregistration delay means it keeps sending traffic to the draining node for 300 seconds. Pods have already shut down. The LB returns 502s.

Every one of these scenarios has the same signature: kubectl get pods shows everything Running and Ready, but real users cannot reach the service. External monitoring is the only way to detect them without relying on user reports.

FAQ

Can CronAlert monitor services inside my Kubernetes cluster?

CronAlert monitors from outside your cluster, checking the same endpoints your users hit. It cannot reach ClusterIP services or pod IPs directly. To monitor an internal service, expose a health endpoint through your ingress controller or load balancer.

How is external monitoring different from liveness and readiness probes?

Liveness and readiness probes run inside the cluster and tell Kubernetes whether a pod is healthy. External monitoring checks the full request path from the public internet through DNS, load balancer, ingress controller, and service routing. A pod can pass its probes while being unreachable from outside due to ingress misconfiguration, certificate errors, or DNS issues.

Should I create a separate monitor for each Kubernetes service?

Yes. Each externally accessible service should have its own monitor pointed at its public URL. This gives you per-service incident history and lets you route alerts to the team that owns that service. On the free plan you get 25 monitors, which covers most clusters.

Can I automate monitor creation when I deploy a new service to Kubernetes?

Yes. Use CronAlert's REST API to create monitors as part of your CI/CD pipeline or Helm post-install hook. The API supports creating, updating, and deleting monitors programmatically. Write access requires a Pro plan or higher.

Start monitoring your cluster from the outside

Your Kubernetes probes tell you whether pods are alive. External monitoring tells you whether users can reach them. The gap between those two is where outages hide -- sometimes for hours, until a customer emails you. If your cluster runs a microservices architecture, external monitoring becomes even more critical -- each service is an independent failure point that needs its own health checks. Running services on Cloudflare Workers or other edge platforms alongside your cluster? The same external monitoring approach applies there too.

Create a free CronAlert account to start monitoring your cluster's external endpoints. The free plan includes 25 monitors with 3-minute checks -- enough to cover most clusters. When you need 1-minute intervals or multi-region checks, paid plans start at $4/month.

Why internal probes are not enough

What to monitor externally on a Kubernetes cluster

Setting up monitors with CronAlert

Monitoring across environments

Multi-region monitoring for Kubernetes

Integrating with your Kubernetes workflow

Alert routing for Kubernetes teams

Common Kubernetes failure modes that external monitoring catches

FAQ

Can CronAlert monitor services inside my Kubernetes cluster?

How is external monitoring different from liveness and readiness probes?

Should I create a separate monitor for each Kubernetes service?

Can I automate monitor creation when I deploy a new service to Kubernetes?

Start monitoring your cluster from the outside

Start monitoring your sites for free