gRPC has quietly become the default protocol for internal service-to-service traffic, and increasingly for public APIs too. It is faster than JSON-over-HTTP, has strong typing through Protobuf, and supports streaming out of the box. But it also breaks the assumptions baked into most uptime monitoring tools, which were designed for HTTP/1.1 request-response cycles and never quite caught up to HTTP/2 framing or gRPC's trailer-based status codes.

The result: teams point a standard HTTP monitor at their gRPC endpoint, see a green dashboard, and discover during an incident that every actual call has been returning UNAVAILABLE for an hour. The monitor was checking the wrong layer. This post walks through what to actually monitor in a gRPC setup, how to wire gRPC health checks into an HTTP-only uptime tool, and the failure modes that are unique to gRPC.

Why HTTP monitoring isn't enough for gRPC

A gRPC call is, at the transport level, an HTTP/2 request. The framing is HTTP/2, the path is the fully-qualified method name (/myco.billing.v1.Billing/CreateInvoice), and the body is a length-prefixed Protobuf message. The server returns HTTP/2 200 for the framing, then writes the actual gRPC status — OK, UNAVAILABLE, DEADLINE_EXCEEDED, PERMISSION_DENIED, and so on — in the HTTP/2 trailers at the end of the response.

This is a problem for HTTP-only monitors. The monitor sees the HTTP/2 200, declares the endpoint healthy, and moves on. It never reads the trailers, never parses Protobuf, and has no idea that grpc-status: 14 (UNAVAILABLE) was sent. Three concrete failure modes that HTTP monitoring misses on a gRPC endpoint:

  • Service-not-implemented. The server is up but the service registration is broken (build artifact mismatch, missing import). Every call returns UNIMPLEMENTED with HTTP 200.
  • Database-dependency outage. The server's database connection pool is exhausted. Every call returns UNAVAILABLE after a short delay, with HTTP 200.
  • Auth misconfiguration. A deploy introduces a bad interceptor that fails authentication. Every call returns UNAUTHENTICATED with HTTP 200.

From the HTTP monitor's perspective, all three look identical to "everything is fine." This is closely related to GraphQL's monitoring problem — where every error is also wrapped in a 200 — but gRPC has its own framing layer that makes naive monitoring even less informative.

The gRPC Health Checking Protocol

The gRPC project ships a standard health-checking service that every gRPC server should expose. The schema is small:

syntax = "proto3";

package grpc.health.v1;

message HealthCheckRequest {
  string service = 1;
}

message HealthCheckResponse {
  enum ServingStatus {
    UNKNOWN = 0;
    SERVING = 1;
    NOT_SERVING = 2;
    SERVICE_UNKNOWN = 3;
  }
  ServingStatus status = 1;
}

service Health {
  rpc Check(HealthCheckRequest) returns (HealthCheckResponse);
  rpc Watch(HealthCheckRequest) returns (stream HealthCheckResponse);
}

Most gRPC server libraries ship with a built-in implementation:

  • Go: google.golang.org/grpc/health and healthpb.RegisterHealthServer.
  • Node: grpc-health-check on npm.
  • Java: io.grpc.protobuf.services.HealthStatusManager.
  • Python: grpcio-health-checking.
  • C#/Rust: first-party packages exist.

Register the health service alongside your application services on the same gRPC server. Set the overall status (empty service name) to SERVING on startup, and per-service status to SERVING for each service you actually want to track separately. When a critical dependency fails — database connection lost, downstream provider degraded — flip the relevant status to NOT_SERVING. External health checks read this and route traffic accordingly.

What to monitor at the gRPC layer

1. Overall server health

Call Health.Check with an empty service name. SERVING means "the server thinks it can serve any of its services right now." This is the baseline alert; if this goes red, traffic should be drained from the instance. Run this check at the same cadence as your HTTP uptime checks — 1-minute intervals from multiple regions are standard.

2. Per-service health

Call Health.Check with the fully-qualified service name for each service whose fate is independent of the rest of the server. Common case: a server hosts Catalog (read-heavy, backed by a Postgres replica) and Billing (write-heavy, backed by the primary). When the primary fails, you want Billing to go NOT_SERVING while Catalog stays SERVING. Per-service health surfaces this distinction.

The triage payoff is real. An alert that says "Billing is NOT_SERVING, Catalog is SERVING" tells the on-call exactly where to look — it's a database-scoped issue, not a server-scoped one. An overall NOT_SERVING alert sends them on a wider hunt.

3. TLS and ALPN negotiation

gRPC over TLS depends on ALPN negotiation selecting h2. A misconfigured ingress, a load balancer that strips ALPN, or a stale TLS certificate will all break gRPC even though a plain HTTPS check from a browser succeeds (browsers fall back to HTTP/1.1). Monitor the certificate explicitly — CronAlert's SSL certificate monitoring catches expiry — and ideally include a TLS check that asserts h2 was negotiated.

4. Real-call probes for critical RPCs

Health.Check tells you the server thinks it's healthy. It doesn't tell you that CreateInvoice actually works. For the small number of RPCs that are business-critical, run a synthetic probe that issues the real RPC with a known-safe payload (test customer, $0.01 invoice, idempotent flag set) and asserts the response. Keep the probe payload boring and deterministic — synthetic probes that occasionally fail because of test-data drift create more noise than signal.

This is the gRPC analogue of monitoring a critical REST endpoint with body validation. The API endpoint monitoring playbook applies; only the protocol changes.

5. Latency percentiles

gRPC failure modes often manifest as latency before they manifest as errors. P50 stays flat while P99 climbs from 50ms to 5 seconds; eventually a downstream timeout kicks in and the calls start failing outright. Monitoring P50/P95/P99 on the Health.Check call gives you early warning. Many uptime tools — CronAlert included — record response times on every check; alert on sustained P99 degradation, not just on outright failures.

Wiring gRPC health checks into HTTP uptime monitoring

Most uptime monitoring tools (CronAlert included) speak HTTP, not gRPC. There are three workable patterns to bridge the two:

Pattern 1: HTTP wrapper on the server

Expose an HTTP endpoint on the same process (or sidecar) that internally calls the gRPC Health.Check method and returns an HTTP status. Minimal example in Go:

http.HandleFunc("/healthz", func(w http.ResponseWriter, r *http.Request) {
  ctx, cancel := context.WithTimeout(r.Context(), 2*time.Second)
  defer cancel()
  resp, err := healthClient.Check(ctx, &healthpb.HealthCheckRequest{Service: ""})
  if err != nil || resp.Status != healthpb.HealthCheckResponse_SERVING {
    http.Error(w, "NOT_SERVING", http.StatusServiceUnavailable)
    return
  }
  w.Write([]byte("SERVING"))
})

Then point a CronAlert HTTP monitor at /healthz. Use keyword monitoring to require the body to contain SERVING, so a misconfigured wrapper that returns 200 with the wrong body doesn't pass silently. This is the simplest pattern and works well for the overall server status; the trade-off is that the HTTP wrapper and the gRPC server share a process, so a process-wide hang takes both down (which is fine — that's the failure you want to detect anyway).

The pattern fits the broader HTTP health check endpoint guidance: shallow check on a fixed path, fast response, no side effects, no auth.

Pattern 2: Out-of-process gRPC probe

A more thorough approach: a tiny worker (Cloudflare Worker, Lambda, sidecar, or a Cron-triggered container) that runs an actual gRPC client against the server and exposes the result over HTTP. The reference tool is grpc_health_probe, a small CLI from the gRPC project that does exactly this:

grpc_health_probe -addr=api.example.com:443 -tls -tls-server-name=api.example.com

Wrap this in a thin HTTP service that runs the probe on every request and returns 200/503 based on the exit code, host it on a different cloud account from the gRPC server (so they don't share fate), and point CronAlert at the wrapper. This catches transport-level failures the wrapper-on-the-server approach can't, because the probe genuinely speaks HTTP/2 + gRPC end-to-end.

The trade-off is operational: one more thing to host, one more thing that can break. For most teams, the in-process wrapper is enough; for teams running gRPC as a public API with external SLAs, the out-of-process probe is worth the extra infrastructure.

Pattern 3: Service-mesh-native health

If you're already running a service mesh (Envoy, Linkerd, Istio, Consul), use the mesh's gRPC health check support directly. Mesh health checks are richer than HTTP probes — they understand gRPC framing, retry semantics, and circuit breaking. The mesh then exposes its own health surface for external monitoring. Don't replace the mesh's internal checks with CronAlert's; instead, use CronAlert to monitor the mesh's externally-facing endpoints and to provide an independent vote that's not subject to the mesh's own failure modes.

gRPC-specific failure modes worth watching

  • Deadline exceeded. Caller's deadline is too tight relative to server P99. Manifests as DEADLINE_EXCEEDED on the caller, no obvious problem on the server. Surface caller-side latency in your alerts.
  • Resource exhaustion. Connection pool, goroutine, or thread pool exhaustion. Server returns RESOURCE_EXHAUSTED; this is a load-shedding signal, alert on a non-trivial rate.
  • Stream connection draining. Long-lived streaming RPCs hold connections open. During deploys with poor draining, in-flight streams die abruptly and clients see UNAVAILABLE. Monitor the per-deploy error spike.
  • Reflection-disabled probe failures. Many production deployments disable gRPC reflection for security. grpc_health_probe doesn't need reflection (it knows the Health schema statically), but other generic probes do. Don't confuse "reflection disabled" with "server unhealthy."
  • HTTP/2 PING failures. Some load balancers and proxies drop idle HTTP/2 connections without sending GOAWAY. Clients see "transient failure" errors that don't show up server-side. PING-based keepalive on the client and connection-recycling on the load balancer mitigate this.

Alerting strategy for gRPC monitoring

The general guidance from avoiding alert fatigue applies — but gRPC has a few specific routing rules worth setting up:

  • Overall NOT_SERVING from multiple regions pages immediately. This is the gRPC equivalent of "site is down."
  • Per-service NOT_SERVING routes to the team that owns the affected service rather than to a global on-call. Surface the service name in the alert payload so the routing is automatic.
  • TLS / ALPN negotiation failures page immediately — they take down the entire surface, not just one service.
  • Sustained P99 latency degradation on Health.Check opens a non-paging alert on a chat channel. This often precedes outright failures by ten or twenty minutes and lets the team get ahead of the incident.
  • Real-RPC probe failures page if the probe is on a critical revenue path, otherwise go to a chat channel. Critical-path probes (checkout, auth, payment) deserve their own escalation; everything else can wait for business hours.

Frequently asked questions

Why can't I just point an HTTP uptime monitor at my gRPC endpoint?

You can, but it only checks TCP and TLS. gRPC's actual status lives in HTTP/2 trailers; the framing-level 200 means nothing about whether RPCs are succeeding. Use the gRPC Health Checking Protocol via a wrapper or out-of-process probe instead.

What is the gRPC Health Checking Protocol?

A standard service at grpc.health.v1.Health with Check (unary) and Watch (server streaming) RPCs that return SERVING, NOT_SERVING, UNKNOWN, or SERVICE_UNKNOWN. Every major language has a library implementation.

How do I monitor a gRPC server externally if my uptime tool only speaks HTTP?

Expose a small /healthz HTTP endpoint that internally calls Health.Check and returns 200/503, or run grpc_health_probe in a small worker and point an HTTP monitor at the worker. Both are simple to set up.

Should I monitor every gRPC service individually or just the server?

Both. Overall server status as the baseline alert; per-service status for services with independent fate (different databases, different downstream dependencies). The granularity helps triage.

Can I monitor streaming gRPC calls?

Yes, but it's rarely worth it for uptime. Streaming RPCs are application-specific — "healthy" depends on the product. Use unary Health.Check for uptime; reserve streaming checks for synthetic monitoring of features where the streaming behavior is itself the product.

Add gRPC monitoring to your uptime stack

gRPC monitoring isn't a separate product — it's an HTTP monitor pointed at the right surface. Expose a thin /healthz wrapper around the gRPC Health Checking Protocol, point a CronAlert keyword monitor at it asserting SERVING, and you're done. Layer in per-service checks, TLS validation, and real-RPC probes as the setup matures. Create a free CronAlert account and add a gRPC health check alongside your HTTP monitors in the same dashboard.

Related reading: GraphQL endpoint monitoring, API endpoint monitoring, HTTP health check endpoints, microservices uptime monitoring, and SSL certificate monitoring.