Healthcare systems have a different relationship with downtime than most SaaS products. A 10-minute outage on a typical web app means refunds, complaints, and a flurry of support tickets. A 10-minute outage on a patient portal at 9am on a Monday means hundreds of pre-visit check-ins fail, prescription refills queue up, telehealth calls get dropped mid-visit, and the front desk reverts to paper while the phone lines pile up. The cost is operational, clinical, and contractual all at once.
On top of that, healthcare apps live under HIPAA — the Security Rule, the Privacy Rule, the Breach Notification Rule — and the rules cover not just user data but also the availability of the systems that handle it. Monitoring a healthcare app means making availability a first-class signal while staying inside the lines on data handling, audit logging, and vendor relationships.
This post walks through the practical version of that work: what to monitor in a healthcare app, how to design endpoints that monitoring tools can hit without ever touching PHI, when you need a BAA with your monitoring vendor, how to use uptime data in HIPAA risk assessments, and what the right alerting cadence looks like for clinical-grade systems.
Why healthcare uptime is different
The structural reasons healthcare uptime monitoring deviates from generic SaaS monitoring:
- Mixed-criticality endpoints. A marketing page going down is a marketing problem. A patient login going down is a clinical problem. The same monitoring tool covers both, but the alerts cannot be treated the same.
- Integration sprawl. Modern healthcare apps depend on EHR APIs (Epic, Cerner/Oracle, athenahealth, Allscripts), e-prescribing networks (Surescripts), insurance eligibility checks (Change Healthcare/Optum), labs (Quest, LabCorp), and identity providers. Any one of them being down breaks specific clinical workflows in ways the application can't internally detect.
- Regulatory exposure on the data path. Anywhere a monitor touches a request or response involving PHI, you have a vendor relationship that needs a Business Associate Agreement (BAA). Most monitoring tools don't sign BAAs without a specific enterprise tier; many will not sign them at all.
- Audit and risk-assessment requirements. The HIPAA Security Rule expects you to evaluate the availability of systems holding electronic PHI annually. Uptime data feeds directly into that evaluation, and the data needs to be retained long enough to be useful.
- Patient harm is in the failure mode. Most outages annoy customers. A telehealth platform dropping a video call during an emergency consult is a different kind of incident, and post-incident review reflects that.
None of this means uptime monitoring for healthcare is exotic. It just means the configuration choices that are nice-to-haves for ordinary SaaS are required here.
What to monitor in a healthcare app
A focused healthcare monitor list, in rough priority order:
1. The patient login endpoint
Login is the gateway to every patient-facing workflow. If patients can't log in, every downstream feature is functionally unavailable even if every other endpoint returns 200 OK. Monitor the login URL with a keyword monitor that confirms a known string ("Sign in," "Forgot password?") appears in the response — a 200 status alone won't catch a broken page that renders an error component instead of the form.
A 1-minute interval with multi-region quorum is right for patient login. False positives here are expensive — a paged engineer at 2am for a CDN blip wastes the on-call rotation — and missed real outages are also expensive. Multi-region monitoring with a 2-of-5 alert threshold is the sweet spot.
2. The clinical-side login
If providers, schedulers, billers, and call center staff log in through a separate path, monitor that separately from the patient path. A patient-side outage is bad; a clinical-side outage means clinic operations stop. They often share infrastructure but fail independently — different SSO providers, different MFA requirements, different session policies.
3. The EHR integration health endpoint
Most healthcare apps connect to one or more EHRs through HL7 v2, FHIR, or proprietary APIs. Build an endpoint in your application — something like /api/integrations/ehr/health — that checks the integration's connectivity and authentication, then exposes a 200/503 to your monitor. Hit it on a 5-minute interval.
The health endpoint does the actual integration check and returns a status; the monitor just measures whether the endpoint is responding correctly. This pattern keeps the integration logic on your side, where you can iterate on it, and avoids exposing EHR endpoints directly to a monitoring vendor. The general pattern is documented in how to monitor a health endpoint.
4. Telehealth signaling and TURN servers
If you offer real-time video, the WebRTC signaling endpoint and TURN server health are far more critical than they look. A dropped TURN server means video calls behind corporate firewalls — including hospital networks — silently fail to connect. The application says "ready"; the patient sees a black screen.
Monitor the signaling websocket connect path and the TURN server allocation endpoint. For TURN specifically, a TCP probe to the TURN port is a more honest signal than an HTTPS health check; the monitor catches firewall and routing issues that an HTTPS check on the same host wouldn't.
5. e-Prescribing and refill flows
Surescripts and direct pharmacy integrations have their own outages and maintenance windows that aren't always communicated reliably. A failed prescription send surfaces as a clinical workflow problem hours after it happened. Monitor the application endpoint that exercises an e-prescribing flow with a synthetic test prescription routed to a sandbox pharmacy on a 15-minute interval.
6. The status page
Healthcare incidents drive a lot of inbound calls. A working status page diverts those calls and informs hospital IT teams that the issue is yours, not theirs. Monitor your own status page as a separate site so you don't silently lose the channel that tells everyone what's going on.
7. Background jobs and overnight syncs
Eligibility refresh jobs, claims submission jobs, insurance verification batches — most run overnight on a cron schedule and fail silently. Use heartbeat monitoring with a generous grace window to catch jobs that stop running entirely. The signal you want is "the job didn't run last night," not "the job is two minutes late."
How to monitor without touching PHI
The cleanest way to keep your monitoring vendor outside HIPAA scope is to design every monitored endpoint so it cannot return PHI. Three patterns work well:
The static health endpoint pattern
Expose a path like /health or /healthz that returns a small static JSON document — the application name, the build version, and a status field. No patient data, no user data, no debug logs. The monitor hits the path; the response is the same for every caller.
The complementary pattern is the deep health endpoint — /healthz/deep or /api/health/full — which actually exercises the database, cache, EHR integration, and other dependencies, then summarizes the result as a boolean. It still returns no PHI; it just returns more information about why something is broken when something is broken. See the complete guide to HTTP health check endpoints for the full pattern.
The synthetic-account pattern
For anything that requires authentication, create a fictitious test account whose data is fabricated, not real PHI. Log the monitor in as that user. The response body contains data, but it is synthetic data — no actual patient information ever flows through the monitor.
This is the right pattern for monitoring "list my appointments" or "show me my medications" endpoints, where you want to verify the data path works without sending real patient records to a third party. Mark the synthetic account internally so it doesn't pollute analytics, doesn't trigger marketing emails, and is excluded from HIPAA audit reports.
The header-aware sanitization pattern
For especially sensitive endpoints, your application can recognize the monitor's request — by user agent, custom header, or allowlisted IP — and return a sanitized response. The monitor exercises the full authentication, authorization, and routing stack, but the response body it sees contains no real data even when called with a real user's session.
This pattern lets you monitor production endpoints with real user contexts (you can verify a particular tenant's path is working) without ever sending PHI through the monitor. It does add complexity to the application, so reserve it for endpoints where the synthetic-account approach isn't enough.
Business Associate Agreements: when you need one
The HIPAA threshold is simple: a monitoring vendor needs a BAA only if it receives, stores, or transmits Protected Health Information on your behalf. In practice:
- BAA not required: Static health endpoints. Endpoints with synthetic test data. Monitors that only check status codes and headers, not bodies.
- BAA required: Monitors authenticated as real patient or provider users that receive responses containing real PHI. Monitors that store request/response bodies that include patient data.
- Gray area: Monitors that capture response bodies for debugging — even if the endpoint usually returns no PHI, an error response (a stack trace, a database row dump) might. The safe path is either to suppress body capture for healthcare monitors or to design endpoints to never echo data in error responses.
CronAlert does not require a BAA when monitoring static health endpoints, synthetic test accounts, or sanitized responses, which covers the vast majority of healthcare uptime monitoring needs. If your monitoring strategy genuinely requires PHI to flow through a third party, you'll need to evaluate vendors that offer enterprise BAAs — and the list is shorter than you'd expect.
The architectural recommendation, every time: design the monitoring layer to never see PHI. It removes the BAA from the critical path, simplifies vendor selection, and makes uptime monitoring a tool you can iterate on without regulatory friction.
Alerting for healthcare on-call
Clinical-grade alerting differs from typical SaaS on-call in two ways: lower tolerance for noise and stricter routing rules.
- Severity tiers matter more. A patient login outage and a marketing page outage cannot share an alert channel. Route patient-facing and clinical-facing critical monitors to PagerDuty or Microsoft Teams with explicit on-call ownership; route lower-tier monitors to a non-paging channel.
- Multi-region quorum is mandatory, not optional. Single-region false positives that wake up an on-call provider once teach them to ignore the next alert. Use a 2-of-5 region quorum on every paging monitor.
- Maintenance windows must be explicit. Healthcare apps have planned downtime — overnight database upgrades, weekly EHR sync windows. Configure maintenance windows for these so a planned outage doesn't fire alerts and dilute the signal.
- Alert payloads must not include PHI. The alert body that gets sent to Slack, email, or PagerDuty becomes a record outside your HIPAA-scoped systems. Never echo request URLs that contain patient identifiers, never include response bodies that might contain data. The monitor name, the endpoint path, and the status code are enough.
- Status page integration is critical. Patient-facing outages drive panic calls. A working status page that updates automatically on incident detection cuts inbound call volume materially. Many CronAlert healthcare customers wire incidents directly to a public status page with a delay of a few minutes — long enough to confirm the incident is real, short enough that the status page is useful.
Uptime data and HIPAA risk assessments
The HIPAA Security Rule (45 CFR 164.308(a)(1)(ii)(A)) requires covered entities and business associates to perform regular risk assessments. Uptime data feeds two parts of that assessment directly:
- Availability evaluation. The Security Rule lists "availability" as one of the three properties of electronic PHI to protect. A year of uptime history is the most concrete evidence you have that you actually delivered availability. Auditors increasingly ask for it.
- Incident timeline reconstruction. When a clinical-impact incident happens, the post-incident review reconstructs what was reachable, when, and from where. Independent third-party uptime data settles disputes that internal logs alone cannot — for example, "the EHR integration was reachable from our infrastructure but unreachable from end-user networks for 22 minutes."
Practical retention guidance: hold at least 90 days of detailed check results for active incident investigations and at least 1 year of summarized uptime data for annual risk assessments. CronAlert's Team plan retains 90 days of detailed results; Business retains 1 year. Many healthcare customers run on Business specifically for the retention.
For the specific mechanics of turning uptime data into compliance reports, see how to use uptime data for SLA reporting and how to read and use uptime reports.
Common healthcare monitoring mistakes
Monitoring only the homepage
The homepage stays up while the actual patient login is broken because of a misconfigured authentication route. A monitor on / alone tells you almost nothing about patient experience. Monitor the login page, the post-login dashboard, and the most-used clinical workflows specifically.
Skipping integrations because "they're not our problem"
EHR and pharmacy outages are still your customer's experience. If Epic's Hyperdrive is down for an hour, your application appears broken to users even though your code is fine. Monitor the integration so you can post a status update telling users the problem is upstream — and so you can communicate clearly with EHR support when they ask for evidence.
Putting PHI in alert payloads
A monitor that pulls an authenticated endpoint and includes the response body in the Slack alert when a check fails has just leaked PHI to Slack, to whoever is in the channel, and to Slack's logs. Configure alert payloads to include monitor name, URL path, and status code only — never response bodies, never request URLs with patient identifiers in the path or query string.
Treating downtime as purely technical
A 30-minute outage of a telehealth platform during business hours has clinical implications — interrupted visits, missed prescriptions, delayed care. The post-incident review needs a clinical perspective, not just an engineering one. Build the practice early; it gets harder to add later.
Monitoring without a BAA, then adding PHI
The most common compliance mistake: a team starts with a no-PHI monitoring strategy, then someone adds an authenticated production-user check three months later because debugging requires it. The monitoring vendor is now receiving PHI without a BAA. Either move the check off the third-party monitor, switch to a synthetic account, or add a BAA. Don't let it sit.
Frequently asked questions
Do I need a BAA with my uptime monitoring vendor?
Only if the monitor receives, stores, or transmits PHI. Static health endpoints and synthetic test accounts don't require a BAA. Authenticated checks that return real patient data do. Designing endpoints to never return PHI is the simplest way to keep monitoring outside HIPAA scope.
Is downtime a HIPAA breach?
Plain unavailability isn't a breach (which is about unauthorized disclosure). But the Security Rule requires reasonable availability, so sustained outages should be documented and fed into your annual risk assessment. SLA penalties with covered-entity customers are a separate issue.
What's the right uptime SLA for a patient portal?
99.9% (about 43 minutes/month) for patient-facing. 99.95% for EHR integrations and clinical decision support. 99.99% during business hours for telehealth-active windows. Set monitor thresholds tighter than your contractual SLA so you can investigate degradation before it counts.
Can I monitor authenticated endpoints without storing PHI?
Yes. Use a synthetic test account whose data is fictitious, or have the application return sanitized responses when the monitor's request is recognized. Both avoid exposing real PHI to the monitoring vendor.
Do uptime monitoring logs count as audit logs under HIPAA?
They aren't formal audit logs (which track PHI access by users) but they are valuable supplementary evidence. Auditors increasingly want third-party uptime data alongside internal logs. Retain at least 90 days for incident investigations and 1+ year for risk assessments.
Set up healthcare-grade monitoring in CronAlert
For a healthcare team, the high-leverage starting point is: patient login + clinical login + EHR integration health endpoint + telehealth signaling, all on a 1-minute interval with multi-region quorum, alerting to PagerDuty or Microsoft Teams with PHI scrubbed from payloads. Add maintenance windows for known overnight processes, attach an automatic status page for patient-facing outages, and retain check results for 90 days on Team or 1 year on Business for compliance review.
Create a CronAlert account and start with a single static health endpoint to verify the pipeline before expanding. For related playbooks, see SaaS uptime monitoring, incident response for small teams, and SLA reporting.