Every page that survives long enough ends up with a Slack channel where people say things like "ignore the alert, it always goes off when deploys happen" or "don't worry, the monitor flaps sometimes." That is the sound of trust decaying. Once operators learn to ignore alerts, real outages slip through and the monitoring service has become a liability.

We built CronAlert because most uptime tools produce too many false positives. The fixes -- consecutive-check verification, multi-region quorum, careful probe placement -- are not secrets, but they need to be on by default and tuned sensibly. This post is a walk through how our edge network actually reduces false positives, so you can decide whether our alerts would be trustworthy for your team.

Where false positives come from

Before we talk about fixes, it helps to look at the actual causes. A real outage looks like "every user on the internet cannot reach the site for several minutes." A false positive usually looks like one of these:

  • A single slow DNS lookup. The first hop of any check is DNS resolution. If the authoritative nameserver is briefly slow, the check times out even though the site would serve a request normally.
  • Transient packet loss on one path. BGP reconvergence, a congested peering link, or a backhoe in a different country can drop packets on one monitoring path while every other path works.
  • The monitoring provider's own region failing. If the service checks from one location and that location is unhealthy, every monitor appears to fail simultaneously.
  • Rate limits and WAF false triggers. Your WAF or a Cloudflare rule briefly classifies the monitoring IP as suspicious and returns a 403. Real users are unaffected.
  • Deploy-induced flapping. A rolling deploy briefly drops one instance. Real users see one slightly slow request; monitors see one failure.
  • TLS handshake timeouts. SSL handshakes occasionally exceed the default timeout on the first connection. The retry succeeds immediately, but a monitor that did not retry has already paged you.

None of these represent a user-visible outage. Every one of them will trigger a naive "check once, alert on failure" system. The job of a good monitoring service is to tell you about real problems while filtering out this noise.

Consecutive-check verification: the single biggest filter

The most effective filter is also the simplest: before alerting, check again. CronAlert runs consecutive-check verification on every monitor, every plan, by default. When a check fails, the worker re-runs it in quick succession. If the site recovers on the retry, the original failure is recorded in the log but no alert fires.

The thresholds depend on your check interval:

  • 1-minute checks (paid plans): 2 consecutive failures before alerting. Worst-case detection time is 2 minutes, but most false positives are filtered.
  • 3-minute checks (free plan): 1 failure triggers an alert. The 3-minute gap between checks is already long enough to confirm the state -- if the site is still down 3 minutes later, it is almost certainly a real issue.

This logic is why CronAlert does not flap on transient network blips. A single slow DNS lookup, a single TLS timeout, a single WAF false trigger -- all of these resolve on the retry and never reach your inbox. You still see them in the uptime log, so you can audit the noise floor, but they do not turn into alerts.

There is a subtlety here that matters: the retry uses the most recent previousCheckedAt timestamp to make sure it is checking the current state, not re-evaluating an older check. If your monitor just recovered on its own in the 30 seconds between the original check and the retry, the system correctly treats the outage as resolved rather than alerting on a stale failure.

Multi-region quorum: eliminating path-level false positives

Consecutive-check verification catches transient failures, but it cannot distinguish between "the site is down" and "the path from one monitoring region to the site is down." For that you need multiple probes in different regions.

CronAlert runs probe workers in five geographic regions -- US East, US West, EU West, EU Central, and AP Southeast. For Team plan monitors and above, a single check fans out to all five probes in parallel. Each probe reports independently, and the orchestrator applies quorum logic before alerting:

  • Alert immediately: any region sees a failure. Use this when false positives are acceptable but every second of downtime matters.
  • Alert after N of M regions fail: require a quorum before alerting. This is the default for most users -- set it to 3 of 5 for a balanced setup, or 4 of 5 for the strictest possible filter.

With a 3-of-5 quorum, a single flaky region is never enough to trigger an alert. Peering issues, regional outages, and path-level noise all get filtered. If your site is actually down, all five regions will see it and you will be paged within one check interval.

The incident log still records every per-region result, so you can investigate later: "EU Central saw elevated latency at 14:32 but the site was up from every other region." That kind of data is useful for debugging real path issues without the drama of a pager going off.

Read more about the setup in our multi-region monitoring guide.

Why run probes on Cloudflare's edge

Probe placement matters more than most people realize. If your probes run on AWS and your target runs on AWS, you are really testing the AWS internal network, not the user-facing internet. If probes run in the same datacenter as the target, you miss DNS and peering issues entirely.

CronAlert's probes run on Cloudflare Workers, which place code in 300+ data centers worldwide. We pin each probe worker to a specific regional zone using placement hints -- enam, wnam, weur, eeur, and apac. This gives us real geographic separation without having to operate physical infrastructure.

A few concrete benefits fall out of this architecture:

  • Real internet paths. Each probe hits your site over the public internet, through real DNS, real peering, real TLS handshakes. No VPC-to-VPC shortcuts.
  • No single-point-of-failure. If one Cloudflare region has an issue, the other four probes still report correctly. The quorum logic handles the rest.
  • Probe isolation. Each probe is a separate Worker, stateless, and only reachable via Service Bindings from the orchestrator. A compromised probe cannot poison another region's result.
  • Low per-check latency. Workers cold-start in milliseconds and run close to the user's browser, so even a 1-minute interval leaves plenty of budget for retries without stacking up overdue checks.

If you are running on Cloudflare yourself, note that we monitor from outside your own Cloudflare account. Our probes see your site the same way a real user would -- through the edge, DNS, and routing. A more detailed view lives in the Cloudflare Workers monitoring guide.

Separating warnings from alerts

Not every abnormal result should page you. CronAlert distinguishes a few severity levels and routes them differently so that alerts stay meaningful:

  • Failed check, recovered on retry. Logged, no alert. This is what consecutive-check verification filters out.
  • Failed check, below quorum. Logged per-region, no alert. This is what multi-region quorum filters out.
  • Elevated response time but still 200. Visible in the response-time chart, no alert unless you explicitly configure a latency threshold.
  • SSL certificate expires in under 14 days. Warning notification -- separate from a downtime alert. You want to know, but it is not a 3am page.
  • Sustained failure across regions. Full alert on every configured channel -- email, Slack, PagerDuty, webhook, push.

If you have been bitten by alert fatigue before, this distinction is worth paying attention to. A monitoring tool that only has one severity level eventually becomes one that you mute.

What CronAlert cannot filter

Every monitoring tool has a noise floor, and honesty requires naming ours. Here are the things we cannot filter out, and how to handle them:

  • Your own WAF blocking monitoring IPs. If your WAF returns 403 to our probes, every check will fail. You need to allowlist our IP ranges or adjust your rules. We publish the ranges in the docs.
  • Very short outages under the check interval. If your site is down for 20 seconds and back up, a 1-minute monitor might never see it, and a 3-minute monitor almost certainly will not. Use heartbeat monitoring or shorter intervals if that matters.
  • Content regressions on 200 responses. A site that returns 200 OK with an error message in the body still looks "up" to an HTTP status check. Add keyword monitoring to catch these.
  • Sporadic slow checks from a single region. A probe on the other side of the world will occasionally see higher latency. We surface it in the response-time chart but do not alert unless you set a threshold.

How to tune your setup

A sensible default that balances signal and noise:

  • 1-minute checks on paid plans; 3-minute checks on free.
  • Multi-region enabled for anything user-facing (Team plan and above).
  • 3-of-5 quorum for most monitors. Tighten to 4-of-5 for infrastructure that is tolerant of brief regional noise; loosen to "alert immediately" only for systems where false positives are cheaper than a one-minute delay.
  • SSL warnings at 14 days out, routed to a low-priority channel.
  • Keyword monitoring for critical JSON endpoints so 200-OK error responses do not slip through.
  • Maintenance windows scheduled during known deploys to prevent expected flapping.

With this setup, if you get paged, something real is wrong. That is the whole point.

Frequently asked questions

What is a false positive alert in uptime monitoring?

A false positive is an alert that fires when your site is actually up. Common causes include transient network blips on the monitoring path, a single slow DNS resolution, a brief ISP route change, or the monitoring service itself having a regional hiccup. They train you to ignore alerts, which defeats the purpose of monitoring.

How does consecutive-check verification work?

Before alerting, CronAlert re-runs the failing check one or two times in quick succession. If the site recovers on the retry, the original failure is treated as a transient blip and no alert fires. Only sustained failures trigger notifications. For 1-minute checks the threshold is 2 consecutive failures; for 3-minute checks it is 1 failure, because 3 minutes is already a long enough window to confirm.

Why check from multiple regions?

A single monitoring location shares network infrastructure with your target. If the path between that location and your site has issues, you get an alert even though every other user can reach you. Multi-region monitoring checks from five geographically separate probes and requires a quorum before alerting -- if only one region sees a failure, it is almost certainly a path issue, not a site issue.

Does the free plan get false-positive protection?

Yes. Consecutive-check verification is on by default for every monitor on every plan, including the free plan. Multi-region quorum is available on the Team plan and above, since it requires probes in five regions. The free plan still benefits from Cloudflare's globally routed edge and the consecutive-check logic.

Start with monitors you can trust

Trustworthy alerts are not a luxury feature -- they are the product. If you are ignoring alerts from your current monitoring tool, that is not something you can fix with more alerts. It is something you fix by switching to a service that takes the noise problem seriously.

Create a free CronAlert account -- 25 monitors, consecutive-check verification, SSL monitoring, and free Slack/Discord/webhook alerts. Upgrade to Team for multi-region quorum across five regions. See the pricing page for full details.