Sometimes Netreo System Timers, specifically Latency can get really high which then bottlenecks into general Incident Management issues such as false positives.
I use a MySQL query to find the culprits and remove those service checks to reduce overall oam latency. More often than not, it's a service check on a device that is host down.
AutoPilot should remove these service checks if they become unreasonably high and the host is down. With a safety exception to not remove PING or TCP checks.
After a device comes back up, Netreo should automatically re-poll the device and the service check would get re-applied. I guess one more safety measurement should be to only auto remediate if the check is controlled via template, and just create a findings card if the check is manually applied on the device itself.