Increase query error rate in 3.0 Serverless US-EAST-1 region

Incident Report for InfluxDB Cloud

Resolved

This incident has been resolved.
Posted Jan 31, 2024 - 15:37 UTC

Monitoring

It has been over an hour since we’ve seen an availability alarm and the querier metrics have returned to normal. We will continue to monitor.
Posted Jan 31, 2024 - 13:43 UTC

Update

Within the last hour the CPU usage has dropped again, and the timeout rate appears to have returned to normal. We continue to search for the root cause.
Posted Jan 31, 2024 - 13:18 UTC

Identified

We have increased the number of queriers in the system, This has alleviated the pressure slightly. We are still processing queries more slowly than before the incident started.
Posted Jan 31, 2024 - 12:06 UTC

Update

We are still investigating the issue. The high CPU usage and error rate persist.
Posted Jan 31, 2024 - 10:17 UTC

Investigating

We have observed a reduction in query performance on InfluxDB Serverless prod101-us-east-1 causing an increased query failure rate. We are currently investigating the issue.
Posted Jan 31, 2024 - 09:17 UTC
This incident affected: Cloud Serverless: AWS, US-East-1 (API Queries).