Intermittent Authentication Issues for AWS EU-CENTRAL (read/write token authentication is not affected)

Incident Report for InfluxDB Cloud

Postmortem

Summary

On January 15, 2026, user login for Cloud 2 was failing intermittently in the eu-central AWS

cluster. The impact was that for some users, they could not login via the web UI. For others,

they were able to login but then could not see their resources (buckets, dashboards etc.) in the

web UI. This was an intermittent issue, not affecting all users, and in some cases, recoverable

by retrying. It was also isolated to the web UI and did not impact API-based writes and queries.

Cause of the Incident

The incident was caused by the clock on some of the nodes being slightly behind the master

clock. When a user logs in, a JWT session token is signed with a nbf (not-before) timestamp

based on the signing server's clock. When the request hits a gateway pod, the token is checked

and one of the checks is to confirm that the token time is valid. If the gateway pod is one whose

node clock is behind the signing node's clock, the JWT is rejected because from that node's

perspective, the token's nbf time is in the future.

The reason why this problem was intermittent was that a load balancer distributed the

authentication requests across pods, and not all the pods had clock-skew. The same token

worked fine on some of the nodes (where the time was correct) and not on a few nodes (where

the time was incorrect).

Recovery

We identified the nodes whose clock time had drifted, and drained the affected nodes.

When the service was restarted on new nodes with the correct time, the problem was

resolved.

Timeline

  • 2026-01-15 10:12 UTC - Customers reported errors when navigating to dashboards and

tasks in the UI in the eu-central AWS cluster, and engineering began to investigate the

issue.

  • 2026-01-15 14:34 UTC - Additional customers reported the same errors.
  • 2026-01-15 16:28 UTC - The issue was identified as an intermittent user login issue.
  • 2026-01-15 21:12 UTC - Further investigation revealed clock skew in some nodes on the

cluster.

  • 2026-01-15 21:41 UTC - The nodes that were showing signs of clock skew were

drained.

  • 2026-01-15 21:45 UTC - Confirmed that authentications were successful.

  • 2026-01-16 16:11 UTC - An additional node was showing signs of clock skew, so the

node was drained and the incident was closed.

Future mitigation

We are still researching why the clocks drifted on these nodes, as these nodes use

the AWS-provided time synchronization service, which should have kept the

clocks in sync.

Posted Jan 16, 2026 - 23:08 UTC

Resolved

This incident has been resolved.
Posted Jan 16, 2026 - 16:21 UTC

Update

A fix has been implemented and we are monitoring the results. Please let us know via the support portal if you see any issues logging in to your cloud account.
Posted Jan 16, 2026 - 00:49 UTC

Monitoring

A fix has been implemented and we are monitoring the results.
Posted Jan 15, 2026 - 22:12 UTC

Investigating

We are currently investigating intermittent authentication issues (read/write token authentication is not affected)
Posted Jan 15, 2026 - 17:18 UTC
This incident affected: Cloud Serverless: AWS, EU-Central (Web UI, Other).