Event Notification system issue

Incident Report for CVaaS

Postmortem

Summary

  • Date: May 9, 2024 — Date of Discovery. May 10, 2024 — Date of Initial Disclosure.
  • Tracking Bug: BUG1164090
  • Impacted Clusters: All customers using TLS for syslog event notifications.
  • Authors: CVaaS (CloudVision as a Service) SRE Team
  • Status: Resolved
  • Summary: Syslog event notifications ignored ‘useTLS’ setting.
  • Impact: Syslog event messages may have been sent on a clear text TCP/UDP (user configured) connection and could potentially be intercepted.
  • RCA: Completed.
  • Detection: Detected by a CVaaS customer.

Root Cause Analysis

While refactoring code related to establishing syslog connections, we unintentionally changed the behaviour of the user configuration parsing. This meant that we ignored the ‘useTLS’ setting on CVaaS. This bug was caused by a misunderstanding in how function arguments would be processed.

Note that if the customer syslog server was configured to use TCP, and the TLS endpoint of that syslog server was configured to accept only TLS connections, then the CVaaS-to-endpoint connection could not have been established. In that case, no messages would have been sent on the wire. We recommend ensuring your syslog endpoints are configured to support only TLS listening on TCP.

Timeline

Tenants on the regions ausoutheast-1, apnortheast-1 or us-central1-a have been impacted by this issue since April 16th, 2025. All other regions have been impacted since March 24th, 2025. We were notified about this issue on May 7th, 2025, and we had deployed the fix on May 9th, after we had identified and fixed the underlying issue. All customers were notified about this issue on the same day the fix was deployed.

Follow up Analysis

Upon fixing this issue in syslog events, we conducted a review of our other event notification systems affected by the refactor, and ensured security related settings were being respected. This issue was only possible due a gap in our automated testing, and as such we are implementing more thorough tests across our event notification systems.

We have reached out to any affected customers directly.

We believe CloudVision as-a-Service should enforce best practices when it comes to security, and as such we will be deprecating the use of insecure transport settings for all types of alert configuration. This change will affect syslog, email, SNMP & webhooks, meaning all of them will require a secure endpoint.

Starting August 2025, we will officially begin deprecation support for insecure transport settings in CloudVision as-a-Service. Customers using insecure settings will be notified with instructions to ensure continued service. New configurations will require secure settings.

By the end of September, secure settings will be enforced, and any alerts sent via insecure transport settings will no longer function.

If you have any questions please feel free to reach out to Arista TAC at support@arista.com or click here for additional ways to reach us.

Posted May 20, 2025 - 19:29 UTC

Resolved

We discovered a security issue in the CloudVision Service that impacts customers who have configured the Event Notification system to send events to a syslog platform with TLS configured. The issue only impacts the event notification messages themselves sent over syslog to the syslog platform, and does not impact the security of any other part of the system. We have deployed a mitigation to all production clusters. This issue does not impact any on-prem CloudVision releases.

Only a limited number of customers were affected. Tenants on the regions ausoutheast-1, apnortheast-1 or us-central1-a have been impacted by this issue since April 16th, 2025. All other regions have been impacted since March 24th, 2025.

At this time we are preparing a post-mortem and will provide a more detailed update in the future.
Posted May 09, 2025 - 20:14 UTC
This incident affected: us-central1-a (Events), us-central1-c (Events), euwest-2 (Events), apnortheast-1 (Events), ausoutheast-1 (Events), na-northeast1-b (Events), uk-1 (Events), us-central1-b (Events), and india-1 (Events).