AWS

Cloud

8 sections
32 source tickets

Last synthesized: 2026-02-13 02:51 | Model: gpt-5-mini
Table of Contents

1. SES account and SMTP user provisioning / onboarding

10 tickets

2. Postfix/mail-relay migrations from MX1 to AWS SES

4 tickets

3. Send-rate limits and SMTP 454 causing failover and delivery anomalies

1 tickets

4. SES IAM permissions and authorized sender identity failures (ses:SendRawEmail)

2 tickets

5. Third‑party application compatibility with SES SMTP credential entry formats

1 tickets

6. AWS IAM access, role transfers and service-account provisioning

12 tickets

7. Database network isolation and provider access controls for external consumers

1 tickets

8. ECS task CPU saturation causing Metabase report endpoints to hang

1 tickets

1. SES account and SMTP user provisioning / onboarding
90% confidence
Problem Pattern

Requests to provision or migrate AWS SES sending accounts and per-service SMTP/API credentials for multiple services. Symptoms included unverified sender identities causing SMTP errors such as "554 Message rejected: Email address is not verified" during migrations; requesters also supplied required sender/reply-to addresses, expected 24‑hour and monthly volumes, attachment behavior, cost‑centre assignment, and hard‑bounce contact details.

Solution

AWS SES sending identities and per-service SMTP/API credentials were created in the eu-central-1 SES tenants and provisioned to requesters. Sender addresses were administratively limited or changed on request and primary contact persons and approval steps were recorded in Jira. Request metadata (expected 24‑hour and monthly volumes, attachment behavior, cost centre, and hard‑bounce contacts) were recorded and used to place accounts into operational categories (marketing vs system) and to set suppression/notification expectations. Credentials and other sensitive data were delivered to teams via the organisation’s secure vaults (Safe App / 1Password / Teams). CloudWatch logging was enabled when requested. During migrations to SES some test sends failed with the SMTP error “554 Message rejected: Email address is not verified” when identities had not been verified; onboarding/migration work therefore included verifying listed sender identities, clarifying approved sender/wildcard noreply patterns for internal services, and confirming contacts and volume limits before enabling sending.

2. Postfix/mail-relay migrations from MX1 to AWS SES
95% confidence
Problem Pattern

Hosts running Postfix and applications using SMTP credentials required migration of their outbound relay from the legacy MX1 service to AWS SES. Affected components included Postfix main.cf relayhost settings and smtp_sasl_password_maps entries, plus application-level SMTP user accounts and sender addresses. Requests were sometimes raised as non-failure migration tasks needing new SES credentials, coordination of cutover, and clarification of cost-center/ownership; there was risk of test/dev mail reaching real recipients if environments were not restricted.

Solution

Postfix-based hosts and application SMTP users were migrated from MX1 to AWS SES by switching relayhost entries to the appropriate AWS SES SMTP endpoints and replacing MX1 SASL credentials with SES SMTP credentials in smtp_sasl_password_maps; previous MX1 settings were retained as commented references for traceability. Application/sender migrations (for example the ernennungsportal sender) received provisioned SES SMTP credentials which were handed over securely—credentials were stored in 1Password and cutover coordination was handled via Microsoft Teams, with cost-center/ownership clarified before handover. Delivery and cutover success were confirmed by reviewing AWS SES send logs and Postfix/OTRS logs on the hosts (example host: cpgbrh2otrs1). Migration activities were coordinated with application teams to avoid accidental sending from test/dev environments and to ensure SMTP users and sender addresses were updated in tandem.

3. Send-rate limits and SMTP 454 causing failover and delivery anomalies
80% confidence
Problem Pattern

High-volume SMTP submissions hit AWS SES per-second send-rate limits and produced SMTP 454 throttle responses; affected systems (Workday) exhibited failover between SMTP paths and some recipients experienced attachment filename/extension anomalies (e.g., PDF attachments delivered as .txt).

Solution

Investigation of SMTP and SES logs identified that the SES SMTP user exceeded the per-second send-rate threshold and returned SMTP 454 responses; this triggered the sending application to alternate SMTP routes and caused inconsistent sender addresses. Mitigation measures reduced observed 454 responses and stabilized the SMTP path usage; after the change the SMTP 454 errors ceased and invoice attachments were received with normal filenames. The attachment-extension anomaly was attributed to recipient-side mailserver/security handling rather than content corruption from SES.

Source Tickets (1)
4. SES IAM permissions and authorized sender identity failures (ses:SendRawEmail)
95% confidence
Problem Pattern

Applications and SMTP users (for example Auth0 and Metabase SMTP user migrations) experienced email send failures with explicit 554 Access denied errors stating an IAM principal (ARN) "is not authorized to perform ses:SendRawEmail" on a specific SES identity ARN. Symptoms included immediate inability to send mail from the affected application or SMTP account, repeated SES retry/delay behavior, and failures appearing after SMTP migrations or identity/host changes. Affected systems included AWS SES, AWS IAM, SMTP endpoints and client applications.

Solution

Failures were traced to IAM principals lacking the ses:SendRawEmail permission scoped to the SES identity ARN, and/or to the SES sender identity not being authorized or verified. The resolution corrected the application's SMTP/IAM principal policy to include ses:SendRawEmail for the target SES identity ARN and ensured the required sender addresses/identities were authorized in SES. In the SMTP migration case the SMTP user/principal mapping to the new server was reconciled with IAM/SES and the SES identity re-authorized. After the IAM policy and sender-identity updates the applications (Auth0, Metabase) regained sending capability and SES retry/delay behavior returned to normal.

Source Tickets (2)
5. Third‑party application compatibility with SES SMTP credential entry formats
50% confidence
Problem Pattern

Some third-party SaaS forms and flow tools required credentials to be entered in a single combined field (sender address + SMTP username), and standard AWS SES SMTP credentials did not match the tool's expected input format, causing send failures.

Solution

A credentials format compatible with the third-party product was supplied and delivered to the requester via the organization's secure vault channels (1Password / Teams). Providing SMTP credentials in a form the application accepted restored sending from the affected flow tools.

Source Tickets (1)
6. AWS IAM access, role transfers and service-account provisioning
91% confidence
Problem Pattern

Users were unable to access AWS due to administrative and provisioning gaps: role/group memberships that appeared active produced no console permissions or environment visibility, and some users lacked the Okta/SSO AWS application assignment required for access. Errors included botocore ForbiddenException from GetRoleCredentials and timeout/connect failures when accessing from remote/home networks while office access worked. Requests for AWS resources or temporary roles were delayed by ServiceDesk routing limits and pending approvals in MyAccess/Automation for Jira.

Solution

Administrative role and group ownership issues were resolved by assigning the appropriate AWS role or group inside AWS and verifying console access. In cases where SSO-level access was missing, AWS access was provisioned by assigning the AWS application to the user in the Okta portal. Requesters were routed to the Microsoft MyAccess portal for AWS role/admin requests (approvals tracked via Automation for Jira) and to the DevOps Service Portal for environment- or resource-specific work owned by Dev teams. Service account identities required for deployments were synchronized into the target AWS accounts and access was validated with the teams. QuickSight access was fulfilled through the internal self-service portal by requesting AWSQuickSightAuthor or AWSQuickSightReadOnly and awaiting approver action. When S3 buckets did not exist and tickets could not be transferred between ServiceDesk systems, requesters opened DevOps tickets; DevOps created the buckets and then granted the requested write access. When assigned roles produced no apparent permissions or environment visibility, users removed themselves from the group and re-requested membership so the role was reprovisioned, which restored console permissions, portal visibility, and credential creation. Production administrator removals and cleanups were performed by DevOps and access changes were verified after removal. One ticket recorded botocore GetRoleCredentials ForbiddenException and timeout connection errors from a home network versus the office; no configuration change was documented for that case in the ticket set.

7. Database network isolation and provider access controls for external consumers
80% confidence
Problem Pattern

External service providers needed read-only access to an internal CARE database in a new AWS environment. Providers were to receive DB replicas (some dedicated for high-load consumers) while the team debated whether network-level blocking of the DB master was required versus relying on password protection. Access sources were variable (Tailscale VPN clients and users via bastion/jump hosts) so source IPs could not be reliably used for filtering. ISO guidance recommended preventing access at the network level.

Solution

Access for external providers was constrained to dedicated read-only replica hosts; direct network-level access to the DB master was prevented. Provider connectivity was routed through controlled paths (replica subnets and bastion/jump hosts or Tailscale endpoints) and auditing/monitoring was enabled using CloudTrail, Datadog, CloudWatch and Microsoft Sentinel to capture access and detect anomalies. This approach separated heavy consumer traffic onto replicas while preserving master protection.

Source Tickets (1)
8. ECS task CPU saturation causing Metabase report endpoints to hang
90% confidence
Problem Pattern

The production Metabase instance became extremely slow or non-responsive when loading report/query data: user login worked but data request endpoints for reports hung for extended periods (>20 minutes). Non-production environments were unaffected. The outage correlated with very high CPU utilization on the production AWS ECS task hosting Metabase.

Solution

The production ECS task was replaced with a new task instance to remove the CPU-constrained process. Starting the newer ECS task restored normal responsiveness for report and data requests and resolved the performance degradation in production.

Source Tickets (1)
Back to Summaries
An unhandled error has occurred. Reload X