Mohammad Gufran Jahangir January 23, 2026 0

Imagine you wake up to a message:

“Why is prod down… and why did our cloud bill spike overnight?”

You open dashboards. CPU looks normal now. No obvious deploy. No incidents logged.

Then you notice something: a new admin user was created at 2:13 AM, in a region your company doesn’t even use. A firewall rule was opened. A storage bucket’s permissions changed. Logs were disabled… and re-enabled.

How do you prove what happened—quickly, confidently, and in a way that stands up in a security review?

That’s what cloud audit logging is for.

This guide is the “do it right once” blueprint:

What to log (so you don’t miss the critical events)
How long to keep logs (without drowning in cost)
What alerts actually work (high-signal, low-noise)

No fluff. Just a practical system you can implement.

Table of Contents

1) What “audit logs” really are (and what they are not)

Audit logs answer: “Who did what, where, and when?”

They capture control plane actions like:

Creating/deleting resources
Changing IAM permissions
Modifying firewall rules
Disabling encryption
Turning off logging (yes, this happens)

Audit logs are not:

Application logs (your app’s errors, traces)
Network flow logs (who talked to whom over the network)
OS logs (syslog, auth.log)
Data content logs (actual database rows or file contents)

A good cloud logging strategy uses all of them, but audit logs are the non-negotiable foundation—because they explain changes to the environment itself.

2) The 3 outcomes your audit logging must deliver

If your audit logging does not achieve these, it’s just storage.

Outcome A — Fast incident investigation

When something goes wrong, you can answer within minutes:

What changed?
Who changed it?
Was it accidental, automated, or malicious?

Outcome B — Continuous detection (not monthly surprises)

You don’t wait for a breach report. You get alerted when risky actions occur.

Outcome C — Compliance and forensics-ready evidence

If you ever need to prove “this is what happened,” your logs are:

Complete enough
Protected against tampering
Retained long enough

3) Step-by-step setup: a “minimum viable audit logging” blueprint

Step 1 — Define your “crown jewels”

Write down what you must protect first:

Production accounts/subscriptions/projects
Identity system (IAM, SSO, roles)
Secrets (KMS/HSM, secret manager)
Databases and storage
Network perimeter (firewalls, gateways)
CI/CD pipelines

You’ll use this list to decide which logs get extra retention and extra alerts.

Step 2 — Inventory your audit log sources

Most teams enable “some logging,” but miss important log sources.

At minimum, you want audit logs from:

Identity & access management (users, roles, policies, keys, MFA)
Resource management (create/update/delete of compute, DB, storage)
Network & perimeter (firewall rules, load balancers, routes, VPN, peering)
Key management & secrets (KMS keys, key policies, secret access changes)
Logging and security tools themselves (SIEM, log sinks, detectors)

Then decide if you also need data access audit logs for sensitive services:

Object storage reads (who accessed what)
Database audit logs (connections, schema changes)
Secrets access events

Step 3 — Centralize logs (the “log archive” pattern)

This is where most setups fail.

If logs stay inside the same account where workloads run, an attacker who gains admin can try to:

Disable logging
Delete logs
Change retention
Alter sinks/exporters

Best practice: ship audit logs to a separate log archive environment with tighter controls.

Minimum safeguards:

Workload accounts can write to archive, but cannot delete.
Only a small security/admin group can manage retention and deletion.
Logs are encrypted, and access is heavily monitored.

Think of it as a bank vault. You don’t store security camera footage in the cashier’s drawer.

Step 4 — Protect log integrity (tamper resistance)

Your audit logs should be:

Write-once-ish (or at least delete-protected)
Encrypted at rest
Access-controlled
Monitored for changes

Add alerts for:

Retention reduced
Log export disabled
Log storage deleted
Permissions changed on log archive

Because the first thing a skilled attacker tries is: erase evidence.

Step 5 — Normalize the key fields (so you can actually alert)

Different clouds name fields differently, but your detection logic needs consistent concepts.

Normalize into these fields in your SIEM/log platform:

timestamp
actor (who did it)
actor_type (user, role, service account, workload identity)
action (what they did)
target (resource affected)
result (success/fail)
source_ip
user_agent / caller
region / location
auth_method (MFA? key? federated?)
request_id (for correlation)

Once you have these, you can build alert rules that survive provider differences.

4) What to log: the practical “Tier 0 / Tier 1 / Tier 2” model

You don’t want to log “everything” blindly. You want everything that matters, and you want to treat critical events differently.

Tier 0 (must log, must alert, must retain longer)

These events are high impact and often high risk:

Identity & privilege

New user / service account created
Role assigned / elevated permissions
Policy changes (especially admin permissions)
Access keys created / rotated / disabled
MFA disabled or bypassed
SSO / federation settings changed
“Break-glass” account used

Security posture

Logging disabled, modified, or retention reduced
Encryption disabled, key policy changed
Public access enabled on storage
Firewall/security rules opened broadly (0.0.0.0/0 or equivalent)
Threat detection tool disabled

Network perimeter

New ingress rules, VPN/peering changes
Route table changes
Load balancer listener changes (new ports exposed)

Data access (for crown jewels)

Reads/downloads from sensitive storage buckets
Database audit events (schema changes, new admin users)
Secrets access policy changes

Tier 1 (log + alert in production; retain medium)

These drive most “how did this happen?” investigations:

Compute instance/container created, deleted, resized
Autoscaling configuration changed
Database instance class/storage modified
Backup policies changed
Infrastructure-as-code pipeline changes (who approved/applied)
Container registry settings changed
Key service configuration changes (queues, topics, gateways)

Tier 2 (log for investigation; alert only if correlated)

These create noise if alerted on alone, but help in correlation:

Read-only inventory events
Frequent benign API calls from automation
Routine deployment activities (if already covered by CI/CD logs)

5) Retention: how long should you keep audit logs?

Here’s the honest truth:

Retention isn’t a number. It’s a strategy.

You keep different levels of detail for different time windows.

The 3-tier retention strategy (simple and effective)

Hot (fast search, high cost): 30–90 days

Full fidelity audit logs
Indexed for quick investigation
Used for day-to-day detection and incident response

Warm (searchable but cheaper): 3–12 months

Still searchable, maybe slower
Used for trend analysis, investigations that start late, internal audits

Cold (cheap archive): 1–7 years (or per compliance)

Stored for compliance, legal holds, long investigations
Not always instantly searchable, but retrievable

How to decide your numbers (beginner-friendly rule)

Ask these questions:

How long do incidents typically go unnoticed?
If the answer is “weeks,” your hot retention must cover it.
What are your compliance requirements?
Some industries require longer retention.
How often do you need to investigate old events?
If you routinely do postmortems or customer audits, keep warm longer.

A practical default (works for many teams)

Hot: 60–90 days
Warm: 12 months
Cold: 3–7 years for Tier 0 events (or required systems)

And remember: you can keep Tier 0 longer than everything else.

6) Alerting use cases: high-signal alerts engineers won’t ignore

Most teams fail at alerting because:

They alert on noisy events (“someone listed buckets”)
They don’t include context (who, where, what changed)
They don’t route alerts to owners with clear actions

A good alert rule is like a good bug report:

What happened
Why it matters
Who did it
What changed
What to do next

Below is an “alert pack” you can implement.

Alert Pack A — Identity and privilege changes (Tier 0)

1) New admin granted
Trigger when:

Any principal gets a high-privilege role or policy
Include in alert:
Actor, target identity, new permission set, source IP, region

2) Access key created
Trigger when:

New access key or credential generated
Extra rule:
Higher severity if created outside business hours or from unusual IP

3) MFA disabled
Trigger when:

MFA removed, disabled, or device changed
Extra:
If it happens on a privileged account → critical

4) SSO / federation configuration changed
Trigger when:

Identity provider settings changed
Why it matters:
Attackers love rerouting auth.

Alert Pack B — “Covering tracks” events (Tier 0)

5) Audit logging disabled or modified
Trigger when:

Logging stopped
Log export/sink removed
Retention reduced
Log bucket permissions changed
This is usually critical. If someone can do this, they can hide everything else.

6) Log archive deletion attempts
Trigger when:

Delete operations attempted on log storage, even if denied

Alert Pack C — Network and perimeter exposures (Tier 0)

7) Firewall opened to the world
Trigger when:

Inbound rule allows wide-open ranges on sensitive ports
Severity guide:
Critical for SSH/RDP/admin ports
High for databases
Medium for app ports depending on architecture

8) New public load balancer/listener created
Trigger when:

Public-facing endpoints added
Include:
Port, protocol, target resource, change actor

9) Route table or gateway changes
Trigger when:

Routes modified in a way that changes egress path (exfiltration risk)

Alert Pack D — Data protection and encryption (Tier 0)

10) Encryption disabled / key policy changed
Trigger when:

Encryption settings reduced
Key access expanded
Key rotation disabled
Why it matters:
This impacts blast radius and compliance.

11) Storage bucket made public
Trigger when:

Public ACL/policy allowed
Include:
Bucket name, policy diff summary, actor

Alert Pack E — Suspicious behavior patterns (high value)

These require baselines, but pay off big.

12) Unusual region usage
Trigger when:

Sensitive actions occur in regions you don’t operate in

13) Spike in failed auth
Trigger when:

Many failures from one IP or many IPs for same identity

14) Privileged action from new IP / new user agent
Trigger when:

Admin role used from never-seen network or device signature

15) Mass deletion or destructive burst
Trigger when:

Delete operations exceed a threshold in a short time window
Examples:
Many storage objects deleted
Many instances terminated
Many IAM policies removed

7) Make alerts actionable: add context + a runbook snippet

Every alert should include a tiny “what to do now” section.

Example for “New admin granted”:

Immediate actions

Verify change ticket / deployment record
Confirm actor identity and auth method
If unexpected: disable the credential/session and revoke the role
Search for follow-on actions by the same actor in the next 15 minutes
Open incident if any Tier 0 follow-up occurred (logging change, firewall open, key creation)

This one addition turns alerts from noise into response.

8) Real examples (what audit logs catch in the real world)

Example 1: “We got billed $8k overnight”

Audit logs reveal:

A new compute resource created at 2 AM
Region differs from normal
Actor is an access key that hasn’t been used in months

Response:

Disable the key
Terminate the resources
Backtrack all actions from that actor

Without audit logs, you’d be guessing.

Example 2: “Database became publicly accessible”

Audit logs show:

Security rule changed to allow inbound from anywhere
Change was made by a CI role used outside normal pipeline hours

Response:

Revert rule
Investigate CI credential misuse
Add alert + policy guardrail to block future changes

Example 3: “We can’t find logs for the incident window”

Audit logs show:

Logging was disabled for 12 minutes
Log sink permissions were modified

Response:

Treat as high-severity security incident
Lock down log archive permissions
Add “logging disabled” as critical alert

9) The most common mistakes (and how to avoid them)

Mistake 1: Only logging in one account/project

Fix: centralize across all environments, especially prod and identity.

Mistake 2: Keeping logs but not searching them

Fix: hot retention must be indexed and queryable.

Mistake 3: No alerts for disabling logging

Fix: alert on changes to logging configuration and retention.

Mistake 4: “Everything alerts all the time”

Fix: Tier your alerts and focus on Tier 0 first.

Mistake 5: No ownership

Fix: route alerts to teams who can act, not a generic inbox.

10) A simple “starter implementation” you can follow this week

Day 1: Enable + centralize

Turn on cloud audit logs across prod accounts/projects
Export/stream them to a log archive environment

Day 2: Lock down integrity

Restrict deletion and retention changes
Alert on log configuration changes

Day 3: Implement the Tier 0 alert pack

Start with:

New admin granted
Key created
MFA disabled
Logging disabled/modified
Firewall opened broadly
Storage made public
Encryption disabled/key policy changed

Day 4: Test your alerts (yes, actually test)

Perform safe test actions in a sandbox:

Create a test user
Assign a test role
Modify a test firewall rule
Confirm alerts fire with the right context.

Day 5: Add a weekly review

A 30-minute weekly routine:

Top 10 risky actions
Any anomalies
Any unowned resources/identities

That’s how audit logging becomes a living system.

11) Quick checklist: “Audit logging done right”

Coverage

Identity/IAM events
Resource lifecycle events
Network/perimeter events
KMS/secrets events
Logging configuration events
Data access logs for crown jewels

Security

Central log archive
Least privilege access to logs
Delete protection / tamper resistance
Alerts on logging changes

Retention

Hot searchable window (30–90 days)
Warm medium-term (up to 12 months)
Cold archive (years as needed)
Longer retention for Tier 0

Alerting

Tier 0 high-signal alerts implemented
Alerts include actor + target + change + next steps
Ownership and routing defined
Alerts tested at least once

Final thought (the part most teams miss)

Audit logging isn’t about collecting evidence for “one day.”

It’s about creating a reality where:

risky actions are visible,
suspicious actions are loud,
and mistakes are reversible before they become disasters.

Below is a cloud-specific, engineer-friendly “Tier 0” audit logging + alert pack for AWS and Azure, plus a dashboard layout and a retention plan you can implement with confidence. I’ll keep it practical, step-by-step, and packed with examples.

AWS + Azure Cloud Audit Logging (Tier 0): What to log, retention, and alert pack

The goal (what “good” looks like)

Within minutes of any risky change, you should be able to answer:

Who did it (identity, role, service principal)
What changed (policy, firewall, key, logging)
Where it happened (account/subscription, region)
How it happened (MFA? access key? workload identity? CI/CD?)
What else they did next (follow-on actions)

If you can do that reliably, your audit logging is doing real work.

Part 1 — What to log (AWS vs Azure)

AWS: required audit log sources

A) CloudTrail management events (Tier 0 foundation)

These are control plane changes: IAM, EC2, VPC, S3 policy changes, etc.

Log these always:

CloudTrail Management Events (Read can be optional; Write is mandatory)
CloudTrail for all regions
Prefer organization-level trails (all accounts)

B) CloudTrail data events (enable selectively for crown jewels)

Data events can get noisy/costly, so enable them for sensitive resources:

S3 object-level access (GetObject/PutObject/DeleteObject)
Lambda Invoke (if needed)
DynamoDB data plane (as needed)

C) AWS Config (configuration history + drift)

Config gives you “state change over time.” CloudTrail gives “who did it.”
You want both.

D) VPC Flow Logs (not audit, but critical companion)

Use it when you investigate exfiltration or weird connections.

E) EKS audit logs (if Kubernetes runs your workloads)

Kubernetes API actions (RBAC changes, secret reads, etc.) can be its own security story.

Azure: required audit log sources

A) Azure Activity Log (Tier 0 foundation)

This is Azure’s control plane log: resource changes and administrative actions.

Log these always:

Activity Log for all subscriptions
Export/stream to centralized analytics + archive

B) Microsoft Entra ID (Azure AD) logs (identity is Tier 0)

You want:

Sign-in logs
Audit logs (directory changes, app/service principal changes)

Identity is the most common starting point for incidents.

C) Azure Resource Graph / change history (optional but helpful)

For “what changed where” queries.

D) Azure Policy events (policy changes + compliance drift)

Because policy changes can create “silent permission expansion.”

E) NSG flow logs (companion to investigate network anomalies)

Good for confirming exposure or exfil patterns.

Part 2 — Centralize logs (the “log archive account” pattern)

AWS recommended pattern

Create a dedicated Log Archive account (separate from workloads)
Send CloudTrail logs to an S3 bucket in Log Archive
Lock it down so workload accounts can’t delete or reduce retention
Send to your SIEM/search layer for hot queries

Azure recommended pattern

Use a centralized Log Analytics workspace (security workspace)
Export Activity Logs + Entra logs into it
Configure long-term archive to storage (immutable policies if possible)
Lock down who can change diagnostics/export settings

Why this matters: an attacker who compromises prod admin will try to delete evidence. Your design must assume that.

Part 3 — Tier 0 “What to log” list (AWS + Azure mapping)

Below are the highest-impact audit events you should log and alert on immediately.

1) Identity & privilege escalation (critical)

AWS events to alert on

New IAM user created
New access keys created
Policy attached/detached to a user/role
Role trust policy changed (who can assume role)
MFA disabled or removed
Root account activity (especially access key creation, login)

Example scenario
A developer role suddenly gains AdministratorAccess at 1:48 AM.
That’s not “ops.” That’s an incident until proven otherwise.

Azure events to alert on

Role assignments created/updated (RBAC)
Privileged role assignments (Owner, Contributor, User Access Administrator)
Service principal credentials added (client secret/cert)
New enterprise app or app consent granted with broad permissions
Conditional Access policies modified
MFA methods removed / security info changed (if tracked)

Example scenario
A service principal gets Owner role on a subscription.
Even if intended, this should alert and create an approval trail.

2) “Covering tracks” changes (critical)

These are the highest-signal alerts in any cloud.

AWS

CloudTrail stopped, deleted, updated
S3 bucket policy changed for the CloudTrail log bucket
CloudTrail log file validation disabled
KMS key policy changes protecting log bucket
Config recorder disabled

Azure

Diagnostic settings removed/changed (Activity Log export disabled)
Log Analytics workspace retention reduced
Storage account used for archive deleted or access reduced
Sentinel/SIEM connectors disabled (if using)

Example scenario
Logging disabled for 6 minutes during a suspicious admin session.
Treat that as “active intrusion” until you know otherwise.

3) Network exposure (critical)

AWS

Security group inbound opened broadly (0.0.0.0/0 or ::/0)
NACL rules loosened
Internet Gateway attached
Route table changes that alter egress path
New public load balancer listener
VPC peering changes

Azure

NSG inbound rule opened broadly
Public IP created/attached to sensitive resources
Firewall rules changed (Azure Firewall, app gateways)
Route table changes, peering changes
Key management ports exposed

Example scenario
Port 22 or 3389 opened to the world on any resource in prod.
Immediate action required.

4) Data exposure (critical for crown jewels)

AWS

S3 bucket policy changed to public access
Public access block disabled
KMS encryption removed from bucket
RDS public accessibility enabled
Snapshot shared publicly or with unexpected account

Azure

Storage container made public
SAS tokens created with broad permissions (if logged/trackable)
Key Vault access policies expanded
SQL firewall rules opened broadly / public endpoint enabled

Example scenario
A storage bucket/container becomes public “by mistake.”
You want an alert within seconds—not after someone finds it on the internet.

Part 4 — Tier 0 Alert Pack (ready to implement)

I’ll give you a minimum set of alerts that provides maximum security coverage with low noise.

AWS Tier 0 Alerts (15 high-signal rules)

Root account sign-in (any root login)
Root access key created
CloudTrail stop/delete/update
Config disabled (recorder/aggregator stopped)
IAM user created (outside automation allowlist)
Access keys created (especially for privileged users)
AdministratorAccess attached (or equivalent policy)
Role trust policy changed (AssumeRole policy updated)
MFA device removed/deactivated
S3 bucket made public or PublicAccessBlock disabled
KMS key policy changed (especially logs + secrets keys)
Security group opened to world on sensitive ports
RDS made publicly accessible
Unusual region activity (write actions in a region you don’t use)
Mass deletion burst (lots of deletes in short window)

Practical filter tip: Maintain an allowlist of known automation roles (Terraform/CI) and still alert if they do Tier 0 actions outside change windows.

Azure Tier 0 Alerts (15 high-signal rules)

Owner role assignment created at subscription/resource group scope
User Access Administrator role assigned (RBAC management power)
New service principal created or app registration created
Credentials added to service principal (secret/cert)
Conditional Access policy changed
Entra ID admin role assigned (privileged directory roles)
Activity Log export/diagnostic settings disabled
Log Analytics retention reduced / workspace settings changed
Storage container set to public / access level changed
Key Vault access policy expanded / RBAC permission expanded
NSG rule opened to world on sensitive ports
Public IP attached to sensitive workloads
SQL firewall opened broadly / public endpoint enabled
Unusual location sign-in for privileged identities
Sign-in risk / multiple failed logins burst (identity attack pattern)

Part 5 — Make alerts actionable (the template that stops alert fatigue)

Every alert should include:

A) What changed
Example: “Security group inbound rule updated: added 0.0.0.0/0 on port 22”

B) Who did it
Actor identity + role + auth type (MFA? key? service principal?)

C) Where
Account/subscription + region + resource name

D) Why this matters (one line)
“Exposes remote administration to the internet.”

E) What to do now (3 steps)

Revert change (or isolate resource)
Validate actor is legitimate (ticket, pipeline, approval)
Search for follow-on actions by same actor for next 15 minutes

This prevents the classic “alert fired… nobody knew what to do… so it got ignored.”

Part 6 — Retention (AWS + Azure) you can use without guessing

A practical retention plan (works for most orgs)

Hot (fast searchable)

90 days for all Tier 0 + Tier 1 audit logs

Warm (searchable but cheaper)

12 months for management/control plane logs (CloudTrail management / Azure Activity)

Cold (archive)

3–7 years for Tier 0 events and identity logs (Entra audit/sign-in, root activity, RBAC changes)
Keep longer if your industry requires it

How to control cost without losing security

Keep Tier 0 fully searchable for 90 days
Move everything else older than 90 days to cheaper storage/index tiers
Enable data access logs only for crown jewels (S3/Storage/Key Vault/etc.)
Reduce noise by filtering high-frequency read events unless required

Part 7 — Dashboard layout (what security + engineering will actually use)

Create one dashboard per cloud, but keep the same sections so teams learn it once.

Section 1: “Today’s Risk”

Count of Tier 0 events today
Highest severity events (top 10)
Logging tamper events count

Section 2: Identity

New privileged assignments (last 24h)
New keys/secrets created (last 24h)
Unusual sign-ins (geo, device, IP)

Section 3: Perimeter changes

New public endpoints
Firewall/SG/NSG rules opened broadly
Route/gateway changes

Section 4: Data exposure

Public storage changes
Encryption/key policy changes
DB public endpoint changes

Section 5: “Top Actors”

Top identities performing changes (last 24h)
New actors never seen before (last 7 days)

Section 6: “Unallocated/Unknown”

Events from identities with no clear owner tag/label
Resources modified without ownership metadata

This dashboard makes investigations fast and keeps your team curious because it constantly answers: “what changed, and should I care?”

Part 8 — The “Starter Plan” (5 days to a working system)

Day 1: Enable Tier 0 sources

AWS: org-level CloudTrail management events (all regions)
Azure: Activity Logs + Entra sign-in & audit logs

Day 2: Centralize + lock down

AWS: Log Archive account + protected S3 bucket + restricted delete
Azure: Central Log Analytics + archive to storage + restricted settings change

Day 3: Implement Tier 0 alerts (start with 8)

AWS first 8:

root login, key created, CloudTrail changed, admin policy attached, key created, SG open, S3 public, unusual region

Azure first 8:

Owner role assigned, SP credential added, diagnostics disabled, NSG open, storage public, Key Vault policy expanded, unusual sign-in, SQL firewall broad

Day 4: Test alerts

Do safe test actions in a sandbox subscription/account and confirm:

alert fires
contains actor + target + change + next steps

Day 5: Add weekly review routine

30 minutes:

review Tier 0 list
close false positives by allowlisting known automation identities
create 1–3 backlog items to reduce recurring risk

Mohammad Gufran Jahangir

Category:

Cloud audit logging: what to log, retention, and alerting use cases (engineer-friendly, step-by-step)

1) What “audit logs” really are (and what they are not)

Audit logs answer: “Who did what, where, and when?”

Audit logs are not:

2) The 3 outcomes your audit logging must deliver

Outcome A — Fast incident investigation

Outcome B — Continuous detection (not monthly surprises)

Outcome C — Compliance and forensics-ready evidence

3) Step-by-step setup: a “minimum viable audit logging” blueprint

Step 1 — Define your “crown jewels”

Step 2 — Inventory your audit log sources

Step 3 — Centralize logs (the “log archive” pattern)

Step 4 — Protect log integrity (tamper resistance)

Step 5 — Normalize the key fields (so you can actually alert)

4) What to log: the practical “Tier 0 / Tier 1 / Tier 2” model

Tier 0 (must log, must alert, must retain longer)

Tier 1 (log + alert in production; retain medium)

Tier 2 (log for investigation; alert only if correlated)

5) Retention: how long should you keep audit logs?

The 3-tier retention strategy (simple and effective)

How to decide your numbers (beginner-friendly rule)

A practical default (works for many teams)

6) Alerting use cases: high-signal alerts engineers won’t ignore

Alert Pack A — Identity and privilege changes (Tier 0)

Alert Pack B — “Covering tracks” events (Tier 0)

Alert Pack C — Network and perimeter exposures (Tier 0)

Alert Pack D — Data protection and encryption (Tier 0)

Alert Pack E — Suspicious behavior patterns (high value)

7) Make alerts actionable: add context + a runbook snippet

8) Real examples (what audit logs catch in the real world)

Example 1: “We got billed $8k overnight”

Example 2: “Database became publicly accessible”

Example 3: “We can’t find logs for the incident window”

9) The most common mistakes (and how to avoid them)

Mistake 1: Only logging in one account/project

Mistake 2: Keeping logs but not searching them

Mistake 3: No alerts for disabling logging

Mistake 4: “Everything alerts all the time”

Mistake 5: No ownership

10) A simple “starter implementation” you can follow this week

Day 1: Enable + centralize

Day 2: Lock down integrity

Day 3: Implement the Tier 0 alert pack

Day 4: Test your alerts (yes, actually test)

Day 5: Add a weekly review

11) Quick checklist: “Audit logging done right”

Final thought (the part most teams miss)

AWS + Azure Cloud Audit Logging (Tier 0): What to log, retention, and alert pack

The goal (what “good” looks like)

Part 1 — What to log (AWS vs Azure)

AWS: required audit log sources

A) CloudTrail management events (Tier 0 foundation)

B) CloudTrail data events (enable selectively for crown jewels)

C) AWS Config (configuration history + drift)

D) VPC Flow Logs (not audit, but critical companion)

E) EKS audit logs (if Kubernetes runs your workloads)

Azure: required audit log sources

A) Azure Activity Log (Tier 0 foundation)

B) Microsoft Entra ID (Azure AD) logs (identity is Tier 0)

C) Azure Resource Graph / change history (optional but helpful)

D) Azure Policy events (policy changes + compliance drift)

E) NSG flow logs (companion to investigate network anomalies)

Part 2 — Centralize logs (the “log archive account” pattern)

AWS recommended pattern

Azure recommended pattern

Part 3 — Tier 0 “What to log” list (AWS + Azure mapping)

1) Identity & privilege escalation (critical)

AWS events to alert on

Azure events to alert on

2) “Covering tracks” changes (critical)

AWS

Azure

3) Network exposure (critical)

AWS

Azure

4) Data exposure (critical for crown jewels)

AWS

Azure

Part 4 — Tier 0 Alert Pack (ready to implement)

AWS Tier 0 Alerts (15 high-signal rules)