AIMIT
Home
Security Domains
Frameworks
Arch. Diagrams
Interview Q&A📖Glossary🎯Mock Interview📄Resume BuilderSecurity News
📱Download
Mobile App
Home / Security Domains / Data Security
NISTISO

🔐 Data Security

Protecting data throughout its lifecycle — classification, encryption at rest/in transit/in use, data loss prevention, tokenization, backup strategies, and data governance.

Overview

Data security focuses on protecting digital information from unauthorized access, corruption, or theft throughout its entire lifecycle. It encompasses the policies, procedures, and technologies used to ensure data confidentiality, integrity, and availability — the CIA triad. With increasing regulations (GDPR, PCI-DSS, HIPAA) and the expanding attack surface, robust data security is critical for every organization.

Key Concepts

Backup & Recovery

3-2-1 Rule: 3 copies of data, on 2 different media types, with 1 offsite/cloud copy. RPO (Recovery Point Objective): Maximum acceptable data loss — determines backup frequency. RTO (Recovery Time Objective): Maximum acceptable downtime — determines recovery strategy. Immutable backups protect against ransomware (WORM storage, air-gapped backups). Regular restoration testing is critical — untested backups are not backups.

Data Classification

Categorizing data by sensitivity and business impact. Common levels: Public (no restrictions), Internal (business use only), Confidential (restricted access — PII, financial data), and Restricted/Secret (highest protection — trade secrets, health records, encryption keys). Classification drives encryption requirements, access controls, retention policies, and handling procedures. Automated tools like Microsoft Purview, Varonis, and BigID help discover and classify data at scale.

Data Governance

Policies and processes for managing data as a strategic asset. Data lifecycle: Creation → Storage → Use → Sharing → Archival → Destruction. Key elements: Data ownership assignment, retention and disposal schedules, privacy impact assessments, data lineage tracking, consent management (GDPR), and cross-border transfer rules (SCCs, BCRs). Tools: Collibra, Alation, Microsoft Purview Governance.

Data Loss Prevention (DLP)

Preventing unauthorized data exfiltration across three vectors: Endpoint DLP: Monitor clipboard, USB, print, and screen capture on endpoints — Microsoft Defender for Endpoint, Symantec DLP. Network DLP: Inspect traffic leaving the network for sensitive patterns (SSN, credit cards, source code) — Palo Alto, Zscaler. Cloud DLP: Monitor SaaS and cloud storage — Microsoft Purview DLP, Google Cloud DLP, Netskope. DLP policies use regex patterns, machine learning classifiers, and fingerprinting to detect sensitive data.

Encryption

At Rest: AES-256 encryption for databases, file systems, and backups using KMS (AWS KMS, Azure Key Vault, Cloud KMS). Full-disk encryption (BitLocker, LUKS). In Transit: TLS 1.3 for all network communication, certificate pinning for APIs, mTLS for service-to-service. In Use: Confidential computing with hardware enclaves (Intel SGX, AMD SEV), homomorphic encryption for processing encrypted data. Key Management: HSM-backed keys, automatic rotation, separation of duties, and key escrow procedures.

Tokenization & Masking

Tokenization: Replacing sensitive data with non-reversible tokens that map back to original data in a secure vault. Used for PCI-DSS compliance — tokenize credit card numbers so they never touch application code. Data Masking: Replacing real data with realistic but fake data for non-production environments. Static masking for dev/test databases, dynamic masking for real-time query results. Tools: Voltage, Protegrity, Delphix.

Three States of Data

Data exists in three states — at rest, in transit, and in use. Each state requires different security controls and encryption strategies. A comprehensive data security program must protect data across all three states.

💾 Data at Rest

Data stored on disk, databases, backup media, or cloud storage — not actively being transmitted or processed.

🔒 Encryption Methods

  • AES-256: Industry standard symmetric encryption for databases, files, and volumes
  • Full Disk Encryption (FDE): BitLocker (Windows), FileVault (macOS), LUKS (Linux)
  • Database Encryption: TDE (Transparent Data Encryption) in SQL Server, Oracle, PostgreSQL
  • File-Level Encryption: Per-file encryption for selective protection — VeraCrypt, 7-Zip AES
  • Cloud Storage: SSE-S3, SSE-KMS, SSE-C (AWS); Azure Storage Service Encryption; Google CMEK/CSEK

🛡️ Best Practices

  • Key Management: Use HSM-backed KMS (AWS KMS, Azure Key Vault, GCP Cloud KMS)
  • Key Rotation: Automatic rotation every 90-365 days; immediate rotation on compromise
  • Separation of Duties: Key custodians ≠ data administrators ≠ security team
  • Backup Encryption: Encrypt ALL backups — immutable/WORM storage for ransomware protection
  • Secure Deletion: Cryptographic erasure, DoD 5220.22-M wipe, physical destruction for decommissioned media

🔄 Data in Transit

Data actively moving between systems — over the network, internet, APIs, or between services. Most vulnerable to interception and man-in-the-middle attacks.

🔒 Protocols & Encryption

  • TLS 1.3: Latest standard — faster handshake, forward secrecy by default, removed insecure ciphers (RC4, SHA-1, CBC)
  • mTLS (Mutual TLS): Both client and server authenticate — essential for service-to-service (microservices, service mesh)
  • IPsec VPN: Network-layer encryption for site-to-site and remote access VPNs (IKEv2, ESP)
  • SSH/SFTP: Secure remote access and file transfer — SSH keys over passwords, certificate-based auth
  • HTTPS Everywhere: Enforce HTTPS via HSTS headers, certificate pinning for mobile apps, TLS termination at load balancer
  • API Security: OAuth 2.0 + JWT tokens, API gateway TLS enforcement, certificate-based API authentication

🛡️ Best Practices

  • Certificate Management: Automated renewal (Let's Encrypt, ACME), certificate inventory, expiration monitoring
  • Perfect Forward Secrecy (PFS): ECDHE key exchange — compromised private key doesn't decrypt past traffic
  • Disable Legacy Protocols: No SSLv3, TLS 1.0/1.1; enforce TLS 1.2+ minimum
  • Network Segmentation: Encrypt east-west traffic between VLANs/segments, not just north-south
  • Email Encryption: S/MIME or PGP for sensitive emails; TLS for SMTP (STARTTLS enforcement)
  • DNS Security: DNSSEC, DNS-over-HTTPS (DoH), DNS-over-TLS (DoT)

⚡ Data in Use

Data actively being processed in memory (RAM), CPU caches, or registers. The hardest state to protect — traditionally data must be decrypted to be processed. Confidential computing is changing this.

🔒 Protection Technologies

  • Confidential Computing: Hardware-based Trusted Execution Environments (TEEs) — process encrypted data in secure enclaves
  • Intel SGX: Software Guard Extensions — create encrypted memory enclaves, even OS/hypervisor cannot access
  • AMD SEV/SEV-SNP: Secure Encrypted Virtualization — encrypts entire VM memory with per-VM keys
  • ARM TrustZone: Hardware isolation between secure and non-secure worlds — used in mobile devices
  • Homomorphic Encryption (HE): Compute on encrypted data without decrypting — fully homomorphic (slow but maturing), partially homomorphic (practical for specific operations)
  • Secure Multi-Party Computation (MPC): Multiple parties jointly compute a function without revealing their individual inputs

🛡️ Best Practices

  • Memory Protection: ASLR (Address Space Layout Randomization), DEP/NX (Data Execution Prevention)
  • Process Isolation: Containers, sandboxing, mandatory access controls (SELinux, AppArmor)
  • Credential Guard: Windows Credential Guard — isolates LSASS in a virtualized container
  • Memory Encryption: Total Memory Encryption (TME/MKTME) — encrypt all system memory transparently
  • Minimize Exposure: Decrypt data only when absolutely necessary, clear sensitive data from memory immediately after use
  • Cloud Confidential VMs: Azure Confidential VMs, GCP Confidential VMs, AWS Nitro Enclaves

Data Security Architecture

📋 Discovery & Classification (Identify & Label)
↓
🔒 Protection (Encrypt + Tokenize + Mask)
↓
🛡️ Prevention (DLP + Access Control + RBAC)
↓
📡 Detection (Monitoring + Anomaly Detection + UEBA)
↓
💾 Recovery (Backup + DR + Immutable Storage)

Data Security Lifecycle

Layered defense from discovery to recovery — know your data, protect it, prevent loss, detect threats, and recover

Data Breach Response Lifecycle

A structured breach response minimizes damage, meets regulatory obligations, and preserves evidence for investigation. Every organization should have a tested Incident Response Plan (IRP) before a breach occurs.

🚨 Detection & Identification — SIEM alerts, DLP triggers, user reports, threat intel, dark web monitoring
↓
🔒 Containment — Isolate affected systems, revoke compromised credentials, block exfiltration channels, preserve evidence
↓
🔍 Investigation — Forensic analysis, scope assessment, root cause analysis, determine data types & records affected
↓
📢 Notification — Notify regulators (GDPR 72hrs, SEC 4 days), affected individuals, law enforcement, and cyber insurance carrier
↓
🔧 Eradication & Recovery — Patch vulnerabilities, rebuild compromised systems, restore from clean backups, reset credentials
↓
📋 Lessons Learned — Post-incident review, update IR playbook, improve detection rules, conduct tabletop exercises

NIST SP 800-61 Incident Response Framework

Preparation → Detection → Containment → Eradication → Recovery → Lessons Learned. Regular tabletop exercises ensure the team can execute under pressure.

Breach Notification Requirements

Different regulations impose different notification timelines and requirements. Missing a notification deadline can result in additional fines on top of the breach penalty itself.

Regulation⏱️ Timeline📋 Who to Notify💰 Max Penalty
GDPR (EU)72 hours to DPAData Protection Authority + affected individuals (if high risk)€20M or 4% global revenue
SEC Rule (US)4 business daysSEC via 8-K filing + investorsEnforcement actions + lawsuits
HIPAA (US Healthcare)60 days to HHSHHS + affected individuals + media (if >500 records)$1.5M per violation category
PCI-DSSImmediatelyPayment brands (Visa, MC) + acquiring bank$100K–$500K/month + loss of processing
US State Laws30–90 days (varies)State AG + affected residents$750/consumer (CCPA) + AG penalties
FFIEC / BankingASAP / 36 hours (OCC)Primary regulator (OCC/FDIC/Fed) + customersConsent orders + enforcement actions

Notable Data Breaches — Lessons Learned

Colonial Pipeline (2021)

Critical infrastructure shutdown. Ransomware (DarkSide) via compromised VPN password (no MFA). Led to US East Coast fuel shortage. Paid $4.4M ransom. Root cause: Legacy VPN without MFA, flat network, no segmentation between IT and OT. Lesson: MFA everywhere (especially VPN), network segmentation IT/OT, immutable backups, incident response drills for critical infrastructure.

Equifax (2017)

147M records. Unpatched Apache Struts vulnerability (CVE-2017-5638) exploited 2 months after patch was available. Attackers exfiltrated data for 76 days undetected. Root cause: Failed patch management + expired SSL certificate on monitoring tool (allowed exfiltration to go unnoticed). Lesson: Patch critical vulns within 48 hours, maintain certificate inventory, segment sensitive databases, monitor outbound traffic.

IBM Cost of a Breach Report

2024 average cost: $4.88M per breach. Key findings: Average time to identify + contain: 258 days. Breaches involving stolen credentials: 292 days. Cost reducers: AI & automation saved $2.2M, DevSecOps saved $1.7M, incident response planning saved $1.5M. Cost multipliers: Cloud migration breaches +$750K, skills shortage +$1.8M, compliance failures +$1.6M. Lesson: Invest in detection speed, automate response, and train incident response teams.

MOVEit (2023)

2,500+ organizations. Zero-day SQL injection in MOVEit Transfer file-sharing software exploited by Cl0p ransomware group. Mass exfiltration before patches available. Root cause: SQLi vulnerability in internet-facing file transfer appliance. Lesson: Minimize internet-facing attack surface, WAF for file transfer apps, monitor file transfer logs for bulk downloads, incident response readiness.

SolarWinds (2020)

18,000+ organizations. Nation-state supply chain attack — malicious code injected into Orion software build process (SUNBURST backdoor). Went undetected for 9+ months. Root cause: Compromised CI/CD pipeline, weak build security. Lesson: Secure build pipelines, implement SBOM, code signing, monitor for anomalous DNS/network behavior, Zero Trust architecture.

T-Mobile (2021–2023)

76M+ records across multiple breaches. Repeated breaches via API exploitation, credential stuffing, and insider threats. Root cause: Insufficient API security, inadequate access controls, lack of monitoring. Lesson: API security testing, rate limiting, UEBA for insider threat detection, breach is not a one-time event — continuous improvement is essential.

🎯 Data Plane Attack Tree — 7-Stage Threat Analysis

Data is the ultimate target of most cyber attacks. Attackers may compromise identities, APIs, or infrastructure — but their final goal is unauthorized access to sensitive data. This attack tree maps 7 stages where data can be exposed, accessed, manipulated, or exfiltrated.

StageAttack SurfaceKey Risks
1. Data CreationWhere data is bornSensitive data stored without classification, unstructured data proliferation, shadow datasets, sensitive logs, AI/analytics datasets with PII, improper data tagging, test data with production PII, over-collection beyond business need
2. Data Discovery & ExposureWhere data is foundExposed storage buckets/databases, shadow data stores, unindexed assets, catalog misconfiguration, metadata leakage, search/index system exposure without restrictions, backup and snapshot discovery
3. Data StorageWhere data livesPublic object storage exposure, misconfigured database access policies, unencrypted storage systems (no lock), backup storage exposure, snapshot exposure, misconfigured file permissions, data lake misconfig, cross-account storage exposure. Encryption risks: KMS misconfig, improper key rotation, shared/hardcoded encryption keys
4. Data Access ControlsWho can touch dataOver-privileged database roles, weak ACLs, shared database credentials (login misuse), token or API key misuse, service account abuse (robot avatar with too many rights), broken object-level authorization, privilege escalation through database roles, lack of row/column-level security
5. Data ProcessingWhere data is transformedCompromised ETL pipelines, malicious transformations, data pipeline injection, unauthorized analytics queries, compromised processing clusters, SQL query manipulation, Spark/big data cluster exploitation. Inference & Query Abuse: sensitive data inference via queries, aggregation-based leakage, model-driven data exposure
6. Data Sharing & DistributionWhere data travelsAPI data overexposure, partner integration misuse, unrestricted data exports, public data sharing links, data replication misconfiguration, cross-region data exposure, third-party leakage (analytics chains), webhook/event-driven leakage
7. Impact & ExfiltrationWhere data leavesMass data exfiltration, sensitive dataset extraction, customer data theft, intellectual property theft, regulated data exposure (PII lock removed), data manipulation or corruption, data destruction/ransomware (burning hard drive), stealth exfiltration over trusted channels (subtle move)

🛡️ Defense Controls & Architecture

🛡️ Access Control

Fine-grained RBAC/ABAC, row/column-level security, least privilege enforcement, and data monitoring/detection with access logging and anomaly detection.

📋 Data Governance

Classification, ownership assignment, and data minimization. Know what data you have, where it lives, and who owns it. Without governance, all other controls are guesswork.

🚫 Data Loss Prevention

DLP scanning and policy enforcement, tokenization/masking of sensitive fields, sensitive data detection across endpoints, network, and cloud. Controlled APIs with partner integration governance.

🔍 DSPM

Data Security Posture Management — sensitive data discovery, exposure risk identification, object storage scanning, and inventory mapping. Tools: Varonis, BigID, Normalyze, Sentra.

🔐 Encryption & Key Mgmt

Encryption at rest/transit, KMS configuration, key rotation, access control on keys, and separation of duties. Never share or hardcode encryption keys.

🚨 Incident Response

Breach detection, automated containment, forensic tracking, and backup protection. Treat every data plane anomaly as a potential exfiltration event until proven otherwise.

💡 Interview Question

Walk me through the 7 stages of a Data Plane Attack Tree — how would you defend an organization's data at each stage?

The Data Plane Attack Tree maps 7 stages where data is vulnerable:

1DATA CREATION
  • Biggest risk is data born without classification
  • Implement automated classification at creation — Microsoft Purview, BigID
  • Enforce data minimization policies — don't collect what you don't need
  • Scan for PII in test/dev environments
2DATA DISCOVERY & EXPOSURE
  • Attackers search for exposed storage buckets, shadow databases, and misconfigured catalogs
  • Defense: continuous DSPM scanning (Varonis, Normalyze), automated detection of public S3 buckets/Azure blobs, metadata access controls for data catalogs
3DATA STORAGE
  • Encrypt everything — AES-256 at rest, KMS-managed keys with automatic rotation
  • Never hardcode encryption keys
  • Audit storage permissions weekly
  • Cross-account storage access should be exception-based and logged
4DATA ACCESS CONTROLS
  • Implement least privilege at the database level — row/column-level security, not just table-level
  • Eliminate shared database credentials
  • Service accounts get minimum permissions with automated rotation
  • Monitor for privilege escalation through database roles
5DATA PROCESSING
  • Secure ETL pipelines with integrity checks
  • Validate data transformations and detect injection in pipeline parameters
  • Monitor analytics queries for inference attacks — when someone runs queries that individually look innocent but together reconstruct sensitive data
6DATA SHARING
  • API response filtering — never return more data than needed
  • Audit partner integrations quarterly
  • Disable unrestricted data exports
  • Monitor webhook payloads for sensitive data leakage
  • Implement data residency controls for cross-region flows
7IMPACT & EXFILTRATION
  • DLP at all egress points — network, endpoint, cloud
  • Monitor for bulk data transfers, unusual download patterns, and exfiltration over trusted channels (DNS, HTTPS to legitimate-looking domains)
  • Immutable backups protect against ransomware/destruction
  • KEY TAKEAWAY: Protecting the data plane requires strong data governance, fine-grained access controls, secure data pipelines, continuous monitoring, and DLP at every boundary

Interview Preparation

💡 Interview Question

How would you implement a data classification program?

1) Define classification levels (Public, Internal, Confidential, Restricted) with clear criteria.

2Assign data owners for each data domain.

3Deploy automated discovery tools (Microsoft Purview, Varonis) to scan repositories.

4Label data with metadata tags (sensitivity, retention, jurisdiction).

5Map classification to security controls — encryption requirements, access levels, DLP policies.

6Train employees on handling procedures for each level.

7Audit regularly and measure: percentage of data classified, policy violations, and remediation time.

💡 Interview Question

Explain the difference between tokenization and encryption.

Encryption transforms data using an algorithm and key — it's mathematically reversible with the correct key. The encrypted data (ciphertext) has the same format/length characteristics. Tokenization replaces data with a random token that has NO mathematical relationship to the original — the mapping exists only in a secure token vault. Key difference: encrypted data can be brute-forced given enough compute; tokens cannot. Tokenization is preferred for PCI-DSS scope reduction because tokenized data is not considered cardholder data, shrinking the compliance boundary.

💡 Interview Question

Walk me through how you would respond to a data breach.

1) DETECT & ASSESS — Verify the alert is a true positive. Determine data types affected (PII, PHI, PCI, IP), volume of records, and whether exfiltration occurred. Activate the Incident Response Team (IRT).

2CONTAIN — Isolate affected systems (network quarantine via EDR), revoke compromised credentials, block attacker IPs/domains, disable data exfiltration channels. Preserve forensic evidence — do NOT wipe systems yet.

3INVESTIGATE — Conduct forensic analysis to determine root cause, attack vector, lateral movement, and full scope. Build a timeline. Engage external forensics firm if needed (required for PCI).

4NOTIFY — Engage legal counsel immediately. Notify regulators within required timeframes (GDPR 72hrs, SEC 4 days). Prepare customer notification with clear description of what happened, what data was affected, and what you're doing about it. Notify cyber insurance carrier.

5ERADICATE & RECOVER — Patch the exploited vulnerability, rebuild compromised systems from clean images, force credential resets, restore from verified clean backups.

6LESSONS LEARNED — Post-incident review within 2 weeks. Update IR playbook, improve detection rules, conduct tabletop exercise simulating the attack scenario. Report metrics: time to detect, contain, and recover.

Related Domains

☁️

Cloud Security

Cloud data protection

📋

GRC

Governance, Risk & Compliance

🔑

IAM

Access controls for data

Enterprise-grade cybersecurity knowledge platform for training, interview preparation, and continuous learning. Master frameworks, architectures, and best practices.

Built by Security Professionals, for Security Enthusiasts.

Security Domains

  • AI Sec
  • AI/ML SecOps
  • API Sec
  • AppSec
  • Cloud
  • Data Sec

More Domains

  • DevSecOps
  • Crypto
  • GRC
  • IAM / IGA
  • MITRE ATT&CK
  • Network
  • OWASP Top 10
  • SAST/DAST
  • SIEM/Logs
  • SOC
  • VulnMgmt
  • ZTA

Frameworks

  • OWASP
  • NIST CSF
  • NIST SP 800
  • MITRE ATT&CK
  • ISO 27001/27002
  • CISA
  • CIS Controls
  • CVSS / CVE / KEV
  • CWE / SANS Top 25
  • SOX
  • PCI-DSS
  • GLBA
  • FFIEC / Federal Banking
  • GDPR
  • Architecture Diagrams
  • 📖 Glossary
© 2026 AIMIT — Cybersecurity Solutions PlatformA GenAgeAI Product
AIMIT
AIMIT 🛡️
On Duty AvatarVani