Phase 2: Discovery & Classification

Identify and classify sensitive information across your organization, including Sri Lankan personal data patterns

Duration: 2-4 months
Team: 4-6 members

Phase Overview

The Discovery & Classification phase focuses on understanding where sensitive data resides within your organization. This includes deploying automated discovery tools, identifying Sri Lankan-specific personal data patterns (NIC, passport numbers, bank details), and establishing a comprehensive data inventory.

Key Objective

Discover all locations of sensitive and personal data across the enterprise

Success Criteria

Complete data inventory with accurate classification and ownership mapping

Deliverables

Data inventory, classification schema, and Sri Lankan data pattern library

Key Activities

1. Deploy Discovery Tools

Set up and configure automated data discovery tools to scan your IT environment.

Tasks:

  • Evaluate and select DLP discovery tools (commercial or open-source)
  • Configure discovery agents for file servers, databases, and endpoints
  • Set up cloud scanning for SaaS applications (Google Workspace, Microsoft 365)
  • Test discovery tools in isolated environment before production deployment

Tool Recommendation: Check our Tools & Scripts section for open-source DLP discovery options.

2. Define Classification Schema

Establish data classification framework aligned with PDPA requirements and business needs.

Tasks:

  • Create classification levels (e.g., Public, Internal, Confidential, Restricted)
  • Define criteria for each classification level
  • Map PDPA personal data categories to classification levels
  • Document handling requirements for each classification level

Sample Classification Levels:

Restricted Highly sensitive personal data, financial data, trade secrets
Confidential Internal business data, employee information, customer data
Internal Internal communications, operational documents
Public Marketing materials, published information

3. Create Sri Lankan Data Patterns

Develop detection patterns for Sri Lankan-specific personal data types.

Tasks:

  • Create regex patterns for Sri Lankan NIC numbers (old and new formats)
  • Develop patterns for Sri Lankan passport numbers
  • Configure detection for Sri Lankan bank account numbers and SWIFT codes
  • Add patterns for local phone numbers and addresses
  • Test pattern accuracy and adjust to reduce false positives

Pattern Library: Access our pre-built Sri Lankan Data Pattern Library with ready-to-use regex patterns and test cases.

4. Conduct Discovery Scans

Execute comprehensive scans across all data repositories and systems.

Tasks:

  • Scan file servers and network shares
  • Scan databases (structured data repositories)
  • Scan endpoint devices (laptops, desktops)
  • Scan cloud storage and SaaS applications
  • Scan email systems and archives

5. Build Data Inventory

Compile comprehensive inventory of sensitive data locations and classifications.

Tasks:

  • Document all locations where sensitive data was discovered
  • Record data volumes, types, and classification levels
  • Identify data owners and custodians for each repository
  • Map data flows and sharing patterns
  • Create visual data maps showing data distribution

6. Remediate Discovery Findings

Address immediate risks identified during discovery phase.

Tasks:

  • Identify high-risk data exposures requiring immediate action
  • Remove or secure orphaned sensitive data
  • Apply appropriate access controls to exposed data
  • Notify data owners of compliance issues

Sri Lankan Personal Data Types

Key data types to detect for PDPA compliance:

National Identity Card (NIC)

Old format: 9 digits + V/X | New format: 12 digits

Example: 199912345678

Passport Numbers

Format: Letter followed by 7 digits

Example: N1234567

Bank Account Details

Account numbers and SWIFT codes for Sri Lankan banks

SWIFT: BCEYLKLX

Phone Numbers

Mobile and landline numbers in Sri Lankan format

+94771234567

Phase Deliverables

Comprehensive Data Inventory

Complete catalog of sensitive data locations, volumes, and classifications

Data Flow Maps

Visual representation of how data moves through your organization

Classification Schema

Documented classification framework with handling procedures

Sri Lankan Pattern Library

Custom detection patterns for Sri Lankan personal data types

Discovery Reports

Detailed scan results with statistics and risk analysis

Remediation Action Plan

Prioritized list of immediate actions to address high-risk findings

Resources & Templates

Ready for Phase 3?

Once you've completed data discovery and built your inventory, move to Policy & Controls development.