Implementing ISO 27001 Annex A 8.6 is a proactive operational discipline that requires the continuous Capacity Management of information processing facilities. By establishing technical performance baselines and deploying agent-based monitoring, organizations can predict resource exhaustion and scale infrastructure before outages impact business availability.
ISO 27001 Annex A Capacity Management Implementation Checklist
Use this implementation checklist to achieve compliance with ISO 27001 Annex A 8.6. Capacity management is not about writing a document that says “we will buy more RAM”; it is the active, technical discipline of monitoring resource exhaustion before it causes an outage.
1. Establish Technical Performance Baselines
Control Requirement: The use of resources shall be monitored and adjusted in line with current and expected capacity requirements.
Required Implementation Step: Monitor your key assets (Servers, Firewalls, Internet Circuits) for 30 days to establish what “Normal” looks like. Document the average CPU, RAM, and IOPS usage during peak business hours to serve as the baseline for your alert thresholds.
Minimum Requirement: You cannot detect a capacity spike if you haven’t defined the baseline load.
2. Deploy Agent-Based Resource Monitoring
Control Requirement: Continuous monitoring of information processing facilities.
Required Implementation Step: Install lightweight monitoring agents (e.g., Zabbix, Datadog, Prometheus node_exporter) on every critical server. Do not rely on intermittent SNMP polls; get granular, second-by-second metrics on Disk Queue Length and Memory Pressure.
Minimum Requirement: Real-time visibility into the health of the operating system.
3. Configure “Soft” and “Hard” Threshold Alerts
Control Requirement: Detective controls to identify capacity issues.
Required Implementation Step: Configure your monitoring stack to send a “Warning” email at 80% usage (Soft Limit) and a “Critical” PagerDuty alert at 90% usage (Hard Limit). This ensures the Ops team has time to react (e.g., expand the volume) before the service crashes.
Minimum Requirement: Alerts must fire before the disk is 100% full.
4. Manage Database Transaction Logs
Control Requirement: Prevent specific application capacity failures.
Required Implementation Step: Configure the backup schedule to truncate SQL/Database transaction logs frequently (e.g., every 15 minutes). Unmanaged transaction logs are the #1 cause of sudden database capacity outages.
Minimum Requirement: “Disk Full” on a SQL server is a failure of management, not hardware.
5. Implement Cloud Auto-Scaling Groups
Control Requirement: Adjust resources in line with changing demand.
Required Implementation Step: In AWS/Azure, configure Auto-Scaling Groups (ASG) for your web tiers. Set rules to automatically spin up new instances when CPU exceeds 70% and terminate them when it drops below 30%. This automates compliance with the “adjustment” requirement.
Minimum Requirement: Static cloud infrastructure that crashes under load is non-compliant.
6. Enforce Data Retention & Purging Policies
Control Requirement: Delete obsolete data to optimise storage.
Required Implementation Step: Script a “Cleanup Job” (e.g., PowerShell or Bash cron) that automatically deletes temporary files, old log archives, and “Deleted Items” older than the retention period. Do not pay to store digital waste.
Minimum Requirement: Automated housekeeping tasks running on all file servers.
7. Monitor Network Bandwidth & Latency
Control Requirement: Ensure network capacity meets business needs.
Required Implementation Step: Use NetFlow or sFlow analysis on your core switches to identify “Top Talkers.” Set alerts for high interface utilisation on your uplink ports. If the backup job saturates the pipe at 9 AM, you have a capacity failure.
Minimum Requirement: Know exactly which application is hogging the bandwidth.
8. Plan for Physical Hardware Lead Times
Control Requirement: Projections of future capacity requirements.
Required Implementation Step: Maintain a hardware procurement register. If you use physical SANs or Servers, account for the 6-week shipping delay. Order the expansion shelf when the array hits 70%, not 95%.
Minimum Requirement: The procurement cycle must be faster than the data growth rate.
9. Conduct Regular Load/Stress Testing
Control Requirement: Validate system ability to handle projected loads.
Required Implementation Step: Use tools like JMeter or K6 to simulate 2x your current user load on the staging environment. Verify that the application degrades gracefully (e.g., slows down) rather than crashing or corrupting data.
Minimum Requirement: Theoretical capacity calculations must be proven by actual load tests.
10. Review Human Resource Capacity
Control Requirement: Ensure sufficient human resources are available.
Required Implementation Step: Audit the on-call hours and ticket volume of your sysadmins. If a single engineer is working 60 hours a week to keep the lights on, you have a “Key Person” capacity risk. Cross-train staff to create redundancy.
Minimum Requirement: No single point of failure in the engineering team.
ISO 27001 Annex A 8.6 SaaS / GRC Platform Implementation Failure Checklist
| Control Requirement | The ‘Checkbox Compliance’ Trap | The Reality Check |
|---|---|---|
| Resource Monitoring | GRC tool asks: “Do you monitor capacity?” (Yes/No). | You clicked “Yes”, but nobody looks at the dashboard. The server hits 100% disk usage at 3 AM and the service dies. |
| Future Planning | “We have a budget meeting once a year.” | Data grows exponentially, not annually. By month 8, you are out of storage and have no budget left to fix it. |
| Cloud Scalability | “We moved to the cloud, so we have infinite capacity.” | You hit the default vCPU quota limit in your AWS region during a launch event. The “infinite” cloud rejects your request for new servers. |
| Alerting | “We get email alerts.” | The alerts go to a “System” folder that nobody reads. Critical warnings are buried under 5,000 “Info” logs. |
| Data Purging | “We keep everything just in case.” | Your backup server is full of 10-year-old temporary files, causing the backup job to fail for the last 3 weeks. |
| Network Capacity | “Our internet is fast.” | A user downloads a 50GB dataset during the CEO’s video call. Without QoS (Quality of Service) rules, the call drops. |
| Human Capacity | “Bob handles the servers.” | Bob is burnt out and quits. Nobody else knows the root password or how the custom backup script works. |
