Implementing ISO 27001 Annex A 5.37 is a critical operational mandate requiring the standardisation and documentation of IT procedures to ensure consistency and prevent knowledge loss. This control necessitates detailed runbooks for system start-up, data recovery, and maintenance, providing the business benefit of resilient operations and reduced dependence on key personnel.
ISO 27001 Annex A Documented Operating Procedures Implementation Checklist
Use this implementation checklist to achieve compliance with ISO 27001 Annex A 5.37. This control requires the creation of detailed, step-by-step instructions for the correct and secure operation of information processing facilities, ensuring that critical technical knowledge is not lost with staff turnover.
1. Identify Critical Operational Tasks
Control Requirement: Determine which activities are essential for the continuous operation of systems. Required Implementation Step: Audit the daily, weekly, and monthly routine of your SysAdmin and DevOps teams. List every task required to “keep the lights on,” such as server patching, log review, database indexing, and certificate renewal. Do not assume knowledge; if it’s in a senior engineer’s head, it must be extracted.
Minimum Requirement: A prioritised list of at least 20 critical maintenance and operational tasks requiring documentation.
2. Standardise the Runbook Format
Control Requirement: Ensure procedures are consistent and easy to follow. Required Implementation Step: Abandon Microsoft Word. Create a standardised Markdown or Confluence template for “Runbooks”. This must include: Prerequisite Access, Estimated Time, Step-by-Step Commands, Expected Output, and Rollback Steps. Treat documentation as code, living where the engineers work.
Minimum Requirement: A standard Runbook template applied across the engineering department.
3. Document System Start-up and Shut-down
Control Requirement: Define the correct sequence for booting and stopping complex systems. Required Implementation Step: Document the exact dependency order for restarting your infrastructure (e.g., “Database must be healthy before API Gateway is started”). Include specific CLI commands or AWS Console actions. This is critical for disaster recovery situations where standard automation may have failed.
Minimum Requirement: Tested start/stop procedures for the primary production environment.
4. Detail Backup and Recovery Operations
Control Requirement: Ensure data restoration is repeatable and reliable. Required Implementation Step: Do not just say “Restore from Backup.” Write the exact syntax to retrieve the artifact, decrypt it, and load it into a fresh instance. Include contact details for off-site storage providers or access paths to “Cold Storage” vaults (e.g., AWS Glacier).
Minimum Requirement: A “Recovery Guide” that a junior engineer could successfully follow without supervision.
5. Map Scheduled Automated Tasks
Control Requirement: Document the logic behind automated jobs. Required Implementation Step: Create a register of all Cron jobs, Windows Task Scheduler items, and Lambda triggers. Explain what the script does, when it runs, and who receives the alert if it fails. Undocumented automation is a major availability risk.
Minimum Requirement: A “Job Dictionary” mapping every automated script to its business function and owner.
6. Define Error Handling and Escalation
Control Requirement: Instruct staff on what to do when procedures fail. Required Implementation Step: In every operating procedure, include a “Troubleshooting” section. Define specific error codes or failure scenarios and the immediate escalation path (e.g., “If HTTP 500 persists > 5 mins, Page the On-Call Engineer”).
Minimum Requirement: Escalation trees embedded within technical documentation.
7. Implement Configuration Management Guidelines
Control Requirement: Ensure secure baseline configurations are maintained. Required Implementation Step: Document the standard build process for new servers or containers (e.g., “The Gold Image”). Include instructions for applying the CIS Benchmark or security hardening scripts immediately upon provisioning.
Minimum Requirement: A “Server Build Standard” document referenced in the provisioning process.
8. Enforce Version Control on Procedures
Control Requirement: Ensure staff are using the current version of instructions. Required Implementation Step: Store operating procedures in a Version Control System (like Git) or a wiki with version history. Prohibit the use of local PDF copies or printed manuals which quickly become obsolete. The “Master” branch is the only source of truth.
Minimum Requirement: Evidence of version history and change authorship for all critical runbooks.
9. Segregate Operational Duties
Control Requirement: Prevent a single person from executing end-to-end critical risks. Required Implementation Step: Structure procedures so that high-impact tasks (e.g., deploying to Production or modifying root keys) require a second pair of eyes or a distinct approval step. Document where the hand-off occurs within the procedure.
Minimum Requirement: Procedures for high-risk tasks explicitly requiring a secondary approver.
10. Conduct the ‘Bus Factor’ Test
Control Requirement: Verify that procedures are usable by others. Required Implementation Step: Once a year, assign a critical task to a competent engineer who does not usually perform it. Ask them to execute the task using only the documentation. If they have to ask questions, the documentation is failed and must be rewritten.
Minimum Requirement: Records of “Peer Review” execution of operating procedures.
ISO 27001 Annex A 5.37 SaaS / GRC Platform Implementation Failure Checklist
| Control Requirement | The ‘Checkbox Compliance’ Trap | The Reality Check |
|---|---|---|
| Operating Procedures | Uploading a 10-page high-level “Operations Policy” PDF. | A policy says “We do backups.” A procedure shows the CLI command to run them. GRC tools confuse the two; auditors need the latter. |
| Availability | Storing procedures inside the very system they describe. | If Confluence is down, can you access the “How to Restart Confluence” doc? You need an offline or alternative availability plan. |
| Version Control | Uploading “v1.0”, “v1.1”, “vFINAL” files to a portal. | Docs should be in Git alongside code. Static file uploads ensure your ops team is always looking at outdated instructions. |
| Automation Documentation | Ignoring scripts because “the code documents itself”. | Code explains how; documentation explains why. When the script fails at 3 AM, the on-call engineer needs the context, not just the syntax. |
| Testing | Assuming the procedure works because it was written by a Senior. | Seniors skip steps in their heads. Unless a Junior tests the doc, it likely contains huge assumptions and gaps. |
| Change Management | Updating the server but forgetting to update the doc. | GRC tools don’t link Jira tickets to documentation updates. This disconnect guarantees drift between reality and the manual. |
| Accessibility | Locking procedures behind a GRC login that only Compliance has. | SysAdmins don’t log into GRC tools. Put the docs in the wiki or the repo where the work actually happens. |