Last week of year 2017 I have been doing AD Disaster Recovery Plan (DRP) and firedrill. Quite often I face customers who doesn’t have DRP from AD or any other technology at all, backups are taken but recovery plans are missing. Maybe they rely to Microsoft Support totally in disaster recovery scenarios.
If things hit the fan, you definitely should have DRP from Active Directory in place because it’s a heart of the most infrastructures. How to recover is main thing in other technologies also. For example ADFS, which plays quite critical role nowadays because of the cloud integrations and SSO scenarios.
ADDS Forest Recovery process and tasks
There isn’t guidance which fits for all because needed actions really depends environment you have, how many forests, how many domains, hardening for Domain Controllers etc. If you consider ADDS forest recovery always contact to Microsoft support before proceeding to it. Together you will try to find other solution to recover from the failure, keep in mind that forest recovery is always the last option. Tasks below from docs.microsoft.co
Identify the problem
Analyze current situation and identify is the forest recovery needed. In some scenarios it’s the only option. Some examples of forest wide failures:
- All DCs have been logically corrupted or physically damaged to a point that business continuity is impossible; for example, all business applications that depend on AD DS are nonfunctional.
- A rogue administrator has compromised the Active Directory environment.
- An attacker intentionally—or an administrator accidentally—runs a script that spreads data corruption across the forest.
- An attacker intentionally—or an administrator accidentally—extends the Active Directory schema with malicious or conflicting changes.
- An attacker has managed to install malicious software on DCs, and you have been advised by Microsoft Support to recover the forest from backup.
- None of the DCs can replicate with their replication partners.
- Changes cannot be made to AD DS at any domain controller.
- New DCs cannot be installed in any domain.
Decide how to recover from the forest
After you determine that forest recovery is necessary, complete preliminary steps to prepare for it:
- determine the current forest structure, identify the functions that each DC performs, decide which DC to restore for each domain, and ensure that all writeable DCs are taken offline.
Perform initial recovery
Best practice today is to recover forest and one Domain Controller to isolated network, verify forest/domain functionality and then connect isolated network to production network, from the high level tasks are
- Restore the first writeable domain controller in each domain
- Reconnect each restored writeable domain controller to the network
- Add the global catalog to a domain controller in the forest root domain
Redeploy remaining DC’s
After everything has been verified it’s time to redeploy all the remaining DC’s which have been cleaned at initial recovery phase. If you are using W2012 PDC (or higher) and virtualization DC cloning can save a lot from your time in this phase
Cleanup- tasks that might be needed
- You might need to revert DNS configuration
- WINS cleanup
- Transfer FSMO roles, add more Global Catalogs
- Because the entire forest is restored to a previous state, any objects (such as users and computers) that were added and all updates (such as password changes) that were made to existing objects after this point are lost. Therefore, you should re-create these missing objects and reapply the missing updates as appropriate.
- External trust reconfiguration
Disaster Recovery Plan (DRP)
That being said, DRP can save you in case of a disaster. I prefer to do it “for dummies” that it contains step by step guidance with pictures. Reason for this is that in case of disaster pressure is extremely high (been there, done that) and you might not be thinking as clearly as you would in normal situation. When you have TESTED step by step guide in your hand and you know that it’s working the stress levels are much lower. I always recommend to my customer to have at least once a year AD DRP firedrill where DRP will be tested and document updated if needed. Then you will have the confidence to perform in case of disaster.
I have been using following structure in DRP’s I have made:
- Forest InformationDetailed information from the forest and domains.
- Detailed information from the DC which is going to be restored
- Information where to find backups, passwords
- Information about the isolated network, how to connect it etc
- Communication plan in your organization
- Forest RecoveryStep by step plan how to recover
- Post restore configuration including:
- Resetting administrator, DC, krbtgt passwords
- Verify DC functionality
- Clean up metadata from DC’s, sites and services
- Raise and invalidate RID Pool
- Configure and validate all the critical services
- Verify forest recovery by promoting additional DC’s
- Post restore configuration including:
- SYSVOL RecoveryHow to recovery SYSVOL data
- Describe common disaster recovery scenarios around SYSVOL
- Object RecoveryDescribe common scenarios
- DNS RecoveryProvide information how DNS zones are backed up
- Provide information how to recover
- Known ErrorsKnown errors found in this environment during firedrills
- CommandsList of command and tools used during firedrill. I have collected all the tools to ISO image which can be used during the process
I highly recommend to make a DRP for your on-premises ADDS. When you have it in place and it’s tested, you will be a much more safer waters in case of disaster. Below is link to Microsoft AD Forest recovery guidance.