Skip to main content

Cloud

Simple SCR Lab Demo

    Introduction

    As you may already know Microsoft has released a great new feature in Exchange 2007 SP1 called Standby Continuous Replication to extend recoverability of Exchange and add site resiliency. While deploying the CCR cluster gives hardware and volume resiliency, SCR extends recoverability of Exchange by providing a third offsite copy of the Exchange mailbox databases. In this type of environment servers involved are generally referred to as ‘sources’ and ‘targets’, with the SCR server being the target and the CCR (also SCC and stand-alone) servers being the source. You can have multiple sources per SCR target server but in this case we will use a stand-alone mailbox server as the source.

    The goal of this blog is to show you how to set up a simple lab to demonstrate the SCR feature and get familiar with the process for failover and failback of a database. This is not meant to be an all inclusive step-by-step for all the possible scenarios dealing with SCR. I also assume you already know how to install an Exchange 2007 server and have the appropriate infrastructure in place to support it (i.e. global catalog servers, DNS, networking, etc). There are a few sources of good material covering this from the Technet articles to the MSExchangeTeam blog site and even the help file that comes with SP1. Thanks goes out to Scott Schnoll from Microsoft and his blogs on this topic.

    The lab environment used for this demo consists of a single AD site with a global catalog server, a CAS/HT server, a standalone mailbox server, an SCR target server, and two workstations with Outlook 2003 and 2007. I won’t get into the details of setting up the lab or machines. This was performed in a VMWare Workstation 6 lab team environment.

    Here’s a diagram of the layout.

    Server and environmental requirements

    Since the primary function of the SCR server is to take over the role of providing mailbox access in the event of a disaster, it has certain requirements and guidelines.

    The SCR server (target) system requirements are the following:

    • The hardware should be the same or as close to the production mailbox servers with respect to system RAM, processors and hardware platform.
    • The drive space and volumes allocated need to match or exceed that of the source servers. Drive letters are important since the SCR target can have multiple sources. Proper planning needs to be done prior to deploying an SCR target. The drive letters and paths for each source must match those on the SCR server. For example, if the source server has databases and logs on F:SG1MBXDB1 and G:LogsSG1, the SCR target needs to have the same drive letters and paths.
    • The operating system needs to be at the same version and patch level as that of the Mailbox Servers.
    • Exchange 2007 needs to be the 64-bit Enterprise version with SP1
    • The installation directory for Exchange needs to be the same on the source and target servers. Normally this is the default directory of “C:Program FilesMicrosoftExchange Server”
    • Manipulation of the SCR feature is done solely through the Exchange Management Shell. You can use the EMC for other processes but it’s easier to use the shell.

    Installing SCR server

    The SCR server should be installed following the general guidelines for an Exchange 2007 Mailbox Server installation as well as the system requirements for an SCR server listed above without the clustering piece since this will be a stand-alone server. A quick summary of the installation steps is as follows:

    • Install the prerequisite software required for Exchange 2007 (.NET Framework 2.0, PowerShell, etc)
    • Install Exchange 2007 using the command line or install wizard
      • Run "Setup.com /mode:install /roles:m"

    Post-installation steps

    Enabling SCR

    1. Use the default storage group and database created when installing Exchange on the SCR server or in this case I removed them and created my own to differentiate them from the defaults. The name does not matter since they are placeholders.
    2. Prepare the placeholder storage group and database for recovery purposes
      1. Create additional storage groups for each production storage group on the SCR source server. These directories should not conflict with any paths or names of the production databases and logs. Create them on another volume. They are merely placeholders for the failover process and will not be used to mount actual databases.
      2. Create a single database in each storage group to represent each production database used as an SCR source. The name is irrelevant but you should choose something meaningful and different for each storage group and database.
      3. Dismount each database and remove the database files (i.e. *.edb files) (WARNING: DO NOT DELETE THE ACTUAL DATABASES LISTED IN THE MANAGEMENT CONSOLE). These will need to be retained so AD knows about the database configuration during the process of failing over a database.
    1. Enable SCR by running the following command for each storage group on the source. I set the ReplayLagTime and TruncationLagTime to the lowest level for this lab. Normally these would adhere to your required recovery times and parameters:

    Enable-StorageGroupCopy MBX1SG1 -StandbyMachine E2K7SCR -ReplayLagTime 0.0:0:0 -TruncationLagTime 0.0:0:0

    Seed the database copy on the SCR target machine by running the following from the SCR target machine:

    Update-StorageGroupCopy MBX1SG1 -StandbyMachine E2K7SCR

    Replay Lag Time[1]

    After the log files are copied to the SCR target computer, SCR does something LCR and CCR do not. Instead of immediately replaying the log files into the copy of the database, SCR enforces a built-in replay delay of 50 log files and 24 hours. SCR also allows you to specify an additional time delay beyond these built-in delays. Delaying replay activity is useful in a variety of scenarios. For example, in the event of logical corruption of an active database, a delay could prevent logical corruption of the SCR target database.

    The administrator-controlled replay delay is set using a parameter called ReplayLagTime, which dictates the amount of time the Exchange replication service should wait before replaying log files that have been copied to the SCR target computer. The format is Days.Hours:Minutes:Seconds, and the default value is 24 hours. The maximum allowable setting for this value is seven days. The minimum allowable setting is zero seconds, and setting this value to zero seconds effectively eliminates any delay in log replay activity above the default delay of 50 log files.

    In addition to ReplayLagTime, Exchange has a built-in, hardcoded delay of 50 log files, regardless of the value for ReplayLagTime. To determine when a log file should be replayed, Exchange uses the larger of ReplayLagTime or x log files, where x=50. This is an additional safeguard against the need to reseed a storage group in situations where an SCR source that uses continuous replication (for example, a clustered mailbox server in a CCR environment) experiences a failover and one or more storage groups need to be brought online using the Restore-StorageGroupCopy cmdlet. (Seeding is the process of using the Extensible Storage Engine (ESE) streaming backup APIs to make an online copy of the SCR source database on the SCR target computer.) By delaying replay activity on the SCR targets, when a lossy failover for an SCR source occurs, the chances of needing to reseed the SCR copies will be minimized because the nature of the data loss on the SCR source puts the two copies closer together in time.

    Truncation Lag Time

    In the RTM version of Exchange 2007, rules are enforced in a continuous replication environment so that a log file is not deleted unless it has been backed up and replayed into the copy of the database. When using SCR, this rule is modified. SCR (which introduces the concept of multiple database copies) allows log files to be truncated on the SCR source computer as soon as they are inspected by all SCR target computers. Log truncation at the SCR source server does not wait until all logs have been replayed into all SCR targets because SCR target copies can be configured with large log replay lag times.

    You can also add an additional delay to log truncation by using a new parameter called TruncationLagTime, which specifies how long the Exchange replication service should wait (in Days.Hours:Minutes:Seconds format) before truncating log files that have been copied to the SCR target computer and replayed into the copy of the database. The time period begins as soon as the log files have been successfully replayed into the copy of the database. The maximum allowable setting for this value is seven days, while the minimum is zero seconds, although zero seconds effectively eliminates any delay in log truncation activity.

    In an SCR environment, a background thread runs every three minutes to determine if any log files need to be truncated. If the log file generation sequence is below the log file checkpoint for the storage group, and the log file is older than ReplayLagTime + TruncationLagTime, a log file on the SCR target will be truncated.

    In an LCR or CCR environment that is extended with SCR, a log file on the SCR target will be truncated if the following four criteria are met: the log file has been backed up, the log file generation sequence is below the log file checkpoint for the storage group, the passive copy of the storage group is in a state that allows the log file to be truncated, and all SCR targets have inspected the log file.

    Monitoring SCR

    Use the Get-StorageGroupCopyStatus command to verify the SCR sync is in a healthy state.

    Get-StorageGroupCopyStatus –Identity <SourceServername> -StandbyMachine <SCR target machine>

    For example,

    Get-StorageGroupCopyStatus –Identity MBX1 -StandbyMachine E2K7SCR | fl Summary*,Copy*

    SummaryCopyStatus : Healthy

    CopyQueueLength : 0

    This shows that the replication is healthy and that there are no backlogged log files that need to be replicated to the target server. To see a more detailed view of the output run Get-StorageGroupCopyStatus –Identity <ServerName> -StandbyMachine <SCR server> |fl

    Some fields to pay particular attention to are the SummaryCopyStatus, Failed, CopyQueueLength, and ReplayQueueLength, and LastInspectedLogTime fields. If the CopyQueueLength is more than 3 or the ReplayQueueLength is more than 20, you should investigate the replication further. If the LastInspectedLogTime is not current, a service may be stopped or the replications are taking a long time.

    Activating SCR Server for database failover

    In the event of a site or database disaster, you can activate the SCR database for one or more databases and provide users continued access to their mailbox.

    To activate an SCR target database:

    1. Dismount the source database using the command:

    Dismount-Database MBX1SG1DB1

    1. Run the following command from the Exchange Management Shell on the SCR server:

    Restore-StorageGroupCopy MBX1SG1 -StandbyMachine E2K7SCR

    If you receive an error and cannot reach the source server, you will have to use the -Force switch on the same restore command. The command will also attempt to copy any remaining log files to the SCR target.

    1. Verify the database is in a clean shutdown state by running ESEUTIL against the database. For example,

    eseutil /mh C:SG1DB1.edb |findstr State

    State: Dirty Shutdown

    If the database is in a dirty shutdown state you must bring it into a clean shutdown state by using eseutil. Run the following command from the transaction logs directory for the storage group using the appropriate log file prefix (Exx).

    Change to the transaction logs directory on the SCR server, e.g. cdsg1logs

    eseutil /r E00

    1. Re-verify the database is in a clean shutdown state by running ESEUTIL against the database.

    eseutil /mh C:SG1DB1.edb |findstr State

    State: Clean Shutdown

    1. Update Active Directory with the new locations of the database file by using the Move-StorageGroupPath and Move-DatabasePath commands.

    Move-StorageGroupPath E2K7SCRTempSG -SystemFolderPath C:SG1 -LogFolderPath C:Logs –ConfigurationOnly

    Move-DatabasePath E2K7SCRTempSG -EdbFilePath C:SG1DB1.edb –ConfigurationOnly

    1. Configure the mailbox database on the SCR target to be overwritten by a restore:

    Set-MailboxDatabase E2K7SCRTempSGTempDB -AllowFileRestore:$true

    1. Mount the SCR target database:

    Mount-Database E2K7SCRTempSGTempDB

    1. Re-home the mailboxes to the SCR target database:

    Get-Mailbox -Database MBX1SG1DB1 |where {$_.ObjectClass -NotMatch ‘(SystemAttendantMailbox|ExOleDbSystemMailbox)’}|Move-Mailbox -ConfigurationOnly -TargetDatabase E2K7SCRTempSGTempDB

    1. Wait for AD to replicate the changes across the organization. You should now be able to access the mailboxes homed on the recovered database on the SCR target. Update any Outlook 2003 client profiles to point back to the original server. OWA and Outlook 2007 clients will be updated automatically.

    Failback to original mailbox server

    1. To fail back to the original site you need to do a few things. For my lab I assumed we had a volume failure (I know our database and logs are on C:, so let’s pretend they are some other drives with nothing on them). I deleted the original storage group and database and all of the database and log files. I then recreated a temporary directory for the database and logs, and created a new storage group and database and called them by their original names.
    2. Now we need to reseed the original server by enabling the storage group copy from the SCR server by running the following commands:

    Enable-StorageGroupCopy -Identity E2K7SCRTempSG -StandByMachine MBX1 -ReplayLagTime 0.0:0:0 -TruncationLagTime 0.0:0:0

    From the new SCR target (original CCR node) run:

    Update-StorageGroupCopy -Identity E2K7SCRTempSG -StandByMachine MBX1

    1. Now you perform the same procedure for failing over to the SCR target as described above except think of the original server as the new SCR target, because it is.

    From SCR target run the following commands:

    Dismount-Database E2K7SCRTempSGTempDB

    Restore-StorageGroupCopy E2K7SCRTempSG -StandbyMachine MBX1

    Move-StorageGroupPath MBX1SG1 -SystemFolderPath C:SG1 -LogFolderPath C:Logs –ConfigurationOnly

    Move-DatabasePath e2k7mbxdb1 -EdbFilePath "E:Database1First Storage Groupmailbox database.edb" –ConfigurationOnly

    Set-MailboxDatabase MBX1SG1DB1 -AllowFileRestore:$true

    Mount-Database MBX1DB1

    Get-Mailbox -Database E2K7SCRTempSGTempDB |where {$_.ObjectClass -NotMatch ‘(SystemAttendantMailbox|ExOleDbSystemMailbox)’}|Move-Mailbox -ConfigurationOnly -TargetDatabase MBX1SG1DB1

    1. Wait for AD to replicate the changes across the organization. You should now be able to access the mailboxes homed on the recovered database on the SCR target. Update any Outlook 2003 client profiles to point back to the original server. OWA and Outlook 2007 clients will be updated automatically.

    Feedback is welcome and your experience in your lab might be slightly different so please provide comments and I will amend this blog as needed.

    References

    [1] Standby Continuous Replication in Exchange Server 2007 Service Pack 1 (http://www.microsoft.com/technet/technetmag/issues/2007/12/SCR/default.aspx)

    MSExchangeTeam

    http://www.msexchange.org

    TechNet Magazine

    http://www.microsoft.com/technet/technetmag/

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

PointBridge Blogs

More from this Author

Follow Us