Categories
Gentoo Linux

Kill all waiting backup jobs in Bacula

If you are a Bacula user, you most likely had the problem of a Director clogging up on one or more jobs getting stuck while others keep piling up. With the new directives for managing duplicate jobs, this should not happen anymore but last night I found out that 2 of my backup servers managed to dead-lock.

The resulting job queue was over 300 jobs long and restarting the Director did not seem to help. So I threw a little shell script together to use bconsole to cancel all jobs in a waiting state.

Us usual, use on your own risk, I only tested it on my servers and it worked fineā€¦

#!/bin/bash
jobIds=`echo 'status dir running' | bconsole | fgrep 'is waiting' | awk '{print $1}'`
for i in $jobIds
do
  if [ -z `echo "$i" | grep '^[0-9]\+$'` ]
  then
    echo "Error: job ID $i is not a number!"
  else
    echo "Killing waiting Bacula job $i"
    echo "cancel jobid=$i" | bconsole
  fi
done
Categories
Linux / Gentoo Linux

Restoring Exchange 2003/2008 using Bacula

Because the instructions in the Bacula documentation left me hanging on how to actually restore the Exchange data from a backup, I am writing this little summary after extracting all information from the mailing lists.

In my case, I have one server called ‘axmail-fd’ running Exchange 2003 and another server called ‘axemail-fd’ running Exchange 2003 SP2. The following steps are needed to restore into the Recovery Storage Group on the new server in order to migrate the mail from the old to the new server.

To prepare, please create the Recovery Storage Group on the target Exchange server. Create the database name you want to recover, for example “Mailbox Store (AXMAIL)”, which will generate a “Mailbox Store (AXMAIL).edb” in the Recovery Storage Group folder. Clear out any log files or other remnants of previous restores as Exchange tends to get confused if data from multiple databases are in there.

Start bconsole, select the restore mode and select the Exchange backup to restore:

*restore
First you select one or more JobIds that contain files to be restored. 
You will be presented several methods of specifying the JobIds. Then 
you will be allowed to select which files from those JobIds are to be restored.
To select the JobIds, you have the following choices:     
1: List last 20 Jobs run                             
2: List Jobs where a given File is saved             
3: Enter list of comma separated JobIds to select    
4: Enter SQL list command                            
5: Select the most recent backup for a client        
6: Select backup for a client before a specified time     
7: Enter a list of files to restore                       
8: Enter a list of files to restore before a specified time     
9: Find the JobIds of the most recent backup for a client      
10: Find the JobIds for a backup for a client before a specified time    
11: Enter a list of directories to restore for found JobIds              
12: Cancel                                                           
Select item:  (1-12): 5                                                 
 Defined Clients:                                                              
...     
4: axmail-fd     
...    
10: axemail-fd
Select the Client (1-10): 4

The defined FileSet resources are:
1: AXMAIL Full Data Set
2: Exchange
Select FileSet resource (1-2): 2
+-------+-------+----------+---------------+---------------------+-------------------------------+
| JobId | Level | JobFiles | JobBytes      | StartTime           | VolumeName                    |
+-------+-------+----------+---------------+---------------------+-------------------------------+
|    90 | F     |       13 | 5,313,968,371 | 2009-06-24 15:36:10 | Deventer_Exchange_Backup_0013 |
|    90 | F     |       13 | 5,313,968,371 | 2009-06-24 15:36:10 | Deventer_Exchange_Backup_0014 |
|    91 | I     |        5 |     2,671,174 | 2009-06-24 17:28:25 | Deventer_Exchange_Backup_0014 |
|    92 | I     |        5 |       233,882 | 2009-06-24 18:00:01 | Deventer_Exchange_Backup_0014 |
|   118 | I     |       17 |    40,099,025 | 2009-06-25 18:00:02 | Deventer_Exchange_Backup_0014 |
+-------+-------+----------+---------------+---------------------+-------------------------------+

You have selected the following JobIds: 90,91,92,118
Building directory tree for JobId(s) 90,91,92,118 ...
24 files inserted into the tree.

Now we want to select the entire First Storage Group to restore, except for the Public Folders store.

Note: If you have the mailbox store defined, perhaps the restoration will work. An old mailing list conversation in 2008 stated that it was only possible to restore one database at a time – hence we now unmark the Public Folder.

You are now entering file selection mode where you add (mark) and
remove (unmark) files to be restored. No files are initially added, unless
you used the "all" keyword on the command line.                           
Enter "done" to leave this mode.                                          
cwd is: /$ mark *                                       
29 files marked.                               
$ cd "@EXCHANGE/Microsoft Information Store/First Storage Group"
cwd is: /@EXCHANGE/Microsoft Information Store/First Storage Group/
$ unmark Public*                                                   
4 files unmarked.                                                  
$ lsmark                                                           
*C:\Program Files\Exchsrvr\mdbdata\E0002FC5.log                    
*C:\Program Files\Exchsrvr\mdbdata\E0002FC6.log                    
*C:\Program Files\Exchsrvr\mdbdata\E0002FC7.log                    
*C:\Program Files\Exchsrvr\mdbdata\E0002FC8.log                    
*C:\Program Files\Exchsrvr\mdbdata\E0002FC9.log                    
*C:\Program Files\Exchsrvr\mdbdata\E0002FCA.log                    
*C:\Program Files\Exchsrvr\mdbdata\E0002FCB.log                    
*C:\Program Files\Exchsrvr\mdbdata\E0002FCC.log                    
*C:\Program Files\Exchsrvr\mdbdata\E0002FCD.log                    
*C:\Program Files\Exchsrvr\mdbdata\E0002FCE.log                    
*C:\Program Files\Exchsrvr\mdbdata\E0002FCF.log                    
*C:\Program Files\Exchsrvr\mdbdata\E0002FD0.log                    
*C:\Program Files\Exchsrvr\mdbdata\E0002FD1.log                    
*C:\Program Files\Exchsrvr\mdbdata\E0002FD2.log                    
*C:\Program Files\Exchsrvr\mdbdata\E0002FD3.log                    
*C:\Program Files\Exchsrvr\mdbdata\E0002FD4.log                    
*C:\Program Files\Exchsrvr\mdbdata\E0002FD5.log                    
*C:\Program Files\Exchsrvr\mdbdata\E0002FD6.log                    
*Mailbox Store (AXMAIL)/                                           
*C:\Program Files\Exchsrvr\mdbdata\priv1.edb                       
*C:\Program Files\Exchsrvr\mdbdata\priv1.stm                       
*DatabaseBackupInfo                                                
$ done

Bootstrap records written to /var/bacula/axnet-dir.restore.23.bsr  
The job will require the following   
Volume(s)                 Storage(s)                SD Device(s)
===========================================================================
Deventer_Exchange_Backup_ File                      FileStorage

25 files selected to be restored.

Run Restore job
JobName:         RestoreFiles
Bootstrap:       /var/bacula/axnet-dir.restore.23.bsr
Where:           /tmp/bacula-restores
Replace:         always
FileSet:         Empty FileSet
Backup Client:   axmail-fd
Restore Client:  axmail-fd
Storage:         File
When:            2009-06-26 00:44:50
Catalog:         MyCatalog
Priority:        10
Plugin Options:  *None*

We now need to change the target to the new server and clear out the Where setting:

OK to run? (yes/mod/no): m
Parameters to modify:    
1: Level    
2: Storage    
3: Job    
4: FileSet    
5: Restore Client    
6: When    
7: Priority    
8: Bootstrap    
9: Where   
10: File Relocation   
11: Replace   
12: JobId   
13: Plugin Options
Select parameter to modify (1-13): 5                

The defined Client resources are:    
1: bartje-fd    
2: nakor-fd    
3: hermione-fd
4: snape-fd
5: hagrid-fd
6: axnet-fd
7: axweb-fd
8: axmail-fd
9: axexact-fd
10: axklant-fd
11: axemail-fd
Select Client (File daemon) resource (1-11): 11

Run Restore job
JobName:         RestoreFiles
Bootstrap:       /var/bacula/axnet-dir.restore.23.bsr
Where:           /tmp/bacula-restores
Replace:         always
FileSet:         Empty FileSet
Backup Client:   axmail-fd
Restore Client:  axemail-fd
Storage:         File
When:            2009-06-26 00:44:50
Catalog:         MyCatalog
Priority:        10
Plugin Options:  *None*
OK to run? (yes/mod/no): m

Parameters to modify:    
1: Level    
2: Storage    
3: Job    
4: FileSet    
5: Restore Client    
6: When    
7: Priority    
8: Bootstrap    
9: Where   
10: File Relocation   
11: Replace   
12: JobId   
13: Plugin Options
Select parameter to modify (1-13): 9
Please enter path prefix for restore (/ for none): /

Run Restore job
JobName:         RestoreFiles
Bootstrap:       /var/bacula/axnet-dir.restore.23.bsr
Where:
Replace:         always
FileSet:         Empty FileSet
Backup Client:   axmail-fd
Restore Client:  axemail-fd
Storage:         File
When:            2009-06-26 00:44:50
Catalog:         MyCatalog
Priority:        10
Plugin Options:  *None*
OK to run? (yes/mod/no): y

Job queued. JobId=132
You have messages.

Notice how we did not use any renaming on the database paths: if Exchange 2003 has a Recovery Storage Group defined, that group will receive any restores, making manual adjustment of the paths unneeded.

Troubleshooting

Error 0x7fe1f42

If you get this error: “Error: HrESERestoreAddDatabase failed with error 0xc7fe1f42 – Database not found. Check that the Database you are trying to restore actually exists in the Storage Group you are restoring to”. You have not created the database in the Exchange manager in the Recovery Storage Group. Make sure you created the database in the manager and check the name.

Error 0xc7ff1004

I ran into this message after figuring out how to restore the data: “Error: HrESERestoreComplete failed with error 0xc7ff1004 – Unknown error”. The error is given by the FD after the data has been restored and the FD crashed after that.

You can run the eseutil.exe against the .edb file to check the state (use eseutil /mh filename.edb) and you will probably see the state as “Dirty Shutdown”.

The cause is a problem during the backup and is making the restore fail. If you have all the log files (E00xxxx.log in ‘restore’ in the Recovery Storage Group folder) you can use the eseutil /cc restore command to replay the log files and fix the database.

After replaying the logs, the database should mount fine and all the mail should be there.