Jul 172015
 

Intro

It took me a while to figure out optimal configuration for the tape library with two streamers used with Bacula backup software.

Exact model of tape library in use is Quantum Scalar i40 with two LTO5 streamers. It is hooked up directly to the main NFS server (so heavy backup traffic goes via localhost only) – server that runs bacula-sd and bacula-fd services only. Bacula director runs on separate, dedicated backup server.

Currently there are around 20 other servers connected to this system as clients, with various daily Incremental, weekly Differential and monthly Full backup level jobs scheduled for execution.

Some additional info about this setup in previous post – click here. Config files below:

insta-24

 


Relevant config files from Backup server


/etc/bacula/bacula-dir.conf

Director {  
  Name = prod-backup-dir
  QueryFile = "/etc/bacula/scripts/query.sql"
  WorkingDirectory = "/var/lib/bacula"
  PidDirectory = "/var/run/bacula"
  Password = "xxxxx"
  Messages = Daemon
  DirAddress = prod-backup.domain.com
  Maximum Concurrent Jobs = 20
}
@/etc/bacula/JobDefs/JobDefs.conf
@|"sh -c 'cat /etc/bacula/Job/*'"
@|"sh -c 'cat /etc/bacula/FileSet/*'"
@|"sh -c 'cat /etc/bacula/Schedule/*'"
@|"sh -c 'cat /etc/bacula/Clients-enabled/*'"
@|"sh -c 'cat /etc/bacula/Storage/*'"
@|"sh -c 'cat /etc/bacula/Pool/*'"
Catalog {
  Name = MyCatalog
  dbaddress = prod-db.domain.com ;
  dbname = "bacula"; dbuser = "bacula"; dbpassword = "xxxxx"
}
Messages {
  Name = Standard
  mailcommand = "/usr/lib/bacula/bsmtp -h prod-mailhub.domain.com -f \"\(Bacula\) \<%r\>\" -s \"Bacula: %t %e of %c %l\" %r"
  operatorcommand = "/usr/lib/bacula/bsmtp -h prod-mailhub.domain.com  -f \"\(Bacula\) \<%r\>\" -s \"Bacula: Intervention needed for %j\" %r"
  mail = [email protected] = all, !skipped            
  operator = [email protected] = mount
  console = all, !skipped, !saved
  append = "/var/lib/bacula/log" = all, !skipped
  catalog = all
}
Messages {
  Name = Daemon
  mailcommand = "/usr/lib/bacula/bsmtp -h localhost -f \"\(Bacula\) \<%r\>\" -s \"Bacula client %c job %n exit code %e  \" %r"
  mail = [email protected] = all, !skipped            
  console = all, !skipped, !saved
  append = "/var/lib/bacula/log" = all, !skipped
}
Console {
  Name = prod-backup-mon
  Password = "xxxxxxxxxxx"
  CommandACL = status, .status
}

Example job definition /etc/bacula/Job/Studies2010-1.conf

#----------------------------------
Job {
  Name = Studies2010-1
  Type = Backup
  Client = nfs-prod-fd
  Schedule = MonthlyCycle
  Messages = Daemon
  FileSet = Studies2010-1
  Level = Full
  Pool = lto5-pool
  Priority = 12
  Max Run Time = 1555200 # default limit is 6 days, 518400sec. bumped 3x just in case
  Spool Data = yes
  Spool Attributes = yes

}
#----------------------------------

Example fileset, /etc/bacula/FileSet/Studies2010-1.conf

#-------------------------------------------
FileSet {
  Name = "Studies2010-1"
  Include {
    Options {
      signature = MD5
      compression=GZIP5
      noatime=yes
      aclsupport = yes
      wilddir = "/export/studies/201007*"
      wilddir = "/export/studies/201008*"
     	    }
    Options {
      RegexDir = ".*"
      exclude = yes
	    }
    File = "/export/studies"
          }
}

Example Schedule, /etc/bacula/Schedule/MonthlyCycle3.conf

Schedule {
  Name = MonthlyCycle3
  Run = Level=Full Pool=lto5-pool 3rd fri at 23:30
}

Tape library, storage definition:

Storage {
  Name = TapeLibrary
  Address = prod-tapelib.comain.com
  SDPort = 9103
  Password = "xxxxxx"
  Device = QuantumScalar-I40
  Media Type = LTO-5
  Autochanger = yes
  Maximum Concurrent Jobs = 4
}

Pool of tapes defined here:

Pool {
  Name = lto5-pool
  Pool Type = Backup
  Volume Retention = 6 months
  Recycle = yes
  AutoPrune = yes
  Recycle = yes
  Label Format = LTO5
  Storage = TapeLibrary
}

 

Relevant config files from Tape Library server

 

Note that I spool data before saving to the tape – this prevents tape “shoe shine” during Incremental/Differential backups.

 

  
Storage { 
  Name = TapeLibrary
  WorkingDirectory = "/var/spool/bacula"
  Pid Directory = "/var/run"
}
Autochanger {
  Name = QuantumScalar-I40
  Device = Drive0
  Device = Drive1
  Changer Device = /dev/changer
  Changer Command = "/usr/libexec/bacula/mtx-changer %c %o %S %a %d"
}
Device {
  Name = Drive0
  Drive Index = 0
  Media Type = LTO-5
  Archive Device = /dev/nst0
  AutomaticMount = yes
  AlwaysOpen = yes
  RemovableMedia = yes
  RandomAccess = no
  AutoChanger = yes
  Alert Command = "sh -c 'smartctl -H -l error %c'"  
  Maximum Changer Wait = 600
  Maximum Rewind Wait = 600
  Maximum Open Wait = 600
  Spool Directory = /var/spool/bacula/Spool
  Maximum Spool Size = 45G
  Maximum Concurrent Jobs = 2
}
Device {
  Name = Drive1
  Drive Index = 1
  Media Type = LTO-5
  Archive Device = /dev/nst1
  AutomaticMount = yes
  AlwaysOpen = yes
  RemovableMedia = yes
  RandomAccess = no
  AutoChanger = yes
  Alert Command = "sh -c 'smartctl -H -l error %c'"
  Maximum Changer Wait = 600
  Maximum Rewind Wait = 600
  Maximum Open Wait = 600
  Spool Directory = /var/spool/bacula/Spool
  Maximum Spool Size = 45G
  Maximum Concurrent Jobs = 2
     }
Messages {
  Name = Standard
  director = prod-backup-dir = all
}
Director {
  Name = prod-backup-dir
  Password = "xxxxxxxx"
}
Director {
  Name = prod-backup-mon
  Password = "xxxxxxxxxx"
  Monitor = yes
}

Thoughts

Implementing Bacula driven backup solution requires some time and effort – but what you get in the end is sophisticated, enterprise grade backup system, capable of backing up TBs of data in organised and efficient manner.

Used in conjunction with Monitoring system it offers fully automated backup solution, with minimal operator effort required. Routine tasks boil down to:

 


Jan 052015
 

I’ve been experiencing problems with Dell running Centos 6 and Bacula 5.2, hooked up to Quantum Scalar i40 tape library with two LTO5 drives. Server has two HBAs, first used with server disks (PERC-310mini) and second LSI SAS2008 with external SAS port connected to tape library. More info about this setup in this post.

Problem: basically, after each server reboot autochanger device was missing.

After spending endless hours I end up with some workaround. Its not ideal, well to be honest its a dirty hack so if there is a better way of doing I would very much appreciate you dropping a quick comment!

So if you can’t find tape library changer under centos 6 with a quantum scalar i40 then read on…

But first, random picture from my library, it seems like she’s showing to her pal a funny cat picture on her phone.

insta-20

Background: Autochanger is being managed via one of the drives, this is called Control Path and being set once via autochanger web interface.

From time to time Quantum Scalar i40 autochanger is not getting detected after server reboot. In order to detect it we need to rescan SCSI bus.

Lets say tape drives are on controller 0, channel 0, with ID 0 LUN 0 and ID 1 LUN 0

root@abc-jamno:~ # lsscsi
[0:0:0:0]    tape    HP       Ultrium 5-SCSI   Z64Z  /dev/st0 
[0:0:1:0]    tape    HP       Ultrium 5-SCSI   Z64Z  /dev/st1 
[1:0:32:0]   enclosu DP       BP12G+           1.00  -       
[1:2:0:0]    disk    DELL     PERC H310        2.12  /dev/sda

in which case we can find controller (aka Control Path) connected on LUN 1 of one of the drives – but it is not being detected by OS for some reason! This is a bit that puzzles me. I suspect that this is due to my Host Bus Adapters getting different IDs after reboot, i.e. sometimes PERC gets detected as 0 and sometimes SAS2008 gets it – quoted example shows the later case.

root@abc-jamno:~ # echo 0 0 1 >  /sys/class/scsi_host/host0/scan
root@abc-jamno:~ # echo 0 1 1 >  /sys/class/scsi_host/host0/scan

root@abc-jamno:~ # lsscsi
[0:0:0:0]    tape    HP       Ultrium 5-SCSI   Z64Z  /dev/st0 
[0:0:1:0]    tape    HP       Ultrium 5-SCSI   Z64Z  /dev/st1 
[0:0:1:1]    mediumx QUANTUM  Scalar i40-i80   153G  /dev/sch0
[1:0:32:0]   enclosu DP       BP12G+           1.00  -       
[1:2:0:0]    disk    DELL     PERC H310        2.12  /dev/sda 

Solution: aka dirty hack, upon reboot we grep logs to check SCSI id of tapes and use that to rescan bus on tape drive SCSI id but changing LUN +1. Can be used to write init script that starts just before bacula-sd starts, I guess…

This one liner will generate commands we need:

grep tape /var/log/messages*|cut -d" " -f7|awk -F: '{print "echo "$2" "$3" "1 " > /sys/class/scsi_host/host"$1"/scan"}'

double check those lines and and run them.

Restart bacula storage daemon

service bacula-sd restart

Useful commands:

cat /proc/scsi/sg/device_hdr /proc/scsi/sg/devices
host	chan	id	lun	type	opens	qdepth	busy	online
0	0	0	0	1	1	254	0	1
0	0	1	0	1	1	254	0	1
1	0	32	0	13	1	256	0	1
1	2	0	0	0	1	256	0	1
0	0	1	1	8	1	254	1	1

root@abc-jamno:~ # sg_scan
/dev/sg0: scsi0 channel=0 id=32 lun=0
/dev/sg1: scsi0 channel=2 id=0 lun=0
/dev/sg2: scsi1 channel=0 id=5 lun=0
/dev/sg3: scsi1 channel=0 id=7 lun=0
/dev/sg4: scsi1 channel=0 id=7 lun=1

tapeinfo -f /dev/sg2

Source:
How do I rescan the SCSI bus to add or remove a SCSI device without rebooting the computer

https://access.redhat.com/site/solutions/3941

I know more about SCSI now that I ever wished to know.