The Non-Beginner's Guide To Syncing Data With Rsync PDF
The Non-Beginner's Guide To Syncing Data With Rsync PDF
The Non-Beginner's Guide To Syncing Data With Rsync PDF
rsyncCommands:SimpletoAdvanced
NowthattheWindowsusersareonthesamepage,letstakealookatasimplersynccommand,
andshowhowtheuseofsomeadvancedswitchescanquicklymakeitcomplex.
Letssayyouhaveabunchoffilesthatneedbackedupwhodoesntthesedays?Youplugin
yourportableharddrivesoyoucanbackupyourcomputersfiles,andissuethefollowingcommand:
rsync -a /home/geek/files/ /mnt/usb/files/
Or,thewayitwouldlookonaWindowscomputerwithCygwin:
rsync -a /cygdrive/c/files/ /cygdrive/e/files/
Prettysimple,andatthatpointtheresreallynoneedtousersync,sinceyoucouldjustdragand
dropthefiles.However,ifyourotherharddrivealreadyhassomeofthefilesandjustneedsthe
updatedversionsplusthefilesthathavebeencreatedsincethelastsync,thiscommandishandy
becauseitonlysendsthenewdataovertotheharddrive.Withbigfiles,andespeciallytransferring
filesovertheinternet,thatisabigdeal.
Backingupyourfilestoanexternalharddriveandthenkeepingtheharddriveinthesamelocation
asyourcomputerisaverybadidea,soletstakealookatwhatitwouldrequiretostartsending
yourfilesovertheinternettoanothercomputer(oneyouverented,afamilymembers,etc).
rsync -av --delete -e 'ssh -p 12345 /home/geek/files/
geek2@10.1.1.1:/home/geek2/files/
TheabovecommandwouldsendyourfilestoanothercomputerwithanIPaddressof10.1.1.1.It
woulddeleteextraneousfilesfromthedestinationthatnolongerexistinthesourcedirectory,output
thefilenamesbeingtransferredsoyouhaveanideaofwhatsgoingon,andtunnelrsyncthrough
SSHonport12345.
The-a -v -e --deleteswitchesaresomeofthemostbasicandcommonlyusedyoushould
alreadyknowagooddealaboutthemifyourereadingthistutorial.Letsgooversomeother
switchesthataresometimesignoredbutincrediblyuseful:
--progressThisswitchallowsustoseethetransferprogressofeachfile.Itsparticularlyuseful
whentransferringlargefilesovertheinternet,butcanoutputasenselessamountofinformation
whenjusttransferringsmallfilesacrossafastnetwork.
Anrsynccommandwiththe--progressswitchasabackupisinprogress:
http://www.howtogeek.com/175008/the-non-beginners-guide-to-syncing-data-with-rsync/
1/7
2/2/2016
--partialThisisanotherswitchthatisparticularlyusefulwhentransferringlargefilesoverthe
internet.Ifrsyncgetsinterruptedforanyreasoninthemiddleofafiletransfer,thepartially
transferredfileiskeptinthedestinationdirectoryandthetransferisresumedwhereitleftoffonce
thersynccommandisexecutedagain.Whentransferringlargefilesovertheinternet(say,acouple
ofgigabytes),theresnothingworsethanhavingafewsecondinternetoutage,bluescreen,or
humanerrortripupyourfiletransferandhavingtostartalloveragain.
-Pthisswitchcombines--progressand--partial,souseitinsteadanditwillmakeyour
rsynccommandalittleneater.
-zor--compressThisswitchwillmakersynccompressfiledataasitsbeingtransferred,
reducingtheamountofdatathathastobesenttothedestination.Itsactuallyafairlycommon
switchbutisfarfromessential,onlyreallybenefitingyouontransfersbetweenslowconnections,
anditdoesnothingforthefollowingtypesoffiles:7z,avi,bz2,deb,g,ziso,jpeg,jpg,mov,mp3,
mp4,ogg,rpm,tbz,tgz,z,zip.
-hor--human-readableIfyoureusingthe--progressswitch,youlldefinitelywanttouse
thisoneaswell.Thatis,unlessyouliketoconvertbytestomegabytesonthefly.The-hswitch
convertsalloutputtednumberstohumanreadableformat,soyoucanactuallymakesenseofthe
amountofdatabeingtransferred.
-nor--dry-runThisswitchisessentialtoknowwhenyourefirstwritingyourrsyncscriptand
testingitout.Itperformsatrialrunbutdoesntactuallymakeanychangesthewouldbechanges
arestilloutputtedasnormal,soyoucanreadovereverythingandmakesureitlooksokaybefore
rollingyourscriptintoproduction.
-Ror--relativeThisswitchmustbeusedifthedestinationdirectorydoesntalreadyexist.
Wewillusethisoptionlaterinthisguidesothatwecanmakedirectoriesonthetargetmachinewith
timestampsinthefoldernames.
--exclude-fromThisswitchisusedtolinktoanexcludelistthatcontainsdirectorypathsthat
youdontwantbackedup.Itjustneedsaplaintextfilewithadirectoryorfilepathoneachline.
--include-fromSimilarto--exclude-from,butitlinkstoafilethatcontainsdirectoriesand
http://www.howtogeek.com/175008/the-non-beginners-guide-to-syncing-data-with-rsync/
2/7
2/2/2016
filepathsofdatayouwantbackedup.
--statsNotreallyanimportantswitchbyanymeans,butifyouareasysadmin,itcanbehandy
toknowthedetailedstatsofeachbackup,justsoyoucanmonitortheamountoftrafficbeingsent
overyournetworkandsuch.
--log-fileThisletsyousendthersyncoutputtoalogfile.Wedefinitelyrecommendthisfor
automatedbackupsinwhichyouarenttheretoreadthroughtheoutputyourself.Alwaysgivelog
filesaonceoverinyoursparetimetomakesureeverythingisworkingproperly.Also,itsacrucial
switchforasysadmintouse,soyourenotleftwonderinghowyourbackupsfailedwhileyouleftthe
internincharge.
Letstakealookatourrsynccommandnowthatwehaveafewmoreswitchesadded:
rsync -avzhP --delete --stats --logfile=/home/geek/rsynclogs/backup.log --exclude-from
'/home/geek/exclude.txt' -e 'ssh -p 12345' /home/geek/files/
geek2@10.1.1.1:/home/geek2/files/
Thecommandisstillprettysimple,butwestillhaventcreatedadecentbackupsolution.Even
thoughourfilesarenowintwodifferentphysicallocations,thisbackupdoesnothingtoprotectus
fromoneofthemaincausesofdataloss:humanerror.
SnapshotBackups
Ifyouaccidentallydeleteafile,aviruscorruptsanyofyourfiles,orsomethingelsehappens
wherebyyourfilesareundesirablyaltered,andthenyourunyourrsyncbackupscript,yourbacked
updataisoverwrittenwiththeundesirablechanges.Whensuchathingoccurs(notif,butwhen),
yourbackupsolutiondidnothingtoprotectyoufromyourdataloss.
Thecreatorofrsyncrealizedthis,andaddedthe--backupand--backup-dirargumentsso
userscouldrundifferentialbackups.Theveryfirstexampleonrsyncswebsiteshowsascript
whereafullbackupisruneverysevendays,andthenthechangestothosefilesarebackedupin
separatedirectoriesdaily.Theproblemwiththismethodisthattorecoveryourfiles,youhaveto
effectivelyrecoverthemsevendifferenttimes.Moreover,mostgeeksruntheirbackupsseveral
timesaday,soyoucouldeasilyhave20+differentbackupdirectoriesatanygiventime.Notonlyis
recoveringyourfilesnowapain,butevenjustlookingthroughyourbackedupdatacanbe
extremelytimeconsumingyoudhavetoknowthelasttimeafilewaschangedinordertofindits
mostrecentbackedupcopy.Ontopofallthat,itsinefficienttorunonlyweekly(orevenlessoften
insomecases)incrementalbackups.
Snapshotbackupstotherescue!Snapshotbackupsarenothingmorethanincrementalbackups,
buttheyutilizehardlinkstoretainthefilestructureoftheoriginalsource.Thatmaybehardtowrap
yourheadaroundatfirst,soletstakealookatanexample.
Pretendwehaveabackupscriptrunningthatautomaticallybacksupourdataeverytwohours.
http://www.howtogeek.com/175008/the-non-beginners-guide-to-syncing-data-with-rsync/
3/7
2/2/2016
Wheneverrsyncdoesthis,itnameseachbackupintheformatof:Backupmonthdayyeartime.
So,attheendatypicalday,wedhavealistoffoldersinourdestinationdirectorylikethis:
Whentraversinganyofthosedirectories,youdseeeveryfilefromthesourcedirectoryexactlyasit
wasatthattime.Yet,therewouldbenoduplicatesacrossanytwodirectories.rsyncaccomplishes
thiswiththeuseofhardlinkingthroughthe--link-dest=DIRargument.
Ofcourse,inordertohavethesenicelyandneatlydateddirectorynames,weregoingtohaveto
beefupourrsyncscriptabit.Letstakealookatwhatitwouldtaketoaccomplishabackup
solutionlikethis,andthenwellexplainthescriptingreaterdetail:
#!/bin/bash
#copy old time.txt to time2.txt
yes | cp ~/backup/time.txt ~/backup/time2.txt
#overwrite old time.txt file with new time
echo `date +%F-%I%p` > ~/backup/time.txt
#make the log file
echo > ~/backup/rsync-`date +%F-%I%p`.log
#rsync command
rsync -avzhPR --chmod=Du=rwx,Dgo=rx,Fu=rw,Fgo=r --delete --stats -log-file=~/backup/rsync-`date +%F-%I%p`.log --exclude-from
'~/exclude.txt' --link-dest=/home/geek2/files/`cat
~/backup/time2.txt` -e 'ssh -p 12345' /home/geek/files/
geek2@10.1.1.1:/home/geek2/files/`date +%F-%I%p`/
#dont forget to scp the log file and put it with the backup
scp -P 12345 ~/backup/rsync-`cat ~/backup/time.txt`.log
geek2@10.1.1.1:/home/geek2/files/`cat ~/backup/time.txt`/rsync-`cat
~/backup/time.txt`.log
Thatwouldbeatypicalsnapshotrsyncscript.Incasewelostyousomewhere,letsdissectitpiece
bypiece:
Thefirstlineofourscriptcopiesthecontentsoftime.txttotime2.txt.Theyespipeistoconfirmthat
wewanttooverwritethefile.Next,wetakethecurrenttimeandputitintotime.txt.Thesefileswill
comeinhandylater.
Thenextlinemakesthersynclogfile,namingitrsyncdate.log(wheredateistheactualdateand
time).
Now,thecomplexrsynccommandthatwevebeenwarningyouabout:
http://www.howtogeek.com/175008/the-non-beginners-guide-to-syncing-data-with-rsync/
4/7
2/2/2016
Alternatively,wecouldhavespecifieditas:
~/backup/rsync-`cat ~/backup/time.txt`.log
Eitherway,the--log-filecommandshouldbeabletofindthepreviouslycreateddatedlogfile
andwritetoit.
Thelinkdestinationfileisspecifiedas:
--link-dest=/home/geek2/files/`cat ~/backup/time2.txt`
Thismeansthatthe--link-destcommandisgiventhedirectoryofthepreviousbackup.Ifwe
arerunningbackupseverytwohours,andits4:00PMatthetimeweranthisscript,thenthe-link-destcommandlooksforthedirectorycreatedat2:00PMandonlytransfersthedatathathas
changedsincethen(ifany).
Toreiterate,thatiswhytime.txtiscopiedtotime2.txtatthebeginningofthescript,sothe--linkdestcommandcanreferencethattimelater.
Thedestinationdirectoryisspecifiedas:
geek2@10.1.1.1:/home/geek2/files/`date +%F-%I%p`
Thiscommandsimplyputsthesourcefilesintoadirectorythathasatitleofthecurrentdateand
http://www.howtogeek.com/175008/the-non-beginners-guide-to-syncing-data-with-rsync/
5/7
2/2/2016
time.
Finally,wemakesurethatacopyofthelogfileisplacedinsidethebackup.
scp -P 12345 ~/backup/rsync-`cat ~/backup/time.txt`.log
geek2@10.1.1.1:/home/geek2/files/`cat ~/backup/time.txt`/rsync-`cat
~/backup/time.txt`.log
Weusesecurecopyonport12345totakethersynclogandplaceitintheproperdirectory.To
selectthecorrectlogfileandmakesureitendsupintherightspot,thetime.txtfilemustbe
referencedviathecatcommand.Ifyourewonderingwhywedecidedtocattime.txtinsteadofjust
usingthedatecommand,itsbecausealotoftimecouldhavetranspiredwhilethersynccommand
wasrunning,sotomakesurewehavetherighttime,wejustcatthetextdocumentwecreated
earlier.
Automation
UseCrononLinuxorTaskScheduleronWindowstoautomateyourrsyncscript.Onethingyou
havetobecarefulofismakingsurethatyouendanycurrentlyrunningrsyncprocessesbefore
continuinganewone.TaskSchedulerseemstocloseanyalreadyrunninginstancesautomatically,
butforLinuxyoullneedtobealittlemorecreative.
MostLinuxdistributionscanusethepkillcommand,sojustbesuretoaddthefollowingtothe
beginningofyourrsyncscript:
pkill -9 rsync
Encryption
Nope,werenotdoneyet.Wefinallyhaveafantastic(andfree!)backupsolutioninplace,butallof
ourfilesarestillsusceptibletotheft.Hopefully,yourebackingupyourfilestosomeplacehundreds
ofmilesaway.Nomatterhowsecurethatfarawayplaceis,theftandhackingcanalwaysbe
problems.
Inourexamples,wehavetunneledallofourrsynctrafficthroughSSH,sothatmeansallofourfiles
areencryptedwhileintransittotheirdestination.However,weneedtomakesurethedestinationis
justassecure.Keepinmindthatrsynconlyencryptsyourdataasitisbeingtransferred,butthe
filesarewideopenoncetheyreachtheirdestination.
Oneofrsyncsbestfeaturesisthatitonlytransfersthechangesineachfile.Ifyouhaveallofyour
filesencryptedandmakeoneminorchange,theentirefilewillhavetoberetransmittedasaresult
oftheencryptioncompletelyrandomizingallofthedataafteranychange.
Forthisreason,itsbest/easiesttousesometypeofdiskencryption,suchasBitLockerforWindows
ordmcryptforLinux.Thatway,yourdataisprotectedintheeventoftheft,butfilescanbe
transferredwithrsyncandyourencryptionwonthinderitsperformance.Thereareotheroptions
http://www.howtogeek.com/175008/the-non-beginners-guide-to-syncing-data-with-rsync/
6/7
2/2/2016
availablethatworksimilarlytorsyncorevenimplementsomeformofit,suchasDuplicity,butthey
lacksomeofthefeaturesthatrsynchastooffer.
Afteryouvesetupyoursnapshotbackupsatanoffsitelocationandencryptedyoursourceand
destinationharddrives,giveyourselfapatonthebackformasteringrsyncandimplementingthe
mostfoolproofdatabackupsolutionpossible.
http://www.howtogeek.com/175008/the-non-beginners-guide-to-syncing-data-with-rsync/
7/7