Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

The Non-Beginner's Guide To Syncing Data With Rsync PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7
At a glance
Powered by AI
The article discusses using rsync to backup and sync files both locally and remotely. It covers basic rsync commands and many useful switches to optimize file transfers, handle errors, and view progress.

-progress, -partial, -z or --compress, -h or --human-readable, -n or --dry-run

Cron on Linux or Task Scheduler on Windows can be used to automate rsync backups on a schedule. Existing rsync processes may need to be killed first using pkill.

2/2/2016

The Non-Beginners Guide to Syncing Data with Rsync

rsyncCommands:SimpletoAdvanced
NowthattheWindowsusersareonthesamepage,letstakealookatasimplersynccommand,
andshowhowtheuseofsomeadvancedswitchescanquicklymakeitcomplex.
Letssayyouhaveabunchoffilesthatneedbackedupwhodoesntthesedays?Youplugin
yourportableharddrivesoyoucanbackupyourcomputersfiles,andissuethefollowingcommand:
rsync -a /home/geek/files/ /mnt/usb/files/

Or,thewayitwouldlookonaWindowscomputerwithCygwin:
rsync -a /cygdrive/c/files/ /cygdrive/e/files/

Prettysimple,andatthatpointtheresreallynoneedtousersync,sinceyoucouldjustdragand
dropthefiles.However,ifyourotherharddrivealreadyhassomeofthefilesandjustneedsthe
updatedversionsplusthefilesthathavebeencreatedsincethelastsync,thiscommandishandy
becauseitonlysendsthenewdataovertotheharddrive.Withbigfiles,andespeciallytransferring
filesovertheinternet,thatisabigdeal.
Backingupyourfilestoanexternalharddriveandthenkeepingtheharddriveinthesamelocation
asyourcomputerisaverybadidea,soletstakealookatwhatitwouldrequiretostartsending
yourfilesovertheinternettoanothercomputer(oneyouverented,afamilymembers,etc).
rsync -av --delete -e 'ssh -p 12345 /home/geek/files/
geek2@10.1.1.1:/home/geek2/files/

TheabovecommandwouldsendyourfilestoanothercomputerwithanIPaddressof10.1.1.1.It
woulddeleteextraneousfilesfromthedestinationthatnolongerexistinthesourcedirectory,output
thefilenamesbeingtransferredsoyouhaveanideaofwhatsgoingon,andtunnelrsyncthrough
SSHonport12345.
The-a -v -e --deleteswitchesaresomeofthemostbasicandcommonlyusedyoushould
alreadyknowagooddealaboutthemifyourereadingthistutorial.Letsgooversomeother
switchesthataresometimesignoredbutincrediblyuseful:
--progressThisswitchallowsustoseethetransferprogressofeachfile.Itsparticularlyuseful
whentransferringlargefilesovertheinternet,butcanoutputasenselessamountofinformation
whenjusttransferringsmallfilesacrossafastnetwork.
Anrsynccommandwiththe--progressswitchasabackupisinprogress:

http://www.howtogeek.com/175008/the-non-beginners-guide-to-syncing-data-with-rsync/

1/7

2/2/2016

The Non-Beginners Guide to Syncing Data with Rsync

--partialThisisanotherswitchthatisparticularlyusefulwhentransferringlargefilesoverthe
internet.Ifrsyncgetsinterruptedforanyreasoninthemiddleofafiletransfer,thepartially
transferredfileiskeptinthedestinationdirectoryandthetransferisresumedwhereitleftoffonce
thersynccommandisexecutedagain.Whentransferringlargefilesovertheinternet(say,acouple
ofgigabytes),theresnothingworsethanhavingafewsecondinternetoutage,bluescreen,or
humanerrortripupyourfiletransferandhavingtostartalloveragain.
-Pthisswitchcombines--progressand--partial,souseitinsteadanditwillmakeyour
rsynccommandalittleneater.
-zor--compressThisswitchwillmakersynccompressfiledataasitsbeingtransferred,
reducingtheamountofdatathathastobesenttothedestination.Itsactuallyafairlycommon
switchbutisfarfromessential,onlyreallybenefitingyouontransfersbetweenslowconnections,
anditdoesnothingforthefollowingtypesoffiles:7z,avi,bz2,deb,g,ziso,jpeg,jpg,mov,mp3,
mp4,ogg,rpm,tbz,tgz,z,zip.
-hor--human-readableIfyoureusingthe--progressswitch,youlldefinitelywanttouse
thisoneaswell.Thatis,unlessyouliketoconvertbytestomegabytesonthefly.The-hswitch
convertsalloutputtednumberstohumanreadableformat,soyoucanactuallymakesenseofthe
amountofdatabeingtransferred.
-nor--dry-runThisswitchisessentialtoknowwhenyourefirstwritingyourrsyncscriptand
testingitout.Itperformsatrialrunbutdoesntactuallymakeanychangesthewouldbechanges
arestilloutputtedasnormal,soyoucanreadovereverythingandmakesureitlooksokaybefore
rollingyourscriptintoproduction.
-Ror--relativeThisswitchmustbeusedifthedestinationdirectorydoesntalreadyexist.
Wewillusethisoptionlaterinthisguidesothatwecanmakedirectoriesonthetargetmachinewith
timestampsinthefoldernames.
--exclude-fromThisswitchisusedtolinktoanexcludelistthatcontainsdirectorypathsthat
youdontwantbackedup.Itjustneedsaplaintextfilewithadirectoryorfilepathoneachline.
--include-fromSimilarto--exclude-from,butitlinkstoafilethatcontainsdirectoriesand
http://www.howtogeek.com/175008/the-non-beginners-guide-to-syncing-data-with-rsync/

2/7

2/2/2016

The Non-Beginners Guide to Syncing Data with Rsync

filepathsofdatayouwantbackedup.
--statsNotreallyanimportantswitchbyanymeans,butifyouareasysadmin,itcanbehandy
toknowthedetailedstatsofeachbackup,justsoyoucanmonitortheamountoftrafficbeingsent
overyournetworkandsuch.
--log-fileThisletsyousendthersyncoutputtoalogfile.Wedefinitelyrecommendthisfor
automatedbackupsinwhichyouarenttheretoreadthroughtheoutputyourself.Alwaysgivelog
filesaonceoverinyoursparetimetomakesureeverythingisworkingproperly.Also,itsacrucial
switchforasysadmintouse,soyourenotleftwonderinghowyourbackupsfailedwhileyouleftthe
internincharge.
Letstakealookatourrsynccommandnowthatwehaveafewmoreswitchesadded:
rsync -avzhP --delete --stats --logfile=/home/geek/rsynclogs/backup.log --exclude-from
'/home/geek/exclude.txt' -e 'ssh -p 12345' /home/geek/files/
geek2@10.1.1.1:/home/geek2/files/

Thecommandisstillprettysimple,butwestillhaventcreatedadecentbackupsolution.Even
thoughourfilesarenowintwodifferentphysicallocations,thisbackupdoesnothingtoprotectus
fromoneofthemaincausesofdataloss:humanerror.

SnapshotBackups
Ifyouaccidentallydeleteafile,aviruscorruptsanyofyourfiles,orsomethingelsehappens
wherebyyourfilesareundesirablyaltered,andthenyourunyourrsyncbackupscript,yourbacked
updataisoverwrittenwiththeundesirablechanges.Whensuchathingoccurs(notif,butwhen),
yourbackupsolutiondidnothingtoprotectyoufromyourdataloss.
Thecreatorofrsyncrealizedthis,andaddedthe--backupand--backup-dirargumentsso
userscouldrundifferentialbackups.Theveryfirstexampleonrsyncswebsiteshowsascript
whereafullbackupisruneverysevendays,andthenthechangestothosefilesarebackedupin
separatedirectoriesdaily.Theproblemwiththismethodisthattorecoveryourfiles,youhaveto
effectivelyrecoverthemsevendifferenttimes.Moreover,mostgeeksruntheirbackupsseveral
timesaday,soyoucouldeasilyhave20+differentbackupdirectoriesatanygiventime.Notonlyis
recoveringyourfilesnowapain,butevenjustlookingthroughyourbackedupdatacanbe
extremelytimeconsumingyoudhavetoknowthelasttimeafilewaschangedinordertofindits
mostrecentbackedupcopy.Ontopofallthat,itsinefficienttorunonlyweekly(orevenlessoften
insomecases)incrementalbackups.
Snapshotbackupstotherescue!Snapshotbackupsarenothingmorethanincrementalbackups,
buttheyutilizehardlinkstoretainthefilestructureoftheoriginalsource.Thatmaybehardtowrap
yourheadaroundatfirst,soletstakealookatanexample.
Pretendwehaveabackupscriptrunningthatautomaticallybacksupourdataeverytwohours.
http://www.howtogeek.com/175008/the-non-beginners-guide-to-syncing-data-with-rsync/

3/7

2/2/2016

The Non-Beginners Guide to Syncing Data with Rsync

Wheneverrsyncdoesthis,itnameseachbackupintheformatof:Backupmonthdayyeartime.
So,attheendatypicalday,wedhavealistoffoldersinourdestinationdirectorylikethis:

Whentraversinganyofthosedirectories,youdseeeveryfilefromthesourcedirectoryexactlyasit
wasatthattime.Yet,therewouldbenoduplicatesacrossanytwodirectories.rsyncaccomplishes
thiswiththeuseofhardlinkingthroughthe--link-dest=DIRargument.
Ofcourse,inordertohavethesenicelyandneatlydateddirectorynames,weregoingtohaveto
beefupourrsyncscriptabit.Letstakealookatwhatitwouldtaketoaccomplishabackup
solutionlikethis,andthenwellexplainthescriptingreaterdetail:
#!/bin/bash
#copy old time.txt to time2.txt
yes | cp ~/backup/time.txt ~/backup/time2.txt
#overwrite old time.txt file with new time
echo `date +%F-%I%p` > ~/backup/time.txt
#make the log file
echo > ~/backup/rsync-`date +%F-%I%p`.log
#rsync command
rsync -avzhPR --chmod=Du=rwx,Dgo=rx,Fu=rw,Fgo=r --delete --stats -log-file=~/backup/rsync-`date +%F-%I%p`.log --exclude-from
'~/exclude.txt' --link-dest=/home/geek2/files/`cat
~/backup/time2.txt` -e 'ssh -p 12345' /home/geek/files/
geek2@10.1.1.1:/home/geek2/files/`date +%F-%I%p`/
#dont forget to scp the log file and put it with the backup
scp -P 12345 ~/backup/rsync-`cat ~/backup/time.txt`.log
geek2@10.1.1.1:/home/geek2/files/`cat ~/backup/time.txt`/rsync-`cat
~/backup/time.txt`.log

Thatwouldbeatypicalsnapshotrsyncscript.Incasewelostyousomewhere,letsdissectitpiece
bypiece:
Thefirstlineofourscriptcopiesthecontentsoftime.txttotime2.txt.Theyespipeistoconfirmthat
wewanttooverwritethefile.Next,wetakethecurrenttimeandputitintotime.txt.Thesefileswill
comeinhandylater.
Thenextlinemakesthersynclogfile,namingitrsyncdate.log(wheredateistheactualdateand
time).
Now,thecomplexrsynccommandthatwevebeenwarningyouabout:
http://www.howtogeek.com/175008/the-non-beginners-guide-to-syncing-data-with-rsync/

4/7

2/2/2016

The Non-Beginners Guide to Syncing Data with Rsync

-avzhPR, -e, --delete, --stats, --log-file, --exclude-from, --link-dest


Justtheswitcheswetalkedaboutearlierscrollupifyouneedarefresher.
--chmod=Du=rwx,Dgo=rx,Fu=rw,Fgo=rThesearethepermissionsforthedestination
directory.Sincewearemakingthisdirectoryinthemiddleofourrsyncscript,weneedtospecify
thepermissionssothatourusercanwritefilestoit.
Theuseofdateandcatcommands
Weregoingtogoovereachuseofthedateandcatcommandsinsidethersynccommand,inthe
orderthattheyoccur.Note:wereawarethatthereareotherwaystoaccomplishthisfunctionality,
especiallywiththeuseofdeclaringvariables,butforthepurposeofthisguide,wevedecidedtouse
thismethod.
Thelogfileisspecifiedas:
~/backup/rsync-`date +%F-%I%p`.log

Alternatively,wecouldhavespecifieditas:
~/backup/rsync-`cat ~/backup/time.txt`.log

Eitherway,the--log-filecommandshouldbeabletofindthepreviouslycreateddatedlogfile
andwritetoit.
Thelinkdestinationfileisspecifiedas:
--link-dest=/home/geek2/files/`cat ~/backup/time2.txt`

Thismeansthatthe--link-destcommandisgiventhedirectoryofthepreviousbackup.Ifwe
arerunningbackupseverytwohours,andits4:00PMatthetimeweranthisscript,thenthe-link-destcommandlooksforthedirectorycreatedat2:00PMandonlytransfersthedatathathas
changedsincethen(ifany).
Toreiterate,thatiswhytime.txtiscopiedtotime2.txtatthebeginningofthescript,sothe--linkdestcommandcanreferencethattimelater.
Thedestinationdirectoryisspecifiedas:
geek2@10.1.1.1:/home/geek2/files/`date +%F-%I%p`

Thiscommandsimplyputsthesourcefilesintoadirectorythathasatitleofthecurrentdateand
http://www.howtogeek.com/175008/the-non-beginners-guide-to-syncing-data-with-rsync/

5/7

2/2/2016

The Non-Beginners Guide to Syncing Data with Rsync

time.
Finally,wemakesurethatacopyofthelogfileisplacedinsidethebackup.
scp -P 12345 ~/backup/rsync-`cat ~/backup/time.txt`.log
geek2@10.1.1.1:/home/geek2/files/`cat ~/backup/time.txt`/rsync-`cat
~/backup/time.txt`.log

Weusesecurecopyonport12345totakethersynclogandplaceitintheproperdirectory.To
selectthecorrectlogfileandmakesureitendsupintherightspot,thetime.txtfilemustbe
referencedviathecatcommand.Ifyourewonderingwhywedecidedtocattime.txtinsteadofjust
usingthedatecommand,itsbecausealotoftimecouldhavetranspiredwhilethersynccommand
wasrunning,sotomakesurewehavetherighttime,wejustcatthetextdocumentwecreated
earlier.

Automation
UseCrononLinuxorTaskScheduleronWindowstoautomateyourrsyncscript.Onethingyou
havetobecarefulofismakingsurethatyouendanycurrentlyrunningrsyncprocessesbefore
continuinganewone.TaskSchedulerseemstocloseanyalreadyrunninginstancesautomatically,
butforLinuxyoullneedtobealittlemorecreative.
MostLinuxdistributionscanusethepkillcommand,sojustbesuretoaddthefollowingtothe
beginningofyourrsyncscript:
pkill -9 rsync

Encryption
Nope,werenotdoneyet.Wefinallyhaveafantastic(andfree!)backupsolutioninplace,butallof
ourfilesarestillsusceptibletotheft.Hopefully,yourebackingupyourfilestosomeplacehundreds
ofmilesaway.Nomatterhowsecurethatfarawayplaceis,theftandhackingcanalwaysbe
problems.
Inourexamples,wehavetunneledallofourrsynctrafficthroughSSH,sothatmeansallofourfiles
areencryptedwhileintransittotheirdestination.However,weneedtomakesurethedestinationis
justassecure.Keepinmindthatrsynconlyencryptsyourdataasitisbeingtransferred,butthe
filesarewideopenoncetheyreachtheirdestination.
Oneofrsyncsbestfeaturesisthatitonlytransfersthechangesineachfile.Ifyouhaveallofyour
filesencryptedandmakeoneminorchange,theentirefilewillhavetoberetransmittedasaresult
oftheencryptioncompletelyrandomizingallofthedataafteranychange.
Forthisreason,itsbest/easiesttousesometypeofdiskencryption,suchasBitLockerforWindows
ordmcryptforLinux.Thatway,yourdataisprotectedintheeventoftheft,butfilescanbe
transferredwithrsyncandyourencryptionwonthinderitsperformance.Thereareotheroptions
http://www.howtogeek.com/175008/the-non-beginners-guide-to-syncing-data-with-rsync/

6/7

2/2/2016

The Non-Beginners Guide to Syncing Data with Rsync

availablethatworksimilarlytorsyncorevenimplementsomeformofit,suchasDuplicity,butthey
lacksomeofthefeaturesthatrsynchastooffer.
Afteryouvesetupyoursnapshotbackupsatanoffsitelocationandencryptedyoursourceand
destinationharddrives,giveyourselfapatonthebackformasteringrsyncandimplementingthe
mostfoolproofdatabackupsolutionpossible.

http://www.howtogeek.com/175008/the-non-beginners-guide-to-syncing-data-with-rsync/

7/7

You might also like