Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
9 views

Machine Learning Notes

The document provides a comprehensive guide on selecting the right machine learning algorithms based on the problem type, data preparation, and performance metrics. It covers various algorithms such as regression, classification, and clustering, along with techniques for feature selection and model evaluation. Additionally, it discusses applications of machine learning in areas like image recognition, speech recognition, and self-driving cars.

Uploaded by

Nilay
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
9 views

Machine Learning Notes

The document provides a comprehensive guide on selecting the right machine learning algorithms based on the problem type, data preparation, and performance metrics. It covers various algorithms such as regression, classification, and clustering, along with techniques for feature selection and model evaluation. Additionally, it discusses applications of machine learning in areas like image recognition, speech recognition, and self-driving cars.

Uploaded by

Nilay
Copyright
© © All Rights Reserved
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 27
@ aj Attemp! any Pour rom Pllowirg. A+ How fo choose the right rt algorithm, Ans 4c Understand the problem - — First understind your -problem , what you ove solving. Is itOclossification C predicting categorie) @ Regression C Predicting valued) © clayering C grouping data) ar somettiry else. — Tt will yehelp you fe narrow down the possible algo. 2» Prepare and Procets dalit 7 Ensure thar your dita is chun & ibis in vight terms! — This include + handling missing values, Nomalziny 3 Sati Splittiry dakoset into training & teuting 8s ~ Tdentify which features are \mportant for the algorithm. 3. Expote and Analyze the Datae — Perform Epp (Exploratory data ancelysis’) Using Visudleation fe Ghelgticas Summories tO Understond putterns ,Coreation and distribation jn dea. - this can veveld the ingight which can help you 4 choase right olgorithms.Ceg. linea vs. Nen-Hnecr) 4+ Choose Patormance metrics: — Pevide show you toll) evaluate the modes, Pray 23 vas pee 23 Hpe of problem for Reyretslon line pr Omperisioh aS GJ Try mustiple gorithms- — Test different models includ ing® * Decision frees For clays Pication Aires ressiory dom forats An ensemble of deni : "ea R acurceey. eusion tree 40 impove © keNeaveit Neighbors (kN)? A Simple argorithy thak cleysity besed on pheximity 40 cHenad Pep ° Support Vector Machine? For complex clussrficeon -esdko | * Neural W/o / Peep leamiryt For loge deloyet or Complex sejhy J] Tune Hyperporameter ¢ 4 — Use methods like i fee nn eRonaen Serch fo Ane dune okorithm purametas for “Tnodel mene ful overfitieg. PerPrmone without 8) Cross valida on’ Apply teehniquay fike K-fol cress validation 40 tat the model general \retH'on . — I+ will help you 10 veduce the rich of ovat, G\ Compare *lodés® the ae Cries the peoforarunce— of ol +eted model. (ee a send choose the beHer One. ne J Consider Model complenity plexity caith Performance. enyy to Uddertkind and interpret, Provide a beblen accurdy, fe | he} batHoney 1 | bound edsinn ono) iy 20 a . | 42. ol4- ahem a. Explain Regression line , Scatter plob Erroy in predictioo and © gost Bitiry Vine- i ANS) Regression lines Tt is a Slodbical concepk which (red! Oe germs, PManship ber” 4:00 or reve variable A reresion line \s a straight line thet shows heno +wo aking are velcted. See - Tt helps to predict one thing boxed on another. — for-evample, (eMS say yourwont fo knoto how Mech yous might sme OP a tes bayed 6h how Many hours yote studied You have bunch of data points Chourssudied & test scoré) and When you pht them on graph , the weRVeK) line Shows +he general txend af the relatonship. LTE line ques UP jit means the Move your ote 300 297 368 4S < @ Evvor in PredicHons =x} refers to difference betP Actual value & Predict vowlue. in a medd. F Ih the context of Machine learning ang Stoatics 6 imporkant a , +o meosure & mininme this erroy ; diction. ImProve the exceurauy — Typer of Predicton Erres: ). Residual /Evvor Term: In the : retsion mode} , the residues) is the ditterence bekween Actuett value Formula® ervor= ¥- (Yu Predicted velue rt, = Smaller Exror , the beter the male) fils the dat. 2. Mean Absolute Error: This Is the aversye of the differen behueen Actucy & Pralided vehe Fore Meee S eet ~ It gives overcet mesiure of fredidion accurey, 3. Mean squared Errore this js the averoye of the Squared dfterene bekeeen Ackuuld Prelicded Formula ¢ x values. MELE Cropp —Tt fs useful when big ersors one pantiqnory bead . 4. Root wenn served Evsat? 6 Bign& varlanle: the oy in Predidions is key 4 builting bo known o8 a trend line or linear pralgst Tine thet Is ued 40 approvimesk een too variables in a ser cog dear job analysis +0 Shoco the relationshi pe > 40 Mine the evvor oF polar ond the fradicte{ value to line \F meury betterfi Unrcarehy Predicts Me ne Helps deater polit. aor croeost aie vebasionshie pHing, fine ard, direction’. — he besh Thon: mek: ree she oo "B. pain the concep} ap. #ikne Secon and apalyon. Gren ie, ty et pam ae TE y “eotfh CET OO eG vesshony , Cle ferirg. att NG en i letting He mest ~relenans feature ~ Fetkne Selector and erpettion cre both techni, > reduce te dlumber of feakirey portal at eee dates 40 improve model rerforman(e v4 \. Feature SelecHon? “at the process of sélethry a subse of the nny) relevant perkwes from the Srigines dakegop how in, model. snes tem be — Tt hops jryerove mode cecur reduce Mr and fey training. by rnc, ~ tony keeps mart important fethng which hewe Strang rélatonship with forge variable, — tt disturds irrehant and wenk Poutuvey, — the Ovigined feature) are not modified “rgleud Some Ore Gdlecred and others ore ‘ignored. ATypa of Peotue setectton: © filler_Method srt selecting features beyed on stastic! aa 3 Properly suth of. thelr corelaton with texget variable. nes with high ranking orre selea-e. seleats feafirer by training models on different subseds and Snee thier Performance. EX Recumive Etimin feature Eliminahion. = It select fesuure when model is in trainirg Process. cis}Or ree. house (ices, Of bedrooms &SRe YF the hacxe le ixrelovant festhnay ihe vbw 2. Feature Oa IPs =I jpvoyver transforming the originay p - of penisich I, \ sxe? Sak 7g into a new Set malt ps aiprrocets oP! spoans forming the orig 1 adh ay t by ginaap a. ne’ set op , wey (thal ore more carina 29) Aspe RROD “2 ro. 2 ensential information trom He ~ the! goa! Aw capire| origina! rel. and represent tr in Jowerr-dimensiona/ v RoFeatore> eahadign techhiyuel’ Q Princi pesh=;fomponent, Anatysiss Redule dimensionwity by ; he gruclel, crewties new feaknes. © Lineor Pisa miaanh Pode + Reduce dimensions ae oF ok ary Hele ny : Re ea elesces {in the date. OESNE $A Darn Feth is used fo reduce dimensions di Seer ; wor? eho 2 high dimensional, cake ” 2D av3aD Ge Explain Kemeurs algorithm: a — tH 1s an upsuperviiel Rornirg: BKo. = Tt is wed © groups tbe -unlobdlled daleset into Cluiteis. = The goal: ‘Patton the date into K-cleyter where ettth peli] balongs~ te the clyfer with the mearelt metth Ceenhotd)- %*% How K-tmeuns voorkse PAR weno data sek of items, with certain Pecures, Boye peckive. [0 | ee norize these 1eMS into JrOUPS. ~ ye uge K-mean gon tha, on unsugervived ‘on teahniquel’. the algoritam vepresents rember of 4 to closs(hy our items into. K poids. cobied Clyjter . fo chret mean and ve uplure the one Ane overayes oof the items 50 fer, Giveh NOP Yerafory and ot the eae compress (an, Ahor|y detec sp evar the concept OF logistic Yeyression, — This a shatico! method vied for bincny clesciticuHap fPreblem — Goa) ts 40 predict One BP +WO Possible surcomey Ces. aie trup / folge). z, — Logistic Reyaession Used Rr clots iconon Fetes) Ipoh eynestiol How Logishe Regression Workss : le Linear comblnaton of lnpotcs = like linear vegresslon, logistic Tegresion stark by computing a linear Combincution OP the inpur Peatires. zebot by mit boXat... thy Xp *X\X2.eAn are inpur features, sboibr...bn ave the model una meters (coefficients) "21s the linear combinah’on oP He Input features, 2+ Apply the sigmoid fandion* —To convert the linear combination 2 into a Probability, logistic veqression appli the sigmoid -Punction. = = soles PCED ee 2 PCY EI) gives the probes ity C Yyey"or positive” ) © The yelult 1c okwoys 18 & I. 2. Make PredicHon® } is _qreter than 0.5 ,the model predict om in 0.5 >the mode) predicts leis 0 Cneystive) . e matholnatigl Pont wed to map the Probubl liHey. Pe) = et Bee ere values dose 12 0 mey Values crete eo) meary high probshi liy dypes — sy Explain iutivaviate Linger Reyvessio = Sh ) Vode). > = is an eersion of Tagen oth . ultiple independent variable 40 Predict oi single Bee se maltip = This capproach heles capture more complex veletionshj by considering meutiple factors thett inbluene the ae dake ee Sant ‘% “Th simple linewr ‘ Bpaton? 9 y= + Pract & alice dep wor Sigle ind voo 36. a Ye depedent voriable p= intercept Biz codficient for 22. €= Grrr term Th MulHvoriale UR »We expand this to include maltiple inkeperdend- values. = Bot ae yy = Bot Pyoe, +220 + ... +Paaagn + € ae \ndependent. Veeriesbles SwWytkpye 2 Vc cecttident for Seder + Gather Data? aes Colleat information on the fadiors you. think ove imporkant pr Predidiny your outcome Clike houle Pride , cren, bedrooms, etc) eTvain the Models Feed this data fo mode? ,ofd if will figure out the bat weight or eoth fudtor to yet-close 40 the athuct Prices. * Woe Predictions: Once faired, you can npr New date Corea, bed, et) and model will calceHored Predicted Price. oe — Ext Predicting house price bojed on: - Sd. i bedrooms we predi ion amore accwate together it forovide ple rely on just one fetter, ox $94-Ft) 429000x( Bed yond) -500%(Age) eam bose Price, Ft eddy fpo'to the (Ce bel —-f20900 of eye dementes the Price by $500 a's « 00 ) + 0.0000 3)-C $00 Kio) SS OE RRR Te een ey creme gee 9- Rendom Forest Alyorithm. may 28 Drthls poworfial free lemring technique in yeekine Leanning “Works by cresting a number of devision tree during the Feaining phove. * This an ensemble learning method , Yexrvesion joyk, Z ea elas akon, © s ; Training Data] ae; stance decision ees Algorithms : }> Th the Yandom forey+ moder, A subsot of Detespoints ond subset ef Pecrkirg is selbtted. 5 Tndividual decision freey are constructed far cxsah sample an outpost. mrojority volry Ard Avevcsyiry Y& Disabvantezer = Complexity ; 1 ~ Leas interprededdl tity jute, Applications - Eroil Spam deteltion ~ (oye, & doe clues fication ~ Pediding stock market velugy 3 re StlecHor aon Dover Process* © Pata samplry (Beoksbapirng)=-the 90 solocts difterent samples from the. Ort ginap clase Sate, al foulty repeuts | — this means Cash sample (ied bootstrap Sample) fs slighty dfte might hewe some of the Same dole poms multiple Amel, @ Tree Buildiry:. foreach oP these campla, the G0 buildy dedston free: ~ Gath Aree learn something different » bete to Prevent over fitting. B Voting & Averoying? for clsstficoNont Each trees makes Praticich nd the\ote! from most trees decide the Final less (obel. — example: \P most Pree Say “Spam tren fhe Orrail $3 chsstied os Spom. For wegressions. Each tee givera pumerice) prediction ond the Pinu = ae result js the avereye of all he free Predictions, Random rest bukb frany dwase bees ah random puts of the data and thea combines tery prokichors for a mere vest. 3. Corcept of Bayging £ Rosstry, > * Boggirg 7 Boggiry builds many versioy of same med Clike deyion Nee) odiry yandom pur of the origind desta, then tt Combines” +hewr predictions get aleetter overall ansver. ee / a ; a sec Sia toy oP combining predictors thot a Aim: Qocseye vortiance in dete, not a bles. Each mode receives equse) weight. — Each model is built indeped ly g Boyging Avizy tO Solve overptiry Problem. — Bose learner typel: Ho megenouy — Rae earner troininy? Parwliel. — kxample: Random Forest Model. — Working , \. Tetke rondom samples from origina desteyet, adlowivg ‘Ye S 2-Troin a model an ectoh Separvedely. 3. Combine Predidory, leg , oe © Os wiiferay Aches ee meld on different vandom Sample conbininy fo get Finuy answer 4ea\ned in odels Root Srraphiny, Agaregat ee 37 Regis n F 3 Boosting : ; B 7 Boosting fase on improvny accuracy by troinirg new mot 2 fo wrfest the esrers of Previoy ones. hiPhevent js om ae ee combining predictors thot belong fe pe. g ~ I beestrg models ore weighted atcordivg te thelr Poretsat zs > New model rears from Previous / \ej+ mode). z Tero}ivaly _ beatin models Jor correstiy the enters op =) the _ Previoed models. 5 Rooshirg bie3 fo reduce bjcsy 7 AP the classition is stable L high ther apply boast] 3 Bote lenrner typay homogenous . z —Beye lesmner breinig © heguembicad, exe PdaBoost - 5S § working £y Stork by priinay model on the detky, See ® |ook at yhete it made mijteke, cnd focoy more oh Hough v= examples in the net ynodel. i So @D Repose the process, euth Hme pee Ni csttenHan = cfrom she Previott Yound. Then combine xredicHan: trained at one a ML : ay © 9 Deseribe Multicles cleysfleston. © moy S-Type of mashine leaving where ; dala pein into anealian wag ne tee clessty "Unlike Beary, ,clusification Clvhere there ave oe Clossey ike “yes” or “Yo mudticls Cless diction eal; woth mabkple closes. ex. cleasthy $ypey oP Pruiits like Apple, bonana, k orange) OF typey of anlmak Like cub,dog For example. let's Say OU. haweq bunch of picture nd each Pidures shows either cat, 409 or a yobbit. syralfidess clegsificaton is When 2 gemputer Program tearny fo look at new pichne, und denide WPIE showy a cost, dog oF a ralpit “i = Working : Xo nay compestey how fo Charty picture) i-you qpotatd sheso| Feb of amis. Tom many lobélled picives , some labded cat Je You: gue Fe P79 Some labeled dog and some fbeled voshbit 9. The compater learns putterns for eh claps. For cx, that cat have painty E05, dogs have longer Snowts ge vobbits huer longer CONS. con procs look e+ & nero gidhive and 3. Once Aratned, the computer decide i€ it shoud Cots, a. dog ov a yobbir. mm can lean 1 clussity emails 2 Ore ly? or spar” mily? or spam Bi GB program mah lant identfiastion onduritlen Diath: Reegatin, > OS Fxplei ae concept of Expectation torgmeodfon Agoritht. SEM algo ts a method where 7 * Jo ure chon eum MCA tate ~ Th is on iterative ppr0aths which ; oe - repessted skps. meary jt happens in - Lets understund with oh example: *Tmagine you have ane boy ith mined chodlotes & core and fou ore tying Figure out hovo many chocoketey 2 hows many Caramels ove rived in the bey. = You cant tell witvar (coking 14 : + Howemer you Kmw ath candy has different weight dependiy on uherthy HS a Chocslute or caramel}. ~ the EM algorithm hod fwo main steps 40 solve this Kind oP Problem? QO) Skpevotion steps :E-step) : = Th ths Aust slep ve eas xf eth candy in the boy Wa Chocolate ar 9 Caramel hoted on iis weight. — For ex, if a candy js heavier ) 8 roan et ! iB iis lighter we might guess VE QProvimpstin step 1-SiP) adevefunding of Fe = Now swe We the guess f° ime Lemrenel) avoraye wal ght of eath HIPS - , qe bet Hae eae! ques 5 ib fo see tr! a cheating eae ety) a boyed on is weights. EPS pe We + = gepoutiry eI SIPS oy —quaning He imerevirg Uh re or the unk e fach Hype SE CAME Hy chocolate card Applicatong: S peluslexan & mixture models. Je process) i Spode recs nilseny Senta > Medica Tmepirg . Advantyese > Works well with Incomplete dette, ~ Usefd Por complex duteyet, > Widely applicatle where there ove complex deta pattem, Disodvonteyed ; — Bensive fo ini )esHon Show convergence. step1 Tnhetre strep 22 £-steP Sepse p1- step ShepA® Converrgen 20°C, 25°C, 80% eThecr soley~ Go ,100/60 i = Linear repetsion draws & stroigh fine through dukes pe'ntr cotted a best Af line. TF Represents trond of the dete. ‘ . i i Ys y- presided outcome Clee ores stay — Laue Y= mache en (rr Cenp g ‘97 Slop cf fhe line 5 c= inteweept- . with diayratm e Ppphi cotton = pedhicting erg - crater Poeely = Sports Prnctytics —Vealth Skitiog » Padua = simple & essy fountorstard = Feat Compass on ~ Xnterpreltbie — Requires lees data * vl. = only wars with [ine —sensifrve Fo oudtrers Re Sta Discriminant Preolysts fo Reditic yy Dimers — Also eee s oy poise! Discrimineunt Prot oe frank esi matty veduttion Jochnique talael Th suposed ~ is is 0 eager a om onver a Supers ree bieston #23 ie pa Yeweniny 150 Spedally guigoal P| = Pim fo on a a Vines combinewon cop Sexthure. thal optimally SoyregalY) elostres within @ duleseh. ~ Far ex. We haw puro bess . Ree 3 and ver need to sppoe Hem hove mubHple Peutwe. U8) only onea-feuter to me wet | Clegsey can lossy them qelult in SO So ve will beer on incressiry op. esha) for propar cPusticeh on. 6 overtop’) —worki with example: step by - rep cronies OOH Fouls with LDA J Understand the potas Suppor ee hve ce ltkesets of pruits with Flloei Peaster, 2 lor Crmesneed ga numericoy Telue like J for red, per yell + cre C meyured 10 cenkmelen) . vlght C mesred in girs). Hired of fe lor 5, se zScm, vest +1909) ——J pe yea 4, 2009) = 28cm , vel oats Pe, als bedi Fe rhe overeye opeath fralt Hp bgal\on g. Scatter Mabics WNuot LDA competey: * Woithin-closs scatter: eoyurey jy A . bath © Betwaon- clexs gratters Cath frat yen? a rus a * Mesures, bots dierent the mearg of the el type ere from the ounal\ menn cf ex TPaapples are closety peeked fo-ether oreungey £ ‘baraney a vithin oa ae Creare but high bekweeh chs). 4, findivy pasting. La idenfifizy the oftimal line that sepratey the fri type This can be visualeed e3 Pindiry a line thar bat spuaty the apples oronqes Kboonangy on aqaph. 5. cheese imp Peaturel: by mojedting the data onto the betttine LP reduey the complexity Sf the dofuset , : + instead OP anetybirg si Cree: eae a IF ogi Or, 0 djmenslory Capture erate Brrdeion dor Atel ois M * noes flan Da jour mit endeap with Birglied atewl dee Perl C0 O1ory BBB- Ba cin # Bp plicetion — Face Recne — pedicel Prsyon — Spe@h pen Ddwantey ane Pr clots Hot —exy fo intor(nee Disdy —sanitive p outfer — Nermali CLES UM ‘on, Bed z . OO Se = ~> Clustering: Tk is woy to groups similar jf abhor Ge iP you hove bunched fruits, yous ml We P applet “ogether, oranges oe wt wi | Dascans Density hosel Spatial cleiteriny of Application with abe, = Thisca popular cluttering algorithm wed in machine Jeurbing and mining = This a methad vial to Find clyfery in set op dade polity that are Close together into Chater which Points thot ome in jow-denvity ove ci noise or outliers, ' R Rewuived Paramohers ¢ [ Je Epsilon(€) + The mesamnum distance bet” too points. 2 ne This is He _mininum no. of points requir yo fovin chater. 8. GrePoints? A point is a wre pum \Pithey at leat mints within its & neighborhood. 4 Border point: B pint tak Is nota core paint buis in the €-néighborhed a core point. ; ) B.Noinpoint: A point thet is nether a Core point Ney o border point. thee poists ore considered of outliers. } Bower point Kinth = 4 eps =) cof Sd ,. ~~ Noise nt- Start with an arbitoy pint in dateyet = HF the point is Cove oink. ink is a Core point, Cveatte a new chyter and jnckdd density--reuthable from tha polit. exidh point \a fhecluter, repeat the recess : ia ancl expand he clster accordingly, Par all @oints in the detleyet thot have Nor chiter hice Of perrameter i «memory intendve = paphcatfon Foye Precessi * Pnomly DetesHon, G coin any Ave frformane meoyures along with axampe. 3 i Accaraay? Ibis the Proporton oP coveety Predicted rene? wi the tole Predictors made by model, 2 fees Novof ore fredicton Oe Tmay| model ply aan ; ine I ‘on email t ib a Jp out of Joo emails s ( Mf parm or hot. Fit cornetly (dente net $9,209 /907, MPreciston: Tt measnes tre frroporlion of positive (Predickion thuk ove Gatually covert. 1S cakwated only for the positive cls a see Poril Wre= Pe ot 2 Tres posihut Folse Pontus 2) medion tes (rredicts 10 People howe & Cerone , b have (b, Predsion would be presion> £078 / 307 Fhecolh: THY Re Prepavion of aMuetpaltnval Hot were eae Jdentified by the model- fd vue, fosifi ve Real cis FroeN es 2A. Out oF 12 dieses ere jf He fait corvethly sankey F O4ES MEP. teat 80.67 / 617. a Preeckon: Elseres This the harmonic meon oF frestfion md recall. I boine the doo end y uydfad when beth predtion and tecall ove inpetere " 5 FI) scores By, Peniexnes! 2 iio K Reco! sion 5 26% and yells 67. NOPKO-ET 330.73 or 737. 0-8 $0.67 * pvarye ¢ obsollbte dp phativeen predicted and outual yore. n A 2 \y-Y) frellcly owe prices-and, Ybooses the medlGet Price dtfer by @ 90k .B 15K $ Sisk $10k ISS +O ~ [2.6 or prask ouka ut ont, @ pukuully f phe Fiscore Would be 92 Drfterence @ Logic regression K Support Vector Mashine, LR sum @ Ago usea br dlasshiaution Poh © Mle ed for io contin @ Tt doesnot try 0 find he —@ Alery 40 Pind bey margin bee tas @ orks went with sreetured @ winks offettively with unibrubsred kc daka and abendy inenhfial semictrudtined a Khe beet jy, independent vartobes @ PBoged on stoited Appoash G) baal on geonatete cxmpech © move velnnable 4a ourfitty @ oer nrk of overfiltiy © &-Apphavtton ® & opplicutian — Caner detention ~ Trage lesssflantiobs a ae ; = fendi Rrecegnifiob - ctions i : Puce peta, ata dap Pos Ci) or Puiled (6) Costomey wall purchoy oe oO Rnd io) scl fete 7 retded Pav Scoti ol. © ee boat anf exsention ie se 9. Explain the ollotwing Roc AAC. D_the ALCL ROC i) a graphical vepresentution of the Parforman op a binony closs fication model ar vaniowy thrathoK- : - Ib ie commonly wed tn ML 10 ossere) the ability of model 4o Aurevpuith bern fo ckyses. typically: He positive clots Coy, prrejence oP diseose?) and the nejatve clue fey. absence of diseoyes A” ROL: Pedidver- Recdver Oper} Cove. chovatterslics. = Roc cave i) the Srapblay vemresentatan op elfastvenely of binowy claysificofton model, = ploy the true Positive rete CTPR) ve the Foke fosiive cete CEPR) at diffeyenF clessificaHon Yhresholds- *& AUC- Aveq under Carve. ~ Th reprejente the Tea ander the ROC cue - =Th measure} the overall performance 4p byrony cleetfeatHon node — Ps both TPRXFPR Taye betm o fo], So, the apnea will cluay he bel" Gand 1. A “greater value op Ave denote oHer model parfermance. ~It yepretens the bobility with Whidh our model can Aistinguigh be? two cletsel fresent lo our feaget. | eaton fvalaton mehic cla fecal STR 8) Fale positive fate . Fr: Ne ep apecielty

You might also like