11-755 Machine Learning or Signal rocessing Laen Variable Models and Signal Separaion Class 13. 11 Oc 2012 11-755 MLS: Bhiksha Raj
Sound separaion and enhancemen A common problem: Separae or enhance sounds Speech rom noise Suppress bleed in music recordings Separae music componens.. A popular approach: Can be done wih pos pans marbles and expecaion maximizaion robabilisic laen componen analysis ools are applicable o oher orms o daa as well.. 11-755 MLS: Bhiksha Raj
Sounds an example A sequence o noes Chords rom he same noes A piece o music rom he same and a ew addiional noes 3
Sounds an example A sequence o sounds A proper speech uerance rom he same sounds 4
emplae Sounds Combine o Form a Signal he individual componen sounds combine o orm he inal complex sounds ha we perceive Noes orm music honeme-like srucures combine in uerances Sound in general is composed o such building blocks or hemes Which can be simple e.g. noes or complex e.g. phonemes Our deiniion o a building block: he enire srucure occurs repeaedly in he process o orming he signal Claim: Learning he building blocks enables us o manipulae sounds 5
he Mixure Mulinomial 5 5 5 1 4 5 5 5 1 4 2 3 3 2 2 3 3 2 1 6 6 1 6 6 A person drawing balls rom a pair o urns Each ball has a number marked on i You only hear he number drawn No idea o which urn i came rom Esimae various aces o his process.. 11-755 MLS: Bhiksha Raj
More complex: WO pickers 6 4 1 5 3 2 2 2 1 1 3 4 2 1 6 5 5 5 1 4 5 5 5 1 4 2 3 3 2 2 3 3 2 1 6 6 1 6 6 wo dieren pickers are drawing balls rom he same pos Aer each draw hey call ou he number and replace he ball hey selec he pos wih dieren probabiliies From he numbers hey call we mus deermine robabiliies wih which each o hem selec pos he disribuion o balls wihin he pos 11-755 MLS: Bhiksha Raj
Soluion 6 4 1 5 3 2 2 2 1 1 3 4 2 1 6 5 5 5 1 4 5 5 5 1 4 2 3 3 2 2 3 3 2 1 6 6 1 6 6 Analyze each o he callers separaely Compue he probabiliy o selecing pos separaely or each caller Bu combine he couns o balls in he pos!! 11-755 MLS: Bhiksha Raj
Recap wih only one picker and wo pos robabiliy o Red urn: 1 Red = 1.71/7.31 = 0.234 2 Red = 0.56/7.31 = 0.077 3 Red = 0.66/7.31 = 0.090 4 Red = 1.32/7.31 = 0.181 5 Red = 0.66/7.31 = 0.090 6 Red = 2.40/7.31 = 0.328 robabiliy o Blue urn: 1 Blue = 1.29/11.69 = 0.122 2 Blue = 0.56/11.69 = 0.322 3 Blue = 0.66/11.69 = 0.125 4 Blue = 1.32/11.69 = 0.250 5 Blue = 0.66/11.69 = 0.125 6 Blue = 2.40/11.69 = 0.056 =Red = 7.31/18 = 0.41 =Blue = 10.69/18 = 0.59 11-755 MLS: Bhiksha Raj Called redx bluex 6.8.2 4.33.67 5.33.67 1.57.43 2.14.86 3.33.67 4.33.67 5.33.67 2.14.86 2.14.86 1.57.43 4.33.67 3.33.67 4.33.67 6.8.2 2.14.86 1.57.43 6.8.2 7.31 10.69 23
wo pickers robabiliy o drawing a number X or he irs picker: 1 X = 1 red*xred + 1 blue*xblue robabiliy o drawing X or he second picker 2 X = 2 red*xred + 2 blue*xblue Noe: Xred and Xblue are he same or boh pickers he pos are he same and he probabiliy o drawing a ball marked wih a paricular number is he same or boh he probabiliy o selecing a paricular po is dieren or boh pickers 1 X and 2 X are no relaed 11-755 MLS: Bhiksha Raj
wo pickers 6 4 1 5 3 2 2 2 1 1 3 4 2 1 6 5 5 5 1 4 5 5 5 1 4 2 3 3 2 2 3 3 2 1 6 6 1 6 6 robabiliy o drawing a number X or he irs picker: 1 X = 1 red*xred + 1 blue*xblue robabiliy o drawing X or he second picker 2 X = 2 red*xred + 2 blue*xblue roblem: Given he se o numbers called ou by boh pickers esimae 1 color and 2 color or boh colors X red and X blue or all values o X 11-755 MLS: Bhiksha Raj
Wih WO pickers Called redx bluex 6.8.2 4.33.67 5.33.67 1.57.43 2.14.86 3.33.67 4.33.67 5.33.67 2.14.86 2.14.86 1.57.43 4.33.67 3.33.67 4.33.67 6.8.2 2.14.86 1.57.43 6.8.2 ICKER 1 7.31 10.69 wo ables ICKER 2 Called redx bluex 4.57.43 4.57.43 3.57.43 2.27.73 1.75.25 6.90.10 5.57.43 4.20 2.80 he probabiliy o selecing pos is independenly compued or he wo pickers 11-755 MLS: Bhiksha Raj
Wih WO pickers Called redx bluex 6.8.2 4.33.67 5.33.67 1.57.43 2.14.86 3.33.67 4.33.67 5.33.67 2.14.86 2.14.86 1.57.43 4.33.67 3.33.67 4.33.67 6.8.2 2.14.86 1.57.43 6.8.2 ICKER 1 7.31 10.69 ICKER 2 Called redx bluex 4.57.43 4.57.43 3.57.43 2.27.73 1.75.25 6.90.10 5.57.43 RED ICKER1 = 7.31 / 18 4.20 2.80 BLUE ICKER1 = 10.69 / 18 RED ICKER2 = 4.2 / 7 BLUE ICKER2 = 2.8 / 7 11-755 MLS: Bhiksha Raj
Wih WO pickers Called redx bluex 6.8.2 4.33.67 5.33.67 1.57.43 2.14.86 3.33.67 4.33.67 5.33.67 2.14.86 2.14.86 1.57.43 4.33.67 3.33.67 4.33.67 6.8.2 2.14.86 1.57.43 6.8.2 Called redx bluex 4.57.43 4.57.43 3.57.43 2.27.73 1.75.25 6.90.10 5.57.43 o compue probabiliies o numbers combine he ables oal coun o Red: 11.51 oal coun o Blue: 13.49 11-755 MLS: Bhiksha Raj
Wih WO pickers: he SECOND picker Called redx bluex 6.8.2 4.33.67 5.33.67 1.57.43 2.14.86 3.33.67 4.33.67 5.33.67 2.14.86 2.14.86 1.57.43 4.33.67 3.33.67 4.33.67 6.8.2 2.14.86 1.57.43 6.8.2 Called redx bluex 4.57.43 4.57.43 3.57.43 2.27.73 1.75.25 6.90.10 5.57.43 oal coun or Red : 11.51 Red: oal coun or 1: 2.46 oal coun or 2: 0.83 oal coun or 3: 1.23 oal coun or 4: 2.46 oal coun or 5: 1.23 oal coun or 6: 3.30 6RED = 3.3 / 11.51 = 0.29 11-755 MLS: Bhiksha Raj
11-755 MLS: Bhiksha Raj In Squiggles Given a sequence o observaions O k1 O k2.. rom he k h picker N kx is he number o observaions o color X drawn by he k h picker Iniialize k X or pos and colors X Ierae: For each Color X or each po and each observer k: Updae probabiliy o numbers or he pos: Updae he mixure weighs: probabiliy o urn selecion or each picker ' ' ' k k k X X X k k X k k k X k X N X N X ' ' ' ' X k X k X k X k k X N X N
Signal Separaion wih he Urn model Wha does he probabiliy o drawing balls rom Urns have o do wih sounds? Or Images? We shall see.. 11-755 MLS: Bhiksha Raj
he represenaion AML IME FREQ IME We represen signals specrographically Sequence o magniude specral vecors esimaed rom overlapping segmens o signal Compued using he shor-ime Fourier ransorm Noe: Only reaining he magniude o he SF or operaions We will need he phase laer or conversion o a signal 11-755 MLS: Bhiksha Raj
A Mulinomial Model or Specra A generaive model or one rame o a specrogram A magniude specral vecor obained rom a DF represens specral magniude agains discree requencies his may be viewed as a hisogram o draws rom a mulinomial FRAME HISOGRAM FRAME ower specrum o rame robabiliy disribuion underlying he -h specral vecor he balls are marked wih discree requency indices rom he DF 11-755 MLS: Bhiksha Raj
A more complex model A picker has muliple urns In each draw he irs selecs an urn and hen a ball rom he urn Overall probabiliy o drawing is a mixure mulinomial Since several mulinomials urns are combined wo aspecs he probabiliy wih which he selecs any urn and he probabiliy o requencies wih he urns HISOGRAM muliple draws 11-755 MLS: Bhiksha Raj
he icker Generaes a Specrogram he picker has a ixed se o Urns Each urn has a dieren probabiliy disribuion over He draws he specrum or he irs rame In which he selecs urns according o some probabiliy 0 z hen draws he specrum or he second rame In which he selecs urns according o some probabiliy 1 z And so on unil he has consruced he enire specrogram 11-755 MLS: Bhiksha Raj
he icker Generaes a Specrogram he picker has a ixed se o Urns Each urn has a dieren probabiliy disribuion over He draws he specrum or he irs rame In which he selecs urns according o some probabiliy 0 z hen draws he specrum or he second rame In which he selecs urns according o some probabiliy 1 z And so on unil he has consruced he enire specrogram 11-755 MLS: Bhiksha Raj
he icker Generaes a Specrogram he picker has a ixed se o Urns Each urn has a dieren probabiliy disribuion over He draws he specrum or he irs rame In which he selecs urns according o some probabiliy 0 z hen draws he specrum or he second rame In which he selecs urns according o some probabiliy 1 z And so on unil he has consruced he enire specrogram 11-755 MLS: Bhiksha Raj
he icker Generaes a Specrogram he picker has a ixed se o Urns Each urn has a dieren probabiliy disribuion over He draws he specrum or he irs rame In which he selecs urns according o some probabiliy 0 z hen draws he specrum or he second rame In which he selecs urns according o some probabiliy 1 z And so on unil he has consruced he enire specrogram 11-755 MLS: Bhiksha Raj
he icker Generaes a Specrogram he picker has a ixed se o Urns Each urn has a dieren probabiliy disribuion over He draws he specrum or he irs rame In which he selecs urns according o some probabiliy 0 z hen draws he specrum or he second rame In which he selecs urns according o some probabiliy 1 z And so on unil he has consruced he enire specrogram 11-755 MLS: Bhiksha Raj
he icker Generaes a Specrogram he picker has a ixed se o Urns Each urn has a dieren probabiliy disribuion over He draws he specrum or he irs rame In which he selecs urns according o some probabiliy 0 z hen draws he specrum or he second rame In which he selecs urns according o some probabiliy 1 z And so on unil he has consruced he enire specrogram he number o draws in each rame represens he RMS energy in ha rame 11-755 MLS: Bhiksha Raj
he icker Generaes a Specrogram he URNS are he same or every rame hese are he componen mulinomials or bases or he source ha generaed he signal he only dierence beween rames is he probabiliy wih which he selecs he urns Frame-speciic specral disribuion z z z Frameime speciic mixure weigh SOURCE speciic bases 11-755 MLS: Bhiksha Raj
Specral View o Componen Mulinomials 5 5 598 158 16481 3996 81 444 1 27453 147 3271 22436947 224 99 1 7520453 147 381 201737 111 37 91 411501502 515 127 27101 203 24 69477 Each componen mulinomial urn is acually a normalized hisogram over requencies z I.e. a specrum Componen mulinomials represen laen specral srucures bases or he given sound source he specrum or every analysis rame is explained as an addiive combinaion o hese laen specral srucures 11-755 MLS: Bhiksha Raj
Specral View o Componen Mulinomials 5 5 598 158 16481 3996 81 444 1 27453 147 3271 22436947 224 99 1 7520453 147 381 201737 111 37 91 411501502 515 127 27101 203 24 69477 By learning he mixure mulinomial model or any sound source we discover hese laen specral srucures or he source he model can be learn rom specrograms o a small amoun o audio rom he source using he EM algorihm 11-755 MLS: Bhiksha Raj
EM learning o bases Iniialize bases z or all z or all Mus decide on he number o urns 5 5 598 158 16481 3996 81 444 1 27453 147 3271 22436947 224 99 1 7520453 147 381 201737 111 37 91 411501502 515 127 27101 203 24 69477 For each rame Iniialize z 11-755 MLS: Bhiksha Raj
EM Updae Equaions Ieraive process: Compue a poseriori probabiliy o he z h urn or he source or each z z z z ' z ' Compue mixure weigh o z h urn z z' z' z S z ' S Compue he probabiliies o he requencies or he z h urn z z S z ' S ' ' 11-755 MLS: Bhiksha Raj
How he bases compose he signal = 5 158 164 5 598 81 3996 81 444 + 5 158 164 5 598 81 3996 81 444 + he overall signal is he sum o he conribuions o individual urns Each urn conribues a dieren amoun o each rame he conribuion o he z-h urn o he -h rame is given by z zs S = S S 11-755 MLS: Bhiksha Raj
Learning Srucures Speech Signal Basis-speciic specrograms z 15 5 164 5598 83996 81 81 444 147 224 1 327 274 453 369 47 224 1 99 147 17520 201737 38 453 111 1 37 127 91 411 501 2469 101 502 477 203 515 From Bach s Fugue in Gm Frequency z ime 11-755 MLS: Bhiksha Raj
Bag o Specrograms LCA Model F F F Compose he enire specrogram all a once Urns include wo ypes o balls One se o balls represens requency F he second has a disribuion over ime Each draw: =1 =2 =M Selec an urn Draw F rom requency po Draw rom ime po Incremen hisogram a F 11-755 MLS: Bhiksha Raj z z z F
he bag o specrograms F F F DRAW F =1 =2 =M F F F Drawing procedure Fundamenally equivalen o bag o requencies model Wih some minor dierences in esimaion 11-755 MLS: Bhiksha Raj Repea N imes z z z
Esimaing he bag o specrograms EM updae rules Can learn all parameers Can learn and only given Can learn only =1 =2 =M F F F? ' ' ' ' z z z z z z z z ' ' z S z S z z ' ' ' S z S z z ' ' ' S z S z z z z z 11-755 MLS: Bhiksha Raj
How meaningul are hese srucures Are hese really he noes o sound o invesigae les go back in ime.. 11-755 MLS: Bhiksha Raj
he Engineer and he Musician Once upon a ime a rich poenae discovered a previously unknown recording o a beauiul piece o music. Unorunaely i was badly damaged. He grealy waned o ind ou wha i would sound like i i were no. So he hired an engineer and a musician o solve he problem.. 11-755 MLS: Bhiksha Raj
he Engineer and he Musician he engineer worked or many years. He spen much money and published many papers. Finally he had a somewha scrachy resoraion o he music.. he musician lisened o he music careully or a day ranscribed i broke ou his rusy keyboard and replicaed he music. 11-755 MLS: Bhiksha Raj
he rize Who do you hink won he princess? 11-755 MLS: Bhiksha Raj
Carnegie Mellon he Engineer and he Musician he Engineer works on he signal Resore i he musician works on his amiliariy wih music He knows how music is composed He can ideniy noes and heir cadence Bu ook many many years o learn hese skills He uses hese skills o recompose he music 11-755 MLS: Bhiksha Raj
Wha he musician can do Noes are disincive he musician knows noes o all insrumens He can Deec noes in he recording Even i i is scrachy Reconsruc damaged music ranscribe individual componens Reconsruc separae porions o he music 11-755 MLS: Bhiksha Raj
Music over a elephone he King acually go music over a elephone he musician mus resore i.. Bandwidh Expansion roblem: A given speech signal only has requencies in he 300Hz-3.5Khz range elephone qualiy speech Can we esimae he res o he requencies 11-755 MLS: Bhiksha Raj
Bandwidh Expansion he picker has drawn he hisograms or every rame in he signal 11-755 MLS: Bhiksha Raj
Bandwidh Expansion he picker has drawn he hisograms or every rame in he signal 11-755 MLS: Bhiksha Raj
Bandwidh Expansion he picker has drawn he hisograms or every rame in he signal 11-755 MLS: Bhiksha Raj
Bandwidh Expansion he picker has drawn he hisograms or every rame in he signal 11-755 MLS: Bhiksha Raj
Bandwidh Expansion he picker has drawn he hisograms or every rame in he signal However we are only able o observe he number o draws o some requencies and no he ohers We mus esimae he draws o he unseen requencies 11-755 MLS: Bhiksha Raj
Bandwidh Expansion: Sep 1 Learning 5 5 5 98444 1 2 7445399 1 752045337 91411501502 515 151648181 1473271224 14738 1111 12727101 203 8 399 6 224 369 201 7 37 24 69 477 From a collecion o ull-bandwidh raining daa ha are similar o he bandwidh-reduced daa learn specral bases Using he procedure described earlier Each magniude specral vecor is a mixure o a common se o bases Use he EM o learn bases rom hem Basically learning he noes 11-755 MLS: Bhiksha Raj
Bandwidh Expansion: Sep 2 Esimaion 1 z 2 z z 5 5 5 98 1 16481 1583996 81444 2 74453 147 3271224 99 1 7520453 147 111 37 91411501502 38 1 1272469477 101 203 515 22436947 2017 37 Using only he observed requencies in he bandwidh-reduced daa esimae mixure weighs or he bases learned in sep 1 Find ou which noes were acive a wha ime 11-755 MLS: Bhiksha Raj
Sep 2 Ieraive process: ranscribe Compue a poseriori probabiliy o he z h urn or he speaker or each z z z z ' z ' z' Compue mixure weigh o z h urn or each rame z z S observedrequencies z' S z' observedrequencies z was obained rom raining daa and will no be reesimaed 11-755 MLS: Bhiksha Raj
Sep 3 and Sep 4: Recompose Compose he complee probabiliy disribuion or each rame using he mixure weighs esimaed in Sep 2 z z z Noe ha we are using mixure weighs esimaed rom he reduced se o observed requencies his also gives us esimaes o he probabiliies o he unobserved requencies Use he complee probabiliy disribuion o predic he unobserved requencies! 11-755 MLS: Bhiksha Raj
redicing rom : Simpliied Example A single Urn wih only red and blue balls Given ha ou an unknown number o draws exacly m were red how many were blue? One Simple soluion: oal number o draws N = m / red he number o ails drawn = N*blue Acual mulinomial soluion is only slighly more complex 11-755 MLS: Bhiksha Raj
he negaive mulinomial N o is he oal number o observed couns nx 1 + nx 2 + o is he oal probabiliy o observed evens X 1 + X 2 + Given X or all oucomes X Observed nx 1 nx 2..nX k Wha is nx k+1 nx k+2 k i X n i o k i i o k i i o k k i X X n N X n N X n X n 2 1... 11-755 MLS: Bhiksha Raj
11-755 MLS: Bhiksha Raj Esimaing unobserved requencies Expeced value o he number o draws rom a negaive mulinomial: observedrequencies observedrequencies ˆ S N Esimaed specrum in unobserved requencies ˆ N S
Overall Soluion Learn he urns or he signal source rom broadband raining daa For each rame o he reduced bandwidh es uerance ind mixure weighs or he urns Ignore marginalize he unseen requencies z 15 5 164 5598 83996 81 81 444 147 224 1 327 274 453 369 47 224 1 99 147 17520 201737 38 453 111 1 37 127 91 411 501 2469 101 502 477 203 515 5 158 164 5598 81 3996 81 444 147 1 224 327 27453 369 47 1 224 99 147 1 7520 201737 38 453 111 1 37 91 127 411 501 24 101 502 69 477 203 515 Given he complee mixure mulinomial disribuion or each rame esimae specrum hisogram a unseen requencies 11-755 MLS: Bhiksha Raj z 5 158 164 5598 81 3996 81 444 147 1 224 327 27453 369 47 1 224 99 147 1 7520 201737 38 453 111 1 37 91 127 411 501 24 101 502 69 477 203 515
redicion o Audio An example wih random specral holes 11-755 MLS: Bhiksha Raj
redicing requencies Reduced BW daa Bases learned rom his Bandwidh expanded version 11-755 MLS: Bhiksha Raj
Resolving he componens he musician wans o ollow he individual racks in he recording.. Eecively separae or enhance hem agains he background 11-755 MLS: Bhiksha Raj
Signal Separaion rom Monaural Recordings Muliple sources are producing sound simulaneously he combined signals are recorded over a single microphone he goal is o selecively separae ou he signal or a arge source in he mixure Or a leas o enhance he signals rom a seleced source 11-755 MLS: Bhiksha Raj
Supervised separaion: Example wih wo sources 5 5 98 1 2 74453 1 7 520453 91411501502 51516481 81444 1473271 83996 224369 47 22499 14738 1 2017 37 11137 12727101 2469 477 203 515 51516481 8399 6 81444 5 5 98 1 2 74453 1 7 520453 91411501502 1473271 224369 47 22499 14738 1 2017 37 11137 12727101 2469 477 203 515 Each source has is own bases Can be learned rom unmixed recordings o he source All bases combine o generae he mixed signal Goal: Esimae he conribuion o individual sources 11-755 MLS: Bhiksha Raj
Supervised separaion: Example wih wo sources 5 5 98 1 2 74453 1 7 520453 91411501502 51516481 81444 1473271 83996 224369 47 22499 14738 1 2017 37 11137 12727101 2469 477 203 515 51516481 8399 6 81444 5 5 98 1 2 74453 1 7 520453 91411501502 1473271 224369 47 22499 14738 1 2017 37 11137 12727101 2469 477 203 515 KNOWN A RIORI all z z z Find mixure weighs or all bases or each rame Segregae conribuion o bases rom each source source1 source2 z z z z or source1 z or source1 11-755 MLS: Bhiksha Raj z z z or source2 z z or source2 z z
Supervised separaion: Example wih wo sources 5 5 98 1 2 74453 1 7 520453 91411501502 51516481 81444 1473271 83996 224369 47 22499 14738 1 2017 37 11137 12727101 2469 477 203 515 51516481 8399 6 81444 5 5 98 1 2 74453 1 7 520453 91411501502 1473271 224369 47 22499 14738 1 2017 37 11137 12727101 2469 477 203 515 all z z z Find mixure weighs or all bases or each rame Segregae conribuion o bases rom each source source1 source2 z z z z or source1 z or source1 11-755 MLS: Bhiksha Raj z z z or source2 z z or source2 z z
Supervised separaion: Example wih wo sources 5 5 98 1 2 74453 1 7 520453 91411501502 51516481 81444 1473271 83996 224369 47 22499 14738 1 2017 37 11137 12727101 2469 477 203 515 51516481 8399 6 81444 5 5 98 1 2 74453 1 7 520453 91411501502 1473271 224369 47 22499 14738 1 2017 37 11137 12727101 2469 477 203 515 all z z z Find mixure weighs or all bases or each rame Segregae conribuion o bases rom each source source1 z or source1 z z z or source1 z z 11-755 MLS: Bhiksha Raj source2 z or source2 z z or source2 z z z
Separaing he Sources: Cleaner Soluion For each rame: Given S he specrum a requency o he mixed signal Esimae S i he specrum o he separaed signal or he i- h source a requency A simple maximum a poseriori esimaor z z S ˆ z or sourcei i S z z all z 11-755 MLS: Bhiksha Raj
Semi-supervised separaion: Example wih wo sources 5 5 98 1 2 74453 1 7 520453 91411501502 51516481 81444 1473271 83996 224369 47 22499 14738 1 2017 37 11137 12727101 2469 477 203 515 51516481 8399 6 81444 5 5 98 1 2 74453 1 7 520453 91411501502 1473271 224369 47 22499 14738 1 2017 37 11137 12727101 2469 477 203 515 UNKNOWN KNOWN A RIORI source1 all z z z z or source1 z z z or source1 z z source2 z or source2 Esimae rom mixed signal in addiion o all z 11-755 MLS: Bhiksha Raj z z or source2 z z z
Separaing Mixed Signals: Examples Raise my ren by David Gilmour Background music bases learn rom 5-seconds o music-only segmens wihin he song Lead guiar bases bases learn rom he res o he song Norah Jones singing Sunrise A more diicul problem: Original audio clipped! Background music bases learn rom 5 seconds o music-only segmens 11-755 MLS: Bhiksha Raj
Where i works When he specral srucures o he wo sound sources are disinc Don look much like one anoher E.g. Vocals and music E.g. Lead guiar and music No as eecive when he sources are similar Voice on voice 11-755 MLS: Bhiksha Raj
Separae overlapping speech Bases or boh speakers learn rom 5 second recordings o individual speakers Shows improvemen o abou 5dB in Speaker-o-Speaker raio or boh speakers Improvemens are worse or same-gender mixures 11-755 MLS: Bhiksha Raj
Can i be improved? Yes weaking More raining daa per source More bases per source ypically abou 40 bu going up helps. Adjusing FF sizes and windows in he signal processing And / Or algorihmic improvemens Sparse overcomplee represenaions Neares-neighbor represenaions Ec.. 11-755 MLS: Bhiksha Raj
More on he opic Shi-invarian represenaions 11-755 MLS: Bhiksha Raj
aerns exend beyond a single rame Four bars rom a music example he specral paerns are acually paches No all requencies all o in ime a he same rae he basic uni is a specral pach no a specrum Exend model o consider his phenomenon 11-755 MLS: Bhiksha Raj
Shi-Invarian Model =1 =2 =M Employs bag o specrograms model Each super-urn z has wo sub urns One suburn now sores a bi-variae disribuion Each ball has a pair marked on i he bases Balls in he oher suburn merely have a ime marked on hem he locaion 11-755 MLS: Bhiksha Raj
he shi-invarian model DRAW =1 =2 =M + Repea N imes z z z 11-755 MLS: Bhiksha Raj
Esimaing arameers Maximum likelihood esimae ollows ragmenaion and couning sraegy wo-sep ragmenaion Each insance is ragmened ino he super urns he ragmen in each super-urn is urher ragmened ino each ime-shi Since one can arrive a a given by selecing any rom and he appropriae shi - rom 11-755 MLS: Bhiksha Raj
Shi invarian model: Updae Rules Given daa specrogram S Iniialize Ierae ' ' ' ' ' ' ' ' ' ' ' S S S S S S Fragmen Coun 11-755 MLS: Bhiksha Raj
An Example wo disinc sounds occuring wih dieren repeiion raes wihin a signal INU SECROGRAM Discovered pach bases Conribuion o individual bases o he recording 11-755 MLS: Bhiksha Raj
Anoher example: Dereverberaion + = =1 Assume generaion by a single laen variable Super urn he - basis is he clean specrogram 11-755 MLS: Bhiksha Raj
Dereverberaion: an example Basis specrum mus be made sparse or eeciveness Dereverberaion o gamma-one specrograms is also paricularly eecive or speech recogniion 11-755 MLS: Bhiksha Raj
Shi-Invariance in wo dimensions aerns may be subsrucures Repeaing paerns ha may occur anywhere No jus in he same requency or ime locaion More apparen in image daa 11-755 MLS: Bhiksha Raj
he wo-d Shi-Invarian Model F F F =1 =2 =M Boh sub-pos are disribuions over F pairs One subpo represens he basic paern Basis he oher subpo represens he locaion 11-755 MLS: Bhiksha Raj
he shi-invarian model DRAW F F F F =1 =2 =M F ++F Repea N imes z F z F F z 11-755 MLS: Bhiksha Raj
wo-d Shi Invariance: Esimaion Fragmen and coun sraegy Fragmen ino superpos bu also ino each and F Since a given can be obained rom any F ' ' ' ' ' ' ' ' ' ' F F F F S F F F F S F F F S F S F F S S ' ' ' ' ' ' ' ' F F F F F F F F F F F F Fragmen Coun 11-755 MLS: Bhiksha Raj
Shi-Invariance: Commens F and are symmeric Canno conrol which o hem learns paerns and which he locaions Answer: Consrains Consrain he size o I.e. he size o he basic pach Oher ricks e.g. sparsiy 11-755 MLS: Bhiksha Raj
Shi-Invariance in Many Dimensions he generic noion o shi-invariance can be exended o mulivariae daa No jus wo-d daa like images and specrograms Shi invariance can be applied o any subse o variables 11-755 MLS: Bhiksha Raj
Example: 2-D shi invariance 11-755 MLS: Bhiksha Raj
ach Locaions Discovered aches Example: 3-D shi invariance he original igure has muliple handwrien renderings o hree characers In dieren colours he algorihm learns he hree characers and ideniies heir locaions in he igure Inpu daa 11-755 MLS: Bhiksha Raj
he consan Q ransorm Band pass Filer Band pass Filer Band pass Filer Specrographic analysis wih a bank o consan Q ilers he bandwidh o ilers increases wih cener requency. he spacing beween iler cener requencies increases wih requency Logarihmic spacing Band pass Filer 11-755 MLS: Bhiksha Raj
Consan Q represenaion o Speech Energy a he oupu o a bank o ilers wih logarihmically spaced cener requencies Like a specrogram wih non-linear requency axis Changes in pich become verical ranslaions o specrogram Dieren noes o an insrumen will have he same paerns a dieren verical locaions 11-755 MLS: Bhiksha Raj
ich racking Changing pich becomes a verical shi in he locaion o a basis he consan-q specrogram is modeled as a single paern modulaed by a verical shi is he Kernel shown o he le z F s z F z F z F F F Carnegie Mellon 11-755 MLS: Bhiksha Raj
Carnegie Mellon ich racking Le: A vocalized song Righ: Chord sequence Impulse disribuion capures he melody! 11-755 MLS: Bhiksha Raj
Carnegie Mellon ich racking Having more han one basis z allows simulaneous pich racking o muliple sources Example: A voice and an insrumen overlaid he impulse disribuion shows pich o boh separaely 11-755 MLS: Bhiksha Raj
In Conclusion Surprising use o EM or audio analysis Various exensions Sparse esimaion Exemplar based mehods.. Relaed deeply o non-negaive marix acorizaion BD.. 11-755 MLS: Bhiksha Raj