Journal of the Acoustical Society of America, Oct 1, 2014
The recently introduced shift method is applied to detect and characterize burst-pulse vocalizati... more The recently introduced shift method is applied to detect and characterize burst-pulse vocalizations produced by marine mammals. To this end, burst pulses are modeled as sequences of click-like events that are repeated after a certain inter-click interval (ICI). The shift method is used to first emphasize events that repeat within an input signal. Afterwards, the ICI can be estimated. A qualitative comparison of the method is made against classical cepstrum using real data. The detection performance is measured using random trials of simulated data with impulsive noise. It is shown that although the cepstrum outperforms in Gaussian noise at low signal-to-noise ratio, the shift method performs significantly better in impulsive noise.
We review maximum entropy (MaxEnt) PDF projection, a method with wide potential applications in s... more We review maximum entropy (MaxEnt) PDF projection, a method with wide potential applications in statistical inference. The method constructs a sampling distribution for a high-dimensional vector x based on knowing the sampling distribution p(z) of a lower-dimensional feature z = T (x). Under mild conditions, the distribution p(x) having highest possible entropy among all distributions consistent with p(z) may be readily found. Furthermore, the MaxEnt p(x) may be sampled, making the approach useful in Monte Carlo methods. We review the theorem and present a case study in model order selection and classification for handwritten character recognition.We review maximum entropy (MaxEnt) PDF projection, a method with wide potential applications in statistical inference. The method constructs a sampling distribution for a high-dimensional vector x based on knowing the sampling distribution p(z) of a lower-dimensional feature z = T (x). Under mild conditions, the distribution p(x) having highest possible entropy among all distributions consistent with p(z) may be readily found. Furthermore, the MaxEnt p(x) may be sampled, making the approach useful in Monte Carlo methods. We review the theorem and present a case study in model order selection and classification for handwritten character recognition.
The projected belief network (PBN) is a deep layered generative network with tractable likelihood... more The projected belief network (PBN) is a deep layered generative network with tractable likelihood function (LF) and can be used as a Bayesian classifier by training a separate model on each data class, and classifying based on maximum likelihood (ML). Unlike other generative models with tractable LF, the PBN can share an embodiment with a feed-forward classifier network. By training a PBN with a cost function that combines LF with classifier cross-entropy, its network weights can be “aligned” to the decision boundaries separating the data class from other classes. This results in a Bayesian classifier that rivals state of the art discriminative classifiers. These claims are backed up by classification experiments involving spectrograms of spoken keywords and handwritten characters.
The paper addresses the problem of detecting echos from linear frequency modulated (LFM) active s... more The paper addresses the problem of detecting echos from linear frequency modulated (LFM) active sonar transmissions when the propagation channel exhibits time spreading distortion (TSD) and fast fading distortion (FFD). Both distortion mechanisms reduce the performance of "high resolution" LFM replica correlators which are designed to improve detection performance in reverberation. Robust detectors are required in this situation. The paper compares two robust detectors: the segmented replica correlator (SRC) and the replica correlator followed by incoherent integration (RCI). The author shows that for LFM, (1) TSD and FFD have an equivalent effect on the signal, (2) the RCI and SRC have equivalent reverberation only output statistics, and (3) the RCI and SRC have equivalent performance against either TSD or FFD. The author points out that while virtually all robust active sonar detectors use the SRC, the RCI is preferable when the amount of distortion is not known apriori.<<ETX>>
Journal of the Acoustical Society of America, Oct 1, 2019
Despite the success of discriminative (DISC) classifiers, there remain serious flaws in the DISC ... more Despite the success of discriminative (DISC) classifiers, there remain serious flaws in the DISC approaches. Using adversarial sampling, a DISC network can be fooled into producing any desired classifier output by subtle changes in the data sample. The problem lies in the goal of DISC methods: assigning class identity without further considerations. In contrast, generative (GEN) methods model the underlying data generation process, can be interrogated and groomed by researchers, so have the potential to operate well with little training data. But, the GEN task is more difficult, so performance lags behind. But, given time and effort, GEN performance can even surpass performance of DISC methods as shown by deep belief network of Hinton in 2006 which performed better than comparable fully-connected (non-CONV) DISC networks. As CONV DISC networks (CNNs) have seen a quantum leap in performance, GEN methods have once again fallen behind. This talk details unexplored avenues to greatly improve CONV GEN models (CGMs). One avenue, “max-pooling for CGMs” is backed up by promising experiments. Max-pooling is a dimension-reduction step that is partly responsible for the success of CNNs, but presents a severe obstacle for CGMs by discarding the pooling positioning information (PPI). We show that encoding PPI info the features greatly improves the quality of GEN auto-encoders. When PPI together with added GEN neurons are “appended” to existing DISC CNNs, a hybrid GEN/DISC CNN is created with the best qualities of both approaches.
International Conference on Information Fusion, Aug 4, 2016
We review recent theoretical results in maximum entropy (MaxEnt) PDF projection that provide a th... more We review recent theoretical results in maximum entropy (MaxEnt) PDF projection that provide a theoretical framework for fusing the information from multiple features for the purpose of general statistical inference. Given a high-dimensional input data vector x, and several dimension-reducing feature transformations zi = Ti(x), we consider the problem of estimating the probability density function (PDF) of x by fusing the information in the various features. When the PDF of one feature p(zi) is known or has been estimated, the PDF pi(x) that has maximum entropy among all PDFs consistent with p(zi) can be constructed. This is called the maximum entropy projected PDFs and can serve as a generative models from which random samples can be drawn. The information from all the features can be fused into a common classifier structure either by testing each hypothesis with a different feature, or by combining the various projected PDFs in a mixture PDFs. We review related theoretical and experimental results and provide a simulated classification experiment to highlight the potential of the method.
Journal of the Acoustical Society of America, Oct 1, 2003
This talk describes a new probabilistic method for classification called the ‘‘class-specific met... more This talk describes a new probabilistic method for classification called the ‘‘class-specific method’’ (CSM). CSM is able to avoid the ‘‘curse of dimensionality’’ which plagues most classifiers which attempt to determine the decision boundaries in a high-dimensional feature space. Using CSM, it is possible to build a theoretically optimum classifier without a common feature space. Separate low-dimensional features sets may be defined for each class, while the decision functions are projected back to the common raw data space. CSM effectively extends classical classification theory to handle multiple feature spaces. It is completely general, and requires no simplifying assumption such as Gaussianity or that data lies in linear subspaces. In real-data problems, CSM has shown orders of magnitude reductions in the false-alarm rate. CSM achieves this gain because it is able to make use of partial prior knowledge about the data classes. In contrast, the existing theory can only make use of full knowledge—that is when the parametric forms of the data probability density functions (PDFs) are known.
Journal of the Acoustical Society of America, Oct 1, 2014
The recently introduced shift method is applied to detect and characterize burst-pulse vocalizati... more The recently introduced shift method is applied to detect and characterize burst-pulse vocalizations produced by marine mammals. To this end, burst pulses are modeled as sequences of click-like events that are repeated after a certain inter-click interval (ICI). The shift method is used to first emphasize events that repeat within an input signal. Afterwards, the ICI can be estimated. A qualitative comparison of the method is made against classical cepstrum using real data. The detection performance is measured using random trials of simulated data with impulsive noise. It is shown that although the cepstrum outperforms in Gaussian noise at low signal-to-noise ratio, the shift method performs significantly better in impulsive noise.
We review maximum entropy (MaxEnt) PDF projection, a method with wide potential applications in s... more We review maximum entropy (MaxEnt) PDF projection, a method with wide potential applications in statistical inference. The method constructs a sampling distribution for a high-dimensional vector x based on knowing the sampling distribution p(z) of a lower-dimensional feature z = T (x). Under mild conditions, the distribution p(x) having highest possible entropy among all distributions consistent with p(z) may be readily found. Furthermore, the MaxEnt p(x) may be sampled, making the approach useful in Monte Carlo methods. We review the theorem and present a case study in model order selection and classification for handwritten character recognition.We review maximum entropy (MaxEnt) PDF projection, a method with wide potential applications in statistical inference. The method constructs a sampling distribution for a high-dimensional vector x based on knowing the sampling distribution p(z) of a lower-dimensional feature z = T (x). Under mild conditions, the distribution p(x) having highest possible entropy among all distributions consistent with p(z) may be readily found. Furthermore, the MaxEnt p(x) may be sampled, making the approach useful in Monte Carlo methods. We review the theorem and present a case study in model order selection and classification for handwritten character recognition.
The projected belief network (PBN) is a deep layered generative network with tractable likelihood... more The projected belief network (PBN) is a deep layered generative network with tractable likelihood function (LF) and can be used as a Bayesian classifier by training a separate model on each data class, and classifying based on maximum likelihood (ML). Unlike other generative models with tractable LF, the PBN can share an embodiment with a feed-forward classifier network. By training a PBN with a cost function that combines LF with classifier cross-entropy, its network weights can be “aligned” to the decision boundaries separating the data class from other classes. This results in a Bayesian classifier that rivals state of the art discriminative classifiers. These claims are backed up by classification experiments involving spectrograms of spoken keywords and handwritten characters.
The paper addresses the problem of detecting echos from linear frequency modulated (LFM) active s... more The paper addresses the problem of detecting echos from linear frequency modulated (LFM) active sonar transmissions when the propagation channel exhibits time spreading distortion (TSD) and fast fading distortion (FFD). Both distortion mechanisms reduce the performance of "high resolution" LFM replica correlators which are designed to improve detection performance in reverberation. Robust detectors are required in this situation. The paper compares two robust detectors: the segmented replica correlator (SRC) and the replica correlator followed by incoherent integration (RCI). The author shows that for LFM, (1) TSD and FFD have an equivalent effect on the signal, (2) the RCI and SRC have equivalent reverberation only output statistics, and (3) the RCI and SRC have equivalent performance against either TSD or FFD. The author points out that while virtually all robust active sonar detectors use the SRC, the RCI is preferable when the amount of distortion is not known apriori.<<ETX>>
Journal of the Acoustical Society of America, Oct 1, 2019
Despite the success of discriminative (DISC) classifiers, there remain serious flaws in the DISC ... more Despite the success of discriminative (DISC) classifiers, there remain serious flaws in the DISC approaches. Using adversarial sampling, a DISC network can be fooled into producing any desired classifier output by subtle changes in the data sample. The problem lies in the goal of DISC methods: assigning class identity without further considerations. In contrast, generative (GEN) methods model the underlying data generation process, can be interrogated and groomed by researchers, so have the potential to operate well with little training data. But, the GEN task is more difficult, so performance lags behind. But, given time and effort, GEN performance can even surpass performance of DISC methods as shown by deep belief network of Hinton in 2006 which performed better than comparable fully-connected (non-CONV) DISC networks. As CONV DISC networks (CNNs) have seen a quantum leap in performance, GEN methods have once again fallen behind. This talk details unexplored avenues to greatly improve CONV GEN models (CGMs). One avenue, “max-pooling for CGMs” is backed up by promising experiments. Max-pooling is a dimension-reduction step that is partly responsible for the success of CNNs, but presents a severe obstacle for CGMs by discarding the pooling positioning information (PPI). We show that encoding PPI info the features greatly improves the quality of GEN auto-encoders. When PPI together with added GEN neurons are “appended” to existing DISC CNNs, a hybrid GEN/DISC CNN is created with the best qualities of both approaches.
International Conference on Information Fusion, Aug 4, 2016
We review recent theoretical results in maximum entropy (MaxEnt) PDF projection that provide a th... more We review recent theoretical results in maximum entropy (MaxEnt) PDF projection that provide a theoretical framework for fusing the information from multiple features for the purpose of general statistical inference. Given a high-dimensional input data vector x, and several dimension-reducing feature transformations zi = Ti(x), we consider the problem of estimating the probability density function (PDF) of x by fusing the information in the various features. When the PDF of one feature p(zi) is known or has been estimated, the PDF pi(x) that has maximum entropy among all PDFs consistent with p(zi) can be constructed. This is called the maximum entropy projected PDFs and can serve as a generative models from which random samples can be drawn. The information from all the features can be fused into a common classifier structure either by testing each hypothesis with a different feature, or by combining the various projected PDFs in a mixture PDFs. We review related theoretical and experimental results and provide a simulated classification experiment to highlight the potential of the method.
Journal of the Acoustical Society of America, Oct 1, 2003
This talk describes a new probabilistic method for classification called the ‘‘class-specific met... more This talk describes a new probabilistic method for classification called the ‘‘class-specific method’’ (CSM). CSM is able to avoid the ‘‘curse of dimensionality’’ which plagues most classifiers which attempt to determine the decision boundaries in a high-dimensional feature space. Using CSM, it is possible to build a theoretically optimum classifier without a common feature space. Separate low-dimensional features sets may be defined for each class, while the decision functions are projected back to the common raw data space. CSM effectively extends classical classification theory to handle multiple feature spaces. It is completely general, and requires no simplifying assumption such as Gaussianity or that data lies in linear subspaces. In real-data problems, CSM has shown orders of magnitude reductions in the false-alarm rate. CSM achieves this gain because it is able to make use of partial prior knowledge about the data classes. In contrast, the existing theory can only make use of full knowledge—that is when the parametric forms of the data probability density functions (PDFs) are known.
Uploads
Papers by Paul Baggenstoss