Search | arXiv e-print repository

Building Trust in Mental Health Chatbots: Safety Metrics and LLM-Based Evaluation Tools

Authors: Jung In Park, Mahyar Abbasian, Iman Azimi, Dawn Bounds, Angela Jun, Jaesu Han, Robert McCarron, Jessica Borelli, Jia Li, Mona Mahmoudi, Carmen Wiedenhoeft, Amir Rahmani

Abstract: Objective: This study aims to develop and validate an evaluation framework to ensure the safety and reliability of mental health chatbots, which are increasingly popular due to their accessibility, human-like interactions, and context-aware support. Materials and Methods: We created an evaluation framework with 100 benchmark questions and ideal responses, and five guideline questions for chatbot r… ▽ More Objective: This study aims to develop and validate an evaluation framework to ensure the safety and reliability of mental health chatbots, which are increasingly popular due to their accessibility, human-like interactions, and context-aware support. Materials and Methods: We created an evaluation framework with 100 benchmark questions and ideal responses, and five guideline questions for chatbot responses. This framework, validated by mental health experts, was tested on a GPT-3.5-turbo-based chatbot. Automated evaluation methods explored included large language model (LLM)-based scoring, an agentic approach using real-time data, and embedding models to compare chatbot responses against ground truth standards. Results: The results highlight the importance of guidelines and ground truth for improving LLM evaluation accuracy. The agentic method, dynamically accessing reliable information, demonstrated the best alignment with human assessments. Adherence to a standardized, expert-validated framework significantly enhanced chatbot response safety and reliability. Discussion: Our findings emphasize the need for comprehensive, expert-tailored safety evaluation metrics for mental health chatbots. While LLMs have significant potential, careful implementation is necessary to mitigate risks. The superior performance of the agentic approach underscores the importance of real-time data access in enhancing chatbot reliability. Conclusion: The study validated an evaluation framework for mental health chatbots, proving its effectiveness in improving safety and reliability. Future work should extend evaluations to accuracy, bias, empathy, and privacy to ensure holistic assessment and responsible integration into healthcare. Standardized evaluations will build trust among users and professionals, facilitating broader adoption and improved mental health support through technology. △ Less

Submitted 3 August, 2024; originally announced August 2024.

arXiv:2408.00109 [pdf, other]

Back to the Continuous Attractor

Authors: Ábel Ságodi, Guillermo Martín-Sánchez, Piotr Sokół, Il Memming Park

Abstract: Continuous attractors offer a unique class of solutions for storing continuous-valued variables in recurrent system states for indefinitely long time intervals. Unfortunately, continuous attractors suffer from severe structural instability in general--they are destroyed by most infinitesimal changes of the dynamical law that defines them. This fragility limits their utility especially in biologica… ▽ More Continuous attractors offer a unique class of solutions for storing continuous-valued variables in recurrent system states for indefinitely long time intervals. Unfortunately, continuous attractors suffer from severe structural instability in general--they are destroyed by most infinitesimal changes of the dynamical law that defines them. This fragility limits their utility especially in biological systems as their recurrent dynamics are subject to constant perturbations. We observe that the bifurcations from continuous attractors in theoretical neuroscience models display various structurally stable forms. Although their asymptotic behaviors to maintain memory are categorically distinct, their finite-time behaviors are similar. We build on the persistent manifold theory to explain the commonalities between bifurcations from and approximations of continuous attractors. Fast-slow decomposition analysis uncovers the persistent manifold that survives the seemingly destructive bifurcation. Moreover, recurrent neural networks trained on analog memory tasks display approximate continuous attractors with predicted slow manifold structures. Therefore, continuous attractors are functionally robust and remain useful as a universal analogy for understanding analog memory. △ Less

Submitted 31 July, 2024; originally announced August 2024.

arXiv:2406.07488 [pdf, other]

ReduceFormer: Attention with Tensor Reduction by Summation

Authors: John Yang, Le An, Su Inn Park

Abstract: Transformers have excelled in many tasks including vision. However, efficient deployment of transformer models in low-latency or high-throughput applications is hindered by the computation in the attention mechanism which involves expensive operations such as matrix multiplication and Softmax. To address this, we introduce ReduceFormer, a family of models optimized for efficiency with the spirit o… ▽ More Transformers have excelled in many tasks including vision. However, efficient deployment of transformer models in low-latency or high-throughput applications is hindered by the computation in the attention mechanism which involves expensive operations such as matrix multiplication and Softmax. To address this, we introduce ReduceFormer, a family of models optimized for efficiency with the spirit of attention. ReduceFormer leverages only simple operations such as reduction and element-wise multiplication, leading to greatly simplified architecture and improved inference performance, with up to 37% reduction in latency and 44% improvement in throughput, while maintaining competitive accuracy comparable to other recent methods. The proposed model family is suitable for edge devices where compute resource and memory bandwidth are limited, as well as for cloud computing where high throughput is sought after. △ Less

Submitted 11 June, 2024; originally announced June 2024.

arXiv:2406.06004 [pdf, other]

FLEUR: An Explainable Reference-Free Evaluation Metric for Image Captioning Using a Large Multimodal Model

Authors: Yebin Lee, Imseong Park, Myungjoo Kang

Abstract: Most existing image captioning evaluation metrics focus on assigning a single numerical score to a caption by comparing it with reference captions. However, these methods do not provide an explanation for the assigned score. Moreover, reference captions are expensive to acquire. In this paper, we propose FLEUR, an explainable reference-free metric to introduce explainability into image captioning… ▽ More Most existing image captioning evaluation metrics focus on assigning a single numerical score to a caption by comparing it with reference captions. However, these methods do not provide an explanation for the assigned score. Moreover, reference captions are expensive to acquire. In this paper, we propose FLEUR, an explainable reference-free metric to introduce explainability into image captioning evaluation metrics. By leveraging a large multimodal model, FLEUR can evaluate the caption against the image without the need for reference captions, and provide the explanation for the assigned score. We introduce score smoothing to align as closely as possible with human judgment and to be robust to user-defined grading criteria. FLEUR achieves high correlations with human judgment across various image captioning evaluation benchmarks and reaches state-of-the-art results on Flickr8k-CF, COMPOSITE, and Pascal-50S within the domain of reference-free evaluation metrics. Our source code and results are publicly available at: https://github.com/Yebin46/FLEUR. △ Less

Submitted 9 June, 2024; originally announced June 2024.

Comments: Accepted at ACL (Main) 2024

arXiv:2405.03958 [pdf, other]

Simple Drop-in LoRA Conditioning on Attention Layers Will Improve Your Diffusion Model

Authors: Joo Young Choi, Jaesung R. Park, Inkyu Park, Jaewoong Cho, Albert No, Ernest K. Ryu

Abstract: Current state-of-the-art diffusion models employ U-Net architectures containing convolutional and (qkv) self-attention layers. The U-Net processes images while being conditioned on the time embedding input for each sampling step and the class or caption embedding input corresponding to the desired conditional generation. Such conditioning involves scale-and-shift operations to the convolutional la… ▽ More Current state-of-the-art diffusion models employ U-Net architectures containing convolutional and (qkv) self-attention layers. The U-Net processes images while being conditioned on the time embedding input for each sampling step and the class or caption embedding input corresponding to the desired conditional generation. Such conditioning involves scale-and-shift operations to the convolutional layers but does not directly affect the attention layers. While these standard architectural choices are certainly effective, not conditioning the attention layers feels arbitrary and potentially suboptimal. In this work, we show that simply adding LoRA conditioning to the attention layers without changing or tuning the other parts of the U-Net architecture improves the image generation quality. For example, a drop-in addition of LoRA conditioning to EDM diffusion model yields FID scores of 1.91/1.75 for unconditional and class-conditional CIFAR-10 generation, improving upon the baseline of 1.97/1.79. △ Less

Submitted 6 May, 2024; originally announced May 2024.

arXiv:2404.11615 [pdf, other]

Factorized Diffusion: Perceptual Illusions by Noise Decomposition

Authors: Daniel Geng, Inbum Park, Andrew Owens

Abstract: Given a factorization of an image into a sum of linear components, we present a zero-shot method to control each individual component through diffusion model sampling. For example, we can decompose an image into low and high spatial frequencies and condition these components on different text prompts. This produces hybrid images, which change appearance depending on viewing distance. By decomposin… ▽ More Given a factorization of an image into a sum of linear components, we present a zero-shot method to control each individual component through diffusion model sampling. For example, we can decompose an image into low and high spatial frequencies and condition these components on different text prompts. This produces hybrid images, which change appearance depending on viewing distance. By decomposing an image into three frequency subbands, we can generate hybrid images with three prompts. We also use a decomposition into grayscale and color components to produce images whose appearance changes when they are viewed in grayscale, a phenomena that naturally occurs under dim lighting. And we explore a decomposition by a motion blur kernel, which produces images that change appearance under motion blurring. Our method works by denoising with a composite noise estimate, built from the components of noise estimates conditioned on different prompts. We also show that for certain decompositions, our method recovers prior approaches to compositional generation and spatial control. Finally, we show that we can extend our approach to generate hybrid images from real images. We do this by holding one component fixed and generating the remaining components, effectively solving an inverse problem. △ Less

Submitted 17 April, 2024; originally announced April 2024.

arXiv:2403.01371 [pdf, other]

eXponential FAmily Dynamical Systems (XFADS): Large-scale nonlinear Gaussian state-space modeling

Authors: Matthew Dowling, Yuan Zhao, Il Memming Park

Abstract: State-space graphical models and the variational autoencoder framework provide a principled apparatus for learning dynamical systems from data. State-of-the-art probabilistic approaches are often able to scale to large problems at the cost of flexibility of the variational posterior or expressivity of the dynamics model. However, those consolidations can be detrimental if the ultimate goal is to l… ▽ More State-space graphical models and the variational autoencoder framework provide a principled apparatus for learning dynamical systems from data. State-of-the-art probabilistic approaches are often able to scale to large problems at the cost of flexibility of the variational posterior or expressivity of the dynamics model. However, those consolidations can be detrimental if the ultimate goal is to learn a generative model capable of explaining the spatiotemporal structure of the data and making accurate forecasts. We introduce a low-rank structured variational autoencoding framework for nonlinear Gaussian state-space graphical models capable of capturing dense covariance structures that are important for learning dynamical systems with predictive capabilities. Our inference algorithm exploits the covariance structures that arise naturally from sample based approximate Gaussian message passing and low-rank amortized posterior updates -- effectively performing approximate variational smoothing with time complexity scaling linearly in the state dimensionality. In comparisons with other deep state-space model architectures our approach consistently demonstrates the ability to learn a more predictive generative model. Furthermore, when applied to neural physiological recordings, our approach is able to learn a dynamical system capable of forecasting population spiking and behavioral correlates from a small portion of single trials. △ Less

Submitted 31 May, 2024; v1 submitted 2 March, 2024; originally announced March 2024.

arXiv:2401.16553 [pdf, other]

SelectLLM: Can LLMs Select Important Instructions to Annotate?

Authors: Ritik Sachin Parkar, Jaehyung Kim, Jong Inn Park, Dongyeop Kang

Abstract: Instruction tuning benefits from large and diverse datasets; however, creating such datasets involves a high cost of human labeling. While synthetic datasets generated by large language models (LLMs) have partly solved this issue, they often contain low-quality data. One effective solution is selectively annotating unlabelled instructions, especially given the relative ease of acquiring unlabeled… ▽ More Instruction tuning benefits from large and diverse datasets; however, creating such datasets involves a high cost of human labeling. While synthetic datasets generated by large language models (LLMs) have partly solved this issue, they often contain low-quality data. One effective solution is selectively annotating unlabelled instructions, especially given the relative ease of acquiring unlabeled instructions or texts from various sources. However, how to select unlabelled instructions is not well-explored, especially in the context of LLMs. Therefore, we introduce SelectLLM, an alternative framework that leverages the capabilities of LLMs to select unlabeled instructions more effectively. Specifically, SelectLLM consists of two key steps: Coreset-based clustering of unlabelled instructions for enlarging diversity and prompting of LLM to identify the most beneficial instructions within each cluster. We evaluate SelectLLM on AlpacaEval2 and MT-Bench, demonstrating its ability to outperform state-of-the-art methods like Alpagasus. In addition, we compare the performance and compatibility of SelectLLM with various LLMs, such as ChatGPT, LLaMA-3.1-70B, and Gemma-2-27b. SelectLLM's adaptability and robustness are further evidenced by its ability to maintain high performance across both human and synthetic datasets. All code and data are publicly available (https://github.com/minnesotanlp/select-llm). △ Less

Submitted 27 August, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

Comments: First Authors: Ritik Sachin Parkar and Jaehyung Kim | Second Author: Jong Inn Park | PI: Dongyeop Kang

arXiv:2401.08655 [pdf, other]

SAiD: Speech-driven Blendshape Facial Animation with Diffusion

Authors: Inkyu Park, Jaewoong Cho

Abstract: Speech-driven 3D facial animation is challenging due to the scarcity of large-scale visual-audio datasets despite extensive research. Most prior works, typically focused on learning regression models on a small dataset using the method of least squares, encounter difficulties generating diverse lip movements from speech and require substantial effort in refining the generated outputs. To address t… ▽ More Speech-driven 3D facial animation is challenging due to the scarcity of large-scale visual-audio datasets despite extensive research. Most prior works, typically focused on learning regression models on a small dataset using the method of least squares, encounter difficulties generating diverse lip movements from speech and require substantial effort in refining the generated outputs. To address these issues, we propose a speech-driven 3D facial animation with a diffusion model (SAiD), a lightweight Transformer-based U-Net with a cross-modality alignment bias between audio and visual to enhance lip synchronization. Moreover, we introduce BlendVOCA, a benchmark dataset of pairs of speech audio and parameters of a blendshape facial model, to address the scarcity of public resources. Our experimental results demonstrate that the proposed approach achieves comparable or superior performance in lip synchronization to baselines, ensures more diverse lip movements, and streamlines the animation editing process. △ Less

Submitted 24 January, 2024; v1 submitted 24 December, 2023; originally announced January 2024.

Comments: Fix bug related to the font size

arXiv:2311.17919 [pdf, other]

Visual Anagrams: Generating Multi-View Optical Illusions with Diffusion Models

Authors: Daniel Geng, Inbum Park, Andrew Owens

Abstract: We address the problem of synthesizing multi-view optical illusions: images that change appearance upon a transformation, such as a flip or rotation. We propose a simple, zero-shot method for obtaining these illusions from off-the-shelf text-to-image diffusion models. During the reverse diffusion process, we estimate the noise from different views of a noisy image, and then combine these noise est… ▽ More We address the problem of synthesizing multi-view optical illusions: images that change appearance upon a transformation, such as a flip or rotation. We propose a simple, zero-shot method for obtaining these illusions from off-the-shelf text-to-image diffusion models. During the reverse diffusion process, we estimate the noise from different views of a noisy image, and then combine these noise estimates together and denoise the image. A theoretical analysis suggests that this method works precisely for views that can be written as orthogonal transformations, of which permutations are a subset. This leads to the idea of a visual anagram--an image that changes appearance under some rearrangement of pixels. This includes rotations and flips, but also more exotic pixel permutations such as a jigsaw rearrangement. Our approach also naturally extends to illusions with more than two views. We provide both qualitative and quantitative results demonstrating the effectiveness and flexibility of our method. Please see our project webpage for additional visualizations and results: https://dangeng.github.io/visual_anagrams/ △ Less

Submitted 2 April, 2024; v1 submitted 29 November, 2023; originally announced November 2023.

Comments: CVPR 2024 camera ready

arXiv:2309.17012 [pdf, other]

Benchmarking Cognitive Biases in Large Language Models as Evaluators

Authors: Ryan Koo, Minhwa Lee, Vipul Raheja, Jong Inn Park, Zae Myung Kim, Dongyeop Kang

Abstract: Large Language Models (LLMs) have recently been shown to be effective as automatic evaluators with simple prompting and in-context learning. In this work, we assemble 15 LLMs of four different size ranges and evaluate their output responses by preference ranking from the other LLMs as evaluators, such as System Star is better than System Square. We then evaluate the quality of ranking outputs intr… ▽ More Large Language Models (LLMs) have recently been shown to be effective as automatic evaluators with simple prompting and in-context learning. In this work, we assemble 15 LLMs of four different size ranges and evaluate their output responses by preference ranking from the other LLMs as evaluators, such as System Star is better than System Square. We then evaluate the quality of ranking outputs introducing the Cognitive Bias Benchmark for LLMs as Evaluators (CoBBLEr), a benchmark to measure six different cognitive biases in LLM evaluation outputs, such as the Egocentric bias where a model prefers to rank its own outputs highly in evaluation. We find that LLMs are biased text quality evaluators, exhibiting strong indications on our bias benchmark (average of 40% of comparisons across all models) within each of their evaluations that question their robustness as evaluators. Furthermore, we examine the correlation between human and machine preferences and calculate the average Rank-Biased Overlap (RBO) score to be 49.6%, indicating that machine preferences are misaligned with humans. According to our findings, LLMs may still be unable to be utilized for automatic annotation aligned with human preferences. Our project page is at: https://minnesotanlp.github.io/cobbler. △ Less

Submitted 12 August, 2024; v1 submitted 29 September, 2023; originally announced September 2023.

Comments: Publishsed at 2024. 29 pages, 9 figures, 14 tables

ACM Class: I.2.7

arXiv:2308.12585 [pdf, other]

Persistent learning signals and working memory without continuous attractors

Authors: Il Memming Park, Ábel Ságodi, Piotr Aleksander Sokół

Abstract: Neural dynamical systems with stable attractor structures, such as point attractors and continuous attractors, are hypothesized to underlie meaningful temporal behavior that requires working memory. However, working memory may not support useful learning signals necessary to adapt to changes in the temporal structure of the environment. We show that in addition to the continuous attractors that ar… ▽ More Neural dynamical systems with stable attractor structures, such as point attractors and continuous attractors, are hypothesized to underlie meaningful temporal behavior that requires working memory. However, working memory may not support useful learning signals necessary to adapt to changes in the temporal structure of the environment. We show that in addition to the continuous attractors that are widely implicated, periodic and quasi-periodic attractors can also support learning arbitrarily long temporal relationships. Unlike the continuous attractors that suffer from the fine-tuning problem, the less explored quasi-periodic attractors are uniquely qualified for learning to produce temporally structured behavior. Our theory has broad implications for the design of artificial learning systems and makes predictions about observable signatures of biological neural dynamics that can support temporal dependence learning and working memory. Based on our theory, we developed a new initialization scheme for artificial recurrent neural networks that outperforms standard methods for tasks that require learning temporal dynamics. Moreover, we propose a robust recurrent memory mechanism for integrating and maintaining head direction without a ring attractor. △ Less

Submitted 24 August, 2023; originally announced August 2023.

arXiv:2308.05542 [pdf, other]

Robust Asymmetric Loss for Multi-Label Long-Tailed Learning

Authors: Wongi Park, Inhyuk Park, Sungeun Kim, Jongbin Ryu

Abstract: In real medical data, training samples typically show long-tailed distributions with multiple labels. Class distribution of the medical data has a long-tailed shape, in which the incidence of different diseases is quite varied, and at the same time, it is not unusual for images taken from symptomatic patients to be multi-label diseases. Therefore, in this paper, we concurrently address these two i… ▽ More In real medical data, training samples typically show long-tailed distributions with multiple labels. Class distribution of the medical data has a long-tailed shape, in which the incidence of different diseases is quite varied, and at the same time, it is not unusual for images taken from symptomatic patients to be multi-label diseases. Therefore, in this paper, we concurrently address these two issues by putting forth a robust asymmetric loss on the polynomial function. Since our loss tackles both long-tailed and multi-label classification problems simultaneously, it leads to a complex design of the loss function with a large number of hyper-parameters. Although a model can be highly fine-tuned due to a large number of hyper-parameters, it is difficult to optimize all hyper-parameters at the same time, and there might be a risk of overfitting a model. Therefore, we regularize the loss function using the Hill loss approach, which is beneficial to be less sensitive against the numerous hyper-parameters so that it reduces the risk of overfitting the model. For this reason, the proposed loss is a generic method that can be applied to most medical image classification tasks and does not make the training process more time-consuming. We demonstrate that the proposed robust asymmetric loss performs favorably against the long-tailed with multi-label medical image classification in addition to the various long-tailed single-label datasets. Notably, our method achieves Top-5 results on the CXR-LT dataset of the ICCV CVAMD 2023 competition. We opensource our implementation of the robust asymmetric loss in the public repository: https://github.com/kalelpark/RAL. △ Less

Submitted 10 August, 2023; originally announced August 2023.

Journal ref: ICCVW 2023

arXiv:2306.13776 [pdf, other]

Swin-Free: Achieving Better Cross-Window Attention and Efficiency with Size-varying Window

Authors: Jinkyu Koo, John Yang, Le An, Gwenaelle Cunha Sergio, Su Inn Park

Abstract: Transformer models have shown great potential in computer vision, following their success in language tasks. Swin Transformer is one of them that outperforms convolution-based architectures in terms of accuracy, while improving efficiency when compared to Vision Transformer (ViT) and its variants, which have quadratic complexity with respect to the input size. Swin Transformer features shifting wi… ▽ More Transformer models have shown great potential in computer vision, following their success in language tasks. Swin Transformer is one of them that outperforms convolution-based architectures in terms of accuracy, while improving efficiency when compared to Vision Transformer (ViT) and its variants, which have quadratic complexity with respect to the input size. Swin Transformer features shifting windows that allows cross-window connection while limiting self-attention computation to non-overlapping local windows. However, shifting windows introduces memory copy operations, which account for a significant portion of its runtime. To mitigate this issue, we propose Swin-Free in which we apply size-varying windows across stages, instead of shifting windows, to achieve cross-connection among local windows. With this simple design change, Swin-Free runs faster than the Swin Transformer at inference with better accuracy. Furthermore, we also propose a few of Swin-Free variants that are faster than their Swin Transformer counterparts. △ Less

Submitted 23 June, 2023; originally announced June 2023.

Comments: 8 pages, 3 figures

arXiv:2306.01802 [pdf, other]

Linear Time GPs for Inferring Latent Trajectories from Neural Spike Trains

Authors: Matthew Dowling, Yuan Zhao, Il Memming Park

Abstract: Latent Gaussian process (GP) models are widely used in neuroscience to uncover hidden state evolutions from sequential observations, mainly in neural activity recordings. While latent GP models provide a principled and powerful solution in theory, the intractable posterior in non-conjugate settings necessitates approximate inference schemes, which may lack scalability. In this work, we propose cvH… ▽ More Latent Gaussian process (GP) models are widely used in neuroscience to uncover hidden state evolutions from sequential observations, mainly in neural activity recordings. While latent GP models provide a principled and powerful solution in theory, the intractable posterior in non-conjugate settings necessitates approximate inference schemes, which may lack scalability. In this work, we propose cvHM, a general inference framework for latent GP models leveraging Hida-Matérn kernels and conjugate computation variational inference (CVI). With cvHM, we are able to perform variational inference of latent neural trajectories with linear time complexity for arbitrary likelihoods. The reparameterization of stationary kernels using Hida-Matérn GPs helps us connect the latent variable models that encode prior assumptions through dynamical systems to those that encode trajectory assumptions through GPs. In contrast to previous work, we use bidirectional information filtering, leading to a more concise implementation. Furthermore, we employ the Whittle approximate likelihood to achieve highly efficient hyperparameter learning. △ Less

Submitted 1 June, 2023; originally announced June 2023.

Comments: Published at ICML 2023

arXiv:2305.11278 [pdf, other]

Real-Time Variational Method for Learning Neural Trajectory and its Dynamics

Authors: Matthew Dowling, Yuan Zhao, Il Memming Park

Abstract: Latent variable models have become instrumental in computational neuroscience for reasoning about neural computation. This has fostered the development of powerful offline algorithms for extracting latent neural trajectories from neural recordings. However, despite the potential of real time alternatives to give immediate feedback to experimentalists, and enhance experimental design, they have rec… ▽ More Latent variable models have become instrumental in computational neuroscience for reasoning about neural computation. This has fostered the development of powerful offline algorithms for extracting latent neural trajectories from neural recordings. However, despite the potential of real time alternatives to give immediate feedback to experimentalists, and enhance experimental design, they have received markedly less attention. In this work, we introduce the exponential family variational Kalman filter (eVKF), an online recursive Bayesian method aimed at inferring latent trajectories while simultaneously learning the dynamical system generating them. eVKF works for arbitrary likelihoods and utilizes the constant base measure exponential family to model the latent state stochasticity. We derive a closed-form variational analogue to the predict step of the Kalman filter which leads to a provably tighter bound on the ELBO compared to another online variational method. We validate our method on synthetic and real-world data, and, notably, show that it achieves competitive performance △ Less

Submitted 18 May, 2023; originally announced May 2023.

Comments: Published at ICLR 2023

arXiv:2305.04468 [pdf, other]

AnomalyBERT: Self-Supervised Transformer for Time Series Anomaly Detection using Data Degradation Scheme

Authors: Yungi Jeong, Eunseok Yang, Jung Hyun Ryu, Imseong Park, Myungjoo Kang

Abstract: Mechanical defects in real situations affect observation values and cause abnormalities in multivariate time series, such as sensor values or network data. To perceive abnormalities in such data, it is crucial to understand the temporal context and interrelation between variables simultaneously. The anomaly detection task for time series, especially for unlabeled data, has been a challenging probl… ▽ More Mechanical defects in real situations affect observation values and cause abnormalities in multivariate time series, such as sensor values or network data. To perceive abnormalities in such data, it is crucial to understand the temporal context and interrelation between variables simultaneously. The anomaly detection task for time series, especially for unlabeled data, has been a challenging problem, and we address it by applying a suitable data degradation scheme to self-supervised model training. We define four types of synthetic outliers and propose the degradation scheme in which a portion of input data is replaced with one of the synthetic outliers. Inspired by the self-attention mechanism, we design a Transformer-based architecture to recognize the temporal context and detect unnatural sequences with high efficiency. Our model converts multivariate data points into temporal representations with relative position bias and yields anomaly scores from these representations. Our method, AnomalyBERT, shows a great capability of detecting anomalies contained in complex time series and surpasses previous state-of-the-art methods on five real-world benchmarks. Our code is available at https://github.com/Jhryu30/AnomalyBERT. △ Less

Submitted 8 May, 2023; originally announced May 2023.

Comments: 11 pages, Presented at ICLR 2023 workshop on Machine Learning for IoT

arXiv:2303.02060 [pdf, other]

Spectral learning of Bernoulli linear dynamical systems models

Authors: Iris R. Stone, Yotam Sagiv, Il Memming Park, Jonathan W. Pillow

Abstract: Latent linear dynamical systems with Bernoulli observations provide a powerful modeling framework for identifying the temporal dynamics underlying binary time series data, which arise in a variety of contexts such as binary decision-making and discrete stochastic processes (e.g., binned neural spike trains). Here we develop a spectral learning method for fast, efficient fitting of probit-Bernoulli… ▽ More Latent linear dynamical systems with Bernoulli observations provide a powerful modeling framework for identifying the temporal dynamics underlying binary time series data, which arise in a variety of contexts such as binary decision-making and discrete stochastic processes (e.g., binned neural spike trains). Here we develop a spectral learning method for fast, efficient fitting of probit-Bernoulli latent linear dynamical system (LDS) models. Our approach extends traditional subspace identification methods to the Bernoulli setting via a transformation of the first and second sample moments. This results in a robust, fixed-cost estimator that avoids the hazards of local optima and the long computation time of iterative fitting procedures like the expectation-maximization (EM) algorithm. In regimes where data is limited or assumptions about the statistical structure of the data are not met, we demonstrate that the spectral estimate provides a good initialization for Laplace-EM fitting. Finally, we show that the estimator provides substantial benefits to real world settings by analyzing data from mice performing a sensory decision-making task. △ Less

Submitted 26 July, 2023; v1 submitted 3 March, 2023; originally announced March 2023.

Comments: Published in Transactions on Machine Learning Research (https://jmlr.org/tmlr/papers/)

Journal ref: Transactions on Machine Learning Research (2023)

arXiv:2212.04319 [pdf, other]

On the Robustness of Normalizing Flows for Inverse Problems in Imaging

Authors: Seongmin Hong, Inbum Park, Se Young Chun

Abstract: Conditional normalizing flows can generate diverse image samples for solving inverse problems. Most normalizing flows for inverse problems in imaging employ the conditional affine coupling layer that can generate diverse images quickly. However, unintended severe artifacts are occasionally observed in the output of them. In this work, we address this critical issue by investigating the origins of… ▽ More Conditional normalizing flows can generate diverse image samples for solving inverse problems. Most normalizing flows for inverse problems in imaging employ the conditional affine coupling layer that can generate diverse images quickly. However, unintended severe artifacts are occasionally observed in the output of them. In this work, we address this critical issue by investigating the origins of these artifacts and proposing the conditions to avoid them. First of all, we empirically and theoretically reveal that these problems are caused by "exploding inverse" in the conditional affine coupling layer for certain out-of-distribution (OOD) conditional inputs. Then, we further validated that the probability of causing erroneous artifacts in pixels is highly correlated with a Mahalanobis distance-based OOD score for inverse problems in imaging. Lastly, based on our investigations, we propose a remark to avoid exploding inverse and then based on it, we suggest a simple remedy that substitutes the affine coupling layers with the modified rational quadratic spline coupling layers in normalizing flows, to encourage the robustness of generated image samples. Our experimental results demonstrated that our suggested methods effectively suppressed critical artifacts occurring in normalizing flows for super-resolution space generation and low-light image enhancement. △ Less

Submitted 16 March, 2023; v1 submitted 8 December, 2022; originally announced December 2022.

Comments: 16 pages

arXiv:2211.07077 [pdf, other]

IFQA: Interpretable Face Quality Assessment

Authors: Byungho Jo, Donghyeon Cho, In Kyu Park, Sungeun Hong

Abstract: Existing face restoration models have relied on general assessment metrics that do not consider the characteristics of facial regions. Recent works have therefore assessed their methods using human studies, which is not scalable and involves significant effort. This paper proposes a novel face-centric metric based on an adversarial framework where a generator simulates face restoration and a discr… ▽ More Existing face restoration models have relied on general assessment metrics that do not consider the characteristics of facial regions. Recent works have therefore assessed their methods using human studies, which is not scalable and involves significant effort. This paper proposes a novel face-centric metric based on an adversarial framework where a generator simulates face restoration and a discriminator assesses image quality. Specifically, our per-pixel discriminator enables interpretable evaluation that cannot be provided by traditional metrics. Moreover, our metric emphasizes facial primary regions considering that even minor changes to the eyes, nose, and mouth significantly affect human cognition. Our face-oriented metric consistently surpasses existing general or facial image quality assessment metrics by impressive margins. We demonstrate the generalizability of the proposed strategy in various architectural designs and challenging scenarios. Interestingly, we find that our IFQA can lead to performance improvement as an objective function. △ Less

Submitted 16 November, 2022; v1 submitted 13 November, 2022; originally announced November 2022.

Comments: WACV 2023, Code: https://github.com/VCLLab/IFQA

arXiv:2208.08005 [pdf, other]

Transformer Encoder for Social Science

Authors: Haosen Ge, In Young Park, Xuancheng Qian, Grace Zeng

Abstract: High-quality text data has become an important data source for social scientists. We have witnessed the success of pretrained deep neural network models, such as BERT and RoBERTa, in recent social science research. In this paper, we propose a compact pretrained deep neural network, Transformer Encoder for Social Science (TESS), explicitly designed to tackle text processing tasks in social science… ▽ More High-quality text data has become an important data source for social scientists. We have witnessed the success of pretrained deep neural network models, such as BERT and RoBERTa, in recent social science research. In this paper, we propose a compact pretrained deep neural network, Transformer Encoder for Social Science (TESS), explicitly designed to tackle text processing tasks in social science research. Using two validation tests, we demonstrate that TESS outperforms BERT and RoBERTa by 16.7% on average when the number of training samples is limited (<1,000 training instances). The results display the superiority of TESS over BERT and RoBERTa on social science text processing tasks. Lastly, we discuss the limitation of our model and present advice for future researchers. △ Less

Submitted 16 August, 2022; originally announced August 2022.

arXiv:2204.13791 [pdf, other]

Depth Estimation with Simplified Transformer

Authors: John Yang, Le An, Anurag Dixit, Jinkyu Koo, Su Inn Park

Abstract: Transformer and its variants have shown state-of-the-art results in many vision tasks recently, ranging from image classification to dense prediction. Despite of their success, limited work has been reported on improving the model efficiency for deployment in latency-critical applications, such as autonomous driving and robotic navigation. In this paper, we aim at improving upon the existing trans… ▽ More Transformer and its variants have shown state-of-the-art results in many vision tasks recently, ranging from image classification to dense prediction. Despite of their success, limited work has been reported on improving the model efficiency for deployment in latency-critical applications, such as autonomous driving and robotic navigation. In this paper, we aim at improving upon the existing transformers in vision, and propose a method for self-supervised monocular Depth Estimation with Simplified Transformer (DEST), which is efficient and particularly suitable for deployment on GPU-based platforms. Through strategic design choices, our model leads to significant reduction in model size, complexity, as well as inference latency, while achieving superior accuracy as compared to state-of-the-art. We also show that our design generalize well to other dense prediction task without bells and whistles. △ Less

Submitted 27 May, 2022; v1 submitted 28 April, 2022; originally announced April 2022.

Comments: Accepted for the CVPR 2022 Transformers For Vision (T4V) workshop

arXiv:2204.01264 [pdf, other]

Probabilistic Implicit Scene Completion

Authors: Dongsu Zhang, Changwoon Choi, Inbum Park, Young Min Kim

Abstract: We propose a probabilistic shape completion method extended to the continuous geometry of large-scale 3D scenes. Real-world scans of 3D scenes suffer from a considerable amount of missing data cluttered with unsegmented objects. The problem of shape completion is inherently ill-posed, and high-quality result requires scalable solutions that consider multiple possible outcomes. We employ the Genera… ▽ More We propose a probabilistic shape completion method extended to the continuous geometry of large-scale 3D scenes. Real-world scans of 3D scenes suffer from a considerable amount of missing data cluttered with unsegmented objects. The problem of shape completion is inherently ill-posed, and high-quality result requires scalable solutions that consider multiple possible outcomes. We employ the Generative Cellular Automata that learns the multi-modal distribution and transform the formulation to process large-scale continuous geometry. The local continuous shape is incrementally generated as a sparse voxel embedding, which contains the latent code for each occupied cell. We formally derive that our training objective for the sparse voxel embedding maximizes the variational lower bound of the complete shape distribution and therefore our progressive generation constitutes a valid generative model. Experiments show that our model successfully generates diverse plausible scenes faithful to the input, especially when the input suffers from a significant amount of missing data. We also demonstrate that our approach outperforms deterministic models even in less ambiguous cases with a small amount of missing data, which infers that probabilistic formulation is crucial for high-quality geometry completion on input scans exhibiting any levels of completeness. △ Less

Submitted 4 April, 2022; originally announced April 2022.

Comments: Accepted to ICLR 2022 as spotlight, code available at https://github.com/96lives/gca

arXiv:2111.12911 [pdf, other]

doi 10.1109/ACCESS.2021.3123059

Human and Scene Motion Deblurring using Pseudo-blur Synthesizer

Authors: Jonathan Samuel Lumentut, In Kyu Park

Abstract: Present-day deep learning-based motion deblurring methods utilize the pair of synthetic blur and sharp data to regress any particular framework. This task is designed for directly translating a blurry image input into its restored version as output. The aforementioned approach relies heavily on the quality of the synthetic blurry data, which are only available before the training stage. Handling t… ▽ More Present-day deep learning-based motion deblurring methods utilize the pair of synthetic blur and sharp data to regress any particular framework. This task is designed for directly translating a blurry image input into its restored version as output. The aforementioned approach relies heavily on the quality of the synthetic blurry data, which are only available before the training stage. Handling this issue by providing a large amount of data is expensive for common usage. We answer this challenge by providing an on-the-fly blurry data augmenter that can be run during training and test stages. To fully utilize it, we incorporate an unorthodox scheme of deblurring framework that employs the sequence of blur-deblur-reblur-deblur steps. The reblur step is assisted by a reblurring module (synthesizer) that provides the reblurred version (pseudo-blur) of its sharp or deblurred counterpart. The proposed module is also equipped with hand-crafted prior extracted using the state-of-the-art human body statistical model. This prior is employed to map human and non-human regions during adversarial learning to fully perceive the characteristics of human-articulated and scene motion blurs. By engaging this approach, our deblurring module becomes adaptive and achieves superior outcomes compared to recent state-of-the-art deblurring algorithms. △ Less

Submitted 24 November, 2021; originally announced November 2021.

arXiv:2109.04463 [pdf, other]

Neural Latents Benchmark '21: Evaluating latent variable models of neural population activity

Authors: Felix Pei, Joel Ye, David Zoltowski, Anqi Wu, Raeed H. Chowdhury, Hansem Sohn, Joseph E. O'Doherty, Krishna V. Shenoy, Matthew T. Kaufman, Mark Churchland, Mehrdad Jazayeri, Lee E. Miller, Jonathan Pillow, Il Memming Park, Eva L. Dyer, Chethan Pandarinath

Abstract: Advances in neural recording present increasing opportunities to study neural activity in unprecedented detail. Latent variable models (LVMs) are promising tools for analyzing this rich activity across diverse neural systems and behaviors, as LVMs do not depend on known relationships between the activity and external experimental variables. However, progress with LVMs for neuronal population activ… ▽ More Advances in neural recording present increasing opportunities to study neural activity in unprecedented detail. Latent variable models (LVMs) are promising tools for analyzing this rich activity across diverse neural systems and behaviors, as LVMs do not depend on known relationships between the activity and external experimental variables. However, progress with LVMs for neuronal population activity is currently impeded by a lack of standardization, resulting in methods being developed and compared in an ad hoc manner. To coordinate these modeling efforts, we introduce a benchmark suite for latent variable modeling of neural population activity. We curate four datasets of neural spiking activity from cognitive, sensory, and motor areas to promote models that apply to the wide variety of activity seen across these areas. We identify unsupervised evaluation as a common framework for evaluating models across datasets, and apply several baselines that demonstrate benchmark diversity. We release this benchmark through EvalAI. http://neurallatents.github.io △ Less

Submitted 17 January, 2022; v1 submitted 9 September, 2021; originally announced September 2021.

arXiv:2107.07098 [pdf, other]

Hida-Matérn Kernel

Authors: Matthew Dowling, Piotr Sokół, Il Memming Park

Abstract: We present the class of Hida-Matérn kernels, which is the canonical family of covariance functions over the entire space of stationary Gauss-Markov Processes. It extends upon Matérn kernels, by allowing for flexible construction of priors over processes with oscillatory components. Any stationary kernel, including the widely used squared-exponential and spectral mixture kernels, are either directl… ▽ More We present the class of Hida-Matérn kernels, which is the canonical family of covariance functions over the entire space of stationary Gauss-Markov Processes. It extends upon Matérn kernels, by allowing for flexible construction of priors over processes with oscillatory components. Any stationary kernel, including the widely used squared-exponential and spectral mixture kernels, are either directly within this class or are appropriate asymptotic limits, demonstrating the generality of this class. Taking advantage of its Markovian nature we show how to represent such processes as state space models using only the kernel and its derivatives. In turn this allows us to perform Gaussian Process inference more efficiently and side step the usual computational burdens. We also show how exploiting special properties of the state space representation enables improved numerical stability in addition to further reductions of computational complexity. △ Less

Submitted 27 December, 2021; v1 submitted 14 July, 2021; originally announced July 2021.

arXiv:2106.03806 [pdf, other]

doi 10.18653/v1/2021.acl-short.63

Deep Context- and Relation-Aware Learning for Aspect-based Sentiment Analysis

Authors: Shinhyeok Oh, Dongyub Lee, Taesun Whang, IlNam Park, Gaeun Seo, EungGyun Kim, Harksoo Kim

Abstract: Existing works for aspect-based sentiment analysis (ABSA) have adopted a unified approach, which allows the interactive relations among subtasks. However, we observe that these methods tend to predict polarities based on the literal meaning of aspect and opinion terms and mainly consider relations implicitly among subtasks at the word level. In addition, identifying multiple aspect-opinion pairs w… ▽ More Existing works for aspect-based sentiment analysis (ABSA) have adopted a unified approach, which allows the interactive relations among subtasks. However, we observe that these methods tend to predict polarities based on the literal meaning of aspect and opinion terms and mainly consider relations implicitly among subtasks at the word level. In addition, identifying multiple aspect-opinion pairs with their polarities is much more challenging. Therefore, a comprehensive understanding of contextual information w.r.t. the aspect and opinion are further required in ABSA. In this paper, we propose Deep Contextualized Relation-Aware Network (DCRAN), which allows interactive relations among subtasks with deep contextual information based on two modules (i.e., Aspect and Opinion Propagation and Explicit Self-Supervised Strategies). Especially, we design novel self-supervised strategies for ABSA, which have strengths in dealing with multiple aspects. Experimental results show that DCRAN significantly outperforms previous state-of-the-art methods by large margins on three widely used benchmarks. △ Less

Submitted 7 June, 2021; originally announced June 2021.

Comments: Accepted to ACL-IJCNLP 2021

arXiv:2103.16851 [pdf, other]

Attention Map-guided Two-stage Anomaly Detection using Hard Augmentation

Authors: Jou Won Song, Kyeongbo Kong, Ye In Park, Suk-Ju Kang

Abstract: Anomaly detection is a task that recognizes whether an input sample is included in the distribution of a target normal class or an anomaly class. Conventional generative adversarial network (GAN)-based methods utilize an entire image including foreground and background as an input. However, in these methods, a useless region unrelated to the normal class (e.g., unrelated background) is learned as… ▽ More Anomaly detection is a task that recognizes whether an input sample is included in the distribution of a target normal class or an anomaly class. Conventional generative adversarial network (GAN)-based methods utilize an entire image including foreground and background as an input. However, in these methods, a useless region unrelated to the normal class (e.g., unrelated background) is learned as normal class distribution, thereby leading to false detection. To alleviate this problem, this paper proposes a novel two-stage network consisting of an attention network and an anomaly detection GAN (ADGAN). The attention network generates an attention map that can indicate the region representing the normal class distribution. To generate an accurate attention map, we propose the attention loss and the adversarial anomaly loss based on synthetic anomaly samples generated from hard augmentation. By applying the attention map to an image feature map, ADGAN learns the normal class distribution from which the useless region is removed, and it is possible to greatly reduce the problem difficulty of the anomaly detection task. Additionally, the estimated attention map can be used for anomaly segmentation because it can distinguish between normal and anomaly regions. As a result, the proposed method outperforms the state-of-the-art anomaly detection and anomaly segmentation methods for widely used datasets. △ Less

Submitted 31 March, 2021; originally announced March 2021.

arXiv:2102.11517 [pdf, other]

SliceNStitch: Continuous CP Decomposition of Sparse Tensor Streams

Authors: Taehyung Kwon, Inkyu Park, Dongjin Lee, Kijung Shin

Abstract: Consider traffic data (i.e., triplets in the form of source-destination-timestamp) that grow over time. Tensors (i.e., multi-dimensional arrays) with a time mode are widely used for modeling and analyzing such multi-aspect data streams. In such tensors, however, new entries are added only once per period, which is often an hour, a day, or even a year. This discreteness of tensors has limited their… ▽ More Consider traffic data (i.e., triplets in the form of source-destination-timestamp) that grow over time. Tensors (i.e., multi-dimensional arrays) with a time mode are widely used for modeling and analyzing such multi-aspect data streams. In such tensors, however, new entries are added only once per period, which is often an hour, a day, or even a year. This discreteness of tensors has limited their usage for real-time applications, where new data should be analyzed instantly as it arrives. How can we analyze time-evolving multi-aspect sparse data 'continuously' using tensors where time is'discrete'? We propose SLICENSTITCH for continuous CANDECOMP/PARAFAC (CP) decomposition, which has numerous time-critical applications, including anomaly detection, recommender systems, and stock market prediction. SLICENSTITCH changes the starting point of each period adaptively, based on the current time, and updates factor matrices (i.e., outputs of CP decomposition) instantly as new data arrives. We show, theoretically and experimentally, that SLICENSTITCH is (1) 'Any time': updating factor matrices immediately without having to wait until the current time period ends, (2) Fast: with constant-time updates up to 464x faster than online methods, and (3) Accurate: with fitness comparable (specifically, 72 ~ 100%) to offline methods. △ Less

Submitted 2 March, 2021; v1 submitted 23 February, 2021; originally announced February 2021.

Comments: Updated Figures 4, 5, 6, 7, and 8 after fixing a bug in preprocessing the Divvy dataset. To appear at the 37th IEEE International Conference on Data Engineering (ICDE '21)

ACM Class: H.2.8

arXiv:2012.04729 [pdf, other]

On 1/n neural representation and robustness

Authors: Josue Nassar, Piotr Aleksander Sokol, SueYeon Chung, Kenneth D. Harris, Il Memming Park

Abstract: Understanding the nature of representation in neural networks is a goal shared by neuroscience and machine learning. It is therefore exciting that both fields converge not only on shared questions but also on similar approaches. A pressing question in these areas is understanding how the structure of the representation used by neural networks affects both their generalization, and robustness to pe… ▽ More Understanding the nature of representation in neural networks is a goal shared by neuroscience and machine learning. It is therefore exciting that both fields converge not only on shared questions but also on similar approaches. A pressing question in these areas is understanding how the structure of the representation used by neural networks affects both their generalization, and robustness to perturbations. In this work, we investigate the latter by juxtaposing experimental results regarding the covariance spectrum of neural representations in the mouse V1 (Stringer et al) with artificial neural networks. We use adversarial robustness to probe Stringer et al's theory regarding the causal role of a 1/n covariance spectrum. We empirically investigate the benefits such a neural code confers in neural networks, and illuminate its role in multi-layer architectures. Our results show that imposing the experimentally observed structure on artificial neural networks makes them more robust to adversarial attacks. Moreover, our findings complement the existing theory relating wide neural networks to kernel methods, by showing the role of intermediate representations. △ Less

Submitted 8 December, 2020; originally announced December 2020.

arXiv:2010.12362 [pdf, other]

Rescuing neural spike train models from bad MLE

Authors: Diego M. Arribas, Yuan Zhao, Il Memming Park

Abstract: The standard approach to fitting an autoregressive spike train model is to maximize the likelihood for one-step prediction. This maximum likelihood estimation (MLE) often leads to models that perform poorly when generating samples recursively for more than one time step. Moreover, the generated spike trains can fail to capture important features of the data and even show diverging firing rates. To… ▽ More The standard approach to fitting an autoregressive spike train model is to maximize the likelihood for one-step prediction. This maximum likelihood estimation (MLE) often leads to models that perform poorly when generating samples recursively for more than one time step. Moreover, the generated spike trains can fail to capture important features of the data and even show diverging firing rates. To alleviate this, we propose to directly minimize the divergence between neural recorded and model generated spike trains using spike train kernels. We develop a method that stochastically optimizes the maximum mean discrepancy induced by the kernel. Experiments performed on both real and synthetic neural data validate the proposed approach, showing that it leads to well-behaving models. Using different combinations of spike train kernels, we show that we can control the trade-off between different features which is critical for dealing with model-mismatch. △ Less

Submitted 23 October, 2020; originally announced October 2020.

Comments: To appear in Advances in Neural Information Processing 2020

arXiv:2009.01362 [pdf, other]

Non-parametric generalized linear model

Authors: Matthew Dowling, Yuan Zhao, Il Memming Park

Abstract: A fundamental problem in statistical neuroscience is to model how neurons encode information by analyzing electrophysiological recordings. A popular and widely-used approach is to fit the spike trains with an autoregressive point process model. These models are characterized by a set of convolutional temporal filters, whose subsequent analysis can help reveal how neurons encode stimuli, interact w… ▽ More A fundamental problem in statistical neuroscience is to model how neurons encode information by analyzing electrophysiological recordings. A popular and widely-used approach is to fit the spike trains with an autoregressive point process model. These models are characterized by a set of convolutional temporal filters, whose subsequent analysis can help reveal how neurons encode stimuli, interact with each other, and process information. In practice a sufficiently rich but small ensemble of temporal basis functions needs to be chosen to parameterize the filters. However, obtaining a satisfactory fit often requires burdensome model selection and fine tuning the form of the basis functions and their temporal span. In this paper we propose a nonparametric approach for jointly inferring the filters and hyperparameters using the Gaussian process framework. Our method is computationally efficient taking advantage of the sparse variational approximation while being flexible and rich enough to characterize arbitrary filters in continuous time lag. Moreover, our method automatically learns the temporal span of the filter. For the particular application in neuroscience, we designed priors for stimulus and history filters useful for the spike trains. We compare and validate our method on simulated and real neural spike train data. △ Less

Submitted 2 September, 2020; originally announced September 2020.

arXiv:2004.05744 [pdf, other]

doi 10.1109/ACCESS.2021.3085864

Integrated Eojeol Embedding for Erroneous Sentence Classification in Korean Chatbots

Authors: DongHyun Choi, IlNam Park, Myeong Cheol Shin, EungGyun Kim, Dong Ryeol Shin

Abstract: This paper attempts to analyze the Korean sentence classification system for a chatbot. Sentence classification is the task of classifying an input sentence based on predefined categories. However, spelling or space error contained in the input sentence causes problems in morphological analysis and tokenization. This paper proposes a novel approach of Integrated Eojeol (Korean syntactic word separ… ▽ More This paper attempts to analyze the Korean sentence classification system for a chatbot. Sentence classification is the task of classifying an input sentence based on predefined categories. However, spelling or space error contained in the input sentence causes problems in morphological analysis and tokenization. This paper proposes a novel approach of Integrated Eojeol (Korean syntactic word separated by space) Embedding to reduce the effect that poorly analyzed morphemes may make on sentence classification. It also proposes two noise insertion methods that further improve classification performance. Our evaluation results indicate that the proposed system classifies erroneous sentences more accurately than the baseline system by 17%p.0 △ Less

Submitted 12 April, 2020; originally announced April 2020.

Comments: 9 pages, 2 figures

Journal ref: IEEE Access, 2021

arXiv:1912.10687 [pdf, other]

5D Light Field Synthesis from a Monocular Video

Authors: Kyuho Bae, Andre Ivan, Hajime Nagahara, In Kyu Park

Abstract: Commercially available light field cameras have difficulty in capturing 5D (4D + time) light field videos. They can only capture still light filed images or are excessively expensive for normal users to capture the light field video. To tackle this problem, we propose a deep learning-based method for synthesizing a light field video from a monocular video. We propose a new synthetic light field vi… ▽ More Commercially available light field cameras have difficulty in capturing 5D (4D + time) light field videos. They can only capture still light filed images or are excessively expensive for normal users to capture the light field video. To tackle this problem, we propose a deep learning-based method for synthesizing a light field video from a monocular video. We propose a new synthetic light field video dataset that renders photorealistic scenes using UnrealCV rendering engine because no light field dataset is available. The proposed deep learning framework synthesizes the light field video with a full set (9$\times$9) of sub-aperture images from a normal monocular video. The proposed network consists of three sub-networks, namely, feature extraction, 5D light field video synthesis, and temporal consistency refinement. Experimental results show that our model can successfully synthesize the light field video for synthetic and actual scenes and outperforms the previous frame-by-frame methods quantitatively and qualitatively. The synthesized light field can be used for conventional light field applications, namely, depth estimation, viewpoint change, and refocusing. △ Less

Submitted 23 December, 2019; originally announced December 2019.

arXiv:1912.10427 [pdf, other]

Joint Face Super-Resolution and Deblurring Using a Generative Adversarial Network

Authors: Jung Un Yun, In Kyu Park

Abstract: Facial image super-resolution (SR) is an important preprocessing for facial image analysis, face recognition, and image-based 3D face reconstruction. Recent convolutional neural network (CNN) based method has shown excellent performance by learning mapping relation using pairs of low-resolution (LR) and high-resolution (HR) facial images. However, since the HR facial image reconstruction using CNN… ▽ More Facial image super-resolution (SR) is an important preprocessing for facial image analysis, face recognition, and image-based 3D face reconstruction. Recent convolutional neural network (CNN) based method has shown excellent performance by learning mapping relation using pairs of low-resolution (LR) and high-resolution (HR) facial images. However, since the HR facial image reconstruction using CNN is conventionally aimed to increase the PSNR and SSIM metrics, the reconstructed HR image might not be realistic even with high scores. An adversarial framework is proposed in this study to reconstruct the HR facial image by simultaneously generating an HR image with and without blur. First, the spatial resolution of the LR facial image is increased by eight times using a five-layer CNN. Then, the encoder extracts the features of the up-scaled image. These features are finally sent to two branches (decoders) to generate an HR facial image with and without blur. In addition, local and global discriminators are combined to focus on the reconstruction of HR facial structures. Experiment results show that the proposed algorithm generates a realistic HR facial image. Furthermore, the proposed method can generate a variety of different facial images. △ Less

Submitted 22 December, 2019; originally announced December 2019.

arXiv:1911.11619 [pdf, other]

doi 10.1109/ACCESS.2020.3002921

Joint Spatial and Angular Super-Resolution from a Single Image

Authors: Andre Ivan, Williem, In Kyu Park

Abstract: Synthesizing a densely sampled light field from a single image is highly beneficial for many applications. Moreover, jointly solving both angular and spatial super-resolution problem also introduces new possibilities in light field imaging. The conventional method relies on physical-based rendering and a secondary network to solve the angular super-resolution problem. In addition, pixel-based loss… ▽ More Synthesizing a densely sampled light field from a single image is highly beneficial for many applications. Moreover, jointly solving both angular and spatial super-resolution problem also introduces new possibilities in light field imaging. The conventional method relies on physical-based rendering and a secondary network to solve the angular super-resolution problem. In addition, pixel-based loss limits the network capability to infer scene geometry globally. In this paper, we show that both super-resolution problems can be solved jointly from a single image by proposing a single end-to-end deep neural network that does not require a physical-based approach. Two novel loss functions based on known light field domain knowledge are proposed to enable the network to preserve the spatio-angular consistency between sub-aperture images. Experimental results show that the proposed model successfully synthesizes dense high resolution light field and it outperforms the state-of-the-art method in both quantitative and qualitative criteria. The method can be generalized to arbitrary scenes, rather than focusing on a particular subject. The synthesized light field can be used for various applications, such as depth estimation and refocusing. △ Less

Submitted 27 June, 2020; v1 submitted 22 November, 2019; originally announced November 2019.

Comments: arXiv admin note: substantial text overlap with arXiv:1903.12364

Journal ref: IEEE Access, vol. 8, June 2020 page(s): 112562-112573

arXiv:1906.01549 [pdf, other]

doi 10.1109/TPAMI.2022.3153225

Streaming Variational Monte Carlo

Authors: Yuan Zhao, Josue Nassar, Ian Jordan, Mónica Bugallo, Il Memming Park

Abstract: Nonlinear state-space models are powerful tools to describe dynamical structures in complex time series. In a streaming setting where data are processed one sample at a time, simultaneous inference of the state and its nonlinear dynamics has posed significant challenges in practice. We develop a novel online learning framework, leveraging variational inference and sequential Monte Carlo, which ena… ▽ More Nonlinear state-space models are powerful tools to describe dynamical structures in complex time series. In a streaming setting where data are processed one sample at a time, simultaneous inference of the state and its nonlinear dynamics has posed significant challenges in practice. We develop a novel online learning framework, leveraging variational inference and sequential Monte Carlo, which enables flexible and accurate Bayesian joint filtering. Our method provides an approximation of the filtering posterior which can be made arbitrarily close to the true filtering distribution for a wide class of dynamics models and observation models. Specifically, the proposed framework can efficiently approximate a posterior over the dynamics using sparse Gaussian processes, allowing for an interpretable model of the latent dynamics. Constant time complexity per sample makes our approach amenable to online learning scenarios and suitable for real-time applications. △ Less

Submitted 8 November, 2021; v1 submitted 4 June, 2019; originally announced June 2019.

arXiv:1906.01005 [pdf, other]

doi 10.3389/fncom.2021.678158

Gated recurrent units viewed through the lens of continuous time dynamical systems

Authors: Ian D. Jordan, Piotr Aleksander Sokol, Il Memming Park

Abstract: Gated recurrent units (GRUs) are specialized memory elements for building recurrent neural networks. Despite their incredible success on various tasks, including extracting dynamics underlying neural data, little is understood about the specific dynamics representable in a GRU network. As a result, it is both difficult to know a priori how successful a GRU network will perform on a given task, and… ▽ More Gated recurrent units (GRUs) are specialized memory elements for building recurrent neural networks. Despite their incredible success on various tasks, including extracting dynamics underlying neural data, little is understood about the specific dynamics representable in a GRU network. As a result, it is both difficult to know a priori how successful a GRU network will perform on a given task, and also their capacity to mimic the underlying behavior of their biological counterparts. Using a continuous time analysis, we gain intuition on the inner workings of GRU networks. We restrict our presentation to low dimensions, allowing for a comprehensive visualization. We found a surprisingly rich repertoire of dynamical features that includes stable limit cycles (nonlinear oscillations), multi-stable dynamics with various topologies, and homoclinic bifurcations. At the same time we were unable to train GRU networks to produce continuous attractors, which are hypothesized to exist in biological neural networks. We contextualize the usefulness of different kinds of observed dynamics and support our claims experimentally. △ Less

Submitted 28 July, 2021; v1 submitted 3 June, 2019; originally announced June 2019.

Journal ref: Frontiers in Computational Neuroscience, 2021

arXiv:1904.06109 [pdf, other]

Face De-occlusion using 3D Morphable Model and Generative Adversarial Network

Authors: Xiaowei Yuan, In Kyu Park

Abstract: In recent decades, 3D morphable model (3DMM) has been commonly used in image-based photorealistic 3D face reconstruction. However, face images are often corrupted by serious occlusion by non-face objects including eyeglasses, masks, and hands. Such objects block the correct capture of landmarks and shading information. Therefore, the reconstructed 3D face model is hardly reusable. In this paper, a… ▽ More In recent decades, 3D morphable model (3DMM) has been commonly used in image-based photorealistic 3D face reconstruction. However, face images are often corrupted by serious occlusion by non-face objects including eyeglasses, masks, and hands. Such objects block the correct capture of landmarks and shading information. Therefore, the reconstructed 3D face model is hardly reusable. In this paper, a novel method is proposed to restore de-occluded face images based on inverse use of 3DMM and generative adversarial network. We utilize the 3DMM prior to the proposed adversarial network and combine a global and local adversarial convolutional neural network to learn face de-occlusion model. The 3DMM serves not only as geometric prior but also proposes the face region for the local discriminator. Experiment results confirm the effectiveness and robustness of the proposed algorithm in removing challenging types of occlusions with various head poses and illumination. Furthermore, the proposed method reconstructs the correct 3D face model with de-occluded textures. △ Less

Submitted 6 September, 2019; v1 submitted 12 April, 2019; originally announced April 2019.

Comments: Presented in ICCV 2019

arXiv:1904.03326 [pdf, other]

360 Panorama Synthesis from a Sparse Set of Images with Unknown Field of View

Authors: Julius Surya Sumantri, In Kyu Park

Abstract: 360 images represent scenes captured in all possible viewing directions and enable viewers to navigate freely around the scene thereby providing an immersive experience. Conversely, conventional images represent scenes in a single viewing direction with a small or limited field of view (FOV). As a result, only certain parts of the scenes are observed, and valuable information about the surrounding… ▽ More 360 images represent scenes captured in all possible viewing directions and enable viewers to navigate freely around the scene thereby providing an immersive experience. Conversely, conventional images represent scenes in a single viewing direction with a small or limited field of view (FOV). As a result, only certain parts of the scenes are observed, and valuable information about the surroundings is lost. In this paper, a learning-based approach that reconstructs the scene in 360 x 180 from a sparse set of conventional images (typically 4 images) is proposed. The proposed approach first estimates the FOV of input images relative to the panorama. The estimated FOV is then used as the prior for synthesizing a high-resolution 360 panoramic output. The proposed method overcomes the difficulty of learning-based approach in synthesizing high resolution images (up to 512$\times$1024). Experimental results demonstrate that the proposed method produces 360 panorama with reasonable quality. Results also show that the proposed method outperforms the alternative method and can be generalized for non-panoramic scenes and images captured by a smartphone camera. △ Less

Submitted 22 December, 2019; v1 submitted 5 April, 2019; originally announced April 2019.

Comments: Presented in WACV 2020

arXiv:1904.00352 [pdf, other]

doi 10.1109/LSP.2019.2947379

Fast and Full-Resolution Light Field Deblurring using a Deep Neural Network

Authors: Jonathan Samuel Lumentut, Tae Hyun Kim, Ravi Ramamoorthi, In Kyu Park

Abstract: Restoring a sharp light field image from its blurry input has become essential due to the increasing popularity of parallax-based image processing. State-of-the-art blind light field deblurring methods suffer from several issues such as slow processing, reduced spatial size, and a limited motion blur model. In this work, we address these challenging problems by generating a complex blurry light fi… ▽ More Restoring a sharp light field image from its blurry input has become essential due to the increasing popularity of parallax-based image processing. State-of-the-art blind light field deblurring methods suffer from several issues such as slow processing, reduced spatial size, and a limited motion blur model. In this work, we address these challenging problems by generating a complex blurry light field dataset and proposing a learning-based deblurring approach. In particular, we model the full 6-degree of freedom (6-DOF) light field camera motion, which is used to create the blurry dataset using a combination of real light fields captured with a Lytro Illum camera, and synthetic light field renderings of 3D scenes. Furthermore, we propose a light field deblurring network that is built with the capability of large receptive fields. We also introduce a simple strategy of angular sampling to train on the large-scale blurry light field effectively. We evaluate our method through both quantitative and qualitative measurements and demonstrate superior performance compared to the state-of-the-art method with a massive speedup in execution time. Our method is about 16K times faster than Srinivasan et. al. [22] and can deblur a full-resolution light field in less than 2 seconds. △ Less

Submitted 31 March, 2019; originally announced April 2019.

Comments: 9 pages, 8 figures

Journal ref: IEEE Signal Processing Letters, vol. 26, no. 12, pp. 1788-1792, December 2019

arXiv:1903.12364 [pdf, other]

Synthesizing a 4D Spatio-Angular Consistent Light Field from a Single Image

Authors: Andre Ivan, Williem, In Kyu Park

Abstract: Synthesizing a densely sampled light field from a single image is highly beneficial for many applications. The conventional method reconstructs a depth map and relies on physical-based rendering and a secondary network to improve the synthesized novel views. Simple pixel-based loss also limits the network by making it rely on pixel intensity cue rather than geometric reasoning. In this study, we s… ▽ More Synthesizing a densely sampled light field from a single image is highly beneficial for many applications. The conventional method reconstructs a depth map and relies on physical-based rendering and a secondary network to improve the synthesized novel views. Simple pixel-based loss also limits the network by making it rely on pixel intensity cue rather than geometric reasoning. In this study, we show that a different geometric representation, namely, appearance flow, can be used to synthesize a light field from a single image robustly and directly. A single end-to-end deep neural network that does not require a physical-based approach nor a post-processing subnetwork is proposed. Two novel loss functions based on known light field domain knowledge are presented to enable the network to preserve the spatio-angular consistency between sub-aperture images effectively. Experimental results show that the proposed model successfully synthesizes dense light fields and qualitatively and quantitatively outperforms the previous model . The method can be generalized to arbitrary scenes, rather than focusing on a particular class of object. The synthesized light field can be used for various applications, such as depth estimation and refocusing. △ Less

Submitted 29 March, 2019; originally announced March 2019.

arXiv:1811.12386 [pdf, other]

Tree-Structured Recurrent Switching Linear Dynamical Systems for Multi-Scale Modeling

Authors: Josue Nassar, Scott W. Linderman, Monica Bugallo, Il Memming Park

Abstract: Many real-world systems studied are governed by complex, nonlinear dynamics. By modeling these dynamics, we can gain insight into how these systems work, make predictions about how they will behave, and develop strategies for controlling them. While there are many methods for modeling nonlinear dynamical systems, existing techniques face a trade off between offering interpretable descriptions and… ▽ More Many real-world systems studied are governed by complex, nonlinear dynamics. By modeling these dynamics, we can gain insight into how these systems work, make predictions about how they will behave, and develop strategies for controlling them. While there are many methods for modeling nonlinear dynamical systems, existing techniques face a trade off between offering interpretable descriptions and making accurate predictions. Here, we develop a class of models that aims to achieve both simultaneously, smoothly interpolating between simple descriptions and more complex, yet also more accurate models. Our probabilistic model achieves this multi-scale property through a hierarchy of locally linear dynamics that jointly approximate global nonlinear dynamics. We call it the tree-structured recurrent switching linear dynamical system. To fit this model, we present a fully-Bayesian sampling procedure using Polya-Gamma data augmentation to allow for fast and conjugate Gibbs sampling. Through a variety of synthetic and real examples, we show how these models outperform existing methods in both interpretability and predictive capability. △ Less

Submitted 4 June, 2019; v1 submitted 29 November, 2018; originally announced November 2018.

arXiv:1811.12069 [pdf, other]

doi 10.3938/jkps.72.1292

Multi-Scale Distributed Representation for Deep Learning and its Application to b-Jet Tagging

Authors: Jason Lee, Inkyu Park, Sangnam Park

Abstract: Recently machine learning algorithms based on deep layered artificial neural networks (DNNs) have been applied to a wide variety of high energy physics problems such as jet tagging or event classification. We explore a simple but effective preprocessing step which transforms each real-valued observational quantity or input feature into a binary number with a fixed number of digits. Each binary dig… ▽ More Recently machine learning algorithms based on deep layered artificial neural networks (DNNs) have been applied to a wide variety of high energy physics problems such as jet tagging or event classification. We explore a simple but effective preprocessing step which transforms each real-valued observational quantity or input feature into a binary number with a fixed number of digits. Each binary digit represents the quantity or magnitude in different scales. We have shown that this approach improves the performance of DNNs significantly for some specific tasks without any further complication in feature engineering. We apply this multi-scale distributed binary representation to deep learning on b-jet tagging using daughter particles' momenta and vertex information. △ Less

Submitted 29 November, 2018; originally announced November 2018.

Comments: 13 pages, 8 figures

Journal ref: J.Korean Phys.Soc. 72 (2018) no.11, 1292-1300

arXiv:1810.03785 [pdf, other]

Information Geometry of Orthogonal Initializations and Training

Authors: Piotr A. Sokol, Il Memming Park

Abstract: Recently mean field theory has been successfully used to analyze properties of wide, random neural networks. It gave rise to a prescriptive theory for initializing feed-forward neural networks with orthogonal weights, which ensures that both the forward propagated activations and the backpropagated gradients are near $\ell_2$ isometries and as a consequence training is orders of magnitude faster.… ▽ More Recently mean field theory has been successfully used to analyze properties of wide, random neural networks. It gave rise to a prescriptive theory for initializing feed-forward neural networks with orthogonal weights, which ensures that both the forward propagated activations and the backpropagated gradients are near $\ell_2$ isometries and as a consequence training is orders of magnitude faster. Despite strong empirical performance, the mechanisms by which critical initializations confer an advantage in the optimization of deep neural networks are poorly understood. Here we show a novel connection between the maximum curvature of the optimization landscape (gradient smoothness) as measured by the Fisher information matrix (FIM) and the spectral radius of the input-output Jacobian, which partially explains why more isometric networks can train much faster. Furthermore, given that orthogonal weights are necessary to ensure that gradient norms are approximately preserved at initialization, we experimentally investigate the benefits of maintaining orthogonality throughout training, from which we conclude that manifold optimization of weights performs well regardless of the smoothness of the gradients. Moreover, motivated by experimental results we show that a low condition number of the FIM is not predictive of faster learning. △ Less

Submitted 4 June, 2019; v1 submitted 8 October, 2018; originally announced October 2018.

Comments: 10 pages and 5 figures; 5 page appendix

arXiv:1711.10918 [pdf, other]

Joint Blind Motion Deblurring and Depth Estimation of Light Field

Authors: Dongwoo Lee, Haesol Park, In Kyu Park, Kyoung Mu Lee

Abstract: Removing camera motion blur from a single light field is a challenging task since it is highly ill-posed inverse problem. The problem becomes even worse when blur kernel varies spatially due to scene depth variation and high-order camera motion. In this paper, we propose a novel algorithm to estimate all blur model variables jointly, including latent sub-aperture image, camera motion, and scene de… ▽ More Removing camera motion blur from a single light field is a challenging task since it is highly ill-posed inverse problem. The problem becomes even worse when blur kernel varies spatially due to scene depth variation and high-order camera motion. In this paper, we propose a novel algorithm to estimate all blur model variables jointly, including latent sub-aperture image, camera motion, and scene depth from the blurred 4D light field. Exploiting multi-view nature of a light field relieves the inverse property of the optimization by utilizing strong depth cues and multi-view blur observation. The proposed joint estimation achieves high quality light field deblurring and depth estimation simultaneously under arbitrary 6-DOF camera motion and unconstrained scene depth. Intensive experiment on real and synthetic blurred light field confirms that the proposed algorithm outperforms the state-of-the-art light field deblurring and depth estimation methods. △ Less

Submitted 14 June, 2018; v1 submitted 29 November, 2017; originally announced November 2017.

arXiv:1310.5347 [pdf, other]

Bayesian Extensions of Kernel Least Mean Squares

Authors: Il Memming Park, Sohan Seth, Steven Van Vaerenbergh

Abstract: The kernel least mean squares (KLMS) algorithm is a computationally efficient nonlinear adaptive filtering method that "kernelizes" the celebrated (linear) least mean squares algorithm. We demonstrate that the least mean squares algorithm is closely related to the Kalman filtering, and thus, the KLMS can be interpreted as an approximate Bayesian filtering method. This allows us to systematically d… ▽ More The kernel least mean squares (KLMS) algorithm is a computationally efficient nonlinear adaptive filtering method that "kernelizes" the celebrated (linear) least mean squares algorithm. We demonstrate that the least mean squares algorithm is closely related to the Kalman filtering, and thus, the KLMS can be interpreted as an approximate Bayesian filtering method. This allows us to systematically develop extensions of the KLMS by modifying the underlying state-space and observation models. The resulting extensions introduce many desirable properties such as "forgetting", and the ability to learn from discrete data, while retaining the computational simplicity and time complexity of the original algorithm. △ Less

Submitted 20 October, 2013; originally announced October 2013.

Comments: 7 pages, 4 fiures

arXiv:1302.0328 [pdf, other]

Bayesian Entropy Estimation for Countable Discrete Distributions

Authors: Evan Archer, Il Memming Park, Jonathan Pillow

Abstract: We consider the problem of estimating Shannon's entropy $H$ from discrete data, in cases where the number of possible symbols is unknown or even countably infinite. The Pitman-Yor process, a generalization of Dirichlet process, provides a tractable prior distribution over the space of countably infinite discrete distributions, and has found major applications in Bayesian non-parametric statistics… ▽ More We consider the problem of estimating Shannon's entropy $H$ from discrete data, in cases where the number of possible symbols is unknown or even countably infinite. The Pitman-Yor process, a generalization of Dirichlet process, provides a tractable prior distribution over the space of countably infinite discrete distributions, and has found major applications in Bayesian non-parametric statistics and machine learning. Here we show that it also provides a natural family of priors for Bayesian entropy estimation, due to the fact that moments of the induced posterior distribution over $H$ can be computed analytically. We derive formulas for the posterior mean (Bayes' least squares estimate) and variance under Dirichlet and Pitman-Yor process priors. Moreover, we show that a fixed Dirichlet or Pitman-Yor process prior implies a narrow prior distribution over $H$, meaning the prior strongly determines the entropy estimate in the under-sampled regime. We derive a family of continuous mixing measures such that the resulting mixture of Pitman-Yor processes produces an approximately flat prior over $H$. We show that the resulting Pitman-Yor Mixture (PYM) entropy estimator is consistent for a large class of distributions. We explore the theoretical properties of the resulting estimator, and show that it performs well both in simulation and in application to real data. △ Less

Submitted 9 April, 2014; v1 submitted 1 February, 2013; originally announced February 2013.

Comments: 38 pages LaTeX. Revised and resubmitted to JMLR

arXiv:1202.2143 [pdf, other]

Active Bayesian Optimization: Minimizing Minimizer Entropy

Authors: Il Memming Park, Marcel Nassar, Mijung Park

Abstract: The ultimate goal of optimization is to find the minimizer of a target function.However, typical criteria for active optimization often ignore the uncertainty about the minimizer. We propose a novel criterion for global optimization and an associated sequential active learning strategy using Gaussian processes.Our criterion is the reduction of uncertainty in the posterior distribution of the funct… ▽ More The ultimate goal of optimization is to find the minimizer of a target function.However, typical criteria for active optimization often ignore the uncertainty about the minimizer. We propose a novel criterion for global optimization and an associated sequential active learning strategy using Gaussian processes.Our criterion is the reduction of uncertainty in the posterior distribution of the function minimizer. It can also flexibly incorporate multiple global minimizers. We implement a tractable approximation of the criterion and demonstrate that it obtains the global minimizer accurately compared to conventional Bayesian optimization criteria. △ Less

Submitted 9 February, 2012; originally announced February 2012.

arXiv:0901.3475 [pdf, ps, other]

Efficient decoding algorithm using triangularity of $\mbf{R}$ matrix of QR-decomposition

Authors: In Sook Park

Abstract: An efficient decoding algorithm named `divided decoder' is proposed in this paper. Divided decoding can be combined with any decoder using QR-decomposition and offers different pairs of performance and complexity. Divided decoding provides various combinations of two or more different searching algorithms. Hence it makes flexibility in error rate and complexity for the algorithms using it. We ca… ▽ More An efficient decoding algorithm named `divided decoder' is proposed in this paper. Divided decoding can be combined with any decoder using QR-decomposition and offers different pairs of performance and complexity. Divided decoding provides various combinations of two or more different searching algorithms. Hence it makes flexibility in error rate and complexity for the algorithms using it. We calculate diversity orders and upper bounds of error rates for typical models when these models are solved by divided decodings with sphere decoder, and discuss about the effects of divided decoding on complexity. Simulation results of divided decodings combined with a sphere decoder according to different splitting indices correspond to the theoretical analysis. △ Less

Submitted 22 January, 2009; originally announced January 2009.

Comments: This paper is submitted to IEEE transactions on Information theory

Showing 1–50 of 50 results for author: Park, I