Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–45 of 45 results for author: Foroosh, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.12084  [pdf, other

    cs.CL cs.AI

    When Reasoning Meets Information Aggregation: A Case Study with Sports Narratives

    Authors: Yebowen Hu, Kaiqiang Song, Sangwoo Cho, Xiaoyang Wang, Wenlin Yao, Hassan Foroosh, Dong Yu, Fei Liu

    Abstract: Reasoning is most powerful when an LLM accurately aggregates relevant information. We examine the critical role of information aggregation in reasoning by requiring the LLM to analyze sports narratives. To succeed at this task, an LLM must infer points from actions, identify related entities, attribute points accurately to players and teams, and compile key statistics to draw conclusions. We condu… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  2. arXiv:2403.04031  [pdf, other

    cs.CL cs.AI

    Can Large Language Models do Analytical Reasoning?

    Authors: Yebowen Hu, Kaiqiang Song, Sangwoo Cho, Xiaoyang Wang, Hassan Foroosh, Dong Yu, Fei Liu

    Abstract: This paper explores the cutting-edge Large Language Model with analytical reasoning on sports. Our analytical reasoning embodies the tasks of letting large language models count how many points each team scores in a quarter in the NBA and NFL games. Our major discoveries are in two folds. Firstly, we find among all the models we employed, GPT-4 stands out in effectiveness, followed by Claude-2.1,… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

  3. arXiv:2402.10979  [pdf, other

    cs.CL cs.AI

    SportsMetrics: Blending Text and Numerical Data to Understand Information Fusion in LLMs

    Authors: Yebowen Hu, Kaiqiang Song, Sangwoo Cho, Xiaoyang Wang, Hassan Foroosh, Dong Yu, Fei Liu

    Abstract: Large language models hold significant potential for integrating various data types, such as text documents and database records, for advanced analytics. However, blending text and numerical data presents substantial challenges. LLMs need to process and cross-reference entities and numbers, handle data inconsistencies and redundancies, and develop planning capabilities such as building a working m… ▽ More

    Submitted 16 June, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

    Comments: ACL 2024 Long Paper

  4. arXiv:2306.12525  [pdf, other

    cs.CV

    LPFormer: LiDAR Pose Estimation Transformer with Multi-Task Network

    Authors: Dongqiangzi Ye, Yufei Xie, Weijia Chen, Zixiang Zhou, Lingting Ge, Hassan Foroosh

    Abstract: Due to the difficulty of acquiring large-scale 3D human keypoint annotation, previous methods for 3D human pose estimation (HPE) have often relied on 2D image features and sequential 2D annotations. Furthermore, the training of these networks typically assumes the prediction of a human bounding box and the accurate alignment of 3D point clouds with 2D images, making direct application in real-worl… ▽ More

    Submitted 2 March, 2024; v1 submitted 21 June, 2023; originally announced June 2023.

    Comments: ICRA 2024. Top solution for the Waymo Open Dataset Challenges 2023 - Pose Estimation. CVPR 2023 Workshop on Autonomous Driving

  5. arXiv:2305.17529  [pdf, other

    cs.CL

    MeetingBank: A Benchmark Dataset for Meeting Summarization

    Authors: Yebowen Hu, Tim Ganter, Hanieh Deilamsalehy, Franck Dernoncourt, Hassan Foroosh, Fei Liu

    Abstract: As the number of recorded meetings increases, it becomes increasingly important to utilize summarization technology to create useful summaries of these recordings. However, there is a crucial lack of annotated meeting corpora for developing this technology, as it can be hard to collect meetings, especially when the topics discussed are confidential. Furthermore, meeting summaries written by experi… ▽ More

    Submitted 27 May, 2023; originally announced May 2023.

    Comments: ACL 2023 Long Paper

  6. arXiv:2305.14702  [pdf, other

    cs.CL

    DecipherPref: Analyzing Influential Factors in Human Preference Judgments via GPT-4

    Authors: Yebowen Hu, Kaiqiang Song, Sangwoo Cho, Xiaoyang Wang, Hassan Foroosh, Fei Liu

    Abstract: Human preference judgments are pivotal in guiding large language models (LLMs) to produce outputs that align with human values. Human evaluations are also used in summarization tasks to compare outputs from various systems, complementing existing automatic metrics. Despite their significance, however, there has been limited research probing these pairwise or $k$-wise comparisons. The collective im… ▽ More

    Submitted 27 October, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

  7. Personalizing Task-oriented Dialog Systems via Zero-shot Generalizable Reward Function

    Authors: A. B. Siddique, M. H. Maqbool, Kshitija Taywade, Hassan Foroosh

    Abstract: Task-oriented dialog systems enable users to accomplish tasks using natural language. State-of-the-art systems respond to users in the same way regardless of their personalities, although personalizing dialogues can lead to higher levels of adoption and better user experiences. Building personalized dialog systems is an important, yet challenging endeavor and only a handful of works took on the ch… ▽ More

    Submitted 24 March, 2023; originally announced March 2023.

    Comments: 11 pages, 4 tables, 31st ACM International Conference on Information and Knowledge Management (CIKM'22)

  8. arXiv:2303.12194  [pdf, other

    cs.CV

    LiDARFormer: A Unified Transformer-based Multi-task Network for LiDAR Perception

    Authors: Zixiang Zhou, Dongqiangzi Ye, Weijia Chen, Yufei Xie, Yu Wang, Panqu Wang, Hassan Foroosh

    Abstract: There is a recent trend in the LiDAR perception field towards unifying multiple tasks in a single strong network with improved performance, as opposed to using separate networks for each task. In this paper, we introduce a new LiDAR multi-task learning paradigm based on the transformer. The proposed LiDARFormer utilizes cross-space global contextual feature information and exploits cross-task syne… ▽ More

    Submitted 2 March, 2024; v1 submitted 21 March, 2023; originally announced March 2023.

    Comments: ICRA 2024

  9. arXiv:2303.06588  [pdf, other

    cs.IR cs.LG cs.SE

    MobileRec: A Large-Scale Dataset for Mobile Apps Recommendation

    Authors: M. H. Maqbool, Umar Farooq, Adib Mosharrof, A. B. Siddique, Hassan Foroosh

    Abstract: Recommender systems have become ubiquitous in our digital lives, from recommending products on e-commerce websites to suggesting movies and music on streaming platforms. Existing recommendation datasets, such as Amazon Product Reviews and MovieLens, greatly facilitated the research and development of recommender systems in their respective domains. While the number of mobile users and applications… ▽ More

    Submitted 12 March, 2023; originally announced March 2023.

    Comments: 10 pages, 4 tables, 4 figures, Under submission at SIGIR'23

  10. arXiv:2209.09385  [pdf, other

    cs.CV

    LidarMultiNet: Towards a Unified Multi-Task Network for LiDAR Perception

    Authors: Dongqiangzi Ye, Zixiang Zhou, Weijia Chen, Yufei Xie, Yu Wang, Panqu Wang, Hassan Foroosh

    Abstract: LiDAR-based 3D object detection, semantic segmentation, and panoptic segmentation are usually implemented in specialized networks with distinctive architectures that are difficult to adapt to each other. This paper presents LidarMultiNet, a LiDAR-based multi-task network that unifies these three major LiDAR perception tasks. Among its many benefits, a multi-task network can reduce the overall cost… ▽ More

    Submitted 21 March, 2023; v1 submitted 19 September, 2022; originally announced September 2022.

    Comments: Accepted to AAAI 2023 (Oral). Full-length paper extending our previous technical report of the 1st place solution of the 2022 Waymo Open Dataset 3D Semantic Segmentation challenge, including evaluations on 5 major benchmarks. arXiv admin note: text overlap with arXiv:2206.11428

  11. arXiv:2209.05588  [pdf, other

    cs.CV

    CenterFormer: Center-based Transformer for 3D Object Detection

    Authors: Zixiang Zhou, Xiangchen Zhao, Yu Wang, Panqu Wang, Hassan Foroosh

    Abstract: Query-based transformer has shown great potential in constructing long-range attention in many image-domain tasks, but has rarely been considered in LiDAR-based 3D object detection due to the overwhelming size of the point cloud data. In this paper, we propose CenterFormer, a center-based transformer network for 3D object detection. CenterFormer first uses a center heatmap to select center candida… ▽ More

    Submitted 12 September, 2022; originally announced September 2022.

    Comments: Accepted to ECCV 2022 (oral)

  12. arXiv:2206.11428  [pdf, other

    cs.CV

    LidarMultiNet: Unifying LiDAR Semantic Segmentation, 3D Object Detection, and Panoptic Segmentation in a Single Multi-task Network

    Authors: Dongqiangzi Ye, Weijia Chen, Zixiang Zhou, Yufei Xie, Yu Wang, Panqu Wang, Hassan Foroosh

    Abstract: This technical report presents the 1st place winning solution for the Waymo Open Dataset 3D semantic segmentation challenge 2022. Our network, termed LidarMultiNet, unifies the major LiDAR perception tasks such as 3D semantic segmentation, object detection, and panoptic segmentation in a single framework. At the core of LidarMultiNet is a strong 3D voxel-based encoder-decoder network with a novel… ▽ More

    Submitted 23 June, 2022; v1 submitted 22 June, 2022; originally announced June 2022.

    Comments: Official 1st Place Solution for the Waymo Open Dataset Challenges 2022 - 3D Semantic Segmentation. Official leaderboard: https://waymo.com/open/challenges/2022/3d-semantic-segmentation/. CVPR 2022 Workshop on Autonomous Driving: http://cvpr2022.wad.vision/

  13. Near-Infrared Depth-Independent Image Dehazing using Haar Wavelets

    Authors: Sumit Laha, Ankit Sharma, Shengnan Hu, Hassan Foroosh

    Abstract: We propose a fusion algorithm for haze removal that combines color information from an RGB image and edge information extracted from its corresponding NIR image using Haar wavelets. The proposed algorithm is based on the key observation that NIR edge features are more prominent in the hazy regions of the image than the RGB edge features in those same regions. To combine the color and edge informat… ▽ More

    Submitted 26 March, 2022; originally announced March 2022.

    Comments: Accepted in 25th International Conference on Pattern Recognition (ICPR 2020)

    Journal ref: 2020 25th International Conference on Pattern Recognition (ICPR) (2021) 5384-5390

  14. arXiv:2109.05160  [pdf, other

    cs.CL

    StreamHover: Livestream Transcript Summarization and Annotation

    Authors: Sangwoo Cho, Franck Dernoncourt, Tim Ganter, Trung Bui, Nedim Lipka, Walter Chang, Hailin Jin, Jonathan Brandt, Hassan Foroosh, Fei Liu

    Abstract: With the explosive growth of livestream broadcasting, there is an urgent need for new summarization technology that enables us to create a preview of streamed content and tap into this wealth of knowledge. However, the problem is nontrivial due to the informal nature of spoken language. Further, there has been a shortage of annotated datasets that are necessary for transcript summarization. In thi… ▽ More

    Submitted 10 September, 2021; originally announced September 2021.

    Comments: EMNLP 2021 (Long Paper)

  15. arXiv:2103.14962  [pdf, other

    cs.CV

    Panoptic-PolarNet: Proposal-free LiDAR Point Cloud Panoptic Segmentation

    Authors: Zixiang Zhou, Yang Zhang, Hassan Foroosh

    Abstract: Panoptic segmentation presents a new challenge in exploiting the merits of both detection and segmentation, with the aim of unifying instance segmentation and semantic segmentation in a single framework. However, an efficient solution for panoptic segmentation in the emerging domain of LiDAR point cloud is still an open research problem and is very much under-explored. In this paper, we present a… ▽ More

    Submitted 27 March, 2021; originally announced March 2021.

    Comments: Accepted by CVPR 2021

  16. arXiv:2010.10566  [pdf, other

    cs.CL

    Better Highlighting: Creating Sub-Sentence Summary Highlights

    Authors: Sangwoo Cho, Kaiqiang Song, Chen Li, Dong Yu, Hassan Foroosh, Fei Liu

    Abstract: Amongst the best means to summarize is highlighting. In this paper, we aim to generate summary highlights to be overlaid on the original documents to make it easier for readers to sift through a large amount of text. The method allows summaries to be understood in context to prevent a summarizer from distorting the original meaning, of which abstractive summarizers usually fall short. In particula… ▽ More

    Submitted 20 October, 2020; originally announced October 2020.

    Comments: EMNLP 2020 (Long Paper)

  17. arXiv:2008.08281  [pdf, other

    cs.CV

    CCA: Exploring the Possibility of Contextual Camouflage Attack on Object Detection

    Authors: Shengnan Hu, Yang Zhang, Sumit Laha, Ankit Sharma, Hassan Foroosh

    Abstract: Deep neural network based object detection hasbecome the cornerstone of many real-world applications. Alongwith this success comes concerns about its vulnerability tomalicious attacks. To gain more insight into this issue, we proposea contextual camouflage attack (CCA for short) algorithm to in-fluence the performance of object detectors. In this paper, we usean evolutionary search strategy and ad… ▽ More

    Submitted 19 August, 2020; originally announced August 2020.

  18. arXiv:2003.14032  [pdf, other

    cs.CV

    PolarNet: An Improved Grid Representation for Online LiDAR Point Clouds Semantic Segmentation

    Authors: Yang Zhang, Zixiang Zhou, Philip David, Xiangyu Yue, Zerong Xi, Boqing Gong, Hassan Foroosh

    Abstract: The need for fine-grained perception in autonomous driving systems has resulted in recently increased research on online semantic segmentation of single-scan LiDAR. Despite the emerging datasets and technological advancements, it remains challenging due to three reasons: (1) the need for near-real-time latency with limited hardware; (2) uneven or even long-tailed distribution of LiDAR points acros… ▽ More

    Submitted 26 April, 2020; v1 submitted 31 March, 2020; originally announced March 2020.

    Comments: Accepted by CVPR 2020; Code at https://github.com/edwardzhou130/PolarSeg

  19. arXiv:1912.08435  [pdf, other

    cs.CV

    Self-Attention Network for Skeleton-based Human Action Recognition

    Authors: Sangwoo Cho, Muhammad Hasan Maqbool, Fei Liu, Hassan Foroosh

    Abstract: Skeleton-based action recognition has recently attracted a lot of attention. Researchers are coming up with new approaches for extracting spatio-temporal relations and making considerable progress on large-scale skeleton-based datasets. Most of the architectures being proposed are based upon recurrent neural networks (RNNs), convolutional neural networks (CNNs) and graph-based CNNs. When it comes… ▽ More

    Submitted 18 December, 2019; originally announced December 2019.

    Comments: WACV 2020 Paper

  20. arXiv:1910.11411  [pdf, other

    cs.CL

    Multi-Document Summarization with Determinantal Point Processes and Contextualized Representations

    Authors: Sangwoo Cho, Chen Li, Dong Yu, Hassan Foroosh, Fei Liu

    Abstract: Emerged as one of the best performing techniques for extractive summarization, determinantal point processes select the most probable set of sentences to form a summary according to a probability measure defined by modeling sentence prominence and pairwise repulsion. Traditionally, these aspects are modelled using shallow and linguistically informed features, but the rise of deep contextualized re… ▽ More

    Submitted 24 October, 2019; originally announced October 2019.

    Comments: EMNLP 2019 Workshop on New Frontiers in Summarization

  21. arXiv:1910.09417  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Maximum Probability Theorem: A Framework for Probabilistic Learning

    Authors: Amir Emad Marvasti, Ehsan Emad Marvasti, Ulas Bagci, Hassan Foroosh

    Abstract: We present a theoretical framework of probabilistic learning derived by Maximum Probability (MP) Theorem shown in the current paper. In this probabilistic framework, a model is defined as an event in the probability space, and a model or the associated event -- either the true underlying model or the parameterized model -- have a quantified probability measure. This quantification of a model's pro… ▽ More

    Submitted 14 June, 2021; v1 submitted 21 October, 2019; originally announced October 2019.

    Comments: in IEEE Transactions on Artificial Intelligence

  22. arXiv:1907.02157  [pdf, other

    cs.CV

    Slim-CNN: A Light-Weight CNN for Face Attribute Prediction

    Authors: Ankit Sharma, Hassan Foroosh

    Abstract: We introduce a computationally-efficient CNN micro-architecture Slim Module to design a lightweight deep neural network Slim-Net for face attribute prediction. Slim Modules are constructed by assembling depthwise separable convolutions with pointwise convolution to produce a computationally efficient module. The problem of facial attribute prediction is challenging because of the large variations… ▽ More

    Submitted 3 July, 2019; originally announced July 2019.

  23. Spatio-Temporal Fusion Networks for Action Recognition

    Authors: Sangwoo Cho, Hassan Foroosh

    Abstract: The video based CNN works have focused on effective ways to fuse appearance and motion networks, but they typically lack utilizing temporal information over video frames. In this work, we present a novel spatio-temporal fusion network (STFN) that integrates temporal dynamics of appearance and motion information from entire videos. The captured temporal dynamic information is then aggregated for a… ▽ More

    Submitted 16 June, 2019; originally announced June 2019.

    Journal ref: Asian Conference on Computer Vision (2018) 347-364

  24. A Temporal Sequence Learning for Action Recognition and Prediction

    Authors: Sangwoo Cho, Hassan Foroosh

    Abstract: In this work\footnote {This work was supported in part by the National Science Foundation under grant IIS-1212948.}, we present a method to represent a video with a sequence of words, and learn the temporal sequencing of such words as the key information for predicting and recognizing human actions. We leverage core concepts from the Natural Language Processing (NLP) literature used in sentence cl… ▽ More

    Submitted 16 June, 2019; originally announced June 2019.

    Comments: 10 pages, 8 figures, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV)

    Journal ref: {IEEE} Winter Conference on Applications of Computer Vision, 2018, 352-361

  25. arXiv:1906.00072  [pdf, other

    cs.CL

    Improving the Similarity Measure of Determinantal Point Processes for Extractive Multi-Document Summarization

    Authors: Sangwoo Cho, Logan Lebanoff, Hassan Foroosh, Fei Liu

    Abstract: The most important obstacles facing multi-document summarization include excessive redundancy in source descriptions and the looming shortage of training data. These obstacles prevent encoder-decoder models from being used directly, but optimization-based methods such as determinantal point processes (DPPs) are known to handle them well. In this paper we seek to strengthen a DPP-based method for e… ▽ More

    Submitted 31 May, 2019; originally announced June 2019.

    Comments: ACL 2019 (Long Paper)

  26. arXiv:1901.02338  [pdf, other

    cs.LG cs.CV

    Sparse One-Time Grab Sampling of Inliers

    Authors: Maryam Jaberi, Marianna Pensky, Hassan Foroosh

    Abstract: Estimating structures in "big data" and clustering them are among the most fundamental problems in computer vision, pattern recognition, data mining, and many other other research fields. Over the past few decades, many studies have been conducted focusing on different aspects of these problems. One of the main approaches that is explored in the literature to tackle the problems of size and dimens… ▽ More

    Submitted 21 December, 2018; originally announced January 2019.

    Comments: WiML2017

  27. arXiv:1812.09953  [pdf, other

    cs.CV

    A Curriculum Domain Adaptation Approach to the Semantic Segmentation of Urban Scenes

    Authors: Yang Zhang, Philip David, Hassan Foroosh, Boqing Gong

    Abstract: During the last half decade, convolutional neural networks (CNNs) have triumphed over semantic segmentation, which is one of the core tasks in many applications such as autonomous driving and augmented reality. However, to train CNNs requires a considerable amount of data, which is difficult to collect and laborious to annotate. Recent advances in computer graphics make it possible to train CNNs o… ▽ More

    Submitted 9 January, 2019; v1 submitted 24 December, 2018; originally announced December 2018.

    Comments: This is the journal version of arXiv:1707.09465

  28. arXiv:1811.12673  [pdf, other

    cs.CV

    ComDefend: An Efficient Image Compression Model to Defend Adversarial Examples

    Authors: Xiaojun Jia, Xingxing Wei, Xiaochun Cao, Hassan Foroosh

    Abstract: Deep neural networks (DNNs) have been demonstrated to be vulnerable to adversarial examples. Specifically, adding imperceptible perturbations to clean images can fool the well trained deep neural networks. In this paper, we propose an end-to-end image compression model to defend adversarial examples: \textbf{ComDefend}. The proposed model consists of a compression convolutional neural network (Com… ▽ More

    Submitted 1 July, 2019; v1 submitted 30 November, 2018; originally announced November 2018.

    Journal ref: CVPR 2019

  29. arXiv:1809.10073  [pdf, other

    cs.LG stat.ML

    Rediscovering Deep Neural Networks Through Finite-State Distributions

    Authors: Amir Emad Marvasti, Ehsan Emad Marvasti, George Atia, Hassan Foroosh

    Abstract: We propose a new way of thinking about deep neural networks, in which the linear and non-linear components of the network are naturally derived and justified in terms of principles in probability theory. In particular, the models constructed in our framework assign probabilities to uncertain realizations, leading to Kullback-Leibler Divergence (KLD) as the linear layer. In our model construction,… ▽ More

    Submitted 9 October, 2019; v1 submitted 26 September, 2018; originally announced September 2018.

  30. arXiv:1808.09574  [pdf, other

    stat.ML cs.CV cs.LG

    Probabilistic Sparse Subspace Clustering Using Delayed Association

    Authors: Maryam Jaberi, Marianna Pensky, Hassan Foroosh

    Abstract: Discovering and clustering subspaces in high-dimensional data is a fundamental problem of machine learning with a wide range of applications in data mining, computer vision, and pattern recognition. Earlier methods divided the problem into two separate stages of finding the similarity matrix and finding clusters. Similar to some recent works, we integrate these two steps using a joint optimization… ▽ More

    Submitted 28 August, 2018; originally announced August 2018.

    Journal ref: ICPR 2018

  31. arXiv:1708.05464  [pdf, other

    cs.CV

    Simultaneous Detection and Quantification of Retinal Fluid with Deep Learning

    Authors: Dustin Morley, Hassan Foroosh, Saad Shaikh, Ulas Bagci

    Abstract: We propose a new deep learning approach for automatic detection and segmentation of fluid within retinal OCT images. The proposed framework utilizes both ResNet and Encoder-Decoder neural network architectures. When training the network, we apply a novel data augmentation method called myopic warping together with standard rotation-based augmentation to increase the training set size to 45 times t… ▽ More

    Submitted 17 August, 2017; originally announced August 2017.

  32. arXiv:1705.08293  [pdf, other

    cs.CV

    An Invariant Model of the Significance of Different Body Parts in Recognizing Different Actions

    Authors: Yuping Shen, Hassan Foroosh

    Abstract: In this paper, we show that different body parts do not play equally important roles in recognizing a human action in video data. We investigate to what extent a body part plays a role in recognition of different actions and hence propose a generic method of assigning weights to different body points. The approach is inspired by the strong evidence in the applied perception community that humans p… ▽ More

    Submitted 22 May, 2017; originally announced May 2017.

    Comments: arXiv admin note: substantial text overlap with arXiv:1705.04641, arXiv:1705.05741, arXiv:1705.04433

  33. arXiv:1705.07609  [pdf, other

    cs.CV

    View-Invariant Recognition of Action Style Self-Dissimilarity

    Authors: Yuping Shen, Hassan Foroosh

    Abstract: Self-similarity was recently introduced as a measure of inter-class congruence for classification of actions. Herein, we investigate the dual problem of intra-class dissimilarity for classification of action styles. We introduce self-dissimilarity matrices that discriminate between same actions performed by different subjects regardless of viewing direction and camera parameters. We investigate tw… ▽ More

    Submitted 22 May, 2017; originally announced May 2017.

  34. arXiv:1705.07340  [pdf, other

    cs.CV

    Phase-Shifting Separable Haar Wavelets and Applications

    Authors: Mais Alnasser, Hassan Foroosh

    Abstract: This paper presents a new approach for tackling the shift-invariance problem in the discrete Haar domain, without trading off any of its desirable properties, such as compression, separability, orthogonality, and symmetry. The paper presents several key theoretical contributions. First, we derive closed form expressions for phase shifting in the Haar domain both in partially decimated and fully de… ▽ More

    Submitted 20 May, 2017; originally announced May 2017.

  35. arXiv:1705.07272  [pdf, other

    cs.CV

    Non-Linear Phase-Shifting of Haar Wavelets for Run-Time All-Frequency Lighting

    Authors: Mais Alnasser, Hassan Foroosh

    Abstract: This paper focuses on real-time all-frequency image-based rendering using an innovative solution for run-time computation of light transport. The approach is based on new results derived for non-linear phase shifting in the Haar wavelet domain. Although image-based methods for real-time rendering of dynamic glossy objects have been proposed, they do not truly scale to all possible frequencies and… ▽ More

    Submitted 20 May, 2017; originally announced May 2017.

  36. arXiv:1705.05745  [pdf, other

    cs.CV

    Volumetric Super-Resolution of Multispectral Data

    Authors: Vildan Atalay Aydin, Hassan Foroosh

    Abstract: Most multispectral remote sensors (e.g. QuickBird, IKONOS, and Landsat 7 ETM+) provide low-spatial high-spectral resolution multispectral (MS) or high-spatial low-spectral resolution panchromatic (PAN) images, separately. In order to reconstruct a high-spatial/high-spectral resolution multispectral image volume, either the information in MS and PAN images are fused (i.e. pansharpening) or super-re… ▽ More

    Submitted 13 May, 2017; originally announced May 2017.

    Comments: arXiv admin note: text overlap with arXiv:1705.01258

  37. arXiv:1705.05741  [pdf, other

    cs.CV

    Motion-Compensated Temporal Filtering for Critically-Sampled Wavelet-Encoded Images

    Authors: Vildan Atalay Aydin, Hassan Foroosh

    Abstract: We propose a novel motion estimation/compensation (ME/MC) method for wavelet-based (in-band) motion compensated temporal filtering (MCTF), with application to low-bitrate video coding. Unlike the conventional in-band MCTF algorithms, which require redundancy to overcome the shift-variance problem of critically sampled (complete) discrete wavelet transforms (DWT), we perform ME/MC steps directly on… ▽ More

    Submitted 13 May, 2017; originally announced May 2017.

    Comments: arXiv admin note: substantial text overlap with arXiv:1705.04433, arXiv:1705.04641

  38. arXiv:1705.05102  [pdf, other

    cs.CV

    Learning Semantics for Image Annotation

    Authors: Amara Tariq, Hassan Foroosh

    Abstract: Image search and retrieval engines rely heavily on textual annotation in order to match word queries to a set of candidate images. A system that can automatically annotate images with meaningful text can be highly beneficial for such engines. Currently, the approaches to develop such systems try to establish relationships between keywords and visual features of images. In this paper, We make three… ▽ More

    Submitted 15 May, 2017; originally announced May 2017.

  39. arXiv:1705.04927  [pdf, other

    cs.CV

    A Closed-Form Model for Image-Based Distant Lighting

    Authors: Mais Alnasser, Hassan Foroosh

    Abstract: In this paper, we present a new mathematical foundation for image-based lighting. Using a simple manipulation of the local coordinate system, we derive a closed-form solution to the light integral equation under distant environment illumination. We derive our solution for different BRDF's such as lambertian and Phong-like. The method is free of noise, and provides the possibility of using the full… ▽ More

    Submitted 14 May, 2017; originally announced May 2017.

  40. arXiv:1705.04641  [pdf, other

    cs.CV

    Single Image Action Recognition by Predicting Space-Time Saliency

    Authors: Marjaneh Safaei, Hassan Foroosh

    Abstract: We propose a novel approach based on deep Convolutional Neural Networks (CNN) to recognize human actions in still images by predicting the future motion, and detecting the shape and location of the salient parts of the image. We make the following major contributions to this important area of research: (i) We use the predicted future motion in the static image (Walker et al., 2015) as a means of c… ▽ More

    Submitted 12 May, 2017; originally announced May 2017.

  41. arXiv:1705.04433  [pdf, other

    cs.CV

    View-Invariant Template Matching Using Homography Constraints

    Authors: Sina Lotfian, Hassan Foroosh

    Abstract: Change in viewpoint is one of the major factors for variation in object appearance across different images. Thus, view-invariant object recognition is a challenging and important image understanding task. In this paper, we propose a method that can match objects in images taken under different viewpoints. Unlike most methods in the literature, no restriction on camera orientations or internal came… ▽ More

    Submitted 11 May, 2017; originally announced May 2017.

  42. arXiv:1705.02460  [pdf, other

    cs.CV

    Image Annotation using Multi-Layer Sparse Coding

    Authors: Amara Tariq, Hassan Foroosh

    Abstract: Automatic annotation of images with descriptive words is a challenging problem with vast applications in the areas of image search and retrieval. This problem can be viewed as a label-assignment problem by a classifier dealing with a very large set of labels, i.e., the vocabulary set. We propose a novel annotation method that employs two layers of sparse coding and performs coarse-to-fine labeling… ▽ More

    Submitted 6 May, 2017; originally announced May 2017.

  43. arXiv:1705.01258  [pdf, other

    cs.CV

    Super-Resolution of Wavelet-Encoded Images

    Authors: Vildan Atalay Aydin, Hassan Foroosh

    Abstract: Multiview super-resolution image reconstruction (SRIR) is often cast as a resampling problem by merging non-redundant data from multiple low-resolution (LR) images on a finer high-resolution (HR) grid, while inverting the effect of the camera point spread function (PSF). One main problem with multiview methods is that resampling from nonuniform samples (provided by LR images) and the inversion of… ▽ More

    Submitted 3 May, 2017; originally announced May 2017.

  44. arXiv:1705.00430  [pdf, other

    cs.CV

    Sub-Pixel Registration of Wavelet-Encoded Images

    Authors: Vildan Atalay Aydin, Hassan Foroosh

    Abstract: Sub-pixel registration is a crucial step for applications such as super-resolution in remote sensing, motion compensation in magnetic resonance imaging, and non-destructive testing in manufacturing, to name a few. Recently, these technologies have been trending towards wavelet encoded imaging and sparse/compressive sensing. The former plays a crucial role in reducing imaging artifacts, while the l… ▽ More

    Submitted 1 May, 2017; originally announced May 2017.

  45. arXiv:1608.04337  [pdf, other

    cs.CV

    Design of Efficient Convolutional Layers using Single Intra-channel Convolution, Topological Subdivisioning and Spatial "Bottleneck" Structure

    Authors: Min Wang, Baoyuan Liu, Hassan Foroosh

    Abstract: Deep convolutional neural networks achieve remarkable visual recognition performance, at the cost of high computational complexity. In this paper, we have a new design of efficient convolutional layers based on three schemes. The 3D convolution operation in a convolutional layer can be considered as performing spatial convolution in each channel and linear projection across channels simultaneously… ▽ More

    Submitted 24 January, 2017; v1 submitted 15 August, 2016; originally announced August 2016.