Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 66 results for author: Saito, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.15296  [pdf, other

    cs.CV cs.CL cs.LG

    Weak-to-Strong Compositional Learning from Generative Models for Language-based Object Detection

    Authors: Kwanyong Park, Kuniaki Saito, Donghyun Kim

    Abstract: Vision-language (VL) models often exhibit a limited understanding of complex expressions of visual objects (e.g., attributes, shapes, and their relations), given complex and diverse language queries. Traditional approaches attempt to improve VL models using hard negative synthetic text, but their effectiveness is limited. In this paper, we harness the exceptional compositional understanding capabi… ▽ More

    Submitted 21 July, 2024; originally announced July 2024.

    Comments: ECCV 2024

  2. arXiv:2406.17672  [pdf, other

    cs.SD eess.AS

    SpecMaskGIT: Masked Generative Modeling of Audio Spectrograms for Efficient Audio Synthesis and Beyond

    Authors: Marco Comunità, Zhi Zhong, Akira Takahashi, Shiqi Yang, Mengjie Zhao, Koichi Saito, Yukara Ikemiya, Takashi Shibuya, Shusuke Takahashi, Yuki Mitsufuji

    Abstract: Recent advances in generative models that iteratively synthesize audio clips sparked great success to text-to-audio synthesis (TTA), but with the cost of slow synthesis speed and heavy computation. Although there have been attempts to accelerate the iterative procedure, high-quality TTA systems remain inefficient due to hundreds of iterations required in the inference phase and large amount of mod… ▽ More

    Submitted 26 June, 2024; v1 submitted 25 June, 2024; originally announced June 2024.

    Comments: 6 pages, 8 figures, 8 tables. Audio samples: https://zzaudio.github.io/SpecMaskGIT/index.html

  3. arXiv:2405.18503  [pdf, other

    cs.SD cs.LG eess.AS

    SoundCTM: Uniting Score-based and Consistency Models for Text-to-Sound Generation

    Authors: Koichi Saito, Dongjun Kim, Takashi Shibuya, Chieh-Hsin Lai, Zhi Zhong, Yuhta Takida, Yuki Mitsufuji

    Abstract: Sound content is an indispensable element for multimedia works such as video games, music, and films. Recent high-quality diffusion-based sound generation models can serve as valuable tools for the creators. However, despite producing high-quality sounds, these models often suffer from slow inference speeds. This drawback burdens creators, who typically refine their sounds through trial and error… ▽ More

    Submitted 10 June, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

    Comments: Audio samples: https://koichi-saito-sony.github.io/soundctm/. Codes: https://github.com/sony/soundctm. Checkpoints: https://huggingface.co/Sony/soundctm

  4. arXiv:2403.11686  [pdf, other

    cs.LG cond-mat.mtrl-sci physics.comp-ph

    Crystalformer: Infinitely Connected Attention for Periodic Structure Encoding

    Authors: Tatsunori Taniai, Ryo Igarashi, Yuta Suzuki, Naoya Chiba, Kotaro Saito, Yoshitaka Ushiku, Kanta Ono

    Abstract: Predicting physical properties of materials from their crystal structures is a fundamental problem in materials science. In peripheral areas such as the prediction of molecular properties, fully connected attention networks have been shown to be successful. However, unlike these finite atom arrangements, crystal structures are infinitely repeating, periodic arrangements of atoms, whose fully conne… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

    Comments: 13 main pages, 3 figures, 4 tables, 10 appendix pages. Published as a conference paper at ICLR 2024. For more information, see https://omron-sinicx.github.io/crystalformer/

  5. arXiv:2402.12170  [pdf, other

    cs.CL cs.AI

    Where is the answer? Investigating Positional Bias in Language Model Knowledge Extraction

    Authors: Kuniaki Saito, Kihyuk Sohn, Chen-Yu Lee, Yoshitaka Ushiku

    Abstract: Large language models require updates to remain up-to-date or adapt to new domains by fine-tuning them with new documents. One key is memorizing the latest information in a way that the memorized information is extractable with a query prompt. However, LLMs suffer from a phenomenon called perplexity curse; despite minimizing document perplexity during fine-tuning, LLMs struggle to extract informat… ▽ More

    Submitted 23 May, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

  6. arXiv:2401.13313  [pdf, other

    cs.CV cs.CL

    InstructDoc: A Dataset for Zero-Shot Generalization of Visual Document Understanding with Instructions

    Authors: Ryota Tanaka, Taichi Iki, Kyosuke Nishida, Kuniko Saito, Jun Suzuki

    Abstract: We study the problem of completing various visual document understanding (VDU) tasks, e.g., question answering and information extraction, on real-world documents through human-written instructions. To this end, we propose InstructDoc, the first large-scale collection of 30 publicly available VDU datasets, each with diverse instructions in a unified format, which covers a wide range of 12 tasks an… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

    Comments: Accepted by AAAI2024; project page: https://github.com/nttmdlab-nlp/InstructDoc

  7. arXiv:2310.12401  [pdf

    cs.CR

    Privacy-Preserving Hierarchical Anonymization Framework over Encrypted Data

    Authors: Jing Jia, Kenta Saito, Hiroaki Nishi

    Abstract: Smart cities, which can monitor the real world and provide smart services in a variety of fields, have improved people's living standards as urbanization has accelerated. However, there are security and privacy concerns because smart city applications collect large amounts of privacy-sensitive information from people and their social circles. Anonymization, which generalizes data and reduces data… ▽ More

    Submitted 18 October, 2023; originally announced October 2023.

    Comments: 8 pages, 12 figures, submitted to IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS and under review

    ACM Class: E.3

  8. arXiv:2310.10076  [pdf, other

    cs.CL cs.AI

    Verbosity Bias in Preference Labeling by Large Language Models

    Authors: Keita Saito, Akifumi Wachi, Koki Wataoka, Youhei Akimoto

    Abstract: In recent years, Large Language Models (LLMs) have witnessed a remarkable surge in prevalence, altering the landscape of natural language processing and machine learning. One key factor in improving the performance of LLMs is alignment with humans achieved with Reinforcement Learning from Human Feedback (RLHF), as for many LLMs such as GPT-4, Bard, etc. In addition, recent studies are investigatin… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

  9. arXiv:2309.11394  [pdf, ps, other

    cs.CY econ.GN

    Is Ethereum Proof of Stake Sustainable? $-$ Considering from the Perspective of Competition Among Smart Contract Platforms $-$

    Authors: Kenji Saito, Yutaka Soejima, Toshihiko Sugiura, Yukinobu Kitamura, Mitsuru Iwamura

    Abstract: Since the Merge update upon which Ethereum transitioned to Proof of Stake, it has been touted that it resulted in lower power consumption and increased security. However, even if that is the case, can this state be sustained? In this paper, we focus on the potential impact of competition with other smart contract platforms on the price of Ethereum's native currency, Ether (ETH), thereby raising… ▽ More

    Submitted 20 September, 2023; originally announced September 2023.

    Comments: 30 pages, 1 figure

  10. arXiv:2309.06934  [pdf, other

    eess.AS cs.SD

    VRDMG: Vocal Restoration via Diffusion Posterior Sampling with Multiple Guidance

    Authors: Carlos Hernandez-Olivan, Koichi Saito, Naoki Murata, Chieh-Hsin Lai, Marco A. Martínez-Ramirez, Wei-Hsiang Liao, Yuki Mitsufuji

    Abstract: Restoring degraded music signals is essential to enhance audio quality for downstream music manipulation. Recent diffusion-based music restoration methods have demonstrated impressive performance, and among them, diffusion posterior sampling (DPS) stands out given its intrinsic properties, making it versatile across various restoration tasks. In this paper, we identify that there are potential iss… ▽ More

    Submitted 13 September, 2023; originally announced September 2023.

  11. arXiv:2304.01973  [pdf, other

    cs.LG cs.CV

    ERM++: An Improved Baseline for Domain Generalization

    Authors: Piotr Teterwak, Kuniaki Saito, Theodoros Tsiligkaridis, Kate Saenko, Bryan A. Plummer

    Abstract: Domain Generalization (DG) measures a classifier's ability to generalize to new distributions of data it was not trained on. Recent work has shown that a hyperparameter-tuned Empirical Risk Minimization (ERM) training procedure, that is simply minimizing the empirical risk on the source domains, can outperform most existing DG methods. ERM has achieved such strong results while only tuning hyper-p… ▽ More

    Submitted 26 March, 2024; v1 submitted 4 April, 2023; originally announced April 2023.

    Comments: An improved baseline for Domain Generalization

  12. arXiv:2303.14744  [pdf, other

    cs.CV

    Mind the Backbone: Minimizing Backbone Distortion for Robust Object Detection

    Authors: Kuniaki Saito, Donghyun Kim, Piotr Teterwak, Rogerio Feris, Kate Saenko

    Abstract: Building object detectors that are robust to domain shifts is critical for real-world applications. Prior approaches fine-tune a pre-trained backbone and risk overfitting it to in-distribution (ID) data and distorting features useful for out-of-distribution (OOD) generalization. We propose to use Relative Gradient Norm (RGN) as a way to measure the vulnerability of a backbone to feature distortion… ▽ More

    Submitted 15 May, 2023; v1 submitted 26 March, 2023; originally announced March 2023.

    Comments: Project page: http://ai.bu.edu/mind_back/

  13. arXiv:2302.03084  [pdf, other

    cs.CV

    Pic2Word: Mapping Pictures to Words for Zero-shot Composed Image Retrieval

    Authors: Kuniaki Saito, Kihyuk Sohn, Xiang Zhang, Chun-Liang Li, Chen-Yu Lee, Kate Saenko, Tomas Pfister

    Abstract: In Composed Image Retrieval (CIR), a user combines a query image with text to describe their intended target. Existing methods rely on supervised learning of CIR models using labeled triplets consisting of the query image, text specification, and the target image. Labeling such triplets is expensive and hinders broad applicability of CIR. In this work, we propose to study an important task, Zero-S… ▽ More

    Submitted 15 May, 2023; v1 submitted 6 February, 2023; originally announced February 2023.

    Comments: CVPR2023

  14. arXiv:2301.12686  [pdf, other

    cs.LG cs.AI cs.CV cs.SD eess.AS

    GibbsDDRM: A Partially Collapsed Gibbs Sampler for Solving Blind Inverse Problems with Denoising Diffusion Restoration

    Authors: Naoki Murata, Koichi Saito, Chieh-Hsin Lai, Yuhta Takida, Toshimitsu Uesaka, Yuki Mitsufuji, Stefano Ermon

    Abstract: Pre-trained diffusion models have been successfully used as priors in a variety of linear inverse problems, where the goal is to reconstruct a signal from noisy linear measurements. However, existing approaches require knowledge of the linear operator. In this paper, we propose GibbsDDRM, an extension of Denoising Diffusion Restoration Models (DDRM) to a blind setting in which the linear measureme… ▽ More

    Submitted 27 June, 2023; v1 submitted 30 January, 2023; originally announced January 2023.

  15. arXiv:2301.04883  [pdf, other

    cs.CL cs.CV

    SlideVQA: A Dataset for Document Visual Question Answering on Multiple Images

    Authors: Ryota Tanaka, Kyosuke Nishida, Kosuke Nishida, Taku Hasegawa, Itsumi Saito, Kuniko Saito

    Abstract: Visual question answering on document images that contain textual, visual, and layout information, called document VQA, has received much attention recently. Although many datasets have been proposed for developing document VQA systems, most of the existing datasets focus on understanding the content relationships within a single image and not across multiple images. In this study, we propose a ne… ▽ More

    Submitted 12 January, 2023; originally announced January 2023.

    Comments: Accepted by AAAI2023

  16. arXiv:2212.13120  [pdf, other

    cond-mat.mtrl-sci cs.LG physics.comp-ph

    Neural Structure Fields with Application to Crystal Structure Autoencoders

    Authors: Naoya Chiba, Yuta Suzuki, Tatsunori Taniai, Ryo Igarashi, Yoshitaka Ushiku, Kotaro Saito, Kanta Ono

    Abstract: Representing crystal structures of materials to facilitate determining them via neural networks is crucial for enabling machine-learning applications involving crystal structure estimation. Among these applications, the inverse design of materials can contribute to explore materials with desired properties without relying on luck or serendipity. We propose neural structure fields (NeSF) as an accu… ▽ More

    Submitted 13 December, 2023; v1 submitted 8 December, 2022; originally announced December 2022.

    Comments: 17 pages , 7 figures, 4 tables. 15 pages Supplementary Information

    Journal ref: Communications Materials (2023)

  17. arXiv:2211.04124  [pdf, other

    eess.AS cs.LG cs.SD

    Unsupervised vocal dereverberation with diffusion-based generative models

    Authors: Koichi Saito, Naoki Murata, Toshimitsu Uesaka, Chieh-Hsin Lai, Yuhta Takida, Takao Fukui, Yuki Mitsufuji

    Abstract: Removing reverb from reverberant music is a necessary technique to clean up audio for downstream music manipulations. Reverberation of music contains two categories, natural reverb, and artificial reverb. Artificial reverb has a wider diversity than natural reverb due to its various parameter setups and reverberation types. However, recent supervised dereverberation methods may fail because they r… ▽ More

    Submitted 8 November, 2022; originally announced November 2022.

    Comments: 6 pages, 2 figures, submitted to ICASSP 2023

  18. arXiv:2206.01125  [pdf, other

    cs.CV

    Prefix Conditioning Unifies Language and Label Supervision

    Authors: Kuniaki Saito, Kihyuk Sohn, Xiang Zhang, Chun-Liang Li, Chen-Yu Lee, Kate Saenko, Tomas Pfister

    Abstract: Image-classification datasets have been used to pretrain image recognition models. Recently, web-scale image-caption datasets have emerged as a source of powerful pretraining alternative. Image-caption datasets are more ``open-domain'', containing a wider variety of scene types and vocabulary words than traditional classification datasets, and models trained on these datasets have demonstrated str… ▽ More

    Submitted 15 May, 2023; v1 submitted 2 June, 2022; originally announced June 2022.

    Comments: CVPR2023

  19. arXiv:2203.03119  [pdf, other

    cs.DC cs.CR cs.CY

    Fabchain: Managing Audit-able 3D Print Job over Blockchain

    Authors: Ryosuke Abe, Shigeya Suzuki, Kenji Saito, Hiroya Tanaka, Osamu Nakamura, Jun Murai

    Abstract: Improvements in fabrication devices such as 3D printers are becoming possible for personal fabrication to freely fabricate any products. To clarify who is liable for the product, the fabricator should keep the fabrication history in an immutable and sustainably accessible manner. In this paper, we propose a new scheme, "Fabchain," that can record the fabrication history in such a manner. By utiliz… ▽ More

    Submitted 6 March, 2022; originally announced March 2022.

  20. arXiv:2112.01698  [pdf, other

    cs.CV

    Learning to Detect Every Thing in an Open World

    Authors: Kuniaki Saito, Ping Hu, Trevor Darrell, Kate Saenko

    Abstract: Many open-world applications require the detection of novel objects, yet state-of-the-art object detection and instance segmentation networks do not excel at this task. The key issue lies in their assumption that regions without any annotations should be suppressed as negatives, which teaches the model to treat the unannotated objects as background. To address this issue, we propose a simple yet s… ▽ More

    Submitted 12 April, 2022; v1 submitted 2 December, 2021; originally announced December 2021.

    Comments: Project page is available at https://ksaito-ut.github.io/openworld_ldet/

  21. arXiv:2108.12544  [pdf, ps, other

    cs.IT math.CO

    Construction for both self-dual codes and LCD codes

    Authors: Keita Ishizuka, Ken Saito

    Abstract: From a given $[n, k]$ code $C$, we give a method for constructing many $[n, k]$ codes $C'$ such that the hull dimensions of $C$ and $C'$ are identical. This method can be applied to constructions of both self-dual codes and linear complementary dual codes (LCD codes for short). Using the method, we construct 661 new inequivalent extremal doubly even $[56, 28, 12]$ codes. Furthermore, constructing… ▽ More

    Submitted 27 August, 2021; originally announced August 2021.

    MSC Class: 94B05

  22. arXiv:2108.10860  [pdf, other

    cs.CV

    Tune it the Right Way: Unsupervised Validation of Domain Adaptation via Soft Neighborhood Density

    Authors: Kuniaki Saito, Donghyun Kim, Piotr Teterwak, Stan Sclaroff, Trevor Darrell, Kate Saenko

    Abstract: Unsupervised domain adaptation (UDA) methods can dramatically improve generalization on unlabeled target domains. However, optimal hyper-parameter selection is critical to achieving high accuracy and avoiding negative transfer. Supervised hyper-parameter validation is not possible without labeled target data, which raises the question: How can we validate unsupervised adaptation techniques in a re… ▽ More

    Submitted 24 August, 2021; originally announced August 2021.

    Comments: ICCV2021

  23. arXiv:2107.11011  [pdf, other

    cs.LG

    VisDA-2021 Competition Universal Domain Adaptation to Improve Performance on Out-of-Distribution Data

    Authors: Dina Bashkirova, Dan Hendrycks, Donghyun Kim, Samarth Mishra, Kate Saenko, Kuniaki Saito, Piotr Teterwak, Ben Usman

    Abstract: Progress in machine learning is typically measured by training and testing a model on the same distribution of data, i.e., the same domain. This over-estimates future accuracy on out-of-distribution data. The Visual Domain Adaptation (VisDA) 2021 competition tests models' ability to adapt to novel test distributions and handle distributional shift. We set up unsupervised domain adaptation challeng… ▽ More

    Submitted 22 July, 2021; originally announced July 2021.

    Comments: Neurips 2021 Competition Track

  24. arXiv:2105.14148  [pdf, other

    cs.CV

    OpenMatch: Open-set Consistency Regularization for Semi-supervised Learning with Outliers

    Authors: Kuniaki Saito, Donghyun Kim, Kate Saenko

    Abstract: Semi-supervised learning (SSL) is an effective means to leverage unlabeled data to improve a model's performance. Typical SSL methods like FixMatch assume that labeled and unlabeled data share the same label space. However, in practice, unlabeled data can contain categories unseen in the labeled set, i.e., outliers, which can significantly harm the performance of SSL algorithms. To address this pr… ▽ More

    Submitted 24 August, 2021; v1 submitted 28 May, 2021; originally announced May 2021.

    Comments: Code https://github.com/VisionLearningGroup/OP_Match

  25. arXiv:2105.12315  [pdf, other

    eess.AS cs.LG cs.SD

    Training Speech Enhancement Systems with Noisy Speech Datasets

    Authors: Koichi Saito, Stefan Uhlich, Giorgio Fabbro, Yuki Mitsufuji

    Abstract: Recently, deep neural network (DNN)-based speech enhancement (SE) systems have been used with great success. During training, such systems require clean speech data - ideally, in large quantity with a variety of acoustic conditions, many different speaker characteristics and for a given sampling rate (e.g., 48kHz for fullband SE). However, obtaining such clean speech data is not straightforward -… ▽ More

    Submitted 25 May, 2021; originally announced May 2021.

    Comments: 5 pages, 3 figures, submitted to WASPAA2021

  26. arXiv:2105.04079  [pdf, ps, other

    cs.SD cs.LG eess.AS

    Sampling-Frequency-Independent Audio Source Separation Using Convolution Layer Based on Impulse Invariant Method

    Authors: Koichi Saito, Tomohiko Nakamura, Kohei Yatabe, Yuma Koizumi, Hiroshi Saruwatari

    Abstract: Audio source separation is often used as preprocessing of various applications, and one of its ultimate goals is to construct a single versatile model capable of dealing with the varieties of audio signals. Since sampling frequency, one of the audio signal varieties, is usually application specific, the preceding audio source separation model should be able to deal with audio signals of all sampli… ▽ More

    Submitted 9 May, 2021; originally announced May 2021.

    Comments: 5 pages, 3 figures, accepted for European Signal Processing Conference 2021 (EUSIPCO 2021)

  27. arXiv:2104.07432  [pdf, ps, other

    cs.IT math.CO

    On the existence of quaternary Hermitian LCD codes with Hermitian dual distance $1$

    Authors: Keita Ishizuka, Ken Saito

    Abstract: For $k \ge 2$ and a positive integer $d_0$, we show that if there exists no quaternary Hermitian linear complementary dual $[n,k,d]$ code with $d \ge d_0$ and Hermitian dual distance greater than or equal to $2$, then there exists no quaternary Hermitian linear complementary dual $[n,k,d]$ code with $d \ge d_0$ and Hermitian dual distance $1$. As a consequence, we generalize a result by Araya, Har… ▽ More

    Submitted 15 April, 2021; originally announced April 2021.

    MSC Class: 94B05

  28. arXiv:2104.03344  [pdf, other

    cs.CV

    OVANet: One-vs-All Network for Universal Domain Adaptation

    Authors: Kuniaki Saito, Kate Saenko

    Abstract: Universal Domain Adaptation (UNDA) aims to handle both domain-shift and category-shift between two datasets, where the main challenge is to transfer knowledge while rejecting unknown classes which are absent in the labeled source data but present in the unlabeled target data. Existing methods manually set a threshold to reject unknown samples based on validation or a pre-defined ratio of unknown s… ▽ More

    Submitted 24 August, 2021; v1 submitted 7 April, 2021; originally announced April 2021.

    Comments: Accepted by ICCV2021 [Code](https://github.com/VisionLearningGroup/OVANet)

  29. arXiv:2103.16141  [pdf, ps, other

    stat.ML cs.AR cs.LG

    Structured Inverted-File k-Means Clustering for High-Dimensional Sparse Data

    Authors: Kazuo Aoyama, Kazumi Saito

    Abstract: This paper presents an architecture-friendly k-means clustering algorithm called SIVF for a large-scale and high-dimensional sparse data set. Algorithm efficiency on time is often measured by the number of costly operations such as similarity calculations. In practice, however, it depends greatly on how the algorithm adapts to an architecture of the computer system which it is executed on. Our pro… ▽ More

    Submitted 30 March, 2021; originally announced March 2021.

    Comments: 10 pages, 12 figures

  30. arXiv:2103.07669  [pdf, ps, other

    cs.CR cs.CY

    Privacy-Preserving Infection Exposure Notification without Trust in Third Parties

    Authors: Kenji Saito, Mitsuru Iwamura

    Abstract: In response to the COVID-19 pandemic, Bluetooth-based contact tracing has been deployed in many countries with the help of the developers of smartphone operating systems that provide APIs for privacy-preserving exposure notification. However, it has been assumed by the design that the OS developers, smartphone vendors, or governments will not violate people's privacy. We propose a privacy-preservi… ▽ More

    Submitted 13 March, 2021; originally announced March 2021.

    Comments: 17 pages, 5 figures, 1 table. Submitted to Journal of Communications and Networks (JCN)

  31. Lightweight Selective Disclosure for Verifiable Documents on Blockchain

    Authors: Kenji Saito, Satoki Watanabe

    Abstract: To achieve lightweight selective disclosure for protecting privacy of document holders, we propose an XML format for documents that can hide arbitrary elements using a cryptographic hash function and salts, which allows to be partially digitally signed and efficiently verified, as well as a JSON format that can be converted to such XML. The documents can be efficiently proven to exist by represent… ▽ More

    Submitted 9 October, 2021; v1 submitted 13 March, 2021; originally announced March 2021.

    Comments: 10 pages, 4 figures, 1 table

    Journal ref: ICT Express, Volume 7, Issue 3, September 2021, Pages 290-294

  32. arXiv:2103.03209  [pdf, ps, other

    cs.CR cs.CY

    Requirement Analyses and Evaluations of Blockchain Platforms per Possible Use Cases

    Authors: Kenji Saito, Akimitsu Shiseki, Mitsuyasu Takada, Hiroki Yamamoto, Masaaki Saitoh, Hiroaki Ohkawa, Hirofumi Andou, Naotake Miyamoto, Kazuaki Yamakawa, Kiyoshi Kurakawa, Tomohiro Yabushita, Yuji Yamada, Go Masuda, Kazuyuki Masuda

    Abstract: It is said that blockchain will contribute to the digital transformation of society in a wide range of ways, from the management of public and private documents to the traceability in various industries, as well as digital currencies. A number of so-called blockchain platforms have been developed, and experiments and applications have been carried out on them. But are these platforms really conduc… ▽ More

    Submitted 4 March, 2021; originally announced March 2021.

    Comments: 50 pages, 3 figures

  33. arXiv:2011.05442  [pdf, ps, other

    cs.CR cs.CY

    Proof of Authenticity of Logistics Information with Passive RFID Tags and Blockchain

    Authors: Hiroshi Watanabe, Kenji Saito, Satoshi Miyazaki, Toshiharu Okada, Hiroyuki Fukuyama, Tsuneo Kato, Katsuo Taniguchi

    Abstract: In tracing the (robotically automated) logistics of large quantities of goods, inexpensive passive RFID tags are preferred for cost reasons. Accordingly, security between such tags and readers have primarily been studied among many issues of RFID. However, the authenticity of data cannot be guaranteed if logistics services can give false information. Although the use of blockchain is often discuss… ▽ More

    Submitted 10 November, 2020; originally announced November 2020.

    Comments: 30 pages, 11 figures

  34. arXiv:2008.00348  [pdf, other

    cs.CV

    Self-supervised Visual Attribute Learning for Fashion Compatibility

    Authors: Donghyun Kim, Kuniaki Saito, Samarth Mishra, Stan Sclaroff, Kate Saenko, Bryan A Plummer

    Abstract: Many self-supervised learning (SSL) methods have been successful in learning semantically meaningful visual representations by solving pretext tasks. However, prior work in SSL focuses on tasks like object recognition or detection, which aim to learn object shapes and assume that the features should be invariant to concepts like colors and textures. Thus, these SSL methods perform poorly on downst… ▽ More

    Submitted 11 August, 2021; v1 submitted 1 August, 2020; originally announced August 2020.

    Comments: Accepted to VIPriors Workshop ICCV 2021

  35. arXiv:2007.07431  [pdf, other

    cs.CV

    COCO-FUNIT: Few-Shot Unsupervised Image Translation with a Content Conditioned Style Encoder

    Authors: Kuniaki Saito, Kate Saenko, Ming-Yu Liu

    Abstract: Unsupervised image-to-image translation intends to learn a mapping of an image in a given domain to an analogous image in a different domain, without explicit supervision of the mapping. Few-shot unsupervised image-to-image translation further attempts to generalize the model to an unseen domain by leveraging example images of the unseen domain provided at inference time. While remarkably successf… ▽ More

    Submitted 28 July, 2020; v1 submitted 14 July, 2020; originally announced July 2020.

    Comments: The paper will be presented at the EUROPEAN Conference on Computer Vision (ECCV) 2020

  36. arXiv:2003.08264  [pdf, other

    cs.CV

    Cross-domain Self-supervised Learning for Domain Adaptation with Few Source Labels

    Authors: Donghyun Kim, Kuniaki Saito, Tae-Hyun Oh, Bryan A. Plummer, Stan Sclaroff, Kate Saenko

    Abstract: Existing unsupervised domain adaptation methods aim to transfer knowledge from a label-rich source domain to an unlabeled target domain. However, obtaining labels for some source domains may be very expensive, making complete labeling as used in prior work impractical. In this work, we investigate a new domain adaptation scenario with sparsely labeled source data, where only a few examples in the… ▽ More

    Submitted 18 March, 2020; originally announced March 2020.

  37. arXiv:2002.09094  [pdf, ps, other

    stat.ML cs.LG

    Inverted-File k-Means Clustering: Performance Analysis

    Authors: Kazuo Aoyama, Kazumi Saito, Tetsuo Ikeda

    Abstract: This paper presents an inverted-file k-means clustering algorithm (IVF) suitable for a large-scale sparse data set with potentially numerous classes. Given such a data set, IVF efficiently works at high-speed and with low memory consumption, which keeps the same solution as a standard Lloyd's algorithm. The high performance arises from two distinct data representations. One is a sparse expression… ▽ More

    Submitted 20 February, 2020; originally announced February 2020.

    Comments: 15 pages, 20 figures

  38. arXiv:2002.07953  [pdf, other

    cs.CV

    Universal Domain Adaptation through Self Supervision

    Authors: Kuniaki Saito, Donghyun Kim, Stan Sclaroff, Kate Saenko

    Abstract: Unsupervised domain adaptation methods traditionally assume that all source categories are present in the target domain. In practice, little may be known about the category overlap between the two domains. While some methods address target settings with either partial or open-set categories, they assume that the particular setting is known a priori. We propose a more universally applicable domain… ▽ More

    Submitted 5 October, 2020; v1 submitted 18 February, 2020; originally announced February 2020.

    Comments: Accepted to NeurIPS2020

  39. arXiv:1909.03493  [pdf, other

    cs.CV cs.CL

    MULE: Multimodal Universal Language Embedding

    Authors: Donghyun Kim, Kuniaki Saito, Kate Saenko, Stan Sclaroff, Bryan A. Plummer

    Abstract: Existing vision-language methods typically support two languages at a time at most. In this paper, we present a modular approach which can easily be incorporated into existing vision-language methods in order to support many languages. We accomplish this by learning a single shared Multimodal Universal Language Embedding (MULE) which has been visually-semantically aligned across all languages. The… ▽ More

    Submitted 28 December, 2019; v1 submitted 8 September, 2019; originally announced September 2019.

    Comments: Accepted as an oral at AAAI 2020

  40. Remark on subcodes of linear complementary dual codes

    Authors: Masaaki Harada, Ken Saito

    Abstract: We show that any ternary Euclidean (resp.\ quaternary Hermitian) linear complementary dual $[n,k]$ code contains a Euclidean (resp.\ Hermitian) linear complementary dual $[n,k-1]$ subcode for $2 \le k \le n$. As a consequence, we derive a bound on the largest minimum weights among ternary Euclidean linear complementary dual codes and quaternary Hermitian linear complementary dual codes.

    Submitted 23 August, 2019; originally announced August 2019.

    Comments: 7 pages

    Journal ref: Information Processing Letters 159-160 (2020) 105963

  41. arXiv:1908.08661  [pdf, ps, other

    cs.IT math.CO

    On the minimum weights of binary LCD codes and ternary LCD codes

    Authors: Makoto Araya, Masaaki Harada, Ken Saito

    Abstract: Linear complementary dual (LCD) codes are linear codes that intersect with their dual codes trivially. We study the largest minimum weight $d_2(n,k)$ among all binary LCD $[n,k]$ codes and the largest minimum weight $d_3(n,k)$ among all ternary LCD $[n,k]$ codes. The largest minimum weights $d_2(n,5)$ and $d_3(n,4)$ are partially determined. We also determine the largest minimum weights… ▽ More

    Submitted 18 November, 2020; v1 submitted 23 August, 2019; originally announced August 2019.

    Comments: 25 pages

  42. arXiv:1908.03294  [pdf, ps, other

    math.CO cs.IT

    Characterization and classification of optimal LCD codes

    Authors: Makoto Araya, Masaaki Harada, Ken Saito

    Abstract: Linear complementary dual (LCD) codes are linear codes that intersect with their dual trivially. We give a characterization of LCD codes over $\mathbb{F}_q$ having large minimum weights for $q \in \{2,3\}$. Using the characterization, we determine the largest minimum weights among LCD $[n,k]$ codes over $\mathbb{F}_q$ for $(q,k) \in \{(2,4), (3,2),(3,3)\}$. Moreover, we give a complete classificat… ▽ More

    Submitted 4 January, 2021; v1 submitted 8 August, 2019; originally announced August 2019.

    Comments: 33 pages

  43. Quaternary Hermitian linear complementary dual codes

    Authors: Makoto Araya, Masaaki Harada, Ken Saito

    Abstract: The largest minimum weights among quaternary Hermitian linear complementary dual codes are known for dimension $2$. In this paper, we give some conditions for the nonexistence of quaternary Hermitian linear complementary dual codes with large minimum weights. As an application, we completely determine the largest minimum weights for dimension $3$, by using a classification of some quaternary codes… ▽ More

    Submitted 27 December, 2019; v1 submitted 16 April, 2019; originally announced April 2019.

    Comments: 24 pages, some corrections are made

    Journal ref: IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 66, 2020 2751-2759

  44. arXiv:1904.06487  [pdf, other

    cs.CV

    Semi-supervised Domain Adaptation via Minimax Entropy

    Authors: Kuniaki Saito, Donghyun Kim, Stan Sclaroff, Trevor Darrell, Kate Saenko

    Abstract: Contemporary domain adaptation methods are very effective at aligning feature distributions of source and target domains without any target supervision. However, we show that these techniques perform poorly when even a few labeled examples are available in the target. To address this semi-supervised domain adaptation (SSDA) setting, we propose a novel Minimax Entropy (MME) approach that adversaria… ▽ More

    Submitted 14 September, 2019; v1 submitted 13 April, 2019; originally announced April 2019.

    Comments: accepted to ICCV2019. ICCV paper version

  45. arXiv:1812.07405  [pdf, ps, other

    cs.LG stat.ML

    TWINs: Two Weighted Inconsistency-reduced Networks for Partial Domain Adaptation

    Authors: Toshihiko Matsuura, Kuniaki Saito, Tatsuya Harada

    Abstract: The task of unsupervised domain adaptation is proposed to transfer the knowledge of a label-rich domain (source domain) to a label-scarce domain (target domain). Matching feature distributions between different domains is a widely applied method for the aforementioned task. However, the method does not perform well when classes in the two domains are not identical. Specifically, when the classes o… ▽ More

    Submitted 18 December, 2018; originally announced December 2018.

  46. arXiv:1812.04798  [pdf, other

    cs.CV

    Strong-Weak Distribution Alignment for Adaptive Object Detection

    Authors: Kuniaki Saito, Yoshitaka Ushiku, Tatsuya Harada, Kate Saenko

    Abstract: We propose an approach for unsupervised adaptation of object detectors from label-rich to label-poor domains which can significantly reduce annotation costs associated with detection. Recently, approaches that align distributions of source and target images using an adversarial loss have been proven effective for adapting object classifiers. However, for object detection, fully matching the entire… ▽ More

    Submitted 5 April, 2019; v1 submitted 11 December, 2018; originally announced December 2018.

    Comments: Accepted to CVPR2019, project page http://cs-people.bu.edu/keisaito/research/CVPR2019.html

  47. arXiv:1812.04351  [pdf, other

    cs.CV

    Multichannel Semantic Segmentation with Unsupervised Domain Adaptation

    Authors: Kohei Watanabe, Kuniaki Saito, Yoshitaka Ushiku, Tatsuya Harada

    Abstract: Most contemporary robots have depth sensors, and research on semantic segmentation with RGBD images has shown that depth images boost the accuracy of segmentation. Since it is time-consuming to annotate images with semantic labels per pixel, it would be ideal if we could avoid this laborious work by utilizing an existing dataset or a synthetic dataset which we can generate on our own. Robot motion… ▽ More

    Submitted 11 December, 2018; originally announced December 2018.

    Comments: published on AUTONUE Workshops of ECCV 2018

  48. arXiv:1809.01924  [pdf, other

    physics.med-ph cs.CV

    Dynamic Block Matching to assess the longitudinal component of the dense motion field of the carotid artery wall in B-mode ultrasound sequences -- Association with coronary artery disease

    Authors: Guillaume Zahnd, Kozue Saito, Kazuyuki Nagatsuka, Yoshito Otake, Yoshinobu Sato

    Abstract: Purpose: The motion of the common carotid artery tissue layers along the vessel axis during the cardiac cycle, observed in ultrasound imaging, is associated with the presence of established cardiovascular risk factors. However, the vast majority of the methods are based on the tracking of a single point, thus failing to capture the overall motion of the entire arterial wall. The aim of this work i… ▽ More

    Submitted 18 May, 2020; v1 submitted 6 September, 2018; originally announced September 2018.

  49. arXiv:1806.09755  [pdf, other

    cs.CV

    Syn2Real: A New Benchmark forSynthetic-to-Real Visual Domain Adaptation

    Authors: Xingchao Peng, Ben Usman, Kuniaki Saito, Neela Kaushik, Judy Hoffman, Kate Saenko

    Abstract: Unsupervised transfer of object recognition models from synthetic to real data is an important problem with many potential applications. The challenge is how to "adapt" a model trained on simulated images so that it performs well on real-world data without any additional supervision. Unfortunately, current benchmarks for this problem are limited in size and task diversity. In this paper, we presen… ▽ More

    Submitted 25 June, 2018; originally announced June 2018.

  50. arXiv:1804.10427  [pdf, other

    cs.CV

    Open Set Domain Adaptation by Backpropagation

    Authors: Kuniaki Saito, Shohei Yamamoto, Yoshitaka Ushiku, Tatsuya Harada

    Abstract: Numerous algorithms have been proposed for transferring knowledge from a label-rich domain (source) to a label-scarce domain (target). Almost all of them are proposed for a closed-set scenario, where the source and the target domain completely share the class of their samples. We call the shared class the \doublequote{known class.} However, in practice, when samples in target domain are not labele… ▽ More

    Submitted 6 July, 2018; v1 submitted 27 April, 2018; originally announced April 2018.

    Comments: Accepted by ECCV2018