Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 50 results for author: Sato, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.01857  [pdf, other

    eess.AS cs.SD eess.SP

    SpeakerBeam-SS: Real-time Target Speaker Extraction with Lightweight Conv-TasNet and State Space Modeling

    Authors: Hiroshi Sato, Takafumi Moriya, Masato Mimura, Shota Horiguchi, Tsubasa Ochiai, Takanori Ashihara, Atsushi Ando, Kentaro Shinayama, Marc Delcroix

    Abstract: Real-time target speaker extraction (TSE) is intended to extract the desired speaker's voice from the observed mixture of multiple speakers in a streaming manner. Implementing real-time TSE is challenging as the computational complexity must be reduced to provide real-time operation. This work introduces to Conv-TasNet-based TSE a new architecture based on state space modeling (SSM) that has been… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: Accepted to Interspeech 2024

  2. 1-D CNN-Based Online Signature Verification with Federated Learning

    Authors: Lingfeng Zhang, Yuheng Guo, Yepeng Ding, Hiroyuki Sato

    Abstract: Online signature verification plays a pivotal role in security infrastructures. However, conventional online signature verification models pose significant risks to data privacy, especially during training processes. To mitigate these concerns, we propose a novel federated learning framework that leverages 1-D Convolutional Neural Networks (CNN) for online signature verification. Furthermore, our… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: 8 pages, 11 figures, 1 table

  3. arXiv:2406.03713  [pdf

    cs.RO

    Gait-Adaptive Navigation and Human Searching in field with Cyborg Insect

    Authors: Phuoc Thanh Tran-Ngoc, Huu Duoc Nguyen, Duc Long Le, Rui Li, Bing Sheng Chong, Hirotaka Sato

    Abstract: This study focuses on improving the ability of cyborg insects to navigate autonomously during search and rescue missions in outdoor environments. We propose an algorithm that leverages data from an IMU to calculate orientation and position based on the insect's walking gait. These computed factors serve as essential feedback channels across 3 phases of our exploration. Our method functions without… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: 35 pages, 9 figures

  4. Model-Driven Security Analysis of Self-Sovereign Identity Systems

    Authors: Yepeng Ding, Hiroyuki Sato

    Abstract: Best practices of self-sovereign identity (SSI) are being intensively explored in academia and industry. Reusable solutions obtained from best practices are generalized as architectural patterns for systematic analysis and design reference, which significantly boosts productivity and increases the dependability of future implementations. For security-sensitive projects, architects make architectur… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  5. arXiv:2405.09033  [pdf, other

    quant-ph cs.DC

    Accelerating Decision Diagram-based Multi-node Quantum Simulation with Ring Communication and Automatic SWAP Insertion

    Authors: Yusuke Kimura, Shaowen Li, Hiroyuki Sato, Masahiro Fujita

    Abstract: An N-bit quantum state requires a vector of length $2^N$, leading to an exponential increase in the required memory with N in conventional statevector-based quantum simulators. A proposed solution to this issue is the decision diagram-based quantum simulator, which can significantly decrease the necessary memory and is expected to operate faster for specific quantum circuits. However, decision dia… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

    Comments: Accepted at IEEE QSW 2024

  6. arXiv:2404.14860  [pdf, other

    eess.AS cs.SD

    Rethinking Processing Distortions: Disentangling the Impact of Speech Enhancement Errors on Speech Recognition Performance

    Authors: Tsubasa Ochiai, Kazuma Iwamoto, Marc Delcroix, Rintaro Ikeshita, Hiroshi Sato, Shoko Araki, Shigeru Katagiri

    Abstract: It is challenging to improve automatic speech recognition (ASR) performance in noisy conditions with a single-channel speech enhancement (SE) front-end. This is generally attributed to the processing distortions caused by the nonlinear processing of single-channel SE front-ends. However, the causes of such degraded ASR performance have not been fully investigated. How to design single-channel SE f… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: 13 pages, 6 figures, Submitted to IEEE/ACM Trans. Audio, Speech, and Language Processing

  7. arXiv:2404.10376  [pdf, other

    cs.SE

    Hunting DeFi Vulnerabilities via Context-Sensitive Concolic Verification

    Authors: Yepeng Ding, Arthur Gervais, Roger Wattenhofer, Hiroyuki Sato

    Abstract: Decentralized finance (DeFi) is revolutionizing the traditional centralized finance paradigm with its attractive features such as high availability, transparency, and tamper-proofing. However, attacks targeting DeFi services have severely damaged the DeFi market, as evidenced by our investigation of 80 real-world DeFi incidents from 2017 to 2022. Existing methods, based on symbolic execution, mode… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  8. arXiv:2403.17496  [pdf, other

    cs.CV cs.GR

    Dr.Hair: Reconstructing Scalp-Connected Hair Strands without Pre-training via Differentiable Rendering of Line Segments

    Authors: Yusuke Takimoto, Hikari Takehara, Hiroyuki Sato, Zihao Zhu, Bo Zheng

    Abstract: In the film and gaming industries, achieving a realistic hair appearance typically involves the use of strands originating from the scalp. However, reconstructing these strands from observed surface images of hair presents significant challenges. The difficulty in acquiring Ground Truth (GT) data has led state-of-the-art learning-based methods to rely on pre-training with manually prepared synthet… ▽ More

    Submitted 29 March, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

    Comments: CVPR 2024

  9. arXiv:2403.17392  [pdf, other

    cs.RO eess.SY nlin.AO

    Natural-artificial hybrid swarm: Cyborg-insect group navigation in unknown obstructed soft terrain

    Authors: Yang Bai, Phuoc Thanh Tran Ngoc, Huu Duoc Nguyen, Duc Long Le, Quang Huy Ha, Kazuki Kai, Yu Xiang See To, Yaosheng Deng, Jie Song, Naoki Wakamiya, Hirotaka Sato, Masaki Ogura

    Abstract: Navigating multi-robot systems in complex terrains has always been a challenging task. This is due to the inherent limitations of traditional robots in collision avoidance, adaptation to unknown environments, and sustained energy efficiency. In order to overcome these limitations, this research proposes a solution by integrating living insects with miniature electronic controllers to enable roboti… ▽ More

    Submitted 27 March, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

  10. arXiv:2401.17053  [pdf, other

    cs.CV cs.AI cs.GR

    BlockFusion: Expandable 3D Scene Generation using Latent Tri-plane Extrapolation

    Authors: Zhennan Wu, Yang Li, Han Yan, Taizhang Shang, Weixuan Sun, Senbo Wang, Ruikai Cui, Weizhe Liu, Hiroyuki Sato, Hongdong Li, Pan Ji

    Abstract: We present BlockFusion, a diffusion-based model that generates 3D scenes as unit blocks and seamlessly incorporates new blocks to extend the scene. BlockFusion is trained using datasets of 3D blocks that are randomly cropped from complete 3D scene meshes. Through per-block fitting, all training blocks are converted into the hybrid neural fields: with a tri-plane containing the geometry features, f… ▽ More

    Submitted 23 May, 2024; v1 submitted 30 January, 2024; originally announced January 2024.

    Comments: ACM Transactions on Graphics (SIGGRAPH'24). Code: https://yang-l1.github.io/blockfusion

  11. arXiv:2401.05111  [pdf, other

    cs.SD cs.CL cs.LG eess.AS

    Noise-robust zero-shot text-to-speech synthesis conditioned on self-supervised speech-representation model with adapters

    Authors: Kenichi Fujita, Hiroshi Sato, Takanori Ashihara, Hiroki Kanagawa, Marc Delcroix, Takafumi Moriya, Yusuke Ijima

    Abstract: The zero-shot text-to-speech (TTS) method, based on speaker embeddings extracted from reference speech using self-supervised learning (SSL) speech representations, can reproduce speaker characteristics very accurately. However, this approach suffers from degradation in speech synthesis quality when the reference speech contains noise. In this paper, we propose a noise-robust zero-shot TTS method.… ▽ More

    Submitted 10 January, 2024; originally announced January 2024.

    Comments: 5 pages,3 figures, Accepted to IEEE ICASSP 2024

  12. arXiv:2312.14511  [pdf

    cs.RO eess.SY

    3D Programming of Patterned Heterogeneous Interface for 4D Smart Robotics

    Authors: Kewei Song, Chunfeng Xiong, Ze Zhang, Kunlin Wu, Weiyang Wan, Yifan Wang, Shinjiro Umezu, Hirotaka Sato

    Abstract: Shape memory structures are playing an important role in many cutting-edge intelligent fields. However, the existing technologies can only realize 4D printing of a single polymer or metal, which limits practical applications. Here, we report a construction strategy for TSMP/M heterointerface, which uses Pd2+-containing shape memory polymer (AP-SMR) to induce electroless plating reaction and relies… ▽ More

    Submitted 22 December, 2023; originally announced December 2023.

    Comments: 37 Pages, 11 Figures

  13. Parallelizing quantum simulation with decision diagrams

    Authors: Shaowen Li, Yusuke Kimura, Hiroyuki Sato, Junwei Yu, Masahiro Fujita

    Abstract: Recent technological advancements show promise in leveraging quantum mechanical phenomena for computation. This brings substantial speed-ups to problems that are once considered to be intractable in the classical world. However, the physical realization of quantum computers is still far away from us, and a majority of research work is done using quantum simulators running on classical computers. C… ▽ More

    Submitted 3 December, 2023; originally announced December 2023.

  14. arXiv:2309.14364  [pdf, other

    cs.HC cs.GR cs.MA cs.NE cs.SE

    Automata Quest: NCAs as a Video Game Life Mechanic

    Authors: Hiroki Sato, Tanner Lund, Takahide Yoshida, Atsushi Masumori

    Abstract: We study life over the course of video game history as represented by their mechanics. While there have been some variations depending on genre or "character type", we find that most games converge to a similar representation. We also examine the development of Conway's Game of Life (one of the first zero player games) and related automata that have developed over the years. With this history in m… ▽ More

    Submitted 23 September, 2023; originally announced September 2023.

    Comments: This article was submitted to and presented at Alife for and from Video Games Workshop at ALIFE2023, Sappro (Japan)

    Journal ref: Alife for and from Video Games Workshop at ALIFE2023

  15. Online Estimation of Self-Body Deflection With Various Sensor Data Based on Directional Statistics

    Authors: Hiroya Sato, Kento Kawaharazuka, Tasuku Makabe, Kei Okada, Masayuki Inaba

    Abstract: In this paper, we propose a method for online estimation of the robot's posture. Our method uses von Mises and Bingham distributions as probability distributions of joint angles and 3D orientation, which are used in directional statistics. We constructed a particle filter using these distributions and configured a system to estimate the robot's posture from various sensor information (e.g., joint… ▽ More

    Submitted 6 June, 2023; originally announced June 2023.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  16. arXiv:2306.02273  [pdf, ps, other

    cs.CL cs.SD eess.AS

    End-to-End Joint Target and Non-Target Speakers ASR

    Authors: Ryo Masumura, Naoki Makishima, Taiga Yamane, Yoshihiko Yamazaki, Saki Mizuno, Mana Ihori, Mihiro Uchida, Keita Suzuki, Hiroshi Sato, Tomohiro Tanaka, Akihiko Takashima, Satoshi Suzuki, Takafumi Moriya, Nobukatsu Hojo, Atsushi Ando

    Abstract: This paper proposes a novel automatic speech recognition (ASR) system that can transcribe individual speaker's speech while identifying whether they are target or non-target speakers from multi-talker overlapped speech. Target-speaker ASR systems are a promising way to only transcribe a target speaker's speech by enrolling the target speaker's information. However, in conversational ASR applicatio… ▽ More

    Submitted 4 June, 2023; originally announced June 2023.

    Comments: Accepted at Interspeech 2023

  17. arXiv:2305.18947  [pdf, other

    cs.CV

    A Probabilistic Rotation Representation for Symmetric Shapes With an Efficiently Computable Bingham Loss Function

    Authors: Hiroya Sato, Takuya Ikeda, Koichi Nishiwaki

    Abstract: In recent years, a deep learning framework has been widely used for object pose estimation. While quaternion is a common choice for rotation representation, it cannot represent the ambiguity of the observation. In order to handle the ambiguity, the Bingham distribution is one promising solution. However, it requires complicated calculation when yielding the negative log-likelihood (NLL) loss. An a… ▽ More

    Submitted 30 May, 2023; originally announced May 2023.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible. arXiv admin note: substantial text overlap with arXiv:2203.04456

  18. arXiv:2305.14723  [pdf, other

    eess.AS cs.SD

    Downstream Task Agnostic Speech Enhancement with Self-Supervised Representation Loss

    Authors: Hiroshi Sato, Ryo Masumura, Tsubasa Ochiai, Marc Delcroix, Takafumi Moriya, Takanori Ashihara, Kentaro Shinayama, Saki Mizuno, Mana Ihori, Tomohiro Tanaka, Nobukatsu Hojo

    Abstract: Self-supervised learning (SSL) is the latest breakthrough in speech processing, especially for label-scarce downstream tasks by leveraging massive unlabeled audio data. The noise robustness of the SSL is one of the important challenges to expanding its application. We can use speech enhancement (SE) to tackle this issue. However, the mismatch between the SE model and SSL models potentially limits… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

    Comments: 4 pages , 2 figures, Accepted to Interspeech 2023

  19. Leveraging Self-Sovereign Identity in Decentralized Data Aggregation

    Authors: Yepeng Ding, Hiroyuki Sato, Maro G. Machizawa

    Abstract: Data aggregation has been widely implemented as an infrastructure of data-driven systems. However, a centralized data aggregation model requires a set of strong trust assumptions to ensure security and privacy. In recent years, decentralized data aggregation has become realizable based on distributed ledger technology. Nevertheless, the lack of appropriate centralized mechanisms like identity mana… ▽ More

    Submitted 21 March, 2023; originally announced March 2023.

  20. arXiv:2303.10990  [pdf

    cs.RO

    Resilient conductive membrane synthesized by in-situ polymerisation for wearable non-invasive electronics on moving appendages of cyborg insect

    Authors: Qifeng Lin, Rui Li, Feilong Zhang, Kai Kazuki, Ong Zong Chen, Xiaodong Chen, Hirotaka Sato

    Abstract: By leveraging their high mobility and small size, insects have been combined with microcontrollers to build up cyborg insects for various practical applications. Unfortunately, all current cyborg insects rely on implanted electrodes to control their movement, which causes irreversible damage to their organs and muscles. Here, we develop a non-invasive method for cyborg insects to address above iss… ▽ More

    Submitted 20 March, 2023; originally announced March 2023.

    Comments: 27 pages

  21. arXiv:2210.15937  [pdf, other

    cs.CL cs.SD eess.AS

    On the Use of Modality-Specific Large-Scale Pre-Trained Encoders for Multimodal Sentiment Analysis

    Authors: Atsushi Ando, Ryo Masumura, Akihiko Takashima, Satoshi Suzuki, Naoki Makishima, Keita Suzuki, Takafumi Moriya, Takanori Ashihara, Hiroshi Sato

    Abstract: This paper investigates the effectiveness and implementation of modality-specific large-scale pre-trained encoders for multimodal sentiment analysis~(MSA). Although the effectiveness of pre-trained encoders in various fields has been reported, conventional MSA methods employ them for only linguistic modality, and their application has not been investigated. This paper compares the features yielded… ▽ More

    Submitted 28 October, 2022; originally announced October 2022.

    Comments: Accepted to SLT 2022

  22. arXiv:2209.04175  [pdf, other

    eess.AS cs.SD

    Streaming Target-Speaker ASR with Neural Transducer

    Authors: Takafumi Moriya, Hiroshi Sato, Tsubasa Ochiai, Marc Delcroix, Takahiro Shinozaki

    Abstract: Although recent advances in deep learning technology have boosted automatic speech recognition (ASR) performance in the single-talker case, it remains difficult to recognize multi-talker speech in which many voices overlap. One conventional approach to tackle this problem is to use a cascade of a speech separation or target speech extraction front-end with an ASR back-end. However, the extra compu… ▽ More

    Submitted 19 September, 2022; v1 submitted 9 September, 2022; originally announced September 2022.

    Comments: Accepted to Interspeech 2022

  23. arXiv:2206.09628  [pdf, other

    cs.LG cs.CR cs.CV

    Diversified Adversarial Attacks based on Conjugate Gradient Method

    Authors: Keiichiro Yamamura, Haruki Sato, Nariaki Tateiwa, Nozomi Hata, Toru Mitsutake, Issa Oe, Hiroki Ishikura, Katsuki Fujisawa

    Abstract: Deep learning models are vulnerable to adversarial examples, and adversarial attacks used to generate such examples have attracted considerable research interest. Although existing methods based on the steepest descent have achieved high attack success rates, ill-conditioned problems occasionally reduce their performance. To address this limitation, we utilize the conjugate gradient (CG) method, w… ▽ More

    Submitted 19 July, 2022; v1 submitted 20 June, 2022; originally announced June 2022.

    Comments: Proceedings of the 39th International Conference on Machine Learning (ICML 2022)

  24. arXiv:2206.08174  [pdf, other

    eess.AS cs.SD eess.SP

    Strategies to Improve Robustness of Target Speech Extraction to Enrollment Variations

    Authors: Hiroshi Sato, Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Takafumi Moriya, Naoki Makishima, Mana Ihori, Tomohiro Tanaka, Ryo Masumura

    Abstract: Target speech extraction is a technique to extract the target speaker's voice from mixture signals using a pre-recorded enrollment utterance that characterize the voice characteristics of the target speaker. One major difficulty of target speech extraction lies in handling variability in ``intra-speaker'' characteristics, i.e., characteristics mismatch between target speech and an enrollment utter… ▽ More

    Submitted 16 June, 2022; originally announced June 2022.

    Comments: 5 pages, 2 figures, 3 tables Submitted to Interspeech 2022

  25. arXiv:2206.07319  [pdf

    cs.RO

    Toward the smooth mesh climbing of a miniature robot using bioinspired soft and expandable claws

    Authors: Hong Wang, Peng Liu, Phuoc Thanh Tran Ngoc, Bing Li, Yao Li, Hirotaka Sato

    Abstract: While most micro-robots face difficulty traveling on rugged and uneven terrain, beetles can walk smoothly on the complex substrate without slipping or getting stuck on the surface due to their stiffness-variable tarsi and expandable hooks on the tip of tarsi. In this study, we found that beetles actively bent and expanded their claws regularly to crawl freely on mesh surfaces. Inspired by the craw… ▽ More

    Submitted 15 June, 2022; originally announced June 2022.

  26. Self-Sovereign Identity as a Service: Architecture in Practice

    Authors: Yepeng Ding, Hiroyuki Sato

    Abstract: Self-sovereign identity (SSI) has gained a large amount of interest. It enables physical entities to retain ownership and control of their digital identities, which naturally forms a conceptual decentralized architecture. With the support of the distributed ledger technology (DLT), it is possible to implement this conceptual decentralized architecture in practice and further bring technical advant… ▽ More

    Submitted 2 June, 2022; v1 submitted 17 May, 2022; originally announced May 2022.

  27. Efficient Autonomous Navigation for Terrestrial Insect-Machine Hybrid Systems

    Authors: Huu Duoc Nguyen, Van Than Dung, Hirotaka Sato, T. Thang Vo-Doan

    Abstract: While bio-inspired and biomimetic systems draw inspiration from living materials, biohybrid systems incorporate them with synthetic devices, allowing the exploitation of both organic and artificial advantages inside a single entity. In the challenging development of centimeter-scaled mobile robots serving unstructured territory navigations, biohybrid systems appear as a potential solution in the f… ▽ More

    Submitted 19 November, 2022; v1 submitted 28 April, 2022; originally announced April 2022.

    Comments: Demonstration video can be found at http://youtu.be/p00mfxFo7VY

    Journal ref: Sensors and Actuators B: Chemical 376(A) (2023) 132988

  28. arXiv:2204.04811  [pdf, other

    eess.AS cs.SD

    Listen only to me! How well can target speech extraction handle false alarms?

    Authors: Marc Delcroix, Keisuke Kinoshita, Tsubasa Ochiai, Katerina Zmolikova, Hiroshi Sato, Tomohiro Nakatani

    Abstract: Target speech extraction (TSE) extracts the speech of a target speaker in a mixture given auxiliary clues characterizing the speaker, such as an enrollment utterance. TSE addresses thus the challenging problem of simultaneously performing separation and speaker identification. There has been much progress in extraction performance following the recent development of neural networks for speech enha… ▽ More

    Submitted 14 July, 2022; v1 submitted 10 April, 2022; originally announced April 2022.

    Comments: Accepted to Interspeech 2022

  29. arXiv:2204.01386  [pdf, other

    cs.GR cs.CV

    Dressi: A Hardware-Agnostic Differentiable Renderer with Reactive Shader Packing and Soft Rasterization

    Authors: Yusuke Takimoto, Hiroyuki Sato, Hikari Takehara, Keishiro Uragaki, Takehiro Tawara, Xiao Liang, Kentaro Oku, Wataru Kishimoto, Bo Zheng

    Abstract: Differentiable rendering (DR) enables various computer graphics and computer vision applications through gradient-based optimization with derivatives of the rendering equation. Most rasterization-based approaches are built on general-purpose automatic differentiation (AD) libraries and DR-specific modules handcrafted using CUDA. Such a system design mixes DR algorithm implementation and algorithm… ▽ More

    Submitted 4 April, 2022; originally announced April 2022.

    Comments: 13 pages, 17 figures, EUROGRAPHICS 2022

  30. A robotic leg inspired from an insect leg

    Authors: P. Thanh Tran-Ngoc, Leslie Ziqi Lim, Jia Hui Gan, Hong Wang, T. Thang Vo-Doan, Hirotaka Sato

    Abstract: While most insect-inspired robots come with a simple tarsus such as a hemispherical foot tip, insect legs have complex tarsal structures and claws, which enable them to walk on complex terrain. Their sharp claws can smoothly attach and detach on plant surfaces by actuating a single muscle. Thus, installing insect-inspired tarsus on legged robots would improve their locomotion on complex terrain. T… ▽ More

    Submitted 11 May, 2022; v1 submitted 21 March, 2022; originally announced March 2022.

    Comments: 17 pages, 10 figures

    Journal ref: Bioinspir. Biomim. 17 (2022) 056008

  31. arXiv:2203.04456  [pdf, other

    cs.CV cs.RO

    Probabilistic Rotation Representation With an Efficiently Computable Bingham Loss Function and Its Application to Pose Estimation

    Authors: Hiroya Sato, Takuya Ikeda, Koichi Nishiwaki

    Abstract: In recent years, a deep learning framework has been widely used for object pose estimation. While quaternion is a common choice for rotation representation of 6D pose, it cannot represent an uncertainty of the observation. In order to handle the uncertainty, Bingham distribution is one promising solution because this has suitable features, such as a smooth representation over SO(3), in addition to… ▽ More

    Submitted 8 March, 2022; originally announced March 2022.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  32. arXiv:2201.06685  [pdf, other

    eess.AS cs.SD

    How Bad Are Artifacts?: Analyzing the Impact of Speech Enhancement Errors on ASR

    Authors: Kazuma Iwamoto, Tsubasa Ochiai, Marc Delcroix, Rintaro Ikeshita, Hiroshi Sato, Shoko Araki, Shigeru Katagiri

    Abstract: It is challenging to improve automatic speech recognition (ASR) performance in noisy conditions with single-channel speech enhancement (SE). In this paper, we investigate the causes of ASR performance degradation by decomposing the SE errors using orthogonal projection-based decomposition (OPD). OPD decomposes the SE errors into noise and artifact components. The artifact component is defined as t… ▽ More

    Submitted 30 March, 2022; v1 submitted 17 January, 2022; originally announced January 2022.

    Comments: 5 pages, 5 figures, submitted to Interspeech 2022

  33. Learning to Enhance or Not: Neural Network-Based Switching of Enhanced and Observed Signals for Overlapping Speech Recognition

    Authors: Hiroshi Sato, Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Naoyuki Kamo, Takafumi Moriya

    Abstract: The combination of a deep neural network (DNN) -based speech enhancement (SE) front-end and an automatic speech recognition (ASR) back-end is a widely used approach to implement overlapping speech recognition. However, the SE front-end generates processing artifacts that can degrade the ASR performance. We previously found that such performance degradation can occur even under fully overlapping co… ▽ More

    Submitted 11 January, 2022; originally announced January 2022.

    Comments: 5 pages, 2 figures

    Journal ref: In 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6287-6291

  34. arXiv:2112.12530  [pdf, other

    eess.SY cs.DM

    Long-Term Optimal Delivery Planning for Replacing the Liquefied Petroleum Gas Cylinder

    Authors: Akihiro Yoshida, Haruki Sato, Shiori Uchiumi, Nariaki Tateiwa, Daisuke Kataoka, Akira Tanaka, Nozomi Hata, Yousuke Yatsushiro, Ayano Ide, Hiroki Ishikura, Shingo Egi, Miyu Fujii, Hiroki Kai, Katsuki Fujisawa

    Abstract: In the daily operation of liquefied petroleum gas service, gas providers visit customers and replace cylinders if the gas is about to run out. For a long time, frequent visits to customers were required because they could not determine the amount of remaining gas without a staff visit and observation. To solve this problem, smart meters are started to be employed to acquire gas consumption more fr… ▽ More

    Submitted 20 June, 2022; v1 submitted 22 December, 2021; originally announced December 2021.

    Comments: 25 pages

    MSC Class: 90C90 (Primary) 90C27; 90C15 (Secondary) ACM Class: G.1.6; G.2.3

  35. arXiv:2112.11661  [pdf

    cs.RO eess.SY physics.app-ph physics.chem-ph

    New metal-plastic hybrid additive manufacturing strategy: Fabrication of arbitrary metal-patterns on external and even internal surfaces of 3D plastic structures

    Authors: Kewei Song, Yue Cui, Tiannan Tao, Xiangyi Meng, Michinari Sone, Masahiro Yoshino, Shinjiro Umezu, Hirotaka Sato

    Abstract: Constructing precise micro-nano metal patterns on complex three-dimensional (3D) plastic parts allows the fabrication of functional devices for advanced applications. However, this patterning is currently expensive and requires complex processes with long manufacturing lead time. The present work demonstrates a process for the fabrication of micro-nano 3D metal-plastic composite structures with ar… ▽ More

    Submitted 21 December, 2021; originally announced December 2021.

  36. Braking and Body Angles Control of an Insect-Computer Hybrid Robot by Electrical Stimulation of Beetle Flight Muscle in Free Flight

    Authors: T. Thang Vo-Doan, V. Than Dung, Hirotaka Sato

    Abstract: While engineers put lots of effort, resources, and time in building insect scale micro aerial vehicles (MAVs) that fly like insects, insects themselves are the real masters of flight. What if we would use living insect as platform for MAV instead? Here, we reported a flight control via electrical stimulation of a flight muscle of an insect-computer hybrid robot, which is the interface of a mountab… ▽ More

    Submitted 28 November, 2021; originally announced November 2021.

    Comments: 9 pages, 7 figures, supplemental video: https://youtu.be/P9dxsSf14LY . Cyborg and Bionic Systems 2022

    Journal ref: Cyborg and Bionic Systems, vol. 2022, Article ID 9780504, 11 pages

  37. Sunspot: A Decentralized Framework Enabling Privacy for Authorizable Data Sharing on Transparent Public Blockchains

    Authors: Yepeng Ding, Hiroyuki Sato

    Abstract: Blockchain technologies have been boosting the development of data-driven decentralized services in a wide range of fields. However, with the spirit of full transparency, many public blockchains expose all types of data to the public such as Ethereum. Besides, the on-chain persistence of large data is significantly expensive technically and economically. These issues lead to the difficulty of shar… ▽ More

    Submitted 12 May, 2022; v1 submitted 6 November, 2021; originally announced November 2021.

  38. Should We Always Separate?: Switching Between Enhanced and Observed Signals for Overlapping Speech Recognition

    Authors: Hiroshi Sato, Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Takafumi Moriya, Naoyuki Kamo

    Abstract: Although recent advances in deep learning technology improved automatic speech recognition (ASR), it remains difficult to recognize speech when it overlaps other people's voices. Speech separation or extraction is often used as a front-end to ASR to handle such overlapping speech. However, deep neural network-based speech enhancement can generate `processing artifacts' as a side effect of the enha… ▽ More

    Submitted 2 June, 2021; originally announced June 2021.

    Comments: 5 pages, 1 figure

    Journal ref: in Proc. Interspeech 2021, 1149-1153

  39. arXiv:2105.10869  [pdf

    cs.RO

    Insect-Computer Hybrid System for Autonomous Search and Rescue Mission

    Authors: P. Thanh Tran-Ngoc, D. Long Le, Bing Sheng Chong, H. Duoc Nguyen, V. Than Dung, Feng Cao, Yao Li, Kazuki Kai, Jia Hui Gan, T. Thang Vo-Doan, T. Luan Nguyen, Hirotaka Sato

    Abstract: There is still a long way to go before artificial mini robots are really used for search and rescue missions in disaster-hit areas due to hindrance in power consumption, computation load of the locomotion, and obstacle-avoidance system. Insect-computer hybrid system, which is the fusion of living insect platform and microcontroller, emerges as an alternative solution. This study demonstrates the f… ▽ More

    Submitted 21 June, 2021; v1 submitted 23 May, 2021; originally announced May 2021.

    Comments: Videos are available at https://hirosatontu.wordpress.com/research/

  40. arXiv:2103.02587  [pdf

    cs.NI

    Reconstructed spatial receptive field structures by reverse correlation technique explains the visual feature selectivity of units in deep convolutional neural networks

    Authors: Yoshiyuki R Shiraishi, Hiromichi Sato, Takahisa M Sanada, Tomoyuki Naito

    Abstract: An important issue in dealing with Deep Convolutional Neural Networks (DCNN) is the 'black box problem', which represents the unknowns about internal information representation and processing, especially in the middle and higher layers. In this study, we adopted a systems neuroscience methodology to measure the visual feature selectivity and visualize the spatial receptive field of the units in VG… ▽ More

    Submitted 3 March, 2021; originally announced March 2021.

    Comments: 28 pages, 7 figures, 1 table

  41. arXiv:2102.01326  [pdf, other

    eess.AS cs.LG cs.SD

    Multimodal Attention Fusion for Target Speaker Extraction

    Authors: Hiroshi Sato, Tsubasa Ochiai, Keisuke Kinoshita, Marc Delcroix, Tomohiro Nakatani, Shoko Araki

    Abstract: Target speaker extraction, which aims at extracting a target speaker's voice from a mixture of voices using audio, visual or locational clues, has received much interest. Recently an audio-visual target speaker extraction has been proposed that extracts target speech by using complementary audio and visual clues. Although audio-visual target speaker extraction offers a more stable performance than… ▽ More

    Submitted 2 February, 2021; originally announced February 2021.

    Comments: 7 pages, 5 figures

    Journal ref: in IEEE Spoken Language Technology Workshop (SLT), 2021, pp. 778-784

  42. arXiv:2012.04185  [pdf, other

    cs.SE

    Formalism-Driven Development of Decentralized Systems

    Authors: Yepeng Ding, Hiroyuki Sato

    Abstract: Decentralized systems have been widely developed and applied to address security and privacy issues in centralized systems, especially since the advancement of distributed ledger technology. However, it is challenging to ensure their correct functioning with respect to their designs and minimize the technical risk before the delivery. Although formal methods have made significant progress over the… ▽ More

    Submitted 30 January, 2022; v1 submitted 7 December, 2020; originally announced December 2020.

    Comments: To appear in ICECCS 2022

  43. arXiv:2008.08245  [pdf, other

    cs.DC cs.LO

    Formalizing and Verifying Decentralized Systems with Extended Concurrent Separation Logic

    Authors: Yepeng Ding, Hiroyuki Sato

    Abstract: Decentralized techniques are becoming crucial and ubiquitous with the rapid advancement of distributed ledger technologies such as the blockchain. Numerous decentralized systems have been developed to address security and privacy issues with great dependability and reliability via these techniques. Meanwhile, formalization and verification of the decentralized systems is the key to ensuring correc… ▽ More

    Submitted 18 August, 2020; originally announced August 2020.

  44. arXiv:2007.13685  [pdf, ps, other

    cs.LO

    Extending Concurrent Separation Logic to Enhance Modular Formalization

    Authors: Yepeng Ding, Hiroyuki Sato

    Abstract: Nowadays, numerous services based on large-scale distributed systems have been developed to boost the convenience of human life. On the other side, it becomes a significant challenge to ensure the correctness and properties of these systems due to the complex and nested architecture. Although concurrent separation logic (CSL) has partially tackled the problem by specifying systems and verifying th… ▽ More

    Submitted 27 July, 2020; originally announced July 2020.

  45. arXiv:1703.04890  [pdf, other

    cs.LG math.NA math.OC stat.ML

    Riemannian stochastic quasi-Newton algorithm with variance reduction and its convergence analysis

    Authors: Hiroyuki Kasai, Hiroyuki Sato, Bamdev Mishra

    Abstract: Stochastic variance reduction algorithms have recently become popular for minimizing the average of a large, but finite number of loss functions. The present paper proposes a Riemannian stochastic quasi-Newton algorithm with variance reduction (R-SQN-VR). The key challenges of averaging, adding, and subtracting multiple gradients are addressed with notions of retraction and vector transport. We pr… ▽ More

    Submitted 16 September, 2017; v1 submitted 14 March, 2017; originally announced March 2017.

  46. arXiv:1702.05594  [pdf, ps, other

    cs.LG math.OC stat.ML

    Riemannian stochastic variance reduced gradient algorithm with retraction and vector transport

    Authors: Hiroyuki Sato, Hiroyuki Kasai, Bamdev Mishra

    Abstract: In recent years, stochastic variance reduction algorithms have attracted considerable attention for minimizing the average of a large but finite number of loss functions. This paper proposes a novel Riemannian extension of the Euclidean stochastic variance reduced gradient (R-SVRG) algorithm to a manifold search space. The key challenges of averaging, adding, and subtracting multiple gradients are… ▽ More

    Submitted 31 May, 2019; v1 submitted 18 February, 2017; originally announced February 2017.

    Comments: Published in SIAM Journal on Optimization. Extended and revised version of arXiv:1605.07367

    Journal ref: SIAM Journal on Optimization 29 (2019) 1444-1472

  47. arXiv:1605.07367  [pdf, other

    cs.LG math.NA math.OC stat.ML

    Riemannian stochastic variance reduced gradient on Grassmann manifold

    Authors: Hiroyuki Kasai, Hiroyuki Sato, Bamdev Mishra

    Abstract: Stochastic variance reduction algorithms have recently become popular for minimizing the average of a large, but finite, number of loss functions. In this paper, we propose a novel Riemannian extension of the Euclidean stochastic variance reduced gradient algorithm (R-SVRG) to a compact manifold search space. To this end, we show the developments on the Grassmann manifold. The key challenges of av… ▽ More

    Submitted 9 April, 2017; v1 submitted 24 May, 2016; originally announced May 2016.

  48. arXiv:1402.1865  [pdf, ps, other

    math.NT cs.CR

    Some properties of $τ$-adic expansions on hyperelliptic Koblitz curves

    Authors: Keisuke Hakuta, Hisayoshi Sato, Tsuyoshi Takagi

    Abstract: This paper explores two techniques on a family of hyperelliptic curves that have been proposed to accelerate computation of scalar multiplication for hyperelliptic curve cryptosystems. In elliptic curve cryptosystems, it is known that Koblitz curves admit fast scalar multiplication, namely, the $τ$-adic non-adjacent form ($τ$-NAF). It is shown that the $τ$-NAF has the three properties: (1) existen… ▽ More

    Submitted 8 February, 2014; originally announced February 2014.

    Comments: 100 pages

    MSC Class: 11A63 (Primary); 94A60 (Secondary)

  49. arXiv:cs/0306092  [pdf

    cs.DC

    Building A High Performance Parallel File System Using Grid Datafarm and ROOT I/O

    Authors: Y. Morita, H. Sato, Y. Watase, O. Tatebe, S. Sekiguchi, S. Matsuoka, N. Soda, A. Dell'Acqua

    Abstract: Sheer amount of petabyte scale data foreseen in the LHC experiments require a careful consideration of the persistency design and the system design in the world-wide distributed computing. Event parallelism of the HENP data analysis enables us to take maximum advantage of the high performance cluster computing and networking when we keep the parallelism both in the data processing phase, in the… ▽ More

    Submitted 14 June, 2003; originally announced June 2003.

    Comments: Talk from the 2003 Computing in High Energy and Nuclear Physics (CHEP03), La Jolla, Ca, USA, March 2003, 4 pages, PDF. PSN TUDT010

    ACM Class: J.2

  50. arXiv:cs/0306051  [pdf, ps, other

    cs.DC

    A data Grid testbed environment in Gigabit WAN with HPSS

    Authors: Atsushi Manabe, Kohki Ishikawa, Yoshihiko Itoh, Setsuya Kawabata, Tetsuro Mashimo, Youhei Morita, Hiroshi Sakamoto, Takashi Sasaki, Hiroyuki Sato, Junichi Tanaka, Ikuo Ueda, Yoshiyuki Watase, Satomi Yamamoto, Shigeo Yashiro

    Abstract: For data analysis of large-scale experiments such as LHC Atlas and other Japanese high energy and nuclear physics projects, we have constructed a Grid test bed at ICEPP and KEK. These institutes are connected to national scientific gigabit network backbone called SuperSINET. In our test bed, we have installed NorduGrid middleware based on Globus, and connected 120TB HPSS at KEK as a large scale… ▽ More

    Submitted 3 September, 2003; v1 submitted 12 June, 2003; originally announced June 2003.

    Comments: Talk from the 2003 Computing in High Energy and Nuclear Physics (CHEP03), La Jolla, Ca, USA, March 2003, 5 pages, LaTeX, 9 figures, PSN THCT002

    ACM Class: C.2.4; J.2; H.3.4