Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (24)

Search Parameters:
Keywords = wavenet

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
20 pages, 3042 KiB  
Article
Voice-Controlled Intelligent Personal Assistant for Call-Center Automation in the Uzbek Language
by Abdinabi Mukhamadiyev, Ilyos Khujayarov and Jinsoo Cho
Electronics 2023, 12(23), 4850; https://doi.org/10.3390/electronics12234850 - 30 Nov 2023
Cited by 1 | Viewed by 1118
Abstract
The demand for customer support call centers has surged across various sectors due to the pandemic. Yet, the constraints of round-the-clock human services and fluctuating wait times pose challenges in fully meeting customer needs. In response, there’s a growing need for automated customer [...] Read more.
The demand for customer support call centers has surged across various sectors due to the pandemic. Yet, the constraints of round-the-clock human services and fluctuating wait times pose challenges in fully meeting customer needs. In response, there’s a growing need for automated customer service systems that can provide responses tailored to specific domains and in the native languages of customers, particularly in developing nations like Uzbekistan where call center usage is on the rise. Our system, “UzAssistant,” is designed to recognize user voices and accurately present customer issues in standardized Uzbek, as well as vocalize the responses to voice queries. It employs feature extraction and recurrent neural network (RNN)-based models for effective automatic speech recognition, achieving an impressive 96.4% accuracy in real-time tests with 56 participants. Additionally, the system incorporates a sentence similarity assessment method and a text-to-speech (TTS) synthesis feature specifically for the Uzbek language. The TTS component utilizes the WaveNet architecture to convert text into speech in Uzbek. Full article
(This article belongs to the Section Computer Science & Engineering)
Show Figures

Figure 1

14 pages, 666 KiB  
Article
Any-to-One Non-Parallel Voice Conversion System Using an Autoregressive Conversion Model and LPCNet Vocoder
by Kadria Ezzine, Joseph Di Martino and Mondher Frikha
Appl. Sci. 2023, 13(21), 11988; https://doi.org/10.3390/app132111988 - 2 Nov 2023
Viewed by 772
Abstract
We present an any-to-one voice conversion (VC) system, using an autoregressive model and LPCNet vocoder, aimed at enhancing the converted speech in terms of naturalness, intelligibility, and speaker similarity. As the name implies, non-parallel any-to-one voice conversion does not require paired source and [...] Read more.
We present an any-to-one voice conversion (VC) system, using an autoregressive model and LPCNet vocoder, aimed at enhancing the converted speech in terms of naturalness, intelligibility, and speaker similarity. As the name implies, non-parallel any-to-one voice conversion does not require paired source and target speeches and can be employed for arbitrary speech conversion tasks. Recent advancements in neural-based vocoders, such as WaveNet, have improved the efficiency of speech synthesis. However, in practice, we find that the trajectory of some generated waveforms is not consistently smooth, leading to occasional voice errors. To address this issue, we propose to use an autoregressive (AR) conversion model along with the high-fidelity LPCNet vocoder. This combination not only solves the problems of waveform fluidity but also produces more natural and clear speech, with the added capability of real-time speech generation. To precisely represent the linguistic content of a given utterance, we use speaker-independent PPG features (SI-PPG) computed from an automatic speech recognition (ASR) model trained on a multi-speaker corpus. Next, a conversion model maps the SI-PPG to the acoustic representations used as input features for the LPCNet. The proposed autoregressive structure enables our system to produce the following prediction step outputs from the acoustic features predicted in the previous step. We evaluate the effectiveness of our system by performing any-to-one conversion pairs between native English speakers. Experimental results show that the proposed method outperforms state-of-the-art systems, producing higher speech quality and greater speaker similarity. Full article
(This article belongs to the Section Acoustics and Vibrations)
Show Figures

Figure 1

20 pages, 1450 KiB  
Article
Detection of Ocean Internal Waves Based on Modified Deep Convolutional Generative Adversarial Network and WaveNet in Moderate Resolution Imaging Spectroradiometer Images
by Zhongyi Jiang, Xing Gao, Lin Shi, Ning Li and Ling Zou
Appl. Sci. 2023, 13(20), 11235; https://doi.org/10.3390/app132011235 - 12 Oct 2023
Viewed by 1043
Abstract
The generation and propagation of internal waves in the ocean are a common phenomenon that plays a pivotal role in the transport of mass, momentum, and energy, as well as in global climate change. Internal waves serve as a critical component of oceanic [...] Read more.
The generation and propagation of internal waves in the ocean are a common phenomenon that plays a pivotal role in the transport of mass, momentum, and energy, as well as in global climate change. Internal waves serve as a critical component of oceanic processes, contributing to the redistribution of heat and nutrients in the ocean, which, in turn, has implications for global climate regulation. However, the automatic identification of internal waves in oceanic regions from remote sensing images has presented a significant challenge. In this research paper, we address this challenge by designing a data augmentation approach grounded in a modified deep convolutional generative adversarial network (DCGAN) to enrich MODIS remote sensing image data for the automated detection of internal waves in the ocean. Utilizing t-distributed stochastic neighbor embedding (t-SNE) technology, we demonstrate that the feature distribution of the images produced by the modified DCGAN closely resembles that of the original images. By using t-SNE dimensionality reduction technology to map high-dimensional remote sensing data into a two-dimensional space, we can better understand, visualize, and analyze the quality of data generated by the modified DCGAN. The images generated by the modified DCGAN not only expand the dataset’s size but also exhibit diverse characteristics, enhancing the model’s generalization performance. Furthermore, we have developed a deep neural network named “WaveNet,” which incorporates a channel-wise attention mechanism to effectively handle complex remote sensing images, resulting in high classification accuracy and robustness. It is important to note that this study has limitations, such as the reliance on specific remote sensing data sources and the need for further validation across various oceanic regions. These limitations are essential to consider in the broader context of oceanic research and remote sensing applications. We initially pre-train WaveNet using the EuroSAT remote sensing dataset and subsequently employ it to identify internal waves in MODIS remote sensing images. Experiments show the highest average recognition accuracy achieved is an impressive 98.625%. When compared to traditional data augmentation training sets, utilizing the training set generated by the modified DCGAN leads to a 5.437% enhancement in WaveNet’s recognition rate. Full article
(This article belongs to the Special Issue Remote Sensing Image Processing and Application)
Show Figures

Figure 1

13 pages, 2155 KiB  
Article
Multitask Attention-Based Neural Network for Intraoperative Hypotension Prediction
by Meng Shi, Yu Zheng, Youzhen Wu and Quansheng Ren
Bioengineering 2023, 10(9), 1026; https://doi.org/10.3390/bioengineering10091026 - 31 Aug 2023
Viewed by 1216
Abstract
Timely detection and response to Intraoperative Hypotension (IOH) during surgery is crucial to avoid severe postoperative complications. Although several methods have been proposed to predict IOH using machine learning, their performance still has space for improvement. In this paper, we propose a ResNet-BiLSTM [...] Read more.
Timely detection and response to Intraoperative Hypotension (IOH) during surgery is crucial to avoid severe postoperative complications. Although several methods have been proposed to predict IOH using machine learning, their performance still has space for improvement. In this paper, we propose a ResNet-BiLSTM model based on multitask training and attention mechanism for IOH prediction. We trained and tested our proposed model using bio-signal waveforms obtained from patient monitoring of non-cardiac surgery. We selected three models (WaveNet, CNN, and TCN) that process time-series data for comparison. The experimental results demonstrate that our proposed model has optimal MSE (43.83) and accuracy (0.9224) compared to other models, including WaveNet (51.52, 0.9087), CNN (318.52, 0.5861), and TCN (62.31, 0.9045), which suggests that our proposed model has better regression and classification performance. We conducted ablation experiments on the multitask and attention mechanisms, and the experimental results demonstrated that the multitask and attention mechanisms improved MSE and accuracy. The results demonstrate the effectiveness and superiority of our proposed model in predicting IOH. Full article
(This article belongs to the Special Issue Monitoring and Analysis of Human Biosignals, Volume II)
Show Figures

Figure 1

22 pages, 2839 KiB  
Article
Wind Power Forecasting Based on WaveNet and Multitask Learning
by Hao Wang, Chen Peng, Bolin Liao, Xinwei Cao and Shuai Li
Sustainability 2023, 15(14), 10816; https://doi.org/10.3390/su151410816 - 10 Jul 2023
Cited by 1 | Viewed by 1483
Abstract
Accurately predicting the power output of wind turbines is crucial for ensuring the reliable and efficient operation of large-scale power systems. To address the inherent limitations of physical models, statistical models, and machine learning algorithms, we propose a novel framework for wind turbine [...] Read more.
Accurately predicting the power output of wind turbines is crucial for ensuring the reliable and efficient operation of large-scale power systems. To address the inherent limitations of physical models, statistical models, and machine learning algorithms, we propose a novel framework for wind turbine power prediction. This framework combines a special type of convolutional neural network, WaveNet, with a multigate mixture-of-experts (MMoE) architecture. The integration aims to overcome the inherent limitations by effectively capturing and utilizing complex patterns and trends in the time series data. First, the maximum information coefficient (MIC) method is applied to handle data features, and the wavelet transform technique is employed to remove noise from the data. Subsequently, WaveNet utilizes its scalable convolutional network to extract representations of wind power data and effectively capture long-range temporal information. These representations are then fed into the MMoE architecture, which treats multistep time series prediction as a set of independent yet interrelated tasks, allowing for information sharing among different tasks to prevent error accumulation and improve prediction accuracy. We conducted predictions for various forecasting horizons and compared the performance of the proposed model against several benchmark models. The experimental results confirm the strong predictive capability of the WaveNet–MMoE framework. Full article
Show Figures

Figure 1

16 pages, 844 KiB  
Article
Automatic Detection of Abnormal EEG Signals Using WaveNet and LSTM
by Hezam Albaqami, Ghulam Mubashar Hassan and Amitava Datta
Sensors 2023, 23(13), 5960; https://doi.org/10.3390/s23135960 - 27 Jun 2023
Cited by 4 | Viewed by 2394
Abstract
Neurological disorders have an extreme impact on global health, affecting an estimated one billion individuals worldwide. According to the World Health Organization (WHO), these neurological disorders contribute to approximately six million deaths annually, representing a significant burden. Early and accurate identification of brain [...] Read more.
Neurological disorders have an extreme impact on global health, affecting an estimated one billion individuals worldwide. According to the World Health Organization (WHO), these neurological disorders contribute to approximately six million deaths annually, representing a significant burden. Early and accurate identification of brain pathological features in electroencephalogram (EEG) recordings is crucial for the diagnosis and management of these disorders. However, manual evaluation of EEG recordings is not only time-consuming but also requires specialized skills. This problem is exacerbated by the scarcity of trained neurologists in the healthcare sector, especially in low- and middle-income countries. These factors emphasize the necessity for automated diagnostic processes. With the advancement of machine learning algorithms, there is a great interest in automating the process of early diagnoses using EEGs. Therefore, this paper presents a novel deep learning model consisting of two distinct paths, WaveNet–Long Short-Term Memory (LSTM) and LSTM, for the automatic detection of abnormal raw EEG data. Through multiple ablation experiments, we demonstrated the effectiveness and importance of all parts of our proposed model. The performance of our proposed model was evaluated using TUH abnormal EEG Corpus V.2.0.0. (TUAB) and achieved a high classification accuracy of 88.76%, which is higher than in the existing state-of-the-art research studies. Moreover, we demonstrated the generalization of our proposed model by evaluating it on another independent dataset, TUEP, without any hyperparameter tuning or adjustment. The obtained accuracy was 97.45% for the classification between normal and abnormal EEG recordings, confirming the robustness of our proposed model. Full article
(This article belongs to the Section Biomedical Sensors)
Show Figures

Figure 1

15 pages, 5480 KiB  
Article
Time Series Forecasting Performance of the Novel Deep Learning Algorithms on Stack Overflow Website Data
by Mesut Guven and Fatih Uysal
Appl. Sci. 2023, 13(8), 4781; https://doi.org/10.3390/app13084781 - 11 Apr 2023
Cited by 3 | Viewed by 5455
Abstract
Time series forecasting covers a wide range of topics, such as predicting stock prices, estimating solar wind, estimating the number of scientific papers to be published, etc. Among the machine learning models, in particular, deep learning algorithms are the most used and successful [...] Read more.
Time series forecasting covers a wide range of topics, such as predicting stock prices, estimating solar wind, estimating the number of scientific papers to be published, etc. Among the machine learning models, in particular, deep learning algorithms are the most used and successful ones. This is why we only focus on deep learning models. Even though it is a hot topic, there are only a few comprehensive studies, and in many studies, there is not much detail about the tested models, which makes it impossible to constitute a comparison chart. Thus, one of the main motivations for this work is to present comprehensive research by providing details about the tested models. In this study, a corpus of the asked questions and their metadata were extracted from the software development and troubleshooting website. Then, univariate time series data were created from the frequency of the questions that included the word “python” as the tag information. In the experiments, deep learning models were trained on the extracted time series, and their prediction performances are presented. Among the tested models, the model using convolutional neural network (CNN) layers in the form of wavenet architecture achieved the best result. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

25 pages, 7503 KiB  
Article
Automatic Assessment of Piano Performances Using Timbre and Pitch Features
by Varinya Phanichraksaphong and Wei-Ho Tsai
Electronics 2023, 12(8), 1791; https://doi.org/10.3390/electronics12081791 - 10 Apr 2023
Cited by 1 | Viewed by 1365
Abstract
To assist piano learners with the improvement of their skills, this study investigates techniques for automatically assessing piano performances based on timbre and pitch features. The assessment is formulated as a classification problem that classifies piano performances as “Good”, “Fair”, or “Poor”. For [...] Read more.
To assist piano learners with the improvement of their skills, this study investigates techniques for automatically assessing piano performances based on timbre and pitch features. The assessment is formulated as a classification problem that classifies piano performances as “Good”, “Fair”, or “Poor”. For timbre-based approaches, we propose timbre-based WaveNet, timbre-based MLNet, Timbre-based CNN, and Timbre-based CNN Transformers. For pitch-based approaches, we propose Pitch-based CNN and Pitch-based CNN Transformers. Our experiments indicate that both Pitch-based CNN and Pitch-based CNN Transformers are superior to the timbre-based approaches, which attained classification accuracies of 96.87% and 97.5%, respectively. Full article
(This article belongs to the Special Issue Machine Learning in Music/Audio Signal Processing)
Show Figures

Figure 1

18 pages, 3296 KiB  
Article
Mandarin Electro-Laryngeal Speech Enhancement Using Cycle-Consistent Generative Adversarial Networks
by Zhaopeng Qian, Kejing Xiao and Chongchong Yu
Appl. Sci. 2023, 13(1), 537; https://doi.org/10.3390/app13010537 - 30 Dec 2022
Viewed by 1573
Abstract
Electro-laryngeal (EL) speech has poor intelligibility and naturalness, which hampers the popular use of the electro-larynx. Voice conversion (VC) can enhance EL speech. However, if the EL speech to be enhanced is with complicated tone variation rules in Mandarin, the enhancement will be [...] Read more.
Electro-laryngeal (EL) speech has poor intelligibility and naturalness, which hampers the popular use of the electro-larynx. Voice conversion (VC) can enhance EL speech. However, if the EL speech to be enhanced is with complicated tone variation rules in Mandarin, the enhancement will be less effective. This is because the source speech (Mandarin EL speech) and the target speech (normal speech) are not strictly parallel. We propose using cycle-consistent generative adversarial networks (CycleGAN, a parallel-free VC framework) to enhance continuous Mandarin EL speech, which can solve the above problem. In the proposed framework, the generator is designed based on the neural networks of a 2D-Conformer-1D-Transformer-2D-Conformer. Then, we used Mel-Spectrogram instead of traditional acoustic features (fundamental frequency, Mel-Cepstrum parameters and aperiodicity parameters). At last, we converted the enhanced Mel-Spectrogram into waveform signals using WaveNet. We undertook both subjective and objective tests to evaluate the proposed approach. Compared with traditional approaches to enhance continuous Mandarin EL speech with variable tone (the average tone accuracy being 71.59% and average word error rate being 10.85%), our framework increases the average tone accuracy by 12.12% and reduces the average errors of word perception by 9.15%. Compared with the approaches towards continuous Mandarin EL speech with fixed tone (the average tone accuracy being 29.89% and the average word error rate being 10.74%), our framework increases the average tone accuracy by 42.38% and reduces the average errors of word perception by 8.59%. Our proposed framework can effectively address the problem that the source and target speech are not strictly parallel. The intelligibility and naturalness of Mandarin EL speech have been further improved. Full article
(This article belongs to the Special Issue AI-Based Biomedical Signal Processing)
Show Figures

Figure 1

27 pages, 95183 KiB  
Article
HRpI System Based on Wavenet Controller with Human Cooperative-in-the-Loop for Neurorehabilitation Purposes
by Juan Daniel Ramirez-Zamora, Omar Arturo Dominguez-Ramirez, Luis Enrique Ramos-Velasco, Gabriel Sepulveda-Cervantes, Vicente Parra-Vega, Alejandro Jarillo-Silva and Eduardo Alejandro Escotto-Cordova
Sensors 2022, 22(20), 7729; https://doi.org/10.3390/s22207729 - 12 Oct 2022
Cited by 2 | Viewed by 1767
Abstract
There exist several methods aimed at human–robot physical interaction (HRpI) to provide physical therapy in patients. The use of haptics has become an option to display forces along a given path so as to it guides the physiotherapist protocol. Critical in this regard [...] Read more.
There exist several methods aimed at human–robot physical interaction (HRpI) to provide physical therapy in patients. The use of haptics has become an option to display forces along a given path so as to it guides the physiotherapist protocol. Critical in this regard is the motion control for haptic guidance to convey the specifications of the clinical protocol. Given the inherent patient variability, a conclusive demand of these HRpI methods is the need to modify online its response with neither rejecting nor neglecting interaction forces but to process them as patient interaction. In this paper, considering the nonlinear dynamics of the robot interacting bilaterally with a patient, we propose a novel adaptive control to guarantee stable haptic guidance by processing the causality of patient interaction forces, despite unknown robot dynamics and uncertainties. The controller implements radial basis neural network with daughter RASP1 wavelets activation function to identify the coupled interaction dynamics. For an efficient online implementation, an output infinite impulse response filter prunes negligible signals and nodes to deal with overparametrization. This contributes to adapt online the feedback gains of a globally stable discrete PID regulator to yield stiffness control, so the user is guided within a perceptual force field. Effectiveness of the proposed method is verified in real-time bimanual human-in-the-loop experiments. Full article
(This article belongs to the Special Issue Robot Assistant for Human-Robot Interaction and Healthcare)
Show Figures

Figure 1

18 pages, 2395 KiB  
Article
Spatial and Temporal Normalization for Multi-Variate Time Series Prediction Using Machine Learning Algorithms
by Alimasi Mongo Providence, Chaoyu Yang, Tshinkobo Bukasa Orphe, Anesu Mabaire and George K. Agordzo
Electronics 2022, 11(19), 3167; https://doi.org/10.3390/electronics11193167 - 1 Oct 2022
Cited by 2 | Viewed by 2927
Abstract
Multi-variable time series (MTS) information is a typical type of data inference in the real world. Every instance of MTS is produced via a hybrid dynamical scheme, the dynamics of which are often unknown. The hybrid species of this dynamical service are the [...] Read more.
Multi-variable time series (MTS) information is a typical type of data inference in the real world. Every instance of MTS is produced via a hybrid dynamical scheme, the dynamics of which are often unknown. The hybrid species of this dynamical service are the outcome of high-frequency and low-frequency external impacts, as well as global and local spatial impacts. These influences impact MTS’s future growth; hence, they must be incorporated into time series forecasts. Two types of normalization modules, temporal and spatial normalization, are recommended to accomplish this. Each boosts the original data’s local and high-frequency processes distinctly. In addition, all components are easily incorporated into well-known deep learning techniques, such as Wavenet and Transformer. However, existing methodologies have inherent limitations when it comes to isolating the variables produced by each sort of influence from the real data. Consequently, the study encompasses conventional neural networks, such as the multi-layer perceptron (MLP), complex deep learning methods such as LSTM, two recurrent neural networks, support vector machines (SVM), and their application for regression, XGBoost, and others. Extensive experimental work on three datasets shows that the effectiveness of canonical frameworks could be greatly improved by adding more normalization components to how the MTS is used. This would make it as effective as the best MTS designs are currently available. Recurrent models, such as LSTM and RNN, attempt to recognize the temporal variability in the data; however, as a result, their effectiveness might soon decline. Last but not least, it is claimed that training a temporal framework that utilizes recurrence-based methods such as RNN and LSTM approaches is challenging and expensive, while the MLP network structure outperformed other models in terms of time series predictive performance. Full article
(This article belongs to the Special Issue Advanced Machine Learning Applications in Big Data Analytics)
Show Figures

Figure 1

17 pages, 3081 KiB  
Article
A Deep Learning Method Based on Bidirectional WaveNet for Voltage Sag State Estimation via Limited Monitors in Power System
by Yaping Deng, Lu Wang, Hao Jia, Xiaohui Zhang and Xiangqian Tong
Energies 2022, 15(6), 2273; https://doi.org/10.3390/en15062273 - 21 Mar 2022
Cited by 4 | Viewed by 1800
Abstract
Voltage sag state estimation on the basis of a limited number of installed monitors is essential to dividing the responsibility for the voltage sag and taking corresponding measurements for improvement in voltage quality. Therefore, a deep learning methodology via bidirectional WaveNet for the [...] Read more.
Voltage sag state estimation on the basis of a limited number of installed monitors is essential to dividing the responsibility for the voltage sag and taking corresponding measurements for improvement in voltage quality. Therefore, a deep learning methodology via bidirectional WaveNet for the voltage sag state estimation is proposed in this paper. The presented method can simultaneously estimate voltage sag state at non-monitored buses via limited monitors. Especially, the proposed deep learning method using the bidirectional WaveNet is designed to explore the long-term and long-range temporal dependencies in both the forward and backward directions. In this way, only by using original measured voltages through monitors, high accuracy for voltage sag state estimation can be achieved without restructured or redesign of the raw monitored data. An excellent advantage of the presented algorithm is that it can be implemented without system parameters or operating conditions or any other prior information. The presented methodology was verified by the IEEE 30-bus benchmark system. The experimental results illustrated that the accuracy of the voltage sag state estimation results was over 99.83%. Furthermore, a comparison among different models, including the bidirectional GRU-based model, one-way WaveNet-based model, and bidirectional WaveNet-based model, was also conducted. The results illustrated that the proposed bidirectional WaveNet-based model achieved the highest accuracy and quickest convergence speed. Full article
(This article belongs to the Section F5: Artificial Intelligence and Smart Energy)
Show Figures

Figure 1

29 pages, 1237 KiB  
Article
Neural Vocoding for Singing and Speaking Voices with the Multi-Band Excited WaveNet
by Axel Roebel and Frederik Bous
Information 2022, 13(3), 103; https://doi.org/10.3390/info13030103 - 23 Feb 2022
Cited by 5 | Viewed by 5146
Abstract
The use of the mel spectrogram as a signal parameterization for voice generation is quite recent and linked to the development of neural vocoders. These are deep neural networks that allow reconstructing high-quality speech from a given mel spectrogram. While initially developed for [...] Read more.
The use of the mel spectrogram as a signal parameterization for voice generation is quite recent and linked to the development of neural vocoders. These are deep neural networks that allow reconstructing high-quality speech from a given mel spectrogram. While initially developed for speech synthesis, now neural vocoders have also been studied in the context of voice attribute manipulation, opening new means for voice processing in audio production. However, to be able to apply neural vocoders in real-world applications, two problems need to be addressed: (1) To support use in professional audio workstations, the computational complexity should be small, (2) the vocoder needs to support a large variety of speakers, differences in voice qualities, and a wide range of intensities potentially encountered during audio production. In this context, the present study will provide a detailed description of the Multi-band Excited WaveNet, a fully convolutional neural vocoder built around signal processing blocks. It will evaluate the performance of the vocoder when trained on a variety of multi-speaker and multi-singer databases, including an experimental evaluation of the neural vocoder trained on speech and singing voices. Addressing the problem of intensity variation, the study will introduce a new adaptive signal normalization scheme that allows for robust compensation for dynamic and static gain variations. Evaluations are performed using objective measures and a number of perceptual tests including different neural vocoder algorithms known from the literature. The results confirm that the proposed vocoder compares favorably to the state-of-the-art in its capacity to generalize to unseen voices and voice qualities. The remaining challenges will be discussed. Full article
(This article belongs to the Special Issue Signal Processing Based on Convolutional Neural Network)
Show Figures

Figure 1

17 pages, 1087 KiB  
Article
A Wavenet-Based Virtual Sensor for PM10 Monitoring
by Claudio Carnevale, Enrico Turrini, Roberta Zeziola, Elena De Angelis and Marialuisa Volta
Electronics 2021, 10(17), 2111; https://doi.org/10.3390/electronics10172111 - 30 Aug 2021
Cited by 4 | Viewed by 1715
Abstract
In this work, a virtual sensor for PM10 concentration monitoring is presented. The sensor is based on wavenet models and uses daily mean NO2 concentration and meteorological variables (wind speed and rainfall) as input. The methodology has been applied [...] Read more.
In this work, a virtual sensor for PM10 concentration monitoring is presented. The sensor is based on wavenet models and uses daily mean NO2 concentration and meteorological variables (wind speed and rainfall) as input. The methodology has been applied to the reconstruction of PM10 levels measured from 14 monitoring stations in Lombardy region (Italy). This region, usually affected by high levels of PM10, is a challenging benchmarking area for the implemented sensors. Neverthless, the performances are good with relatively low bias and high correlation. Full article
(This article belongs to the Special Issue Theory and Applications of Fuzzy Systems and Neural Networks)
Show Figures

Figure 1

16 pages, 3166 KiB  
Article
Short-Term Load Forecasting Using Encoder-Decoder WaveNet: Application to the French Grid
by Fernando Dorado Rueda, Jaime Durán Suárez and Alejandro del Real Torres
Energies 2021, 14(9), 2524; https://doi.org/10.3390/en14092524 - 28 Apr 2021
Cited by 30 | Viewed by 3131
Abstract
The prediction of time series data applied to the energy sector (prediction of renewable energy production, forecasting prosumers’ consumption/generation, forecast of country-level consumption, etc.) has numerous useful applications. Nevertheless, the complexity and non-linear behaviour associated with such kind of energy systems hinder the [...] Read more.
The prediction of time series data applied to the energy sector (prediction of renewable energy production, forecasting prosumers’ consumption/generation, forecast of country-level consumption, etc.) has numerous useful applications. Nevertheless, the complexity and non-linear behaviour associated with such kind of energy systems hinder the development of accurate algorithms. In such a context, this paper investigates the use of a state-of-art deep learning architecture in order to perform precise load demand forecasting 24-h-ahead in the whole country of France using RTE data. To this end, the authors propose an encoder-decoder architecture inspired by WaveNet, a deep generative model initially designed by Google DeepMind for raw audio waveforms. WaveNet uses dilated causal convolutions and skip-connection to utilise long-term information. This kind of novel ML architecture presents different advantages regarding other statistical algorithms. On the one hand, the proposed deep learning model’s training process can be parallelized in GPUs, which is an advantage in terms of training times compared to recurrent networks. On the other hand, the model prevents degradations problems (explosions and vanishing gradients) due to the residual connections. In addition, this model can learn from an input sequence to produce a forecast sequence in a one-shot manner. For comparison purposes, a comparative analysis between the most performing state-of-art deep learning models and traditional statistical approaches is presented: Autoregressive-Integrated Moving Average (ARIMA), Long-Short-Term-Memory, Gated-Recurrent-Unit (GRU), Multi-Layer Perceptron (MLP), causal 1D-Convolutional Neural Networks (1D-CNN) and ConvLSTM (Encoder-Decoder). The values of the evaluation indicators reveal that WaveNet exhibits superior performance in both forecasting accuracy and robustness. Full article
(This article belongs to the Special Issue Time Series Forecasting for Energy Consumption)
Show Figures

Figure 1

Back to TopTop