1 Introduction

In the present era of research and technology, multimedia systems have seen tremendous growth due to the integration of several technologies which includes Big Data, IoT and Networking Technologies. Multimedia based networking applications have been grown significantly in past few years and a lot of research is going on due to advancements in communication technologies. Due to the huge growth in the use of fourth generation (4G) networks and with the possibilities of 5G networks in coming time, there are a lot of possibilities of future research to meet the 5G based demands for development of better multimedia systems. There is a growing demand of an infrastructure to have on-demand access to a shared pool of configurable computing resources (e.g., networks, storages, servers, applications, and services) for efficient decision as well as quality care.

A number of new ideas, including Wireless Sensor Networks (WSN), Body Wireless Sensor Networks (BWSN), IoT, Cloud, Fog, Edge, SDN, and Big Data Analytics are emerged in past one decade and contributing in various dimensions. These ideas can be used to design and develop intelligent systems across a variety of industries, including transportation, education, business, and industry. In addition to these systems, the Internet of Things (IoT) is used in a variety of healthcare applications, including tele-healthcare systems for chronic diseases, support for medication intake management, homecare, etc.

In terms of data capture, data storage, data management, and communications, emerging technology computing (Cloud, Fog, Edge, SDN, Big Data, IoT, Deep Learning) offers scalability, flexibility, agility, and ubiquity. In addition to improving many technical challenges for many media-rich applications such as video streaming, serious games, rehabilitative exercise, health sports, e-healthcare, and so forth, the integration of multimedia and cloud for healthcare, smart cities, and surveillance is applied in many of the applications. Unhindered access to medical media content by heterogeneous devices (such as a mobile phone, laptop, and IPTV), resource capacity demands (such as bandwidth, memory, storage, and processors), the needs for medical multimedia’s quality of service/Experience/Context (m-QoS/m-QoE/m-QoC), and dynamic resource allocation for processing of media content are a few of the challenging open research problems.

The goal of this special issue is to bring forward the recent advancements in multimedia systems in context to health care services, smart world applications and precision agriculture related fields. More specifically, it looks for the contributions into the state-of-the-art and approaches, methodologies, systems and innovative use of multimedia-based emerging technology services in different application areas. This special issue on Futuristic Trends and Innovations in Multimedia Systems Using Big Data, IoT and Cloud Technologies received a total of 102 submissions which qualify for review after the screening and after the rigorous review by at least three reviewers for each paper followed by the review from guest editors, editor in chief, a few of the papers are accepted for final publication. Below, we briefly summarize the main contribution of each paper.

Sehgal et al. proposed a model that helps prevent human causality by providing early warning of natural disasters such as storms, collapse of old buildings, old bridges, earthquakes and landslides (https://doi.org/10.1007/s11042-021-11486-8) .

Roy et al. claims that there are many powerful impulse denoising techniques, performing data cleaning operations on constrained IoT gateways is much more computationally complex (https://doi.org/10.1007/s11042-021-11832-w). The proposed work focuses on pulse elimination techniques that are less computationally complex than other established techniques that are best suited for implementation at the IoT gateway level. The results show that HDFF offers a 3% improvement in PSNR compared to state-of-the-art filters.

Kaur et al. contributed a paper on title “Hidden Markov Model for short-term Churn Forecast in the Structured Overlay Networks” (https://doi.org/10.1007/s11042-021-11831-x). This study used Rapid Miner to provide predictive analytics on machine uptime patterns identified in Microsoft trace files, resulting in 95% predictive accuracy. Simulation results report that the implementation of the proposed approach increased the average lookup success rate for code-based overlay networks by 54.23% and reduced maintenance requirements by 59%.

Ghobaei et al. demonstrates an Observe-Orient-Decide-Act (OODA) way to enhance the resource elasticity of cloud-based multimedia storage systems (https://doi.org/10.1007/s11042-021-11021-9). In the proposed work, elastic management is performed using OODA loops and fuzzy logic theory. The simulation results shown in paper state that the proposed solution has 7.2% and 6.9% read time, write time, and response time, respectively, compared to existing elastic cloud-based storage mechanisms, the work also shows a reduction of 8.4%.

Clark et al. proposes a technique that uses a hybrid approach using low-cost sensory hardware in combination with intelligent character recognition software to improve the accuracy of software-based gesture recognition systems (https://doi.org/10.1007/s11042-021-11830-y). Their approach is to develop software-based modules using convolutional neural network (CNN) and recurrent neural network (RNN) methods, and to train myo-based modules using support vector machine methods. Based on a dataset of nine motions derived from numerous videos, each gesture was tested 45 times across both software-based and hardware-based modules. The software-based module had an accuracy rate of 89% compared to the myo-based module’s 49%.

Malik et al. state that driving, while drowsy has become a serious problem among drivers because it is frequently ignored, and it usually results in a fatal crash. Their work proposes MADDOKE, a real-time driver drowsiness detection framework based on microsleeps and yawning that uses low computational power devices to detect signs of drowsiness and alert the driver to take a break. Finally, the proposed framework achieved an accuracy of 90% above without requiring intensive computational power.

Singh et al. uses a chaotic maps and diffusion circuits in their paper to proposes a lightweight encryption technique for images (https://doi.org/10.1007/s11042-021-11657-7). Further reducing the time complexity, the chaotic maps are employed to regulate the creation of random number sequences for image permutation and replacement. Several statistical and security tests are performed on the scheme to ensure its resistance to attacks.

Khanna et al. work is entitled on “improved method for analyzing electrical data obtained from EEG for better diagnosis of brain related disorders”. This work took the dataset from brainwave datasets from Kaggle and IEEE dataport. The proposed method improves accuracy in estimating the current state of the brain diagnosis and its related disorders based on data analysis.

Gupta et al. covers the fabrication of the FBG sensor and the bonding of the sensing element to the suspension bridge structure, along with experimental details (https://doi.org/10.1007/s11042-021-11565-w). The article also discusses the proposed Smart Distributed Sensing (SDS) model’s scalable architecture using FBG sensors. Utilizing an IoT-based FBG sensing method, the experimental validation is performed by estimating the strain distribution profile at the base plate’s bonding region from a central location.

Kumar et al. propose “a hybrid deep learning model based on long short-term memory (LSTM) and artificial bee colony (ABC) algorithms” (https://doi.org/10.1007/s11042-021-11029-1). Performance analysis demonstrates that ABC-optimized LSTMs with emotional polarity achieve improved prediction accuracy over the corresponding models.

Sangwan et al. study was motivated by the need to model automatic rumor detection, and was optimized for deep learning and trained filter wrappers (https://doi.org/10.1007/s11042-021-11340-x). Provides a hybrid model of rumor classification using vessels tested with the PHEME rumor dataset. Text features are learned using CNNs combined with optimized feature vectors generated using the IGACO filter wrapper technology. The suggested classifier performs better than earlier works that have been done.

Varshney et al. work on paper entitled on “Human activity recognition by combining external features with accelerometer sensor data using deep learning network model” (https://doi.org/10.1007/s11042-021-11313-0). The results of the paper shows that the performance of all three LSTM, CNN, and ConvLSTM models is superior to the state-of-the-art methods for activity datasets.

Velpula et al. reported their work on title “EBGO: An Optimal Load Balancing Algorithm, A solution for existing tribulation to balance the load efficiently on cloud servers” (https://doi.org/10.1007/s11042-021-11012-w). According to certain important metrics including reaction time, energy consumption, weighted total cost, procession time, and others, the authors’ suggested EBGO algorithm outperforms numerous load balancing techniques. In data centres, the model effectively balances server demand, resulting in optimal resource use. The proposed work may be appropriate for health care systems, where systems with larger data size shared the load.

Safara et al. discussed the algorithmic based work on title “ISHO: Improved Spotted Hyena Optimization Algorithm for Detecting Phishing Websites” (https://doi.org/10.1007/s11042-021-10678-6). The paper report the ISHO algorithm, which outperformed the standard spotted hyena optimization algorithm in term of higher accuracy. Moreover, the results indicate the superiority of the proposed algorithm as compared to particle swarm optimization, firefly algorithm, and bat algorithm and same has been tested on the dataset as well.

Tyagi et al. mentioned their work on title “SSEER: Segmented Sectors in Energy Efficient Routing for Wireless Sensor Network” (https://doi.org/10.1007/s11042-021-11829-5). In this paper, for the WSNs, the integration of direct diffusion and clustering is carried out, which improve the network lifetime and stability. Simulation results of the proposed work show good results in term of energy efficiency improved by 11% and stability by 19%, compared to the Z-SEP protocol.

Satapathy et al. reported their work on title “Effect of learning parameters on the performance of U-Net Model in segmentation of Brain tumor” (https://doi.org/10.1007/s11042-021-11273-5). The study examines how different learning parameters affect the performance of the deep U-Net model for segmenting brain tumours using the two publicly accessible data sets; “BraTs 2017” and “BraTs 2018”. According to the simulation results, when using the base model with adjusted parameters to segment the entire tumour, the AUC is increased by 2%.

Sharma et al. proposed a framework on title “A real time cloud-based framework for glaucoma screening using EfficientNet” (https://doi.org/10.1007/s11042-021-11559-8). Deep learning techniques and convolutional neural networks, respectively using the EfficientNet and UNet++ models, are used to analyse retinal fundus images for glaucoma. The suggested framework is determined to be scalable, location independent, and easily available to all due to the cloud platform after several state-of-the-art models and quantitative assessment are performed on various benchmark datasets including RIM-ONE and DRISHTI-GS1.

Tong et al. propose an approach on entitle “Protecting image privacy through adversarial perturbation” (https://doi.org/10.1007/s11042-021-11394-x). The research presented in this study is utilised to stop DNN detectors from picking up on personal things, particularly human bodies. With minimal effects on pixel value, the suggested technique lowers the recall of human detection from 81.1% to 18.0%. The findings demonstrate that, with just minor visual quality loss, the used technique performs extremely well in avoiding privacy from being exposed by DNN detectors.

Sharma et al. address their work on “Image captioning improved visual question answering” (https://doi.org/10.1007/s11042-021-11276-2). The knowledge gained from the picture captioning work is applied to the VQA task using the proposed model using image captioning. According to the testing results, the answer generation accuracy outperformed state-of-the-art VQA models by a factor of 3.45% on VQA 1.0 datasets, 3.33% on VQA 2.0 datasets, and 1.73% on VQA-CP v2 datasets.

Tanwar et al. cover their findings on “A secure data analytics scheme for multimedia communication in a decentralized smart grid” (https://doi.org/10.1007/s11042-021-10512-z). In this work, the ChoIce data analytics system, which offers safe data collecting, analysis, and decision support for SG systems, is proposed. As a result, ChoIce beats other cutting-edge methods, according to the results.

Singh et al. work on “EMM: Extended Matching Market based Scheduling for Big Data Platform Hadoop” (https://doi.org/10.1007/s11042-021-11283-3). For effective and flexible work processing, a pluggable scheduling method is suggested in this paper. The experimental findings show higher cluster performance, better resource efficiency, and a general decrease in makespan. Cluster efficiency has increased by 31%, indicating a considerable improvement in cluster efficiency for the proposed system.

Kumar et al. discuss on “A high gain UWB human face shaped MIMO microstrip printed antenna with high isolation” (https://doi.org/10.1007/s11042-021-11827-7). In this study, a dual-polarized MIMO monopole patch antenna in the shape of a human face is designed, simulated, manufactured, and measured for use in UWB applications. Using the CST microwave suite, the HFS MIMO MP antenna is optimized and simulated. The suggested high gain HFS MP antenna with good isolation is suitable for UWB MIMO systems, according to the findings of simulations and measurements.

Khare et al. propose their work on “Multi-resolution approach to human activity recognition in video sequence based on combination of complex wavelet transform, Local Binary Pattern and Zernike moment” (https://doi.org/10.1007/s11042-021-11828-6). This research suggests a set of features in a multiresolution framework for recognizing human activities. To categories the recognized human activities, a multi-class support vector machine classifier is used. The experimental findings show that the suggested method outperforms several existing state-of-the-art methods in terms of various quantitative performance indicators, and it works well for multitier human tasks.

Singh et al. work on “Vehicle Identification Using Modified Region Based Convolution Network for Intelligent Transportation System” (https://doi.org/10.1007/s11042-020-10366-x). On the MIO-TCD vehicle dataset and the EBVT video dataset, the suggested approach is used. Three separate metrics—average accuracy, mean precision, and mean recall—are used to generate the results. Results are also contrasted with those from other cutting-edge techniques. The results demonstrate a considerable improvement, demonstrating how well the approach works for video analysis.

Ahmad et al. propose their findings on “An Overview of Rate Control Techniques in HEVC and SHVC Video Encoding” (https://doi.org/10.1007/s11042-021-11249-5). Based on their fundamental theory and mechanics, the research offers another classification of rate control methods. The article also describes the SHVC scalable extension of HEVC and highlights some potential SHVC rate control design difficulties. The authors conclude by outlining potential future research routes and presenting some of the remaining research problems in HEVC rate control.

Shankar et al. discuss on “Hyperparameter Search Based Convolution Neural Network with Bi-LSTM Model for Intrusion Detection System in Multimedia Big Data Environment” (https://doi.org/10.1007/s11042-021-11271-7). The HPS-CBL model, which is based on deep learning and is designed for intrusion detection in big data environments. By achieving a maximum precision of 99.24%, recall of 98.69%, F-score of 98.97%, and accuracy of 98.18%, respectively, the obtained experimental result amply demonstrated the superiority of the HPS-CBL model over the comparison approaches.

Singh et al. demonstrate their findings on “An Efficient Verifiable (t;n)-Threshold Secret Image Sharing Scheme with Ultralight Shares” (https://doi.org/10.1007/s11042-021-10523-w). A verifiable (t, n)-threshold secret image sharing (VSIS) technique is suggested in this research. The key benefit of the proposed work scheme is that it presents the public shares as integer numbers, which are substantially smaller than the hidden image and are not image matrices like in earlier SIS designs. Additionally, it creates a public share-picture with the exact same dimensions as the hidden image. As a result, the public shares can be efficiently kept in memory and sent over the public network. As a result, all communications can be safely conducted through open channels.

Pal et al. conduct an extensive survey on “A comprehensive survey of image segmentation: clustering methods, performance parameters, and benchmark datasets” (https://doi.org/10.1007/s11042-021-10594-9). Hierarchical and partitional based clustering methods are the two main clustering techniques covered in this article. Further research is conducted in the context of techniques from this family since partitional clustering is computationally superior. The partitional based clustering techniques are further divided into three categories in the literature: K-means based techniques, histogram based techniques, and meta-heuristic based techniques. Additionally, a briefing on the publicly accessible benchmark datasets for picture segmentation is provided.

Kumar et al. propose the work on “Low bandwidth Data Hiding for Multimedia Systems based on Bit Redundancy” (https://doi.org/10.1007/s11042-021-10832-0). In contrast to conventional AMBTC-based techniques, this study provides a high capacity data hiding scheme that effectively uses complex picture blocks for embedding the hidden data without any image quality reduction. Experimental findings are compared with existing methodologies to validate the performance of the proposed strategy, which shows a significant improvement in embedding capacity while maintaining stego-image quality.

Bakshi et al. demonstrate their findings on “An Efficient Face Anti-Spoofing and Detection Model Using Image Quality Assessment Parameters” (https://doi.org/10.1007/s11042-020-10045-x). The main goal of the research is to suggest a low-complexity fake biometric detection method using various image quality assessment metrics, such as Mean Square Error, Signal to Noise Ratio, SC, and others, on the extracted features of the images. By examining the MSE values, which are 5.8% and 8.49% above the threshold value for the features of the nose and eyes, respectively, the validity of the proposed model is demonstrated. 500 male and female students with ages ranging from 20 to 30 were included in the database used to create the tests.

Tsai et al. analyze this findings on “A Cooperative Mechanism for Managing Multimedia Project Documentation” (https://doi.org/10.1007/s11042-021-10521-y). This study’s goal is to give a clear expression of this context in project documents that are useful. The findings can be used to manage multimedia project documentation practically.

Mehta et al. discuss their findings on “Hierarchical WSN Protocol with Fuzzy Multi-criteria Clustering and Bio-inspired Energy-efficient Routing (FMCB-ER)” (https://doi.org/10.1007/s11042-020-09633-8). On the grounds of energy consumption, lifetime, throughput, end-to-end delay, jitter, packet delivery ratio, latency, and the number of dead and living nodes, the suggested method is assessed and contrasted with other existing routing approaches. The proposed strategy performs better than the compared approaches on the aforementioned parameters, according to simulation findings. In particular, the proposed solution extends the network lifetime by 8% and cuts energy consumption by 13% when compared to other approaches.

Acharjya et al. implement an algorithm on “An integrated fuzzy rough set and real coded genetic algorithm approach for crop identification in smart agriculture” (https://doi.org/10.1007/s11042-021-10518-7). A model that combines a fuzzy rough set, a real-coded evolutionary algorithm, and linear regression is presented in this study. Using agricultural information systems collected from the Krishi Vigyan Kendra in the Thiruvannamalai district of Tamilnadu, India, the proposed model was evaluated for its viability. Additionally, the proposed model’s accuracy is contrasted with that of existing methods.

Ibrahim et al. reported a review on “Futuristic CRISPR-based biosensing in the Cloud and Internet of Things Era: An Overview” (https://doi.org/10.1007/s11042-020-09010-5). The creation of cutting-edge CRISPR-biosensors built on microchips using machine learning techniques was the main emphasis of this review, as was the usage of the Internet of Things to transmit wireless signals to the cloud to enhance decision-making. The paper also describes the current state of CRISPR-based biosensing applications’ limitations and unresolved research questions.

Chilamkurti et al. discuss their work on “A voice-based Real-Time Emotion Detection Technique using Recurrent Neural Network Empowered Feature Modelling” (https://doi.org/10.1007/s11042-022-13363-4). In this research, a novel feature extraction method using feature embeddings based on Bag-of-Audio-Words (BoAW) for conversational audio data is proposed. A real-time prediction capabilities empirical evaluation and two benchmark datasets are used to assess the performance of the suggested approach and model. The suggested method considerably outperformed current state-of-the-art models, reporting 60.87% weighted accuracy and 60.97% unweighted accuracy for six fundamental emotions for the IEMOCAP dataset.