Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

central processor
Recently Published Documents


TOTAL DOCUMENTS

132
(FIVE YEARS 26)

H-INDEX

13
(FIVE YEARS 1)

2021 ◽  
Vol 4 ◽  
pp. 10-15
Author(s):  
Gennadii Malaschonok ◽  
Serhii Sukharskyi

With the development of the Big Data sphere, as well as those fields of study that we can relate to artificial intelligence, the need for fast and efficient computing has become one of the most important tasks nowadays. That is why in the recent decade, graphics processing unit computations have been actively developing to provide an ability for scientists and developers to use thousands of cores GPUs have in order to perform intensive computations. The goal of this research is to implement orthogonal decomposition of a matrix by applying a series of Householder transformations in Java language using JCuda library to conduct a research on its benefits. Several related papers were examined. Malaschonok and Savchenko in their work have introduced an improved version of QR algorithm for this purpose [4] and achieved better results, however Householder algorithm is more promising for GPUs according to another team of researchers – Lahabar and Narayanan [6]. However, they were using Float numbers, while we are using Double, and apart from that we are working on a new BigDecimal type for CUDA. Apart from that, there is still no solution for handling huge matrices where errors in calculations might occur. The algorithm of orthogonal matrix decomposition, which is the first part of SVD algorithm, is researched and implemented in this work. The implementation of matrix bidiagonalization and calculation of orthogonal factors by the Hausholder method in the jCUDA environment on a graphics processor is presented, and the algorithm for the central processor for comparisons is also implemented. Research of the received results where we experimentally measured acceleration of calculations with the use of the graphic processor in comparison with the implementation on the central processor are carried out. We show a speedup up to 53 times compared to CPU implementation on a big matrix size, specifically 2048, and even better results when using more advanced GPUs. At the same time, we still experience bigger errors in calculations while using graphic processing units due to synchronization problems. We compared execution on different platforms (Windows 10 and Arch Linux) and discovered that they are almost the same, taking the computation speed into account. The results have shown that on GPU we can achieve better performance, however there are more implementation difficulties with this approach.


Photonics ◽  
2021 ◽  
Vol 8 (12) ◽  
pp. 527
Author(s):  
Vladimir Y. Zaitsev ◽  
Sergey Y. Ksenofontov ◽  
Alexander A. Sovetsky ◽  
Alexander L. Matveyev ◽  
Lev A. Matveev ◽  
...  

We present a real-time realization of OCT-based elastographic mapping local strains and distribution of the Young’s modulus in biological tissues, which is in high demand for biomedical usage. The described variant exploits the principle of Compression Optical Coherence Elastography (C-OCE) and uses processing of phase-sensitive OCT signals. The strain is estimated by finding local axial gradients of interframe phase variations. Instead of the popular least-squares method for finding these gradients, we use the vector approach, one of its advantages being increased computational efficiency. Here, we present a modified, especially fast variant of this approach. In contrast to conventional correlation-based methods and previously used phase-resolved methods, the described method does not use any search operations or local calculations over a sliding window. Rather, it obtains local strain maps (and then elasticity maps) using several transformations represented as matrix operations applied to entire complex-valued OCT scans. We first elucidate the difference of the proposed method from the previously used correlational and phase-resolved methods and then describe the proposed method realization in a medical OCT device, in which for real-time processing, a “typical” central processor (e.g., Intel Core i7-8850H) is sufficient. Representative examples of on-flight obtained elastographic images are given. These results open prospects for broad use of affordable OCT devices for high-resolution elastographic vitalization in numerous biomedical applications, including the use in clinic.


Electronics ◽  
2021 ◽  
Vol 10 (19) ◽  
pp. 2335
Author(s):  
Dong-Suk Ryu ◽  
Yeung-Mo Yeon ◽  
Seung-Hee Kim

As the growth rate of the internet-of-things (IoT) sensor market is expected to exceed 30%, a technology that can easily collect and processing a large number of various types of sensor data is gradually required. However, conventional multilink IoT sensor communication based on Bluetooth low energy (BLE) enables only the processing of up to 19 peripheral nodes per central device. This study suggested an alternative to increasing the number of IoT sensor nodes while minimizing the addition of a central processor by expanding the number of peripheral nodes that can be processed per central device through a new group-switching algorithm based on Bluetooth low energy (BLE). Furthermore, this involves verifying the relevancy of application to the industry field. This device environment lowered the possibility of data errors and equipment troubles due to communication interference between central processors, which is a critical advantage when applying it to industry. The scalability and various benefits of a group-switching algorithm are expected to help accelerate various services via the application of BLE 5 wireless communication by innovatively improving the constraint of accessing up to 19 nodes per central device in the conventional multilink IoT sensor communication.


2021 ◽  
Vol 5 (2(61)) ◽  
pp. 21-25
Author(s):  
Yaroslav Sokolovskyy ◽  
Denys Manokhin ◽  
Yaroslav Kaplunsky ◽  
Olha Mokrytska

The object of research is to parallelize the learning process of artificial neural networks to automate the procedure of medical image analysis using the Python programming language, PyTorch framework and Compute Unified Device Architecture (CUDA) technology. The operation of this framework is based on the Define-by-Run model. The analysis of the available cloud technologies for realization of the task and the analysis of algorithms of learning of artificial neural networks is carried out. A modified U-Net architecture from the MedicalTorch library was used. The purpose of its application was the need for a network that can effectively learn with small data sets, as in the field of medicine one of the most problematic places is the availability of large datasets, due to the requirements for data confidentiality of this nature. The resulting information system is able to implement the tasks set before it, contains the most user-friendly interface and all the necessary tools to simplify and automate the process of visualization and analysis of data. The efficiency of neural network learning with the help of the central processor (CPU) and with the help of the graphic processor (GPU) with the use of CUDA technologies is compared. Cloud technology was used in the study. Google Colab and Microsoft Azure were considered among cloud services. Colab was first used to build a prototype. Therefore, the Azure service was used to effectively teach the finished architecture of the artificial neural network. Measurements were performed using cloud technologies in both services. The Adam optimizer was used to learn the model. CPU duration measurements were also measured to assess the acceleration of CUDA technology. An estimate of the acceleration obtained through the use of GPU computing and cloud technologies was implemented. CPU duration measurements were also measured to assess the acceleration of CUDA technology. The model developed during the research showed satisfactory results according to the metrics of Jaccard and Dyce in solving the problem. A key factor in the success of this study was cloud computing services.


Author(s):  
Alaa Alameer Ahmad ◽  
Hayssam Dahrouj ◽  
Anas Chaaban ◽  
Tareq Y. Al-Naffouri ◽  
Aydin Sezgin ◽  
...  

Minimizing the power consumption in mobile communication networks while ensuring a minimum quality of service (QoS) for applications is essential in light of the unprecedented expected increase in the number of connected devices and the associated data traffic beyond the fifth generation of wireless networks (B5G). This paper considers a cloud-radio access network (C-RAN) model where a central processor (CP) is connected to the base stations (BSs) via limited capacity fronthaul links. In the context of our C-RAN setting, we consider the practical case where the CP has only statistical knowledge of channel state information (CSI). While conventional wireless systems adopt the treating interference as noise (TIN) strategy to deal with the interference in the network, this paper instead considers that the CP applies the rate splitting (RS) strategy by dividing each user’s message into two parts: a private part to be decoded by the intended user only and a common part to be decoded by a subset of users, for the sole reason of interference mitigation in the network. To best account for the channel estimation errors, this paper addresses the problem of transmit power minimization under minimum QoS constraints on the achievable ergodic rate per user, so as to determine the beamforming vectors of the private and common messages as well as the rate allocated to all the users. The considered problem is of stochastic, complex, and non-convex nature. This paper addresses the problem intricacies through an iterative approach that leverages both the sample average approximation (SAA) technique and the weighted minimum mean squared error (WMMSE) algorithm to obtain a stationary point of the optimization problem in the asymptotic regime. The numerical results demonstrate the gain achieved with the RS strategy as compared to TIN, especially under high QoS requirements.


Author(s):  
M. Sarsembayev ◽  
B. Urmashev ◽  
O. Mamyrbayev ◽  
M. Turdalyuly ◽  
T. Sarsembayeva

The main idea of the implementation is reducing the time for calculation and thereby implement a multi-user mode for users by placing it on a server with access via a web browser. To model the kinetics of chemical reacting systems were used 4th and 5th grade Runge-Kutta methods and to receive the index of advantages of this elaboration were written programs in C# for sequential computation on a central processor and was used a platform for parallel computation of CUDA on graphic processors. Parallelization of data during calculation on a GPU was performed by the distribution of the reaction to individual strands, when changes of the concentration was calculated over a given time interval of a certain substance. Parallelization is performed over all elementary reactions, with the increasing of the number of reactions in the mechanism, because of this the computation on the GPU has a noticeable gain in time.


2021 ◽  
Vol 3 (1) ◽  
pp. 31-39
Author(s):  
Subarna Shakya

A multi-cell Fog-Radio Access Network (F-RAN) architecture that takes into consideration the noisy interference from Internet of Things (IoT) devices and transmission takes place in the uplink with grant-free access. An edge node is used to connect the devices present in every cell and will hold a reasonable capacity in the central processor. The reading obtained from the IoT devices are used to determine the field of correlated Quality of Interests in every cell, transmitting using the Type-Based Multiple Access (TBMA) protocol. This is in contrast to the conventional protocols that are used for diagnostic purpose. In this proposed work, we have implemented the multi-cell F-RAN using cloud or edge detection in analysing the form of information-centric radio access. In a multi-cell system, cloud and edge detection are implemented and analysed. We have implemented model-based detectors and the probability of error for the asymptotic behavior in edge as well as cloud is determined. Similarly, cloud and edge detectors that are data driven are used when statistical models are not available.


2021 ◽  
Vol 24 (1) ◽  
pp. 42-56
Author(s):  
Татьяна Петровна Баранова ◽  
Александр Борисович Бугеря ◽  
Кирилл Николаевич Ефимкин

The paper considers the issues of the computations distributing within one node of a hybrid computing system for applied programs with computation-intense operations. A method is proposed for static distribution of computations, as well as a method for automatic balancing of the computational load during program execution, which is based on periodic analyzing the CPU load by the executed program and making decision to redistribute computational load if necessary. The proposed methods are implemented in an applied program that solves a gas dynamic problem using the computing resources of the multicore central processor and graphics accelerators. The results of program execution with various data distributions were obtained and analyzed, both with and without the mechanism for automatic balancing of the computational load.


Author(s):  
О. П. Кравченко ◽  
Е. Г. Манойлов ◽  
Г. О. Бабич ◽  
Я. С. Малий

Development of electronic monitoring and control system for achieving an effective ratio between electrical energy generation and consumption in the local object power supply system. Methodology. The theory of electrical circuits and electronic circuits were used. Obtaned  results.  The  electronic  system  for  monitoring  and  controlling  power  supply  in  the  local object  power  system  was  developed.  The  system  comprises  three  modules:  central  processor,  module  for monitoring  environment  parameters  and executive module  which  consists  of measuring (current,  voltage) and relay blocks. The central processor processes signals from monitoring and measuring blocks and forms executive  commands  on  relay  block  in  order  to  switch  on/off  consumer  loads  and  electric  generators. Developed systems alowes both maximal power take-off from distributed (renewable) energy sources and flexible  implementation  of  power  consumption  regulation  for  achieving  an  effective  ratio  between  the generation of electrical energy provided by renewable energy sources and the general distribution network, and the total load device consumption in the local object power system. Orginality. The electronic monitoring and controlling system in the local object power system alows providing generated and consumed loads monitoring in the real time. The system provides an ability to form real time  energy  profiles based    on  which  the  control  algorithm  for  executive  block  control is  formed in order to achieve an effective ratio between generation and consumption of electricity in the power system of the local facility.for in   power consumption control system has been developed, which consists of a central processor, monitoring and executive units. The monitoring unit allows you to create energy profiles in real time,  on  the  basis  of  which  the  control  algorithm  in  the  executive  unit  is  formed  in  order  to  achieve  an effective ratio between the electricity  generation and consumption in the local object power system. Practical  value.  As  a  result  of  the  presented  work,  an  electronic  system  for  monitoring  and controling electricity supply in the local object power system with the defined formation of distributed energy sources generation and required consumption profiles in the real time was developed to provide efficient energy  consumption  according  to  the  concepts  of  distributed  electrical  networks  with  renewable  energy sources and Smart House.


The tasks related to the construction of a united semi-active system for damping vibrations of the supporting platform (chassis) of a wheeled vehicle (WV), taking into account the real road profile were considered. The influence estimation of the network on the functioning resulting quality of the entire united damping system is carried out. The modeling of the network united of the model of one wheelset, the possible law of control of the suspension, the central processor and the physical model of the CAN network by using the National Instruments equipment is performed. The results of the experiments, both purely mathematical and with a physical network model, showed the performance of the proposed solutions. Keywords CAN-tire; semi-active suspension system; identification; modeling


Export Citation Format

Share Document