Semi-Distributed Control For FPGA-based Reconfigurable Systems

Semi-distributed control for FPGA-based recongurable systems
Chiraz Trabelsi , Samy Meftali and Jean-Luc Dekeyser INRIA Lille Nord Europe - LIFL - Universite Lille1 Lille, FRANCE Email:{Firstname.Lastname}@inria.fr
hal-00703093, version 1 - 31 May 2012
AbstractDue to the growing complexity of the applications targeted by FPGA-based recongurable systems, the control design of such systems is becoming one of the main hurdles faced by designers. In this paper, we propose a semi-distributed control model based on the separation between different control concerns (monitoring, decision-making and reconguration) and on formalism-oriented design in order to decrease the design complexity of the control, and facilitate design verication, reuse and scalability. This model is composed of distributed controllers handling each the self-adaptivity of a recongurable region of the system, and a coordinator that coordinates their reconguration decisions in order to respect global system constraints. Implementations on FPGA showed that our semi-distributed control model is more exible, reusable and scalable than the centralized one, at the cost of a slight increase in required hardware resources.
I. I NTRODUCTION Thanks to their ability to be recongured an arbitrary number of times, FPGAs offer a high exibility for modern embedded systems design. Partial Dynamic Reconguration (PDR), supported by several FPGAs, offers more exibility by allowing portions of the FPGA to be recongured at runtime, in order to load different functionalities and adapt to runtime changes, while the rest remains operating [1]. The progress in FPGA technologies has enabled to embed a growing number of computing resources on one chip targeting increasingly sophisticated applications. However, this has led to a growing design complexity since design tools do not evolve at the same pace as hardware technology, resulting in a productivity gap. One of the most complex design tasks for recongurable SoC (RSoC) is the control design, since it has to handle different aspects related to runtime adaptivity. In this context, autonomy, modularity and formalism-oriented design can be viewed as an effective combination to deal with the growing RSoC design complexity. In this paper, we propose a semi-distributed control model for FPGA-based recongurable systems. This model divides the control problem between autonomous controllers handling each the self-adaptivity of a recongurable region of the system through three major tasks: monitoring, decision-making and reconguration using three different modules. In order to respect global system constraints and correlations between recongurable regions, the controllers reconguration decisions are coordinated by a coordinator before launching recongurations. This twolayer decision-making is well adapted to single-FPGA, as well as multi-FPGA systems by implementing the coordinator
on a master FPGA. The semi-distributed control is also well adapted to higher hierarchy control architectures by organizing the controllers into clusters coordinated by local coordinators and implemented either on the same FPGA or on different FPGAs. The proposed semi-distributed decision-making model is based on the mode-automata formalism, which allows to abstract the control problem and gives clear control semantics decreasing design complexity. Such a splitting of the control problem offers a high design exibility facilitating reuse and scalability. Implementation results showed that our semi-distributed control model is more exible, reusable and scalable than the centralized one, at the cost of a slight increase in required hardware resources. The rest of this paper is organized as follows. Section 2 gives a summary of the related works. Section 3 illustrates the proposed control model. In section 4, an example of the control implementation for a video processing application is presented. The last section concludes this paper and gives some future works. II. R ELATED WORKS Distributed control for FPGA-based systems adaptivity has been proposed by several works. In [2], a hardware controller was allocated to each recongurable region in order to control the tasks it runs throughout the application execution. However, reconguration decisions were only dependent on a task graph, and the correlation between regions was not treated. In [3], the distributed control was used for the reconguration of an organic computing system. However, this work focused more on the distributed access to the conguration port (ICAP), allowing to accelerate the reconguration process compared to the centralized access, without giving details about the used control components (monitoring, decisions, etc). In [4] [5], the authors propose a general model of networked entities, handling each computing, monitoring, control and communication. Nevertheless, the decision of reconguring such entities is done in a centralized way. Distributed control was also used for multi-FPGA systems, which have started to gain interest and have been investigated in several works. However, only one controller was used per FPGA [6] [7], which implies a high design complexity of controllers, and no formalism was proposed to model the control system. To master design complexity, formal control models are important since they enable shorter design cycle by improving
external events Control model local monitored data Controller1 ...... Monitoring module Monitoring module Controllern local monitored data
Controlled region1
Controlled regionn
reconfiguration actions
monitoring data load(mode) Reconfiguration Decision module module loaded_mode reconfiguration requests/ acceptance/ refusal Coordinator
monitoring data load(mode) Reconfiguration Decision module module loaded_mode coordinator's suggestions/ responses
reconfiguration actions Distributed monitoring Semi-distributed decision-making Distributed reconfiguration
Fig. 1: Overview of the proposed control model

controller_decisioni (monitoring_datai, loaded_modei, coord_inprogress, coord_suggestioni, coord_decisioni) = (controlleri_request, controlleri_response, load_modei) coord_decisioni (modei j) = refusal conditioni (modei j) / send_decision (modei j) Modei 1 coord_decisioni (modei j) = refusal coord_decisioni (modei j) = refusal conditioni (modei j) / send_decision (modei j) conditioni (modei j) / send_decision (modei j) Modei 2 Modei Mi loaded_modei = modei 1 coord_decisioni (modei j) = authorization / load _modei = modei j coord_decisioni (modei j) = authorization / load _modei = modei j
loaded_modei = modei 2
coord_decisioni (modei j) = authorization loaded_modei = modei 2 / load _modei = modei j
hal-00703093, version 1 - 31 May 2012
* conditioni (modei j) / send_decison(modei j) <=> [ conditioni_request (modei j)= true / controlleri_request = modei j or condition i_acceptance (modei j)= true / controlleri_response = acceptance or condition i_refusal (modei j)= true / controlleri_response = refusal ]
Fig. 2: The controller mode-automaton
design reuse as well as offering formal verication. Modeautomata formalism [8], on which is based the semi-distributed decision-making in our control model, is a simplied version of State charts [9] in syntax. It has been adopted as a specication language for control oriented reactive systems [10]. It has also been used to model control for FPGA-based recongurable systems [11]. However, the proposed models targeted a centralized controller. III. S EMI - DISTRIBUTED CONTROL MODEL The main objective of the control model proposed in this paper is to solve design problems related to design complexity, verication, reuse and scalability. In order to achieve this objective, the proposed control model combines three main points: autonomy, modularity and formalism. A. Autonomous modular distributed controllers for selfadaptivity Each distributed controller is composed of three modules handling monitoring, reconguration decision-making and reconguration realization for a given recongurable region, as shows Figure 1. The monitoring module collects information from the behavior of the controlled region and other external events such as those sent by sensors, for example. The monitoring data is sent to the decision module, which makes reconguration decisions accordingly. Each decision module makes local decisions about whether or not a reconguration of the controlled region is required. Due to the local vision of each controller, launching a reconguration of its controlled region without checking whether it can coexist with the current congurations of the other regions might result in problems such as safety problems or
might not respect the control global constraints such as those related to performance, temperature, energy consumption, etc. Therefore, before launching a reconguration that it estimates required according to the monitoring data, the controller has to send a reconguration request to the coordinator. If the coordinator authorizes the requested reconguration, the decision module noties the reconguration module in order to launch the required conguration. The role of the reconguration module is to apply reconguration actions on the controlled region, which consist in loading the required conguration data (bitstream) in the recongurable region through a conguration port, such as ICAP for Xilinx FPGAs. After loading the required bitstream, the reconguration module noties the decision module so that it updates its automatons current mode. B. Mode-oriented decision-making The proposed decision-making model is based on the modeautomata formalism [8]. It is composed of the distributed controllers automata and the coordinators automaton. These automata communicate through coordination information whenever one of the controllers estimates that a reconguration of its controlled region is required. In this case, it sends a reconguration request to the coordinator and waits for its decision. If the request implies also the reconguration of other regions in order to respect the global system constraints, the coordinator sends reconguration suggestions to the concerned controllers. These controllers can accept or refuse those suggestions. After treating the controllers responses, the coordinator gives its decision, which can be either the authorization or the refusal of the requested reconguration.
coordinator({controlleri_request}, {controlleri_response}) = (coord_inprogress, {coord_suggestioni}, {coord_responsei}) {controlleri_request} != {} / coord_inprogress=true {controlleri_request} = {}
Reconfigurable regions
Global configurations
1 mode1 j1.1 mode2 j2.1 ....... moden jn.1
2 mode1 j1.2 mode2 j2.2 ....... moden jn.2
....... ....... ....... ....... .......
K mode1 j1.K mode2 j2.K ....... mode2 jn.K
TreatRequests
treat_requests() involve_others=true / send_suggestions()
Region1 Region2 ....... Regionn
Idle
involve_others=false /coord_inprogress=false, send_decision(), final_decision=null
TreatResponses
treat_responses() final_decision=null
final_decision !=null / coord_inprogress=false, send_decision(), final_decision=null
Fig. 4: Global congurations table (GC)
Fig. 3: The coordinator mode-automaton determined at design time and can be lled manually by the designer or generated from high-level languages such as those based on contract mechanism [12]. We call this table GC (Global Congurations) shown in Figure 4, where each row corresponds to a global conguration, which is a combination of partial congurations of the recongurable regions. This table is dened as follows: GC[i, k] = modei ji.k i [1..n], ji.k [1..Mi ] and k [1..K ] where, modei ji.k is the mode of regioni corresponding to the global conguration k; n is the number of recongurable regions and of controllers; K is the number of the global conguration possibilities, and Mi is the number of modes/congurations of regioni . The exchanges between the controllers and the coordinator are not continuous in time. They only happen when one of the controllers decides that a reconguration of its controlled region is required. This allows to reduce the impact of the communication latency inside the control model on the overall system performance. The coordination algorithm is executed using a three-mode automaton as shows Figure 3. The idle mode corresponds to a coordinator waiting for reconguration requests coming from controllers. The TreatRequests and TreatResponses modes correspond to a coordinator that is treating the controllers reconguration requests and responses to reconguration suggestions, respectively. The coordinator starts at the idle mode. Whenever it receives a reconguration request ({controlleri request } {}), it sends a notication to the controllers indicating that a coordination is in progress (coord inprogress = true), so that they stop sending reconguration requests. It moves then to the TreatRequests mode. Due to its local vision of the system, a request sent by a controller concerns only its controlled region, which corresponds here to a cell of the GC table. Note that the number of requests received at the same time depends on the instants reconguration decisions are made by controllers, as well as on the communication type between the controllers and the coordinator (through a bus, point-to-point, etc). In this paper, we used a pointto-point communication as will be shown in the case study. This implementation has the advantage of accelerating the coordination process offering the possibility to receive more than one request or response from controllers at the same time. Other communication architectures are still possible here by modifying the communication part of the coordinator without modifying the coordination algorithm. Once in the TreatRequests mode, the coordinator checks whether the conguration(s) requested by the controller(s) at a given time combined to the current congurations
The Controller mode-automaton The decision module of each controller is modeled by a mode-automaton. Figure 2 shows an example of this mode-automaton. Each mode Modei j corresponds to a given conguration/mode of the controlled regioni , where i [1..n], n is the number of the systems recongurable regions, j [1..Mi ] and Mi is the number of the modes/congurations of regioni . Here, we assume that each recongurable region has a set of conguration possibilities predened at designtime. Inputs and outputs of the mode-automaton are shown in its header. Inputs are monitoring data (monitoring datai ) sent by the monitoring module, a reconguration success notication (loaded modei ) from the reconguration module, and coordination information from the coordinator. These information include a notication at the beginning/end of each coordination process (coord inprogress), reconguration suggestions (coord suggestioni ), and the coordinator decision at the end of a coordination process (coord decisioni ). Based on monitoring data, the current mode of the controlled region, and on coordination information (coord inprogress and coord decisioni ), the controller makes a decision on whether a reconguration of the controlled region is required. If yes, it sends a reconguration request to the coordinator (controlleri request ). During a coordination process, a controller might receive reconguration suggestions from the coordination to which it can respond (controlleri response ) with acceptance or refusal depending on monitoring data. If the coordinator authorizes a reconguration to the controller (coord decisioni (modei j ) = authorization), it sends a reconguration command to the reconguration module (load modei ), indicating the mode to be loaded. If the coordinator refuses a reconguration (coord decisioni (modei j ) = re f usal ), it cannot be launched because it cannot coexist with the current congurations of the other regions, according to the global system constraints as it will be explained later. The coordinator mode-automaton The role of the coordinator is to coordinate the reconguration decisions of the controllers in order to guarantee that the system conguration respects the global system constraints. For this, a table containing the allowed system congurations is used, according to constraints of safety, performance, consumption, etc. This table is
hal-00703093, version 1 - 31 May 2012
of the regions handled by other controllers exist as a global conguration possibility in the GC table. If no, it checks which other recongurations that, combined to the one requested, allow to obtain one or more global congurations that respect the global system constraints. Here, if more than one global conguration satises the requested reconguration(s), possibilities are ordered according to the control strategy/objective. In the present implementation of the coordinator, we order possibilities in a list according to the number of partial recongurations they require, and we give the highest priority to the global conguration that requires less partial recongurations in order to minimize reconguration time. Different algorithms can also be used here to order conguration possibilities. In case the rst possibility in the ordered list satises the requests and doesnt require reconguration of other regions (involve others = f alse), the coordinator ends the coordination process (coord inprogress = f alse) and sends directly its decision ( f inal decision = authorization) to the controllers requesting the recongurations as shows Figure 3. Otherwise, the coordination process is divided into steps. Each coordination step is related to a possibility of the ordered list. The coordinator begins with sending reconguration suggestions to the controllers whose regions have to be recongured in order to move to the rst possibility. Then it goes to the TreatResponses mode. If a coordination step ends with positive responses from all the concerned controllers, the coordination process ends with an authorization ( f inal decision = authorization). In this case, the coordinator noties the controllers (coord inprogress = f alse) and sends its reconguration authorization to the controllers requesting the recongurations as well as the other concerned controllers, and goes back to the idle mode. Otherwise, the coordinator remains at the TreatResponses mode and considers the next possibility. The coordination ends with a refusal ( f inal decision = re f usal ) if all the possibilities have been refused by the controllers. In this case, the coordinator noties the controllers (coord inprogress = f alse) and sends a reconguration refusal to the controllers requesting the recongurations. Then it goes back to the idle mode. Advantages compared to the centralized model Thanks to the distribution of the control problem between controllers, the proposed control model provides a high design exibility compared to the centralized control facilitating reuse and scalability. In the case of a centralized decision-making, the decision module of the centralized controller can be modeled by a mode-automaton, where each mode corresponds to a global conguration of the system (a combination of the congurations of the recongurable regions). Transitions from a mode to another depend thus on a global vision of the system, using monitoring data as well global constraints (the same as those used by the coordinator in the semi-distributed model). This makes the centralized decision module tightly dependent on the implemented system, which is an obstacle
to design reuse. Indeed, when designers want to add other regions to a previous system design, the whole decision model has to be rewritten (both modes and transitions), because each mode has to take into account the new regions. On the other hand, with the semi-distributed decision-making model, adding new regions requires simply adding a controller for each new region. The monitoring and decision modules of the controllers can be easily reused since they depend only on the monitoring data related to the controlled region. The coordination algorithm doesnt change. It has only to increase the number of controllers to be coordinated, and to modify the global constraints checked by the coordinator. This is done simply by modifying the GC table. IV. C ASE STUDY This case study explains the use of the semi-distributed control for a video scaling application, targeting a singleFPGA system. Then it evaluates its efciency in terms of design complexity, reuse, scalability and resource overhead compared to the centralized model. A. The application The application consists of a classical downscaler composed of two main tasks: a horizontal lter and a vertical lter applied to a sequence of video frames. Each lter is composed of a repetition of an elementary task on a block of the frame. An elementary task of a lter is executed by a hardware accelerator in order to guarantee a high performance. Using data-parallelism, each task can be implemented using a number of hardware accelerators performing in parallel the same elementary task on different frame blocks, which allows to reduce execution time. In our case study, we assume that each hardware accelerator is implemented in a recongurable region in order to adapt to runtime changes as we will explain later. The variety of parallelism possibilities of the considered application allows to test the scalability of our control model by varying the number of recongurable regions and thus the number of controllers. Using similar accelerators to implement data-parallelism for each lter task allows to reuse the same controller for similar regions, reducing thus the design time of the control model. In our case study, the objective of the control model is to adapt the downscaler application to changes in performance and power requirements. For this, we assume that both elementary tasks of the horizontal and vertical lters are implemented in three versions of hardware accelerators available in an IP (Intellectual Property) library, and different in terms of performance and power consumption. Switching these versions during runtime allows each recongurable region to have three different modes (HFilter mode j /V Filter mode j ), j [1, 2, 3]. Modes HFilter mode1 and V Filter mode1 are the modes giving the highest performance but also the highest consumption for the horizontal and vertical lters respectively. HFilter mode3 and V Filter mode3 are the least performing but the least consuming. Our semi-distributed control model allocates a controller to each region in order to control its
hal-00703093, version 1 - 31 May 2012
behavior allowing to switch different modes depending on requirements in terms of performance and power consumption. The role of the coordinator is to verify that the global conguration of the system respects the constraints indicated in the GC table as we explained previously in section III-B, and this by coordinating the recongurable decisions made by controllers. B. Hardware design of the semi-distributed control Our control model is wholly designed in hardware in order to avoid the execution time overhead of the software implementation. This design follows the control model described in gure 1. The inputs of this model are performance and consumption information. The objective of the control in this case study is to make a trade-off between the performance and the consumption constraints. We dene performance requirements, for both lters, as three performance levels given by the user, where level 1 corresponds to the highest performance. As for consumption requirements, we assume that a battery sensor is used in order to give the battery level at each clock cycle. This information is monitored by the monitoring modules of the distributed controllers in order to be taken into account for reconguration decisions. Provided that all recongurable regions have three conguration possibilities, each controller uses a decision module modeled by a three-mode automaton. Figure 5 shows the mode-automaton of the controller related to the horizontal lter, through three different control aspects. The controller related to the vertical lter follows the same concepts. As we said previously, the decision-making of each controller depends on a local vision of the system, which decreases its design complexity. This case study shows how the local-vision decisions of the controllers can be coordinated in order to respect global system constraints. The controllers decisions (reconguration requests and responses to suggestions) are based on the following rules: No request is sent when a coordination process is in progress (coord inprogress = true) If a request has been refused by the coordinator for mode j (re f used mode j = true), no request is sent for the same mode or it will be refused again. Moving to the requested mode is still possible, at a later coordination process, if the coordinator sends a suggestion to the controller related to the same mode and the coordination process ends with an authorization Being at H /V Filter mode j1 , a controller decides that a reconguration to a less consuming mode H /V Filter mode j2 is required, only if the user requires a lower performance level, or the consumption constraints do not allow to stay at H /V Filter mode j1 , which is the case when the following condition is valid AB/H j1 < a j1, j2 .FB/H1 (1)
battery energy at a given clock cycle and FB is the energy of a full battery. This constraint allows to check whether the available energy is under a threshold (determined by a j1, j2 ) that allows to stay at H /V Filter mode j1 , taking as a reference the highest consumption (H1 ). In Figure 5(a), a1,2 = 75% and a2,3 = 75%.75%, which give the thresholds to move from H /V Filter mode1 to H /V Filter mode2 and from H /V Filter mode2 to H /V Filter mode3 respectively. In order to move from a H /V Filter mode j2 to a H /V Filter mode j1 that consumes more, it is necessary that the user requires a performance level that is higher than the previous one and that the consumption constraints allow to move to the target mode, which is the case when AB/H j1 >= (a j1, j2 + b j2, j1 ).FB/H1 (2)
Note that we add the term b j2, j1 in order to avoid that, once in H /V Filter mode j1 , the consumption constraints in (1) lead the controller to decide to go back to H /V Filter mode j2 so soon, which would lead to an innite loop. b j2, j1 has to be well chosen in order to avoid this problem. In our case study, we take a b j2, j1 = 5% as shows Figure 5(a). If the controller receives a reconguration suggestion from the coordinator it treats it as follows. If the suggestion requires to move to a less consuming mode, the controller accepts directly. Otherwise, the controller checks the consumption constraints in (2) in order to accept or refuse as shows Figure 5(b). The coordinators automaton was implemented according to the description in Figure 3. After a reconguration is authorized by the coordinator, the decision modules of the concerned controllers notify the reconguration modules. Partial recongurations can then be launched either in a parallel or a sequential way. Parallel recongurations are only possible for systems having more than one conguration port (ICAP for Xilinx FPGAs), such as multi-FPGA systems but not singleFPGA systems because current FPGAs have only one ICAP. Therefore, in our single-FPGA system, the distributed reconguration model was implemented in a way that it realizes recongurations in a sequential manner. Each reconguration module contains a dedicated register indicating which mode is to be loaded in the controlled region. These registers are then read by a processor, which communicates with the ICAP port in order to load the required bitstreams in the recongurable regions. When a required conguration is loaded, the processor noties the reconguration module. The reconguration module noties then the decision module, which updates its current mode in the mode-automaton accordingly. C. Design reusability and scalability In order to evaluate the efciency of our control model compared to the centralized one in terms of design reusability and scalability, we designed the semi-distributed and centralized control models for different numbers of controlled
hal-00703093, version 1 - 31 May 2012
as shows Figure 5(a), where H j (V j for the vertical lter) is the energy consumption per cycle of the controlled regions mode H /V Filter mode j , AB is the available
Controller's reconfiguration requests coord_inprogress=false and refused_mode2 = false and performance_level=2 or (AB / H1 < 75% . FB / H1) / request(mode2) HFilter_mode1
loaded (mode2) = true loaded (mode1) = true
coord_inprogress=false and refused_mode3 = false and performance_level=3 or (AB / H2 < 75% . 75% . FB / H1) / request(mode3) loaded (mode3) = true HFilter_mode2 loaded (mode2) = true coord_inprogress=false and refused_mode1= false and performance_level=1 and (AB / H1 >= (75% + 5%) . FB / H1) / request(mode1)
coord_inprogress=false and refused_mode2 = false and performance_level=2 and (AB / H2 >= ((75% . 75%) + 5% ) . FB / H1) / request(mode2) HFilter_mode3
a)
Treating the coordinator's suggestions
coord_inprogress=false and refused_mode3 = false and performance_level=3 / request(mode3)
coord_inprogress=false and refused_mode1= false and performance_level=1 and (AB / H1 >= (75% + 5%) . FB / H1) ) / request(mode1)
hal-00703093, version 1 - 31 May 2012
coord_suggestion=mode3 coord_suggestion=mode1 and (AB / H1 >= (75% + 5%) . FB / H1) coord_suggestion=mode2 and (AB / H2 >= ((75% . 75%) + 5% ) . FB / H1) /accept(mode1) / accept(mode2) / accept(mode3) loaded (mode ) = true loaded (mode2) = true coord_suggestion=mode1 and (AB / H1 >= (75% + 5%) . FB / H1) ) 3 HFilter_mode3 HFilter_mode2 HFilter_mode1 / accept(mode1) loaded (mode1) = true loaded (mode2) = true coord_suggestion=mode2 and (AB / H1 < (75% + 5%) . FB / H1) )/ refuse(mode1) coord_suggestion=mode1 and (AB / H1 < (75% + 5%) . FB / H1) coord_suggestion=mode2 and coord_suggestion=mode3/ accept(mode3) / refuse(mode1) b) (AB / H2 < ((75% . 75%) + 5% ) . FB / H1) / refuse(mode2) Treating the coorinator's reponse coord_response (mode2) = authorization coord_response (mode1) = authorization coord_response (mode2) = authorization / load(mode2) / load(mode1) coord_response (mode ) = authorization coord_response (mode3) = authorization 3 coord_response (mode1) = authorization / load(mode2) / load(mode3) / load(mode3) loaded (mode2) = true loaded (mode3) = true / load(mode1) HFilter_mode3 HFilter_mode2 HFilter_mode1 loaded (mode2) = true loaded (mode1) = true coord_response (mode1) = refusal coord_response (mode3) = refusal / coord_response (mode2) = refusal / refused_mode1 = true coord_response (mode1) = refusal coord_response (mode3) = refusal refused_mode3= true / refused_mode2 = true coord_response (mode2) = refusal / refused_mode = true 3 c) / refused_mode1 = true / refused_mode2 = true coord_suggestion=mode2 / accept(mode2)
Fig. 5: Mode-automaton of the HFilters controller
regions (up to n = 10 regions, where n/2 regions implement the horizontal lter and the rest the vertical lter). We started the semi-distributed model design with one controller for each type of lter. Later, we reused these controllers to compose bigger control models. For the coordination scalability, we only modied the number of coordinated controllers given as parameter to the coordinator, as well as the GC table. The splitting of the control problem between the controllers and the coordinator allowed us to test them separately, which facilitated signicantly the design phase. On the other hand, adapting the centralized controller to different numbers of regions was more complicated. The centralized controllers were implemented using one mode-automaton as explained previously. The controller was rewritten each time to adapt to the system implementation, which led to longer design phases. D. The semi-distributed control through a simulation scenario Our semi-distributed control model was simulated using ISE 12.4 of Xilinx for different numbers of controllers. In order to explain more the evolution of the semi-distributed decision making at runtime, we will consider a simulation scenario for a control model with 4 controllers and a coordinator. The rst two controllers control regions HFilter1 and HFilter2 , which implement the horizontal lter. The two others control regions V Filter1 and V Filter2 , which implement the vertical lter. In this case study, we suppose that global systems constraints require that all the regions have to implement the same mode number as shows Table I. This choice allows to test both acceptance and refusal in coordination processes by restricting global conguration possibilities. The inputs of the control model are the available
battery signal sent by the battery sensor, and the processor commands. These commands allow to send the user required performance level, read reconguration registers of the reconguration modules and notify them at the end of the recongurations. These inputs were simulated using VHDL processes. Here, we suppose that the processor reads the reconguration registers after each frame downscaling. Table II represents the characteristics of the simulated system for the studied scenario, in terms of performance (frame/s) and power consumption. These values are used to determine the energy consumption per cycle of the controlled regions, the decrementation step of the battery, and the instants the processor reads the reconguration registers. Figure 6 describes the simulation scenario represented by a chart, where the x axis describes instants when different events occur. These events are related to the current battery level, the performance level required by the user and the coordination processes. The y axis describes the available battery energy with a precision of the thresholds used by the different controllers to make reconguration decisions. The chart points are labeled with numbers corresponding to the global conguration at different instants of the simulation. At t < t1 , the current global conguration is number 1. The user required performance level is 1. At t = t1 , the available battery energy (AB) reaches 75% of a fully-charged battery (FB). In this case, all controllers send reconguration requests to the coordinator asking to move to H /V Filter mode2 as we have seen in Figure 5(a). The coordinator noties the controllers that a coordination process is in progress. Then, it looks for the global conguration(s) that satisfy the received
Region Region Region Region
1 2 3 4
Global conguration number 1 2 3 HFilter mode1 HFilter mode2 HFilter mode3 HFilter mode1 HFilter mode2 HFilter mode3 V Filter mode1 V Filter mode2 V Filter mode3 V Filter mode1 V Filter mode2 V Filter mode3
TABLE I: GC table for the 4-region system

Global conguration Consumption of H/VFilter (mW) Performance (frames/s) 1 60/70 10 2 40/50 8 3 20/30 5
TABLE II: Power consumption and performance for different congurations of the 4-region system
requests. Here, only one global conguration satises the requests, which corresponds to column 2 of Table I. Since the current coordination process doesnt involve additional controllers to those that sent the requests, a reconguration authorization is sent to the controllers with a notication of the end of the coordination process. The decision modules of the controllers send reconguration commands to the reconguration modules asking them for loading H /V Filter mode2 as it was shown in Figure 5(c). Then, when the processor launches its read commands, it nds that the controllers require to move to H /V Filter mode2 . After loading the required partial congurations, the processor noties the controllers (loaded (mode2 ) = true in Figure 5(a)) so that they update the current modes of their modeautomata to be H /V Filter mode2 . The whole process ends at t = t1 + p1 , by modifying the global conguration to 2. Note that p j corresponds to the coordination process number j including the time required to load partial bitstream if the reconguration has been authorized. At t = t2 , the available battery is less than (75%.75%).FB.V 2/V 1. Controllers 3 and 4 send reconguration requests to the coordinator related to V Filter mode3 . The coordinator sends then reconguration suggestions (moving to HFilter mode3 ) to Controller 1 and 2. These controllers accept the suggestions according to Figure 5(b). The coordinator authorizes then the reconguration to all the controllers. The whole process ends at t = t2 + p2 by modifying the global conguration to 3.
available battery energy 100%.FB 80%.FB 75%.FB (75%.75% +5%).FB.V2/V1 (75%.75% +5%).FB.H2/H1 75%.75%.FB.V2/V1 75%.75%.FB.H2/H1
1
hal-00703093, version 1 - 31 May 2012
At t = t3 , the battery is at, so it moves to the charging mode. Since the regions implementing the horizontal lter consume less than the other regions, controllers 1 and 2 are the rst to reach a battery threshold allowing their regions to move to mode2 if the performance level required by the user is 2. For this, we simulate the change of the user performance level to 2 at t = t4 . At t = t5 , the battery threshold is reached for Controllers 1 and 2 (AB > (75%.75% + 5%).FB.H 2/H 1). In this case, controllers 1 and 2 send reconguration requests to the coordinator in order to move to HFilter mode2 at t = t5 . The coordinator suggests V Filter mode2 to controllers 3 and 4. Since the available battery doesnt allow to move to V Filter mode2 (AB < (75%.75% + 5%).FB.V 2/V 1), controllers 3 and 4 send a refusal to the coordinator. The coordination process ends at t = t5 + p3 with a reconguration refusal sent by the coordinator to controllers 1 and 2. Those controllers will not send requests to move to mode2 anymore because they will be refused. They can wait until the coordinator sends them reconguration suggestions to move to mode2 . At t = t6 , there is enough battery to move to V Filter mode2. Controllers 3 and 4 send reconguration requests to the coordinator. The coordinator sends suggestions to controllers 1 and 2 to move to HFilter mode2 . These controllers accept, and the whole process ends at t = t6 + p4 by modifying the global conguration to 2. E. Resource and power overheads After simulating both control models, we synthesized them for Virtex6-xc6vlx240t in order to estimate their overheads in terms of hardware resources. Figure 7 shows that the overhead of the distributed controllers is linear with the number of recongurable regions, because as we explained in section IVC, the controllers are reused to move from a parallelism degree to another. The overhead of controllers is up to 0.57% of slice registers and 1.36% of slice LUTs, which is an acceptable overhead compared to the overhead of recongurable regions as presented in works such as [13] and [14]. The coordinators overhead is also linear with the number of regions. The main reason to this is that the implemented coordinator uses a pointto-point communication allowing to handle many requests/responses from and to the controllers at the same time, which increases the required resources with the number of distributed controllers. Using different communication types decreases the overhead of the coordinator at a cost of longer coordination processes. However, here also there is an overhead of the communication architecture (overhead of a bus, a NoC, etc.). Up to 10 controlled regions the overhead of the implemented version of the coordinator is acceptable (0.035% of slice registers and 0.29% of slice LUTs). Table III gives a comparison between the overhead of the semi-distributed and the centralized models. The semidistributed control model has an overhead that is almost twice the centralized model overhead for different numbers of regions. This difference of overhead is mainly due to the resources required for the coordination between controllers as well as to the higher modularity of the semi-distributed control.
3 3
0%.FB
simulation time
performance level=1
t1 t1 + p 1 t2 t2 + p 2 t3
performance level=2
t4
t5 t5 + p 3
t6 t6 + p 4
Fig. 6: Simulation scenario
Resource occupation Semi-distributed control model Centralized control model Slice registers Slice LUTs Slice registers Slice LUTs 2 375 (0.12%) 474 (0.31%) 290 (0.1%) 286 (0.19%)
Number of controlled regions 4 6 8 744 (0.25%) 1112 (0.37%) 1479 (0.49%) 907 (0.6%) 1388 (0.92%) 2010 (1.33%) 434 (0.14%) 576 (0.19%) 720 (0.24%) 545 (0.36%) 736 (0.49%) 964 (0.64%)
10 1847 (0.61%) 2499 (1.66%) 862(0.29%) 1207 (0.8%)
TABLE III: Synthesis details of the semi-distributed and centralized control models
showed that our decentralized control model is more exible, reusable and scalable than the centralized one, at the cost of a slight increase in required hardware resources. As future works, we plan to integrate our control model in a whole recongurable system in order to evaluate more its efciency. Our control model can also be used in a Model-Driven Engineering based SoC design ow in order to generate automatically its code, taking advantage of the high abstraction offered by the control formalism.
hal-00703093, version 1 - 31 May 2012
Fig. 7: Resource overhead variation of the semi-distributed model with the number of controlled regions
R EFERENCES
[1] P. Lysaght, B. Blodget, J. Mason, J. Young, and B. Bridgford, Invited paper: Enhanced architectures, design methodologies and cad tools for dynamic reconguration of xilinx fpgas, in FPL, 2006, pp. 16. [2] S. Toscher, T. Reinemann, and R. Kasper, An adaptive fpga-based mechatronic control system supporting partial reconguration of controller functionalities, in Proceedings of the rst NASA/ESA conference on Adaptive Hardware and Systems, ser. AHS 06, Washington, DC, USA, 2006, pp. 225228. [3] C. Schuck, B. Haetzer, and J. Becker, An interface for a decentralized 2d reconguration on xilinx virtex-fpgas for organic computing, Int. J. Recong. Comput., vol. 2009, pp. 7:37:3, January 2009. [4] C. Gamrat, J.-M. Philippe, C. Jesshope, A. Shafarenko, L. Bisdounis, U. Bondi, A. Ferrante, J. Cabestany, M. Hubner, J. Parsinnen, J. Kadlec, M. Danek, B. Tain, S. Eisenbach, M. Auguin, J.-P. Diguet, E. Lenormand, and J.-L. Roux, Aether: Self-adaptive networked entities: Autonomous computing elements for future pervasive applications and technologies, in Recongurable Computing: From FPGAs to Hardware/Software Codesign, vol. Chapter 7. Springer, 2011, pp. 149184. [5] J.-M. Philippe, B. Tain, and C. Gamrat, A self-recongurable fpgabased platform for prototyping future pervasive systems, in Evolvable Systems: From Biology to Hardware, ser. Lecture Notes in Computer Science. Springer Berlin / Heidelberg, 2010, vol. 6274, pp. 262273. [6] A. Akoglu, A. Sreeramareddy, and J. G. Josiah, Fpga based distributed self healing architecture for reusable systems, Cluster Computing, vol. 12, pp. 269284, September 2009. [7] X. Y. Niu, K. H. Tsoi, and W. Luk, Reconguring distributed applications in fpga accelerated cluster with wireless networking, in International Conference on Field Programmable Logic and Applications (FPL), 2011, pp. 545 550. [8] F. Maraninchi and Y. Remond, Mode-automata: a new domain-specic construct for the development of safe critical systems, Science of Computer Programming, vol. 46, 2003. [9] D. Harel, Statecharts: A visual formalism for complex systems, 1987. [10] E. Borde, G. Haik, and L. Pautet, Mode-based reconguration of critical software component architectures, in Design, Automation Test in Europe Conference Exhibition, ser. DATE 09, 2009, pp. 1160 1165. [11] S. Guillet, F. de Lamotte, E. Rutten, G. Gogniat, and J.-P. Diguet, Modeling and formal control of partial dynamic reconguration, Recongurable Computing and FPGAs, International Conference on, vol. 0, pp. 3136, 2010. [12] G. Delaval, H. Marchand, and E. Rutten, Contracts for modular discrete controller synthesis, SIGPLAN Not., vol. 45, pp. 5766, 2010. [13] M. Rummele-Werner, T. Perschke, L. Brauna, M. Hubner, and J. Becker, A fpga based fast runtime recongurable real-time multi-objecttracker, in IEEE International Symposium on Circuits and Systems (ISCAS), 2011.
Fig. 8: Power overhead of both semi-distributed and centralized control models
However, the overhead of the semi-distributed model (up to 2499 slice LUTs) is quite acceptable for large FPGAs such as Virtex-6 used here, and the even larger Virtex-7 [15]. The power overhead of the semi-distributed model is also almost twice the centralized model overhead for different numbers of regions for a frequency of 100Mhz as shows Figure 8. Up to 10 regions this overhead does not exceed 5mW, which is considered as an acceptable consumption compared to the consumption of recongurable regions. V. CONCLUSION In this paper, we propose a semi-distributed control model aiming to decrease the complexity and enhance the reusability and scalability of the control design. This control model is well adapted to both single-FPGA and multi-FPGA systems. It is composed of distributed controllers controlling each the runtime adaptivity of a region of the system, and a coordinator for the controllers reconguration decisions. The semi-distributed decision-making model is based on the mode-automata formalism allowing to decrease its design complexity and facilitate its reuse. Implementation on FPGA
[14] J. Huang, M. Parris, J. Lee, and R. F. DeMara, Scalable fpga-based architecture for dct computation using dynamic partial reconguration, ACM Transactions on Embedded Computing Systems, vol. V, pp. 118, 2008. [15] Xilinx, 7 series fpgas overview, advance product specication, Tech. Rep. DS180 (v1.8), 2011.
hal-00703093, version 1 - 31 May 2012

Semi-Distributed Control For FPGA-based Reconfigurable Systems

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Semi-Distributed Control For FPGA-based Reconfigurable Systems

Uploaded by

Copyright:

Available Formats

Semi-distributed control for FPGA-based recongurable systems

hal-00703093, version 1 - 31 May 2012

reconfiguration actions Distributed monitoring Semi-distributed decision-making Distributed reconfiguration

Fig. 1: Overview of the proposed control model

coord_decisioni (modei j) = authorization loaded_modei = modei 2 / load _modei = modei j

hal-00703093, version 1 - 31 May 2012

Fig. 2: The controller mode-automaton

coordinator({controlleri_request}, {controlleri_response}) = (coord_inprogress, {coord_suggestioni}, {coord_responsei}) {controlleri_request} != {} / coord_inprogress=true {controlleri_request} = {}

1 mode1 j1.1 mode2 j2.1 ....... moden jn.1

2 mode1 j1.2 mode2 j2.2 ....... moden jn.2

....... ....... ....... ....... .......

K mode1 j1.K mode2 j2.K ....... mode2 jn.K

Region1 Region2 ....... Regionn

involve_others=false /coord_inprogress=false, send_decision(), final_decision=null

final_decision !=null / coord_inprogress=false, send_decision(), final_decision=null

Fig. 4: Global congurations table (GC)

hal-00703093, version 1 - 31 May 2012

hal-00703093, version 1 - 31 May 2012

hal-00703093, version 1 - 31 May 2012

loaded (mode2) = true loaded (mode1) = true

coord_inprogress=false and refused_mode3 = false and performance_level=3 / request(mode3)

hal-00703093, version 1 - 31 May 2012

Fig. 5: Mode-automaton of the HFilters controller

Region Region Region Region

TABLE I: GC table for the 4-region system

hal-00703093, version 1 - 31 May 2012

Fig. 6: Simulation scenario

10 1847 (0.61%) 2499 (1.66%) 862(0.29%) 1207 (0.8%)

hal-00703093, version 1 - 31 May 2012

Fig. 8: Power overhead of both semi-distributed and centralized control models

hal-00703093, version 1 - 31 May 2012

You might also like