Transaction level (TL) modeling is regarded today as the next step in the direction of complex in... more Transaction level (TL) modeling is regarded today as the next step in the direction of complex integrated circuits and systems design entry. This means that as this modeling level definition evolves, automated synthesis tools will increasingly support it, allowing design capture to start at a higher abstraction level than today. This work presents a comparison of traditional register transfer level (RTL) modeling and transaction level modeling through the implementation of a simple processor case study. SystemC is a language that naturally supports hardware transaction level descriptions. The R8 processor was described in SystemC TL and RTL versions and these were compared to an equivalent hand-coded VHDL RTL description in some key points, such as simulation efficiency and implementation results. The experiments indicate that TL descriptions present a faster path to system validation and that it is possible to envisage the automation of the design flow from this level of abstractio...
6th International Workshop on Reconfigurable Communication-Centric Systems-on-Chip (ReCoSoC), 2011
MPSoCs are largely used in embedded systems, allowing the design of complex systems within short ... more MPSoCs are largely used in embedded systems, allowing the design of complex systems within short time-tomarket. The shift in the communication infrastructure, from buses to networks-on-chip (NoCs), adds new design challenges. Standard directory-based cache coherence protocols represent a performance bottleneck due to number of transactions in the network, reducing performance and increasing the energy consumption. State-of-the-art works investigate new protocols, at abstract levels (e.g. TLM), to optimize the performance of the memory organization. Differently from previous works, we investigate the benefits NoCs can bring to directory-based cache coherence protocols using RTL modeling. The main functionality NoCs may provide for the protocols is the way messages are sent through the network. Most NoCs support multicast as a set of unicast messages. Such method is not suitable for cache coherence protocols, because transactions as block invalidate and block update are naturally multicast. This work proposes the use of multicast messages to reduce the number of transactions to improve the performance of cache coherence protocols in NoCbased MPSoCs. Results show that performance of some transactions is improved up to 32% when using multicast messages.
Proceedings of the 24th symposium on Integrated circuits and systems design - SBCCI '11, 2011
As the number of cores and functionalities integrated in embedded devices increases, the amount o... more As the number of cores and functionalities integrated in embedded devices increases, the amount of memory used on these devices also increases, justifying the development of memory architectures presenting scalability, low energy consumption and low latency. To implement memory solutions, most works adopting NoC-based MPSoCs only employ basic communication services, such as send/receive, without exploring the services NoCs can offer, for instance connection, priorities and multicast communication. Multicast can be used to optimize the cache coherence protocol, leading to both traffic and energy consumption reduction. The goal of this work is to optimize a directory-based cache coherence protocol exploiting specific NoC services, as multicast and priorities. To demonstrate our proposal, an MPSoC described at the RTL level is used, enabling accurate performance and energy evaluation. Results show a reduction of 17% in the number of clock cycles and a reduction up to 86% (average reduction: 39%) in energy consumption for some memory transactions.
2012 VIII Southern Conference on Programmable Logic, 2012
The design of a Multiprocessor System-on-Chip (MPSoC) is a complex task, including steps as appli... more The design of a Multiprocessor System-on-Chip (MPSoC) is a complex task, including steps as application development, platform configuration, code generation, task mapping onto the platform and debugging. An integrated environment covering most of these steps is a gap in the literature. The present work first details an MPSoC architecture, which supports the execution of distributed applications, including an operating system enabling multitask execution at each processing element. The MPSoC is heterogeneous, due to the support to different processor architectures. Then, a framework able to cover the design steps previously mentioned is presented. The framework enables the design space exploration for applications to be executed in the MPSoC, varying for example the number and type of processors, the memory size, the task mapping. Results demonstrate the correct operation for different MPSoC configurations, generated from the proposed framework. Such open-source framework enables the research community to investigate new subjects related to MPSoC and Network on Chip (NoC) design, as well as evaluate distributed applications in a multiprocessor environment.
2014 IEEE International Symposium on Circuits and Systems (ISCAS), 2014
ABSTRACT Software development becomes an important issue in today's MPSoC design. Due to ... more ABSTRACT Software development becomes an important issue in today's MPSoC design. Due to the inherent non-deterministic behavior of MPSoCs, they are prone to concurrency bugs. Debugging tools for MPSoC may be grouped in the following classes: simulators, parallel software development environments, NoC debuggers. An important gap is observed concerning a complete NoC-based MPSoC: tools to inspect the traffic exchanged between processing elements in a higher abstraction level, and not simply as raw data. This is the goal of the paper: propose a new class of debugging tools, able to trace the messages exchanged between PEs, enabling debugging at the protocol level. Examples of protocols include communication between tasks, mapping heuristics, monitoring schemes for QoS, among others. The paper presents the proposed debug framework, as well as a task migration protocol as case study.
IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2014
ABSTRACT With the significant increase in the number of processing elements in NoC-based MPSoCs, ... more ABSTRACT With the significant increase in the number of processing elements in NoC-based MPSoCs, communication becomes, increasingly, a critical resource for performance gains and quality-of-service (QoS) guarantees. The main gap observed in the NoC-based MPSoCs literature is the runtime adaptive techniques to meet QoS. In the absence of such techniques, the system user must statically define, for example, the scheduling policy, communication priorities, and the communication switching mode of applications. The goal of this paper is to investigate the runtime adaptation of the NoC resources, according to the QoS requirements of each application running in the MPSoC. This paper adopts an NoC architecture with duplicated physical channels, adaptive routing, support to flow priorities and simultaneous packet and circuit switching. The monitoring and adaptation management is performed at the operating system level, ensuring QoS to the monitored applications. The QoS acts in the flow priority and the switching mode. Monitoring and QoS adaptation were implemented in software, resulting in flexibility to apply the techniques to other platforms or include other adaptive techniques, as task migration or DVFS. Applications with latency and throughput deadlines run concurrently with best-effort applications. Results with synthetic and real application reduced in average 60% the latency violations, ensuring smaller jitter and throughput. The execution time of applications is not penalized applying the proposed QoS adaptation methods.
2012 IEEE International Symposium on Circuits and Systems, 2012
ABSTRACT Task migration is a well-known strategy adopted in distributed systems for load balancin... more ABSTRACT Task migration is a well-known strategy adopted in distributed systems for load balancing. but the adoption of such strategy in NoC-based MPSoC is scarce in the literature. This paper proposes a complete task migration protocol for NoC-based MPSoCs. The migration transfers the task code, data and context to another PE. The paper presents the communication strategy to ensure coherence in the messages delivery, the heuristic to compute the new task location, and the procedure to inform the new task position. Results evaluate the cost of the task migration using a real MPSoC (described in synthesizable VHDL), demonstrating that the cost to migrate a given task has a small impact in the system performance, enabling its use to improve the overall system performance.
18th IEEE/IFIP International Workshop on Rapid System Prototyping (RSP '07), 2007
Abstract Networks-on-chip, or NoCs, are one communication architecture candidate to be used in pr... more Abstract Networks-on-chip, or NoCs, are one communication architecture candidate to be used in present and future SoCs, due to its scalability, reusability and performance. The focus of this paper is the analysis of IP communication models in NoCs. Employing ...
2011 IEEE International Symposium of Circuits and Systems (ISCAS), 2011
This paper proposes a novel strategy for optimizing resources in Multi-Processor Systems-on-Chip ... more This paper proposes a novel strategy for optimizing resources in Multi-Processor Systems-on-Chip (MPSoC). The approach is based on using control-loop feedback mechanism to maximize the efficiency on exploiting available resources such as CPU time, operating frequency, etc. Each Processing Element (PE) in the architecture is equipped with a frequency scaling module responsible for tuning the frequency of processors at run-time according to the application requirements. Results show the system's capability of adapting to disturbing conditions. For validation purposes we have implemented a multi-threaded MJPEG decoder together with an ADPCM audio decoder and a FIR.
2009 IEEE International Symposium on Circuits and Systems, 2009
Multi-Processor Systems-on-Chip (MPSoCs) are increasingly popular in embedded systems. Due to the... more Multi-Processor Systems-on-Chip (MPSoCs) are increasingly popular in embedded systems. Due to their complexity and huge design space to explore for such systems, CAD tools and frameworks to customize MPSoCs are mandatory. Some academic and industrial frameworks are available to support bus-based MPSoCs, but few works target NoCs as underlying communication architecture. A framework targeting MPSoC customization must provide abstract models to enable fast design space exploration, flexible application mapping strategies, all coupled to features to evaluate the performance of running applications. This paper proposes a framework to customize NoC-based MPSoCs with support to static and dynamic task mapping and C/SystemC simulation models for processors and memories. A simple, specifically designed microkernel executes in each processor, enabling multitasking at the processor level. Graphical tools enable debug and system verification, individualizing data for each task. Practical results highlight the benefit of using dynamic mapping strategies (total execution time reduction) and abstract models (total simulation time reduction without losing accuracy).
Proceedings of the twenty-first annual symposium on Integrated circuits and system design - SBCCI '08, 2008
High-speed networks used to interconnect computers advance at an extraordinary pace, driven by th... more High-speed networks used to interconnect computers advance at an extraordinary pace, driven by the evolution of several contributing technologies. Due to the ever-increasing complexity of designing parts and equipments for these networks, design complexity management makes scalability and reusability more important issues than performance, in most cases. This paper describes MOTIM, a scalable and reusable architecture enabling the implementation of Ethernet switches with low latency and high throughput. The architecture is built around a network-on-chip-based switch fabric, which guarantees scalability. The architecture has been validated by functional simulation and prototyped in FPGAs. The experimental results show that even under severe traffic conditions the architecture achieves packet transmission with low latencies.
2009 17th IFIP International Conference on Very Large Scale Integration (VLSI-SoC), 2009
The use of NoCs in complex MPSoCs is a reality in academic researches and industrial designs. A l... more The use of NoCs in complex MPSoCs is a reality in academic researches and industrial designs. A lot of research effort has been conducted in the last years in NoC and MPSoC designs, but few works address the gap between the NoC infrastructure and the MPSoC software applications. An important issue in MPSoC design is QoS, since applications running in such systems may have tight timing constraints, as video processing or fast communication protocols. This work bridges the hardware/software gap, exploring the integration of low-level NoC services into an application programming interface (API). Such API hides the interconnection complexity from programmer and provides efficient design space exploration to meet the QoS application requirements. Results shows that, even with the huge available bandwidth offered by NoCs, such interconnection architecture is not capable to meet QoS constraints when flows compete for common resources inside the NoC. Using the priority scheme developed in this work, applications executing in the MPSoC achieve the performance requirements. This work highlights the need to integrate NoC and MPSoC design efforts in a unified framework. (Abstract)
Proceedings of the 20th annual conference on Integrated circuits and systems design - SBCCI '07, 2007
A considerable number of NoC designs are available, focusing on different aspects of this type of... more A considerable number of NoC designs are available, focusing on different aspects of this type of communication infrastructure. Example of relevant aspects considered during NoC design are quality-of-service achievement, the choice of synchronization method to employ between routers, power consumption reduction and application modules mapping. However, some design choices are common to many if not most NoC proposals: wormhole packet switching and the use of virtual channels. This work discusses trade-offs on using circuit and packet switching, arguing in favor of the former with fixed packet size. Next, it proposes and justifies the replacement of virtual channels by replicated channels, based on the abundance of wires expected in current and future deep sub-micron technologies. Finally, the work proposes the use of a session layer coupled to circuit switching. Results point out to reduced latency and router area, leading to a router architecture adapted for high-performance NoCs.
Several NoC routing schemes proposals targeting overall performance optimization are available in... more Several NoC routing schemes proposals targeting overall performance optimization are available in the literature. However, such proposals do not differentiate the application flows. The goal here is to demonstrate that adaptive routing algorithms can be used in flows with temporal constraints, enabling an enhanced degree of path exploration. The main contribution of this work is to expose the routing algorithm at the IP level. Results show gains in latency, throughput and jitter for hotspot scenarios, with minimal area overhead.
Recent works propose Networks on Chip (NoC) as the communication architecture that will be able t... more Recent works propose Networks on Chip (NoC) as the communication architecture that will be able to provide scalability and performance for communication in future SoCs. Even if NoC performance easily exceeds the performance of buses, NoCs have throughput limited to a fraction of the nominal network capacity. This limitation comes from phenomena like packet collision during routing, and buffers space scarcity. One method to increase NoC throughput is to employ virtual channels. Virtual channels are a time ...
High-speed networks used to interconnect computers advance at an extraordinary pace, driven by th... more High-speed networks used to interconnect computers advance at an extraordinary pace, driven by the evolution of several contributing technologies. Due to the ever-increasing complexity of designing parts and equipments for these networks, design complexity management makes scalability and reusability more important issues than performance, in most cases. This paper describes MOTIM, a scalable and reusable architecture enabling the implementation of Ethernet switches with low latency and high throughput. The architecture is built around a network-on-chip-based switch fabric, which guarantees scalability. The architecture has been validated by functional simulation and prototyped in FPGAs. The experimental results show that even under severe traffic conditions the architecture achieves packet transmission with low latencies.
For almost a decade now, Network on Chip (NoC) concepts have evolved to provide an interesting al... more For almost a decade now, Network on Chip (NoC) concepts have evolved to provide an interesting alternative to more traditional intrachip communication architectures (e.g. shared busses) for the design of complex Systems on Chip (SoCs). A considerable number of NoC proposals are available, focusing on different sets of optimization aspects, related to specific classes of applications. Each such application employs a NoC as part of its underlying implementation infrastructure. Many of the mentioned optimization aspects target results such as Quality of Service (QoS) achievement and/or power consumption reduction. On the other hand, the use of NoCs brings about the solution of new design problems, such to the choice of synchronization method to employ between NoC routers and application modules mapping. Although the availability of NoC structures is already rather ample, some design choices are at base of many, if not most, NoC proposals. These include the use of wormhole packet switching and virtual channels. This work pledges against this practice. It discusses trade-offs of using circuit or packet switching, arguing in favor the use of the former with fixed size packets (cells). Quantitative data supports the argumentation. Also, the work proposes and justifies replacing the use of virtual channels by replicated channels, based on the abundance of wires in current and expected deep sub-micron technologies. Finally, the work proposes a transmission method coupling the use of session layer structures to circuit switching to better support application implementation. The main reported result is the availability of a router with reduced latency and area, a communication architecture adapted for high-performance applications.
Transaction level (TL) modeling is regarded today as the next step in the direction of complex in... more Transaction level (TL) modeling is regarded today as the next step in the direction of complex integrated circuits and systems design entry. This means that as this modeling level definition evolves, automated synthesis tools will increasingly support it, allowing design capture to start at a higher abstraction level than today. This work presents a comparison of traditional register transfer level (RTL) modeling and transaction level modeling through the implementation of a simple processor case study. SystemC is a language that naturally supports hardware transaction level descriptions. The R8 processor was described in SystemC TL and RTL versions and these were compared to an equivalent hand-coded VHDL RTL description in some key points, such as simulation efficiency and implementation results. The experiments indicate that TL descriptions present a faster path to system validation and that it is possible to envisage the automation of the design flow from this level of abstractio...
6th International Workshop on Reconfigurable Communication-Centric Systems-on-Chip (ReCoSoC), 2011
MPSoCs are largely used in embedded systems, allowing the design of complex systems within short ... more MPSoCs are largely used in embedded systems, allowing the design of complex systems within short time-tomarket. The shift in the communication infrastructure, from buses to networks-on-chip (NoCs), adds new design challenges. Standard directory-based cache coherence protocols represent a performance bottleneck due to number of transactions in the network, reducing performance and increasing the energy consumption. State-of-the-art works investigate new protocols, at abstract levels (e.g. TLM), to optimize the performance of the memory organization. Differently from previous works, we investigate the benefits NoCs can bring to directory-based cache coherence protocols using RTL modeling. The main functionality NoCs may provide for the protocols is the way messages are sent through the network. Most NoCs support multicast as a set of unicast messages. Such method is not suitable for cache coherence protocols, because transactions as block invalidate and block update are naturally multicast. This work proposes the use of multicast messages to reduce the number of transactions to improve the performance of cache coherence protocols in NoCbased MPSoCs. Results show that performance of some transactions is improved up to 32% when using multicast messages.
Proceedings of the 24th symposium on Integrated circuits and systems design - SBCCI '11, 2011
As the number of cores and functionalities integrated in embedded devices increases, the amount o... more As the number of cores and functionalities integrated in embedded devices increases, the amount of memory used on these devices also increases, justifying the development of memory architectures presenting scalability, low energy consumption and low latency. To implement memory solutions, most works adopting NoC-based MPSoCs only employ basic communication services, such as send/receive, without exploring the services NoCs can offer, for instance connection, priorities and multicast communication. Multicast can be used to optimize the cache coherence protocol, leading to both traffic and energy consumption reduction. The goal of this work is to optimize a directory-based cache coherence protocol exploiting specific NoC services, as multicast and priorities. To demonstrate our proposal, an MPSoC described at the RTL level is used, enabling accurate performance and energy evaluation. Results show a reduction of 17% in the number of clock cycles and a reduction up to 86% (average reduction: 39%) in energy consumption for some memory transactions.
2012 VIII Southern Conference on Programmable Logic, 2012
The design of a Multiprocessor System-on-Chip (MPSoC) is a complex task, including steps as appli... more The design of a Multiprocessor System-on-Chip (MPSoC) is a complex task, including steps as application development, platform configuration, code generation, task mapping onto the platform and debugging. An integrated environment covering most of these steps is a gap in the literature. The present work first details an MPSoC architecture, which supports the execution of distributed applications, including an operating system enabling multitask execution at each processing element. The MPSoC is heterogeneous, due to the support to different processor architectures. Then, a framework able to cover the design steps previously mentioned is presented. The framework enables the design space exploration for applications to be executed in the MPSoC, varying for example the number and type of processors, the memory size, the task mapping. Results demonstrate the correct operation for different MPSoC configurations, generated from the proposed framework. Such open-source framework enables the research community to investigate new subjects related to MPSoC and Network on Chip (NoC) design, as well as evaluate distributed applications in a multiprocessor environment.
2014 IEEE International Symposium on Circuits and Systems (ISCAS), 2014
ABSTRACT Software development becomes an important issue in today's MPSoC design. Due to ... more ABSTRACT Software development becomes an important issue in today's MPSoC design. Due to the inherent non-deterministic behavior of MPSoCs, they are prone to concurrency bugs. Debugging tools for MPSoC may be grouped in the following classes: simulators, parallel software development environments, NoC debuggers. An important gap is observed concerning a complete NoC-based MPSoC: tools to inspect the traffic exchanged between processing elements in a higher abstraction level, and not simply as raw data. This is the goal of the paper: propose a new class of debugging tools, able to trace the messages exchanged between PEs, enabling debugging at the protocol level. Examples of protocols include communication between tasks, mapping heuristics, monitoring schemes for QoS, among others. The paper presents the proposed debug framework, as well as a task migration protocol as case study.
IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2014
ABSTRACT With the significant increase in the number of processing elements in NoC-based MPSoCs, ... more ABSTRACT With the significant increase in the number of processing elements in NoC-based MPSoCs, communication becomes, increasingly, a critical resource for performance gains and quality-of-service (QoS) guarantees. The main gap observed in the NoC-based MPSoCs literature is the runtime adaptive techniques to meet QoS. In the absence of such techniques, the system user must statically define, for example, the scheduling policy, communication priorities, and the communication switching mode of applications. The goal of this paper is to investigate the runtime adaptation of the NoC resources, according to the QoS requirements of each application running in the MPSoC. This paper adopts an NoC architecture with duplicated physical channels, adaptive routing, support to flow priorities and simultaneous packet and circuit switching. The monitoring and adaptation management is performed at the operating system level, ensuring QoS to the monitored applications. The QoS acts in the flow priority and the switching mode. Monitoring and QoS adaptation were implemented in software, resulting in flexibility to apply the techniques to other platforms or include other adaptive techniques, as task migration or DVFS. Applications with latency and throughput deadlines run concurrently with best-effort applications. Results with synthetic and real application reduced in average 60% the latency violations, ensuring smaller jitter and throughput. The execution time of applications is not penalized applying the proposed QoS adaptation methods.
2012 IEEE International Symposium on Circuits and Systems, 2012
ABSTRACT Task migration is a well-known strategy adopted in distributed systems for load balancin... more ABSTRACT Task migration is a well-known strategy adopted in distributed systems for load balancing. but the adoption of such strategy in NoC-based MPSoC is scarce in the literature. This paper proposes a complete task migration protocol for NoC-based MPSoCs. The migration transfers the task code, data and context to another PE. The paper presents the communication strategy to ensure coherence in the messages delivery, the heuristic to compute the new task location, and the procedure to inform the new task position. Results evaluate the cost of the task migration using a real MPSoC (described in synthesizable VHDL), demonstrating that the cost to migrate a given task has a small impact in the system performance, enabling its use to improve the overall system performance.
18th IEEE/IFIP International Workshop on Rapid System Prototyping (RSP '07), 2007
Abstract Networks-on-chip, or NoCs, are one communication architecture candidate to be used in pr... more Abstract Networks-on-chip, or NoCs, are one communication architecture candidate to be used in present and future SoCs, due to its scalability, reusability and performance. The focus of this paper is the analysis of IP communication models in NoCs. Employing ...
2011 IEEE International Symposium of Circuits and Systems (ISCAS), 2011
This paper proposes a novel strategy for optimizing resources in Multi-Processor Systems-on-Chip ... more This paper proposes a novel strategy for optimizing resources in Multi-Processor Systems-on-Chip (MPSoC). The approach is based on using control-loop feedback mechanism to maximize the efficiency on exploiting available resources such as CPU time, operating frequency, etc. Each Processing Element (PE) in the architecture is equipped with a frequency scaling module responsible for tuning the frequency of processors at run-time according to the application requirements. Results show the system's capability of adapting to disturbing conditions. For validation purposes we have implemented a multi-threaded MJPEG decoder together with an ADPCM audio decoder and a FIR.
2009 IEEE International Symposium on Circuits and Systems, 2009
Multi-Processor Systems-on-Chip (MPSoCs) are increasingly popular in embedded systems. Due to the... more Multi-Processor Systems-on-Chip (MPSoCs) are increasingly popular in embedded systems. Due to their complexity and huge design space to explore for such systems, CAD tools and frameworks to customize MPSoCs are mandatory. Some academic and industrial frameworks are available to support bus-based MPSoCs, but few works target NoCs as underlying communication architecture. A framework targeting MPSoC customization must provide abstract models to enable fast design space exploration, flexible application mapping strategies, all coupled to features to evaluate the performance of running applications. This paper proposes a framework to customize NoC-based MPSoCs with support to static and dynamic task mapping and C/SystemC simulation models for processors and memories. A simple, specifically designed microkernel executes in each processor, enabling multitasking at the processor level. Graphical tools enable debug and system verification, individualizing data for each task. Practical results highlight the benefit of using dynamic mapping strategies (total execution time reduction) and abstract models (total simulation time reduction without losing accuracy).
Proceedings of the twenty-first annual symposium on Integrated circuits and system design - SBCCI '08, 2008
High-speed networks used to interconnect computers advance at an extraordinary pace, driven by th... more High-speed networks used to interconnect computers advance at an extraordinary pace, driven by the evolution of several contributing technologies. Due to the ever-increasing complexity of designing parts and equipments for these networks, design complexity management makes scalability and reusability more important issues than performance, in most cases. This paper describes MOTIM, a scalable and reusable architecture enabling the implementation of Ethernet switches with low latency and high throughput. The architecture is built around a network-on-chip-based switch fabric, which guarantees scalability. The architecture has been validated by functional simulation and prototyped in FPGAs. The experimental results show that even under severe traffic conditions the architecture achieves packet transmission with low latencies.
2009 17th IFIP International Conference on Very Large Scale Integration (VLSI-SoC), 2009
The use of NoCs in complex MPSoCs is a reality in academic researches and industrial designs. A l... more The use of NoCs in complex MPSoCs is a reality in academic researches and industrial designs. A lot of research effort has been conducted in the last years in NoC and MPSoC designs, but few works address the gap between the NoC infrastructure and the MPSoC software applications. An important issue in MPSoC design is QoS, since applications running in such systems may have tight timing constraints, as video processing or fast communication protocols. This work bridges the hardware/software gap, exploring the integration of low-level NoC services into an application programming interface (API). Such API hides the interconnection complexity from programmer and provides efficient design space exploration to meet the QoS application requirements. Results shows that, even with the huge available bandwidth offered by NoCs, such interconnection architecture is not capable to meet QoS constraints when flows compete for common resources inside the NoC. Using the priority scheme developed in this work, applications executing in the MPSoC achieve the performance requirements. This work highlights the need to integrate NoC and MPSoC design efforts in a unified framework. (Abstract)
Proceedings of the 20th annual conference on Integrated circuits and systems design - SBCCI '07, 2007
A considerable number of NoC designs are available, focusing on different aspects of this type of... more A considerable number of NoC designs are available, focusing on different aspects of this type of communication infrastructure. Example of relevant aspects considered during NoC design are quality-of-service achievement, the choice of synchronization method to employ between routers, power consumption reduction and application modules mapping. However, some design choices are common to many if not most NoC proposals: wormhole packet switching and the use of virtual channels. This work discusses trade-offs on using circuit and packet switching, arguing in favor of the former with fixed packet size. Next, it proposes and justifies the replacement of virtual channels by replicated channels, based on the abundance of wires expected in current and future deep sub-micron technologies. Finally, the work proposes the use of a session layer coupled to circuit switching. Results point out to reduced latency and router area, leading to a router architecture adapted for high-performance NoCs.
Several NoC routing schemes proposals targeting overall performance optimization are available in... more Several NoC routing schemes proposals targeting overall performance optimization are available in the literature. However, such proposals do not differentiate the application flows. The goal here is to demonstrate that adaptive routing algorithms can be used in flows with temporal constraints, enabling an enhanced degree of path exploration. The main contribution of this work is to expose the routing algorithm at the IP level. Results show gains in latency, throughput and jitter for hotspot scenarios, with minimal area overhead.
Recent works propose Networks on Chip (NoC) as the communication architecture that will be able t... more Recent works propose Networks on Chip (NoC) as the communication architecture that will be able to provide scalability and performance for communication in future SoCs. Even if NoC performance easily exceeds the performance of buses, NoCs have throughput limited to a fraction of the nominal network capacity. This limitation comes from phenomena like packet collision during routing, and buffers space scarcity. One method to increase NoC throughput is to employ virtual channels. Virtual channels are a time ...
High-speed networks used to interconnect computers advance at an extraordinary pace, driven by th... more High-speed networks used to interconnect computers advance at an extraordinary pace, driven by the evolution of several contributing technologies. Due to the ever-increasing complexity of designing parts and equipments for these networks, design complexity management makes scalability and reusability more important issues than performance, in most cases. This paper describes MOTIM, a scalable and reusable architecture enabling the implementation of Ethernet switches with low latency and high throughput. The architecture is built around a network-on-chip-based switch fabric, which guarantees scalability. The architecture has been validated by functional simulation and prototyped in FPGAs. The experimental results show that even under severe traffic conditions the architecture achieves packet transmission with low latencies.
For almost a decade now, Network on Chip (NoC) concepts have evolved to provide an interesting al... more For almost a decade now, Network on Chip (NoC) concepts have evolved to provide an interesting alternative to more traditional intrachip communication architectures (e.g. shared busses) for the design of complex Systems on Chip (SoCs). A considerable number of NoC proposals are available, focusing on different sets of optimization aspects, related to specific classes of applications. Each such application employs a NoC as part of its underlying implementation infrastructure. Many of the mentioned optimization aspects target results such as Quality of Service (QoS) achievement and/or power consumption reduction. On the other hand, the use of NoCs brings about the solution of new design problems, such to the choice of synchronization method to employ between NoC routers and application modules mapping. Although the availability of NoC structures is already rather ample, some design choices are at base of many, if not most, NoC proposals. These include the use of wormhole packet switching and virtual channels. This work pledges against this practice. It discusses trade-offs of using circuit or packet switching, arguing in favor the use of the former with fixed size packets (cells). Quantitative data supports the argumentation. Also, the work proposes and justifies replacing the use of virtual channels by replicated channels, based on the abundance of wires in current and expected deep sub-micron technologies. Finally, the work proposes a transmission method coupling the use of session layer structures to circuit switching to better support application implementation. The main reported result is the availability of a router with reduced latency and area, a communication architecture adapted for high-performance applications.
Transaction level (TL) modeling is regarded today as the next step in the direction of complex in... more Transaction level (TL) modeling is regarded today as the next step in the direction of complex integrated circuits and systems design entry. This means that as this modeling level definition evolves, automated synthesis tools will increasingly support it, allowing design capture to start at a higher abstraction level than today. This work presents a comparison of traditional register transfer level (RTL) modeling and transaction level modeling through the implementation of a simple processor case study. SystemC is a language that naturally supports hardware transaction level descriptions. The R8 processor was described in SystemC TL and RTL versions and these were compared to an equivalent hand-coded VHDL RTL description in some key points, such as simulation efficiency and implementation results. The experiments indicate that TL descriptions present a faster path to system validation and that it is possible to envisage the automation of the design flow from this level of abstraction without significant impact on the quality of the final implementation.
Uploads
Papers by Everton Carara