ECE 530 High Performance Vlsi Home Work 1
ECE 530 High Performance Vlsi Home Work 1
ECE 530 High Performance Vlsi Home Work 1
Sorting Algorithm
A)
Following are the five algorithms which were developed in this homework.
3. Insert Sort:- Insertion sort sorts one element at a time, It is just like manual sorting by humans. Insertion sort
is better for small set of elements. Insertion sort is slower than heap sort, shell sort, quick sort,and merge sort.
5. Heap Sort:- Heap sort algorithm starts by building a heap from the given elements and then heap removes its
largest element from the end of partially sorted array
Require two arrays – one to hold the heap and the other to hold the sorted elements .
extract the largest element
Places it in the next open position from the end of the partially sorted array.
Repeat until there is nothing in the heap and the array is full
6.Shell Sort:- Based on an increment sequence. The increment size is reduced after each pass until the
increment size is 1. With an increment size of 1, the sort is a basic insertion sort, but by this time the data is
guaranteed to be almost sorted, which is insertion sort's "best case"
7. Selection sort:- First finds the smallest in the array and exchanges it with the element in the
first position, then finds the second smallest element and exchanges it with the element in the second position,
and continue in this way until the entire array is sorted.
Task 1) Use software transformation (example, loop unrolling) technique to modify ‘slack.c’ with default cache
configuration;
Task 2) Use software transformation (example, loop unrolling) technique to modify ‘slack.c’ with your own cache
configuration; we need to modify ‘my_config’ file for cache settings. Find your optimal cache
parameters for level-1 and level-2 caches in terms of power
The default values of level-1 instruction cache, level-1 data cache and level-2 cache are as followed
A)
Task 1)
Without changing the default cache configuration as given above we will modify the slack.c file which is given to
us using a software transformation technique known as loop unrolling and will find out the power consumption
for this case. Figure 8 shows the modification that are been done to ‘slack.c’ file in-order to get minimum power.
Procedure to be followed:-
1. Study the two reference files given to us which are ‘roll.c’ and ‘unroll.c’.
2. The slack code given to us is with respect to roll software transformation.
3. We need to convert the roll software transformation to unroll software transformation.
4. If we look at the slack code which is given to us we can see that there are many functions in the code.
5. We need to apply unrolling to only one of these function. If we apply unrolling to almost all of the
function the power will increase which is not what we want.
6. We must also make sure that the ‘output.txt’ file shows the same results as it was showing with
unrolling.
7. Here I have applied the unrolling technique to the last loop in the code because doing this only did I get
the same results in the ‘output.txt’ file as before.
Fig 8. Changes made in the slack.c file
Here we will be changing the default cache configuration which was discussed earlier to those set of
combinations which will give us the minimum power. So we actually need to perform trial and error method but
there is one more way that we can do and that is by simply referring the lab 1 tutorial. I found that if I make all
the 4 cache for cache size 4K then we get minimum power which is what we want.
Procedure to be followed:-
B) Output results
Fig 11 shows that by keeping the default cache configuration values we get average power of 14.6365.
D) We can see that the value for D1, L1, L2 and D2 are set at 128:32:4 gives us least power.
We can say that the power is 10.847 for slack without enrol and 10.842 for slack with unroll.
Cache configuration values are independent of any applied software configuration. It could
be found and said that minimum power is concurred when the combination of both modified
cache and transformation in software is done.
Conclusion:- Successfully implemented five sorting algorithms and analysed the power
required for them and have understood as to why they consume this much power. Also the
concept of cache configuration was understood.