Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
63 views

Code Analysis: CSE 373 Data Structures and Algorithms

The document summarizes key points from a lecture on data structures and algorithms: 1. It discusses the importance of testing code as it is written and provides strategies for effective testing, such as isolating modules, incrementally building code, and testing edge cases. 2. It introduces code modeling, which mathematically represents how many operations a piece of code will perform based on input size, to analyze runtime efficiency without extensive testing. 3. It provides an example comparing two algorithms for detecting duplicates in a sorted array, modeling their runtime as a function of array size to determine asymptotic complexity.

Uploaded by

Tawsif
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views

Code Analysis: CSE 373 Data Structures and Algorithms

The document summarizes key points from a lecture on data structures and algorithms: 1. It discusses the importance of testing code as it is written and provides strategies for effective testing, such as isolating modules, incrementally building code, and testing edge cases. 2. It introduces code modeling, which mathematically represents how many operations a piece of code will perform based on input size, to analyze runtime efficiency without extensive testing. 3. It provides an example comparing two algorithms for detecting duplicates in a sorted array, modeling their runtime as a function of array size to determine asymptotic complexity.

Uploaded by

Tawsif
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 24

Lecture 3: CSE 373 Data Structures and

Code Analysis Algorithms

CSE 373 SU 19 - ROBBIE WEBER 1


Administrivia
Three forms are going out later today:
- Partner form for Project 1 (due Monday night)
- Course background survey (help us optimize for your background/goals)
- Canvas quiz
- Make sure you understand important details of the syllabus
- Get extra credit!

Lecture 2 slides are updated on webpage.


- Usually will do this silently, if bugs were pointed out during lecture.

CSE 373 SU 19 - ROBBIE WEBER 2


Testing Wrap Up
Computers don’t make mistakes- people do!
“I’m almost done, I just need to make sure it works”
– Naive 14Xers

Software Test: a separate piece of code that exercises the code you are assessing by providing
input to your code and finishes with an assertion of what the result should be.

1. Isolate - break your code into small modules


2. Build in increments - Make a plan from simplest to most complex cases
3. Test as you go - As your code grows, so should your tests

CSE 373 19 SU - ROBBIE WEBER 3


Testing Strategies
You can’t test everything
- Break inputs into categories
- What are the most important pieces of code?

Test behavior in combination


- Call multiple methods one after the other
- Call the same method multiple times

Trust no one!
- How can the user mess up?

If you messed up, someone else might


- Test the complex logic

CSE 373 19 SU - ROBBIE WEBER 4


Algorithm Analysis

CSE 373 SU 19 - ROBBIE WEBER 5


Code Analysis
How do we compare two pieces of code?
Mathematically:
- Time needed to run Tools we use can be applied to any aspect
- Memory used of your code you can measure.
- Number of network calls
- Amount of data saved to disk
With Words:
- Specialized vs generalized
- Code reusability
- Security
CSE 373 SU 19 - ROBBIE WEBER 6
Comparing Algorithms with Mathematical
Models
Consider overall trends as input/data set gets bigger
- Computers are fast – anything you do on your 5 item dataset is going to finish in the blink of an eye.
- Large inputs differentiate.

Identify trends without investing in testing


- Estimate how big of a dataset you can handle

asymptotic analysis – the process of mathematically representing runtime of an algorithm as a


function of the number/size of inputs as the input grows (arbitrarily large)

CSE 373 SU 19 - ROBBIE WEBER 7


Code Modeling

CSE 373 SU 19 - ROBBIE WEBER 8


Disclaimer
This topic has lots of details/subtle relationships between concepts.
I’m going to try to introduce things one at a time (all at once can be overwhelming).

“We’ll see that later” might be the answer to a lot of questions.

CSE 373 SU 19 - ROBBIE WEBER 9


Code Modeling
 
code modeling – the process of mathematically representing how many operations a piece
of code will run in relation to the number of inputs
What counts as an “operation”? Assume all basic operations run in equivalent time

Basic operations Function calls


- Adding ints or doubles - Count runtime of function body
- Variable assignment - Remember that new calls a function!
- Variable update
- Return statement Conditionals
- Accessing array index or object field - Time of test + appropriate branch
- We’ll talk about which branch to analyze when we get to cases.
Consecutive statements
- Sum time of each statement Loops
- Number of iterations of loop body x runtime of loop
body

CSE 373 SU 19 - ROBBIE WEBER 10


Modeling Case Study
Goal: return ‘true’ if a sorted array of ints contains duplicates
Solution 1: compare each pair of elements
public boolean hasDuplicate1(int[] array) {
boolean found = false;
int failedChecks = 0;
for (int i = 0; i < array.length; i++) {
for (int j = 0; j < array.length; j++) {
if (i != j && array[i] == array[j])
found = true;
else
failedChecks++
}
}
return found;
}

Solution 2: compare each consecutive pair of elements


public boolean hasDuplicate2(int[] array) {
boolean found = false;
int failedChecks = 0;
for (int i = 0; i < array.length - 1; i++) {
if (array[i] == array[i + 1])
found = true;
else
failedChecks++;
}
return found;
}

CSE 373 SU 19 - ROBBIE WEBER 11


Modeling Case Study: Solution 2
 
Goal: produce mathematical function representing runtime where is the size of the array
Solution 2: compare each consecutive pair of elements
public boolean hasDuplicate2(int[] array) { loop = (n – 1)(body)
boolean found = false; +1 Loop body
int failedChecks = 0; +1 +2
for +1(int i = 0; i < array.length - 1; i++) { 5 (if) + 2 (loop checks) = 7
if (array[i] == array[i + 1])
found = true; +1 If statement
else +1 either branch 4+1=5
failedChecks++; +1
} +1
return found;
Approach
}
 𝑓 ( 𝑛 ) =7 ( 𝑛 −1 ) + 4 -> start with basic operations, work inside out for control structures
- Each basic operation = +1
- Conditionals = test operations+ appropriate branch (today branches
linear ->
equivalent)
- Loop = #iterations * (operations in loop body)
CSE 373 SU 19 - ROBBIE WEBER 12
Finding a Big-O
 We have an expression for .  𝑓 ( 𝑛 )=7 ( 𝑛 −1 ) + 4
How do we get the ?
1. Make “look nice” 𝑓  ( 𝑛 )=7 ( 𝑛 −1 ) + 4=7 𝑛 −7+ 4=7 𝑛 −3
2. Find the “dominating term” and delete all others.  𝑓
- The “dominating” term is the one that is largest as gets bigger. In this
( 𝑛 ) =7 𝑛 − 3 ≈ 7 𝑛
class, often the largest power of .

3. Remove any constant factors.   𝑓 (𝑛 )≈ 7 𝑛 ≈ 𝑛


4. Write the final big-O   is

CSE 373 SU 19 - ROBBIE WEBER 13


Wait, what?
 Why did we just throw out all of that information?
Big-O is the “significant digits” of computer science.

We care about what happens when gets bigger


- All code is “fast enough” for small in practice

For large enough the dominant term decides how big the function is.

Why get rid of constants – we were counting “basic operations”


There is not a strong correlation between the number of basic operations and the time code
actually takes to run.

CSE 373 SU 19 - ROBBIE WEBER 14


Why aren’t they significant?
public static void method1(int[] input) public static void method2(int[] input)
{ {
int n = input.length; int five = 5;
input[five] = input[five] + 1;
input[n-1] = input[3] + input[4];
input[five]--;
input[0]+= input[1]; }
}
public static void method1(int[]); Code: public static void method2(int[]); Code:
0: aload_0 10: aload_0 20: iconst_1 0: iconst_5 10: aload_0
1: arraylength 11: iconst_4 21: iaload 1: istore_1 11: iload_1
2: istore_1 12: iaload 22: iadd 2: aload_0 12: dup2
3: aload_0 13: iadd 23: iastore 3: iload_1 13: iaload
4: iload_1 14: iastore 24: return 4: aload_0 14: iconst_1
5: iconst_1 15: aload_0 5: iload_1 15: isub
6: isub 16: iconst_0 6: iaload 16: iastore
7: aload_0 17: dup2 7: iconst_1 17: return
8: iconst_3 18: iaload 8: iadd
9: iaload 19: aload_0 9: iastore
CSE 373 SU 19 - ROBBIE WEBER 15
Why aren’t they significant?
It goes deeper.

The Java bytecode is converted (compiled) into your own machine’s assembly code
- Might change the number of lines again.

The number of lines still isn’t a perfect reflection of time taken by your laptop.
The amount of time it takes to look up a value in memory is wildly variable
Recently used values are probably “cached” and will have a quick lookup
If a value hasn’t been used in a long time, might have to wait for main memory, which takes thousands of times as
long.

Modern computers do lots of crazy things to speed up code.


- “pipelining” (execute parts of multiple instructions simultaneously)
- “branch prediction” (guess whether you’re about to go down the if or else branch before it actually gets there)

CSE 373 SU 19 - ROBBIE WEBER 16


Code Modelling
We can’t accurately model the constant factors just by staring at the code.
And the lower-order terms matter even less than the constant factors.

So we just ignore them for the big-O.


If we ask for a model, we won’t care about whether you count 4 operations per loop or 5 (or 10 or
1 or 28).
We want to be able to see your numbers weren’t guesses and that you get the right big-O.

This does not mean you shouldn’t care about constant factors ever – they are important in real
code!
- Our theoretical tools aren’t precise enough to analyze them well.

CSE 373 SU 19 - ROBBIE WEBER 17


Modeling Case Study: Solution 1
Solution 1: compare each consecutive pair of elements
public boolean hasDuplicate1(int[] array) {
boolean found = false; xn
int failedChecks = 0;
for (int i = 0; i < array.length; i++) {
for (int j = 0; j < array.length; j++) { xn
if (i != j && array[i] == array[j]) +5 n(8n+1)
found = true; +1 8n
else 6

failedChecks++ +1
}
}
return found; Approach
} -> start with basic operations, work inside out for control structures
𝑓  ( 𝑛 )=𝑛 ( 8 𝑛+1 ) + 4 - Each basic operation = +1
- Conditionals = test operations+ appropriate branch (today branches
quadratic -> O(n2)
equivalent)
- Loop = #iterations * (operations in loop body)
CSE 373 SU 19 - ROBBIE WEBER 18
5 Minutes
Your turn!
 
Write the specific mathematical code model for the following code and indicate the big-O
runtime in terms of .
public void foobar (int k) {
int j = 0; +1
while (j < k) { +k/5 (body)   𝑘 ( 𝑘 +2 )
𝑓 ( 𝑘 )=
for (int i = 0; i < k; i++) { +k(body) 5
System.out.println(“Hello world”); +1   quadratic ->
}
j = j + 5; +2

} Approach
-> start with basic operations, work inside out for control structures
} - Each basic operation = +1
- Conditionals = test operations+ appropriate branch (today branches
equivalent)
- Loop = #iterations * (operations in loop body)
CSE 373 SU 19 - ROBBIE WEBER 19
More Practice
 Let myLL be a linked list (like we saw in lecture 1) with nodes.
Suppose we’re a client class. Let’s try to print every element of the list.
Assume get(i)takes steps
for(int i=0; i<myLL.size(); i++){
System.out.println(myLL.get(i));
}

The number of operations changes each time through the loop.


Summations to the rescue!
  𝑛 (𝑛 − 1)
¿
2
Summations review and a bunch of identities:
https://courses.cs.washington.edu/courses/cse373/19su/resources/
CSE 373 SU 19 - ROBBIE WEBER 20
Iterators

CSE 373 SU 19 - ROBBIE WEBER 21


Traversing Data
We could get through the data much more efficiently in the Linked List class itself.
Node curr = this.front;
while(curr!=null){
System.out.println(curr.data);
curr = curr.next;
}
What if the client wants to do something other than just print?
We should provide giving each element in order as a service to client classes.

for (T item : list) { Iterator!


System.out.println(item);
}

CSE 373 SU 19 - ROBBIE WEBER 22


Review: Iterators
iterator: a Java interface that dictates how a collection of data should be traversed. Can only
move in the forward direction and in a single pass.

Iterator Interface supported operations:


hasNext() – returns true if the iteration has more elements yet to be examined
behavior
hasNext() – true if elements
remain
next() – returns the next element in the iteration and moves the iterator
next() – returns next element forward to next item

ArrayList<Integer> list = new ArrayList<Integer>(); ArrayList<Integer> list = new ArrayList<Integer>();


//fill up list //fill up list

Iterator itr = list.iterator(); for (int i : list) {


while (itr.hasNext()) { int item = i;
int item = itr.next(); }
}

CSE 373 SU 19 - ROBBIE


23 WEBER
Implementing an Iterator
Usually: you’ll have a private class for the iterator object.
That iterator class will have a class variable to remember where you are.
hasNext() – check if there’s something left by examining the class variable.
next() – return the current thing and update the class variable.

You have a choice:


- Variable might point to the thing you just processed
- Or the next thing that would be returned.

Both will work, one might be easier to think about/code up in some instances than others.
Punchline: Iterators make your client’s code more efficient (which is what they care about)

CSE 373 SU 19 - ROBBIE


24 WEBER

You might also like