Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
43 views

Unit Testing Python-Chapter1

Unit testing automates repetitive testing of functions to validate their expected behavior and outputs. It saves time compared to manual testing by running tests repeatedly during development. The document demonstrates unit testing a function for parsing rows from a housing data file, showing examples of valid and invalid inputs and their expected outputs to test the function's robustness.

Uploaded by

charwill1234
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views

Unit Testing Python-Chapter1

Unit testing automates repetitive testing of functions to validate their expected behavior and outputs. It saves time compared to manual testing by running tests repeatedly during development. The document demonstrates unit testing a function for parsing rows from a housing data file, showing examples of valid and invalid inputs and their expected outputs to test the function's robustness.

Uploaded by

charwill1234
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 86

Why unit test?

U N I T T E S T I N G F O R D ATA S C I E N C E I N P Y T H O N

Dibya Chakravorty
Test Automation Engineer
How can we test an implementation?
def my_function(argument): my_function(argument_1)
...

return_value_1

my_function(argument_2)

return_value_2

my_function(argument_3)

return_value_3

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Life cycle of a function
Implementation

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Life cycle of a function
Implementation

Test

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Life cycle of a function
Implementation

Test

PASS
Accepted
implementation

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Life cycle of a function
Implementation

Test

FAIL PASS
Accepted
Bugfix
implementation

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Life cycle of a function
Implementation

Test

FAIL PASS
Accepted
Bugfix
implementation

Feature request
or Refactoring

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Life cycle of a function
Implementation

Test

FAIL PASS
Accepted
Bugfix
implementation

Feature request
or Refactoring

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Life cycle of a function
Implementation

Test

FAIL PASS
Accepted
Bugfix
implementation

Feature request
Bug found
or Refactoring

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Life cycle of a function
Implementation

Test

FAIL PASS
Accepted
Bugfix
implementation

Feature request
Bug found
or Refactoring

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Life cycle of a function
Implementation

Test 100 times

FAIL PASS
Accepted
Bugfix
implementation

Feature request
Bug found
or Refactoring

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Example
def row_to_list(row): area (sq. ft.) price (dollars)
... 2,081 314,942
1,059 186,606
293,410
1,148 206,186
1,506 248,419
1,210 214,114
1,697 277,794
1,268 194,345
2,318 372,162
1,463238,765
1,468 239,007

File: housing_data.txt

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Data format
def row_to_list(row): area (sq. ft.) price (dollars)
... 2,081 314,942
1,059 186,606
Argument Type Return value 293,410
1,148 206,186
["2,081",
"2,081\t314,942\n" Valid 1,506 248,419
"314,942"]
1,210 214,114
1,697 277,794
1,268 194,345
2,318 372,162
1,463238,765
1,468 239,007

File: housing_data.txt

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Data isn't clean
def row_to_list(row): area (sq. ft.) price (dollars)
... 2,081 314,942
1,059 186,606
Argument Type Return value 293,410 <-- row with missing area
1,148 206,186
["2,081",
"2,081\t314,942\n" Valid 1,506 248,419
"314,942"]
1,210 214,114

"\t293,410\n" Invalid None 1,697 277,794


1,268 194,345
2,318 372,162
1,463238,765
1,468 239,007

File: housing_data.txt

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Data isn't clean
def row_to_list(row): area (sq. ft.) price (dollars)
... 2,081 314,942
1,059 186,606
Argument Type Return value 293,410 <-- row with missing area
1,148 206,186
["2,081",
"2,081\t314,942\n" Valid 1,506 248,419
"314,942"]
1,210 214,114

"\t293,410\n" Invalid None 1,697 277,794


1,268 194,345
"1,463238,765\n" Invalid None 2,318 372,162
1,463238,765 <-- row with missing tab
1,468 239,007

File: housing_data.txt

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Time spent in testing this function
def row_to_list(row): row_to_list("2,081\t314,942\n")
...

["2,081", "314,942"]
Argument Type Return value

["2,081", row_to_list("\t293,410\n")
"2,081\t314,942\n" Valid
"314,942"]
None
"\t293,410\n" Invalid None

"1,463238,765\n" Invalid None row_to_list("1,463238,765\n")

None

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Time spent in testing this function
Implementation

Test 100 times

FAIL PASS
Accepted
Bugfix
implementation

Feature request
Bug found
or Refactoring

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Time spent in testing this function

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Manual testing vs. unit tests
Unit tests automate the repetitive testing process and saves time.

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Learn unit testing - with a data science spin
area (sq. ft.) price (dollars)
2,081 314,942
1,059 186,606
293,410
1,148 206,186
1,506 248,419
1,210 214,114
1,697 277,794
1,268 194,345
2,318 372,162
1,463238,765
1,468 239,007 Linear regression of housing price against
area

UNIT TESTING FOR DATA SCIENCE IN PYTHON


GitHub repository of the course

Implementation of functions like row_to_list() .

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Develop a complete unit test suite
data/
src/
|-- data/
|-- features/
|-- models/
|-- visualization/

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Develop a complete unit test suite
data/
src/
|-- data/
|-- features/
|-- models/
|-- visualization/
tests/ # Test suite
|-- data/
|-- features/
|-- models/
|-- visualization/

Write unit tests for your own projects.

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Let's practice these
concepts!
U N I T T E S T I N G F O R D ATA S C I E N C E I N P Y T H O N
Write a simple unit
test using pytest
U N I T T E S T I N G F O R D ATA S C I E N C E I N P Y T H O N

Dibya Chakravorty
Test Automation Engineer
Testing on the console
row_to_list("2,081\t314,942\n")

["2,081", "314,942"]

row_to_list("\t293,410\n")

None

row_to_list("1,463238,765\n")

None

Unit tests improve this process.

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Python unit testing libraries
pytest

uni est

nosetests

doctest

We will use pytest!

Has all essential features.

Easiest to use.

Most popular.

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Step 1: Create a file
Create test_row_to_list.py .

test_ indicate unit tests inside (naming convention).

Also called test modules.

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Step 2: Imports
Test module: test_row_to_list.py

import pytest
import row_to_list

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Step 3: Unit tests are Python functions
Test module: test_row_to_list.py

import pytest
import row_to_list

def test_for_clean_row():

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Step 3: Unit tests are Python functions
Test module: test_row_to_list.py Argument Type Return value

["2,081",
import pytest "2,081\t314,942\n" Valid
"314,942"]
import row_to_list

def test_for_clean_row():

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Step 4: Assertion
Test module: test_row_to_list.py Argument Type Return value

["2,081",
import pytest "2,081\t314,942\n" Valid
"314,942"]
import row_to_list

def test_for_clean_row():
assert ...

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Theoretical structure of an assertion
assert boolean_expression

assert True

assert False

Traceback (most recent call last):


File "<stdin>", line 1, in <module>
AssertionError

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Step 4: Assertion
Test module: test_row_to_list.py Argument Type Return value

["2,081",
import pytest "2,081\t314,942\n" Valid
"314,942"]
import row_to_list

def test_for_clean_row():
assert row_to_list("2,081\t314,942\n") == \
["2,081", "314,942"]

UNIT TESTING FOR DATA SCIENCE IN PYTHON


A second unit test
Test module: test_row_to_list.py Argument Type Return value

["2,081",
import pytest "2,081\t314,942\n" Valid
"314,942"]
import row_to_list

"\t293,410\n" Invalid None


def test_for_clean_row():
assert row_to_list("2,081\t314,942\n") == \
["2,081", "314,942"]

def test_for_missing_area():
assert row_to_list("\t293,410\n") is None

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Checking for None values
Do this for checking if var is None .

assert var is None

Do not do this.

assert var == None

UNIT TESTING FOR DATA SCIENCE IN PYTHON


A third unit test
Test module: test_row_to_list.py Argument Type Return value

["2,081",
import pytest "2,081\t314,942\n" Valid
"314,942"]
import row_to_list

"\t293,410\n" Invalid None


def test_for_clean_row():
assert row_to_list("2,081\t314,942\n") == \ "1,463238,765\n" Invalid None
["2,081", "314,942"]

def test_for_missing_area():
assert row_to_list("\t293,410\n") is None

def test_for_missing_tab():
assert row_to_list("1,463238,765\n") is None

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Step 5: Running unit tests
Do this in the command line.

pytest test_row_to_list.py

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Running unit tests in DataCamp exercises

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Running unit tests in DataCamp exercises

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Running unit tests in DataCamp exercises

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Running unit tests in DataCamp exercises

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Next lesson: test result report

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Let's write some unit
tests!
U N I T T E S T I N G F O R D ATA S C I E N C E I N P Y T H O N
Understanding test
result report
U N I T T E S T I N G F O R D ATA S C I E N C E I N P Y T H O N

Dibya Chakravorty
Test Automation Engineer
Unit tests for row_to_list()
Test module: test_row_to_list.py Argument Type Return value

["2,081",
import pytest "2,081\t314,942\n" Valid
"314,942"]
import row_to_list

"\t293,410\n" Invalid None


def test_for_clean_row():
assert row_to_list("2,081\t314,942\n") == \ "1,463238,765\n" Invalid None
["2,081", "314,942"]

def test_for_missing_area():
assert row_to_list("\t293,410\n") is None

def test_for_missing_tab():
assert row_to_list("1,463238,765\n") is None

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Test result report
!pytest test_row_to_list.py

============================= test session starts ==============================


platform linux -- Python 3.6.7, pytest-4.0.1, py-1.8.0, pluggy-0.9.0
rootdir: /tmp/tmpvdblq9g7, inifile:
plugins: mock-1.10.0
collecting ...
collected 3 items

test_row_to_list.py .F. [100%]

=================================== FAILURES ===================================


____________________________ test_for_missing_area _____________________________

def test_for_missing_area():
> assert row_to_list("\t293,410\n") is None

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Section 1: general information
============================= test session starts ==============================
platform linux -- Python 3.6.7, pytest-4.0.1, py-1.8.0, pluggy-0.9.0
rootdir: /tmp/tmpvdblq9g7, inifile:
plugins: mock-1.10.0

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Section 2: Test result
collecting ...
collected 3 items

test_row_to_list.py .F. [100%]

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Section 2: Test result
collecting ...
collected 3 items

test_row_to_list.py .F. [100%]

Character Meaning When Action


An exception is raised when running unit Fix the function or unit
F Failure
test. test.

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Section 2: Test result
collecting ...
collected 3 items

test_row_to_list.py .F. [100%]

Character Meaning When Action


An exception is raised when running unit Fix the function or unit
F Failure
test. test.
assertion raises AssertionError

def test_for_missing_area():
assert row_to_list("\t293,410") is None # AssertionError from this line

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Section 2: Test result
collecting ...
collected 3 items

test_row_to_list.py .F. [100%]

Character Meaning When Action


An exception is raised when running unit Fix the function or unit
F Failure
test. test.
another exception

def test_for_missing_area():
assert row_to_list("\t293,410") is none # NameError from this line

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Section 2: Test result
collecting ...
collected 3 items

test_row_to_list.py .F. [100%]

Character Meaning When Action


An exception is raised when running unit Fix the function or unit
F Failure
test. test.
No exception raised when running unit Everything is ne. Be
. Passed
test happy!

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Section 3: Information on failed tests
=================================== FAILURES ===================================
____________________________ test_for_missing_area _____________________________

def test_for_missing_area():
> assert row_to_list("\t293,410\n") is None
E AssertionError: assert ['', '293,410'] is None
E + where ['', '293,410'] = row_to_list('\t293,410\n')

test_row_to_list.py:7: AssertionError

The line raising the exception is marked by > .

> assert row_to_list("\t293,410\n") is None

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Section 3: Information on failed tests
=================================== FAILURES ===================================
____________________________ test_for_missing_area _____________________________

def test_for_missing_area():
> assert row_to_list("\t293,410\n") is None
E AssertionError: assert ['', '293,410'] is None
E + where ['', '293,410'] = row_to_list('\t293,410\n')

test_row_to_list.py:7: AssertionError

the exception is an AssertionError .

E AssertionError: assert ['', '293,410'] is None

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Section 3: Information about failed tests
=================================== FAILURES ===================================
____________________________ test_for_missing_area _____________________________

def test_for_missing_area():
> assert row_to_list("\t293,410\n") is None
E AssertionError: assert ['', '293,410'] is None
E + where ['', '293,410'] = row_to_list('\t293,410\n')

test_row_to_list.py:7: AssertionError

the line containing where displays return values.

E + where ['', '293,410'] = row_to_list('\t293,410\n')

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Section 4: Test result summary
====================== 1 failed, 2 passed in 0.03 seconds ======================

Result summary from all unit tests that ran: 1 failed, 2 passed tests.

Total time for running tests: 0.03 seconds.


Much faster than testing on the interpreter!

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Let's practice
reading test result
reports
U N I T T E S T I N G F O R D ATA S C I E N C E I N P Y T H O N
More benefits and
test types
U N I T T E S T I N G F O R D ATA S C I E N C E I N P Y T H O N

Dibya Chakravorty
Test Automation Engineer
Unit tests serve as documentation
Test module: test_row_to_list.py

import pytest
import row_to_list

def test_for_clean_row():
assert row_to_list("2,081\t314,942\n") == \
["2,081", "314,942"]

def test_for_missing_area():
assert row_to_list("\t293,410\n") is None

def test_for_missing_tab():
assert row_to_list("1,463238,765\n") is None

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Unit tests serve as documentation
Test module: test_row_to_list.py Created from the test module

Argument Return value


import pytest
import row_to_list ["2,081",
"2,081\t314,942\n"
"314,942"]
def test_for_clean_row():
assert row_to_list("2,081\t314,942\n") == \ "\t293,410\n" None
["2,081", "314,942"]
"1,463238,765\n" None

def test_for_missing_area():
assert row_to_list("\t293,410\n") is None

def test_for_missing_tab():
assert row_to_list("1,463238,765\n") is None

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Guess function's purpose by reading unit tests
!cat test_row_to_list.py

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Guess function's purpose by reading unit tests
!cat test_row_to_list.py

UNIT TESTING FOR DATA SCIENCE IN PYTHON


More trust
Users can run tests and verify that the package works.

UNIT TESTING FOR DATA SCIENCE IN PYTHON


More trust
Users can run tests and verify that the package works.

UNIT TESTING FOR DATA SCIENCE IN PYTHON


More trust
Users can run tests and verify that the package works.

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Reduced downtime

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Reduced downtime

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Reduced downtime

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Reduced downtime

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Reduced downtime

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Reduced downtime

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Reduced downtime

UNIT TESTING FOR DATA SCIENCE IN PYTHON


All benefits
Time savings.

Improved documentation.

More trust.

Reduced downtime.

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Tests we already wrote
row_to_list()

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Tests we already wrote
row_to_list()

convert_to_int()

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Data module
Raw data Clean data
row_to_list()

convert_to_int()

Data

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Feature module
Raw data Clean data Features
row_to_list()

convert_to_int()

Data Feature

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Models module
Raw data Clean data Features
row_to_list()

convert_to_int()

Data Feature

Models

Housing area Predictive Housing price


model

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Unit test
Raw data Clean data Features
row_to_list()

convert_to_int()

Data Feature

Models

Housing area Predictive Housing price


model

UNIT TESTING FOR DATA SCIENCE IN PYTHON


What is a unit?
Small, independent piece of code.

Python function or class.

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Integration test
Raw data Clean data Features
row_to_list()

convert_to_int()

Data Feature

Models

Housing area Predictive Housing price


model

UNIT TESTING FOR DATA SCIENCE IN PYTHON


End to end test
Raw data Clean data Features
row_to_list()

convert_to_int()

Data Feature

Models

Housing area Predictive Housing price


model

UNIT TESTING FOR DATA SCIENCE IN PYTHON


This course focuses on unit tests
Writing unit tests is the best way to learn pytest.

UNIT TESTING FOR DATA SCIENCE IN PYTHON


In Chapter 2...
Learn more pytest.

Write more advanced unit tests.

Work with functions in the features and models modules.

UNIT TESTING FOR DATA SCIENCE IN PYTHON


Let's practice these
concepts!
U N I T T E S T I N G F O R D ATA S C I E N C E I N P Y T H O N

You might also like