Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
34 views

From Scratch: Writing Your Own Functions

The document discusses writing custom functions in Python. It explains that functions allow code to be reused by defining blocks of code that are called by name. Functions can take arguments as input and return values. The document provides examples of defining, calling, and testing functions. It also covers improving functions by adding flexibility through default arguments, keyword arguments, and encapsulation.

Uploaded by

Anas Jamshed
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views

From Scratch: Writing Your Own Functions

The document discusses writing custom functions in Python. It explains that functions allow code to be reused by defining blocks of code that are called by name. Functions can take arguments as input and return values. The document provides examples of defining, calling, and testing functions. It also covers improving functions by adding flexibility through default arguments, keyword arguments, and encapsulation.

Uploaded by

Anas Jamshed
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

from scratch

A primer for scientists working with Next-Generation-


Sequencing data

CHAPTER 4

Writing your own functions


Why functions?

Often the same task needs to be performed at different times


in a program.

Copying and pasting lines is error-prone and leads to bloated


code.

Functions enable you to reuse bits of code without the need


of copying it.
What is a function?

We have seen several examples:


● print
● open
● ...

Essentially a function is
● a block of code

● identified and called by a name

Functions can have


● arguments (input values)

● a return value (output)


Defining a function

function name argument

def get_gc_content(dna):
length = len(dna)
g_count = dna.count('G')
c_count = dna.count('C')
gc_content = (g_count + c_count) / length
return gc_content

body: defines code to run and return value


head: defines name and input parameters

Note: This function will not work correctly with Python 2 unless you include the following
import at the top.
from __future__ import division
Calling a function

● functions are called by their defined name:


gc_content = get_gc_content("ATAGGCACATT")

● only when the function is called, its code is executed


● if a return value is returned by the function it can be
assigned to a variable
– if no return statement was defined, the function
returns the value None
Using our function in a program

● a program can comprise several functions and calls to


those functions
● programs are usually considered to be more complex and
structured than scripts, which often only implement a small
subtask
def get_gc_content(dna):
[...]
return gc_content

my_gc_content = get_gc_content("ATGCGCGATCGATCGAATCG")
print(str(my_gc_content))
print(get_gc_content("ATGCATGCAACTGTAGC"))
print(get_gc_content("aactgtagctagctagcagcgta"))
Improving our function

● we can make the output nicer by rounding the return value:


def get_gc_content(dna):
[...]
return round(gc_content, 2)

● Still, the result for the lower case sequence is incorrect.


We can make the script ignore case by formatting the
input parameter:

def get_gc_content(dna):
g_count = dna.upper().count('G')
c_count = dna.upper().count('C')
[...]
Improving our function

● More flexibility can be achieved by making the number of


significant figures an input parameter:
def get_gc_content(dna, ndigits):
length = len(dna)
g_count = dna.upper().count('G')
c_count = dna.upper().count('C')
gc_content = (g_count + c_count) / length
return round(gc_content, ndigits)
Keyword arguments

● A slightly different way of calling a function (or method):


get_gc_content(dna="ATAGGCACATT", ndigits=3)
● identical to the call we used before:
get_gc_content("ATAGGCACATT", 3)
● BUT now we can change the parameter order:
get_gc_content(ndigits=3, dna="ATAGGCACATT")

● Can make calling of functions more descriptive


Parameter defaults

● Some parameters are provided for flexibility but in most cases


a reasonable default value can be assumed
def get_gc_content(dna, ndigits=2):
[...]
return round(gc_content, ndigits)
● Now we can omit the parameter if we are fine with the
default value:
gc_content = get_gc_content("ATAGGCACATT")

● Especially useful for functions with many input parameters


Encapsulation

● putting bits of code that perform a defined subtask into a


function is called encapsulation
● encapsulation enables you to deal with only parts of the
code at a time, a method of abstraction
● well-structured code is much easier to maintain and to
understand
● the whole concept of object-oriented programming is
based on encapsulation
Testing functions

● the assert function is a means of testing a function's


output:
assert get_gc_content("ACGT") == 0.5
assert get_gc_content("acgt") == 0.5
assert get_gc_content("ACT", 3) == 0.333

assert returns an exception if the comparison returns
false, otherwise nothing
● for complex programs with many functions it is
good practice to write so-called unit tests to make
sure functions work as expected at all times
● unit tests are usually stored in separate test scripts
Recap

In this unit you learned:


● defining functions
● calling functions
● passing input values to functions as parameters
– defining default values for parameters
– passing parameters by name as keyword arguments
● returning output values from within functions
● testing functions
● the concept of encapsulation
Exercise 4-1: Amino Acid Content (pt. 1)

● Write a function aa_content that takes two arguments:


– a protein sequence
– an amino acid residue code (single letter)
● The function should return the percentage of the protein
that the amino acid makes up.
● Use the following assertions to test your function:
assert aa_content("MSRSLLLRFLLFLLLLPPLP", "M") == 5
assert aa_content("MSRSLLLRFLLFLLLLPPLP", "r") == 10
assert aa_content("MSRSLLLRFLLFLLLLPPLP", "L") == 50
assert aa_content("MSRSLLLRFLLFLLLLPPLP", "Y") == 0
Exercise 4-2: Amino Acid Content (pt. 2)

● Modify the function from part one so that it accepts a list of


amino acid residues rather than a single one.
● If no list is given, the function should return the percentage
of hydrophobic amino acid residues (A, I, L, M, F, W, Y, V).
● Your function should pass the following assertions:
assert aa_content("MSRSLLLRFLLFLLLLPPLP", ['M']) == 5
assert aa_content("MSRSLLLRFLLFLLLLPPLP", ['M','L']) == 55
assert aa_content("MSRSLLLRFLLFLLLLPPLP", ['F','S','L']) == 70
assert aa_content("MSRSLLLRFLLFLLLLPPLP") == 65

You might also like