Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

04 Version Control

Download as pdf or txt
Download as pdf or txt
You are on page 1of 37

Data Science

Survival Skills

Version Control and Python Package Management


Exercise: Opening JPGs and opening JPGs

Decoding: Trade
accuracy for speed...

IMAGEIO

OP
EN
CV
Looks the same, but
not px by px!!
Exercises **UPDATE**
● You have 1 full week (Mo after lecture to Mo before lecture)
● If you’re too late… you’re too late. No exceptions. You had enough time.
● You only get the bonus point, if you tackled and tried with effort all tasks.
● Friday we will have Q&As, further examples, tipps&tricks etc.
● The solution will be provided as soon as possible after the deadline.

Bonus points are indicated in StudOn using the


“Bestanden/Passed” option for each exercise.
Agenda
● The dilemma of code versions
● Version control concepts
● Git
● Github/Gitlab
● Pypi
● A Python package
● Documentation (!)
“Sure, I just need
10 lines of code…”
Here you go!
Here you go!!
Versioning
● Thesis.docx
● Thesis_1.docx
● Thesis_1_anki.docx

?!?
● Thesis_2.docx
● Thesis_2_anki.docx
● Thesis_2_AB.docx
● …..
● Thesis_final.docx
● Thesis_final_anki.docx
● Thesis_final2.docx
● Thesis_final2_fix.docx
Software versioning

Semantic versioning

Example: Python versioning, e.g. 2.7 and 3.10 → major change may indicate incompatibilities
and breaking changes!
More on SemVar
Alpha, Beta, Release Candidate

Beta Release Candidate (SILVER)


Alpha
● Feature complete ● Final beta product
● Internally tested ● Contains significant with acceptable bugs
● Not feature complete less bugs ● Minimal interaction
● Serious performance ● Still performance, with source code and
issues speed issues documentation
● (un)known bugs ● Can also crash and
● Software may crash data loss may occur
often ● Interaction with users Release (GOLD)
→ usability testing
● Release to Manufacturing
(RTM)
● Digitally signed
→ knowing product state
Glottis Analysis Tools
How can I track more meaningful file
versions?
Is there some kind of “version control”?

Already developed in the 80s.


The store changes using delta compression!
Delta compression

https://ably.com/blog/practical-guide-to-diff-algorithms
Apache Subversion (SVN)

● Tried to be successor to CVS


● Is used in the following projects:
Clang, FreeBSD, GCC…
● Fixed a lot of previous bugs in CVS
And implemented more features

Issue:
● Renaming is copy&delete that is fed back to complete
file history → could break things in older versions
The BitKeeper controversy (early 2000s)

BitKeeper: “You can use BitKeeper free of


charge for cool freeware and open source Used in the Linux kernel by some
project”* people…

Andrew Tridgell reverse


engineered BitKeeper protocol to
*if you are not actively supporting any create “SourcePuller” ->
competitor to BitKeeper BitKeeper revoked free
licenses……..
Git

https://github.blog/2020-12-17-com
mits-are-snapshots-not-diffs/

Developed in 1 Month (April 2005)


Powered the Kernel release 2.6.12 release (June 2005)
The Git principle
The branching principle
Where to store these “repositories”?
Example repository
https://github.com/anki-xyz/pipra
Git software
Storing Python code

https://docs.python-guide.org/writing/structure/#structure-of-code-is-key
Content of repo
Licenses

https://choosealicense.com/
Licenses on Github
Documentation
● Sphinx
● Read The Docs

https://www.writethedocs.org/guide/writing/beginners-guide-to-docs/

● You will be using your code in 6 months


● You want people to use your code
● You want people to help out
● You want your code to be better
● You want to be a better writer
How to document a function
Tipp: Autodocs
Every IDE has an autodoc format,
I’ll show you!

Also use pylint and similar tools,


Especially ones that correct your documentation:

https://github.com/psf/black
Use cookie cutter for your projects
Caveat: I have never used this on my own though...

For Python: https://github.com/audreyfeldroy/cookiecutter-pypackage


A new python package with CookieCutter
PyPI - your package pip installable

pip install myfancypackage


The last slide
● Version control keeps track of your code
○ Versions can be stored as delta or snapshot
○ Branching is important for adding features
○ Merging allows to combine features to a common master branch
● Your packages should have a consistent version scheme (1.2.123)
● Licenses specify how others can use your code
● Documentation is an integral part of your code
○ Distinct from commenting
○ Docs for code/functions/classes
○ Docs for tutorials, setting up, getting started, …
● A good organized python package should be pip installable
Exercise
Description of the exercise
For this exercise, we will deal with Version Control using GitHub and package
management using pip. Furthermore, we are going to learn how to create and install
our packages using Git and Pip.

&
Creating a repo and installing it with pip
The first step is to create an account on GitHub. Then, your task is to set a new public
repository!

This repo will contain the necessary structure to be installed using pip. You will create a
package containing a function that we will install and check directly in our virtual
environment.
Description of the exercise
● Create a Github account if you don’t have one
● You get a script idea from us with the instructions to follow.
● Create a public repository for the script and make it a pip installable.
● Ensure that you have a proper README, a license file, etc.

You might also like