Skip to content

Latest commit

 

History

History
1025 lines (779 loc) · 24.3 KB

introduction.md

File metadata and controls

1025 lines (779 loc) · 24.3 KB

A Gentle introduction to Python

Nicolas P. Rougier - [email protected]
Lecture notes from the EDMI course taught at the University of Bordeaux for the academic year 2015/16.
This work is licensed under [Creative Commons Attribution-ShareAlike 4.0 International License] (http://creativecommons.org/licenses/by-sa/4.0/).

Objectives

The primary goal of this lesson is twofold:

  • To ensure you have a clean Python installation
  • To discover Python basic syntax through the interpreter

Note: This lesson only covers the very basics of Python. If you're already familiar with Python, you can probably skip it and target the next lesson, but who knows, you might discover some tips while reading this lesson. In any case, make sure to have all necessary packages installed before the next lesson.

Installation

As of today (2016), Python exists mainly in two flavors: Python 2.x and Python 3.x. On most system, a 2.x version is already installed and for some systems, the 3.x is also installed. However, because we don't want to mess with the system installation, we'll install our own private version usign either:

  • Enthought Canopy is a comprehensive Python analysis environment that provides easy installation of the core scientific analytic and scientific Python packages, creating a robust platform you can explore, develop, and visualize on.

  • Anaconda by Continuum Analytics which is a completely free Python distribution (including for commercial use and redistribution). It includes more than 400 of the most popular Python packages for science, math, engineering, and data analysis.

For this lesson, we'll go for Anaconda that is available for several architectures:

The anaconda comes with a lot of nice features but we won't use them (if you're interested, have a look at their starter guide). Next step is to test the installation. To do that, you'll need to open a terminal and type:

$ conda --version
conda 3.19.1

If this works, your conda version should be greater than 3.19.1. We can now try to start the Python interpreter.

$ python
Python 3.5.1 (default, Dec 26 2015, 18:11:22)
[GCC 4.2.1 Compatible Apple LLVM 7.0.2 (clang-700.1.81)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> |

The anaconda comes also with the IPython interpreter which is far more powerful that the vanilla interpreter.

$ ipython
Python 3.5.1 |Anaconda 2.5.0 (x86_64)| (default, Dec  7 2015, 11:24:55)
Type "copyright", "credits" or "license" for more information.

IPython 4.0.3 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

In [1]: |

Now let's check our installation to see if everything is ok. At this point, no need yet to understand what you're typing but we need to check if some important packages are present with the proper version. In the following examples, the >>> is the prompt and does not need to be typed. For example, if you chose the IPython interpetrer, you prompt is more likely something like [12]:. There is also a second prompt (...) meaning the previous line is not ended and needs to be terminated. This is for example the case when you enter an parantheses unbalanced expression (i.e. number of opening parathenses is greater than the number of closing parantheses).

Checking for numpy

>>> import numpy
>> print(numpy.__version__)
1.10.4
>>> numpy.test() # This can last a few seconds
..........................

Checking for scipy

>>> import scipy
>>> print(scipy.__version__)
0.17.0
>>> scipy.test() # This can last a few seconds
...........................

Checking for matplotlib

>>> import matplotlib
>>> print(matplotlib.__version__)
1.5.1
>>> matplotlib.test() # This can last a few minutes
...........................

Checking for cython

>>> import cython
>>> print(cython.__version__)
0.23.4

Checking for opengl

>>> import OpenGL
>>> print(OpenGL.__version__)
3.1.0

For each of these packages, the x.y.z version should be equal or greater than the displayed version. If this is not the case, then maybe you conda installation is not up to date. You can upgrade all packages at once using:

$ conda update --all

If something goes wrong, you'll have to check on the Anaconda help page.

Python is a calculator

Arithmetic computation

Now it's time to experience a little bit with Python. Let's start with simple arithmetic operations because Python can be used as a regular calculator with standard arithmetic operations (addition, subtraction, multiplication, division, etc.)

Addition

>>> 2 + 3
5

Subtraction

>>> 11 - 3
8

Multiplication

>>> 3 * 4
12

Division

>>> 11 / 5
2.2

Integer division

>>> 11 // 5
2

Modulo operation

>>> 11 % 5
1

Power

>>> 2**3
8

Note that you cannot have spaces between digits of a number:

>>> 1 0 + 2 
  File "<stdin>", line 1
    1 0 + 2
      ^
SyntaxError: invalid syntax

In such a case, Python complains about a syntax error and points at the position of the error in the expression (using the ^ character). Why Python points at the zero and not the space ? Because you could have written 1 + 2 and the space would have been legal. The interpreter can only find the error after it discovers the extra digit and consequently points at it when reporting the error.

Of course, you can compose any number of operations in order to compute a more complex operation:

>>> 11 - (5 * (11//5)) #  = 11 % 5
1

Native numeric types

Python offers natively four main native numeric type, bool, integer, float and complex. But always keep in mind that they are the poor's man version of their mathematical equivalent (Boolean ($\mathbb{B}$), Integer ($\mathbb{Z}$), Real ($\mathbb{R}$) and Complex ($\mathbb{C}$)). ìnteger have limited range, float and complex have limited range and precision.

In the case of float and complex, this has very important consequences.

>>> 0.3 == 0.1*3
False
>>> 0.5 == 0.1*5
True

The reason is that the decimal number 0.1 cannot be represented exactly is only approximated1. On most machines, if Python were to print the true decimal value of the binary approximation stored for 0.1, it would have to display

>>> 0.1
0.1000000000000000055511151231257827021181583404541015625

Not very convenient... Consequently, Python (as many other languages) chose to display a rounded value instead. If you want to know more, have a look at the Floating Point Arithmetic: Issues and Limitations chapter in the official Python 3 tutorial. An immediate and practical consequence is that what you see in the console is not always what you get in memory, even if they're reasonably close.

For each type, there exist many ways to specify the same number.

>>> True              # Boolean
>>> 0b1010            # Integer (base  2: binary)
>>> 0o12              # Integer (base  8: octal)
>>> 10                # Integer (base 10: decimal)
>>> 0x0a              # Integer (base 16: hexadecimal)
>>> 10.0              # Float
>>> 1e1               # Float (scientic notation)
>>> float('inf')      # Float (infinity +∞)
>>> float('nan')      # Float (Not A Number: nan)
>>> 10 + 0j           # Complex

But you can also force the type of any quantity by casting it.

>>> bool(0)
False
>>> int(0)
0
>>> float(0)
0.0
>>> complex(0)
0j

Beyond simple arithmetic

If you want to use more elaborate functions, you'll need the help of the mathematical module for real numbers and the complex mathematical module for complex numbers.

Power and logarithmic functions

>>> from math import *
>>> log(exp(1.234))
1.234

Trigonometric functions

>>> from math import *
>>> asin(sin(1.234))
1.234

Hyperbolic functions

>>> from math import *
>>> asinh(sinh(1.234))
1.234

Special functions

>>> from math import *
>>> gamma(2.0)
1.0

Constants

>>> from math import *
>>> pi
3.141592653589793
>>> e
2.718281828459045
>>> nan
nan
>>> inf
inf

Logical operations

Logic is an important part of Python because this allows to manipulate and compare quantities, including numbers, and we'll see later that it works for all kind of objects.

>>> True and True # Logical and
>>> 42 or 57      # Logical or
>>> 1 == 2-1      # Equality test
>>> 1 != 2        # Inequality test
>>> 1 is 2-1      # Identity test
>>> not 24        # Negation

Note that the is keyword really means identity, it is not a test for equality.

>>> 1 is 1.0
False
>>> True is 1
False
>>> True and 1
True

Bitwise operations

Bitwise operations are logical operations that operate a the bit level. They might be useful in some situations but we won't use them in this course.

>>> 1 | 2   # bitwise or
>>> 1 & 2   # bitwise and
>>> 1 ^ 2   # bitwise xor
>>> 8 << 2  # bitwise left shift
>>> 8 >> 2  # bitwise right shift
>>> ~8	  # bitwise negation

Python is much more

Beside being a convenient calculator, Python is also (and mostly) a powerful programming language with an elegant and intuitive syntax. Furthermore, you have to knwo that Python is an interpreted langage, meaning each time you enter a set of instructions, they need to be intepreted by the Python interpreter. This can make Python quite slow in some situation but we'll later how to overcome most of Python slowness.

Variables

Until now, we have been playing in the console, throwing some expressions in the interpreted and checked the result. Problem is that those expression cannot be re-used. It's thus time to save us some trouble and assign those expressions to variables. This can be done quite naturally.

>>> width = 1
>>> height = 2

What is really cool though is that you can assign several variables at once:

>>> width, height = 2,1
>>> width
1
>>> height
2

However, you cannot refer a new variable on the same line

>>> width, height = 2, 2*width
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'width' is not defined

In this case, you have to split the expression in two distinct lines.

>>> width = 2
>>> height = 2*width

Variables can be manipulated just as any expression but they need to have been defined previously.

>>> a = 2*b
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'b' is not defined

In interactive mode, that is, the console mode we've been using from the start, there is a special variable whose name is '_' and that contains the last printed expression.

>>> very_long_name = 10
>>> very_long_name
10
>>> b = 2 + _ # here, _ = very_long_name
>>> b
12

Don't assign explicitely a value to the _ variable or you'll kill the magic, and you don't want to do that, do you ?

Containers

Beside the numeric types, Python also offers native container type (also known as collections), one is dedicated to the storing of ordered sequence of characters (i.e. strings) while some others allows to store just anything and offers different properties. In a nutshell:

>>>  "1,2,3,4"  # String (ordered, character only, immutable, indexable)
>>>  (1,2,3,4)  # Tuple (ordered, immutable, indexable)
>>>  [1,2,3,4]  # List (ordered, mutable, indexable)
>>>  {1,2,3,4}  # Set (unordered, mutable, unique elements)
>>>  {1:2,3:4}  # Dictionnary (unordered, mutable, hashable)

Strings

Strings are expressed by enclosing a text using pairs of " or ' characters. Depending on the character you chose, you can use the other inside the string.

>>> "Hello 'world!'"
"Hello 'world!'"
>>> 'Hello "world"!'
'Hello "world!"'
>>> "" # empty string
''

Note that you can split a string using spaces and Python will concatenate all the pieces together.

>>> "Hello"  " "  "world!"
'Hello world!'

But you can also explicitely concatenate the different pieces together.

>>> "Hello" + " " + "world!"
'Hello world!'

For multiline strings, you need to triple the enclosing quotes or to use parentheses.

>>> """Hello 
... world!"""
'Hello\nworld!'

>>> '''Hello 
... world!'''
'Hello\nworld!'

>>>("Hello"
... " world!")
'Hello world!'

In the output above, you can seen the "\n" character has been added, it expresses the newline character. There are several backslashed characters:

Escape Sequence Meaning
\\ Backslash ()
\' Single quote (')
\" Double quote (")
\a ASCII Bell (BEL)
\b ASCII Backspace (BS)
\n ASCII Linefeed (LF)
\r ASCII Carriage Return (CR)
\t ASCII Horizontal Tab (TAB)
\v ASCII Vertical Tab (TAB)

This means the '\n' will be interpreted as a new line character, but what if we want to really have '\n' in our string as in 'C:\some\name'? Either we "escape" all the backslash or we prefix the string with r meaning special characters won't be interpeted.

>>> print('C:\some\name')
'C:\some
ame'
>>> print('C:\\some\\name')
C:\some\name
>>> print(r'C:\some\name')
C:\some\name

Strings in Python 3 are encoded using UTF-8 (Unicode), meaning you can encode pretty much any glyphs fom any languages (and emojis as well).

>>> "ℕ ⊂ ℤ ⊂ ℚ ⊂ ℝ ⊂ ℂ" 
'ℕ ⊂ ℤ ⊂ ℚ ⊂ ℝ ⊂ ℂ'

Tuples

Tuples are immutable containers, meaning they cannot be changed after they've been created. They allow to store pretty much anything, including other tuples. To create a tuple, you simply write a comma-separated list of values, optionally enclosed by parentheses.

>>> 1,2,3
(1, 2, 3)
>>> 1, "2", (1,2) 
(1, '2', (1, 2))
>>> () # empty tuple
()

Lists

Lists are mutable containers quite similar to tuple, but they can be modified after creation. They also allow to store pretty much anything. To create a list, you need to write a comma-separated list of values enclosed by square brackets.

>>> [1,2,3]
[1, 2, 3]

>>> [1, "2", (1,2)]
[1, '2', (1, 2)]

>>> [] # empty list
[]

Sets

Sets are mutable containers (they can be modified) and contains only unique elements, i.e. they prevent to have duplicated elements.

>>> {1, 2, 2}
{1, 2}

>>> {1, 2, "2"}
{1, 2, '2'}

Dictionnary

Dictionnary can be considered as a kind of associative memory where items are indexed by a key (instead of integer). The type of the key can be pretty much anything.

>>> { "item 1" : 1, "item 2" : 2}
{'item 2': 2, 'item 1': 1}
>>> { 1 : 2, 3 : 4}
{1: 2, 3: 4}

Indexing and slicing

Individual items of a list, tuple and strings can be accessed invidually using their position as index. Note that first element has index 0

>>> d = [1,2,3,4,5]
>>> d[0]
1

This also work using negative indices, meaning the position has to be taken from the end. Note that last element has index -1.

>>> s = "Hello world!"
>>> s[-1]
'!'

Furthermore, we can also access a range of items using the slice notation start:end. Note that both start and end are optional.

>>> d = [1,2,3,4,5]
>>> d[1:3]
[2, 3]

If start is missing, Python will implicitly replace it by the start of the list. If end is missing, Python will implicitly replace it by the end of the list.

>>> d = [1,2,3,4,5]

>>> d[1:]
[2, 3, 4, 5]

>>> d[:2]
[1, 2]

>>> d[:]
[1, 2, 3, 4, 5]

We can further refine our slice by giving the step to between elements. The new syntax is thus start:end:step.

>>> d = [1,2,3,4,5,6]
>>> d[0:5:2]
[1, 3, 5]

# Can be abbreviated into
>>> d[::2] # 
[1, 3, 5]

What if we use a negative step? We get the reversed sequence.

>>> d = [1,2,3,4,5,6]
>>> d[::-1] 
[6, 5, 4, 3, 2, 1]

Because strings are indexable, indexing and slicing work just the same.

>>> s = "Hello world!"
>>> s[6:-1]
" world"
>>> s[6:]
" world!"

Adding and removing

Adding an element to a mutable container can be done in two distinct ways. Either by creating a new container that is the container plus the new item, or by inserting the new item into the container, hence modyfying it.

>>> l1 = [1, 2, 3]
>>> l2 = l1 + [4]
>>> l2
[1, 2, 3, 4]
>>> l1.append(4)
>>> l1
[1, 2, 3, 4]

For removing items, we need to use the del keyword and give indices where to delete items.

>>> l1 = [1, 2, 3, 4, 5, 6]
>>> del l1[0 ]
>>> l1
[2, 3, 4, 5, 6]

But we can also delete a range of indices at once.

>>> l1 = [1, 2, 3, 4, 5, 6]
>>> del l1[::2]
>>> l1
[2,4,6]

Running scripts

Until now, we have been mostly typing some Python expressions directly into the interpreter, meaning we had to type and retype the same expression again and again. To save us some time, it's time to put all these commands into a file that will be ran by the Python interpreter. First, chose a text editor you might like and fill it with some python commands.

print("Hello world!")

Then save it with the name script.py. The .py is the regular file extension used for Python programs. You are free to use any exension you like, but using .py is a damn good idea since the operating system can make the connection with Python based on this extension.

To run this script, we have two options. First, we can start it using the regular python interpreter.

$ python script.py
Hello world!
$

If you run the above command, Python will terminate as soon as your program has ended. If you want to stay within the Python interpreter, you'll have to use the -i switch (interactive mode) that tells Python to not exit once the program has finished.

$ python script.py
Hello world!
>>> 

Another option is to use IPython that allows to run a script from the within the interpreter.

[1]: %run script.py
Hello world!
[2]: 

Indentation

Something you will discover soon enough (or maybe you've already discovered it) is that Python is rather nitpicking about indentation.

>>> a = 1
>>>  a = 1
  File "<stdin>", line 1
    a = 1
    ^
IndentationError: unexpected indent

The reason is that indentation has a semantic meaning but we'll see that later.

The Jupyter notebook

The Jupyter notebook is a web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text. Uses include (but not limited to):

  • data cleaning and transformation
  • numerical simulation
  • statistical modeling
  • machine learning

The Jupyter notebook is not restricted to python and support for over 40 programming languages, including those popular in Data Science such as Python, R, Julia and Scala.

You can try it online but you can also start a new one locally:

$ jupyter notebook

This should open a new tab in your browser (or open a new browser if it was closed) showing your available notebooks. There should be none, so you can create a new one and start typing python code.

We won't explore all that you can do with notebook during this course but you can have a look at the nbviewer to see what can be done with them.

Exercises

We'll stop here with this very short introduction to Python. It's now time to move on to programming with python. So now, shutdown your computer, grab a pen a sheet of paper and try to answer these exercices without typing them in a Python interpreter. Once you've answered every questiosn, you can chek them.

###Find the type of the following expressions

.0
-1
1,
'4.0 + 5.0'
1e2
1j
[{()}]
float('nan')

Are these legal python expressions?

1 + 1 == 2
1 = 2
(1,)[0] 
1 + 1i
1 <- 2
0.+.0
3***3
3 <<2>> 3
[({})]

Find the result of the following expressions

1.+1,
1,+1.
(1,)*3
1e1000 - 1e1000
'abc'*3
3 or 10
3 <2> 3

How to...

  • Build the empty set?
  • Get the list of unique letters composing "abracadabra"
  • Build a tuple of two empty lists
  • Generate a list of all even numbers between 20 and 40 (included)
  • Check if a word is a palindrom?
  • Transform the string "1.2" into a float?
  • Count the number of unique elements in a list?
  • Find the maximum representable integer ?
  • Print the string "Isn't, he said" (including quotes)?
  • Swap the content of two variables?

Resources

Here are a set of resources for those who want to go further in their knowledge of Python.

  • The (official) Python tutorial does not attempt to be comprehensive and cover every single feature, or even every commonly used feature. Instead, it introduces many of Python's most noteworthy features, and will give you a good idea of the language's flavor and style.

  • Dive into python is a teach-by-example guide to the paradigms of programming in Python and modern software development techniques. It assumes some preexisting knowledge of programming, although not necessarily in Python.

  • Programming with Python are the lecture notes from the course taught at the University of Manchester in the academic year 2014/15. The aim of the course is to lay a strong foundation for your future programming requirements, be they in Python or some other language or a similar tool.

  • Scipy Lecture Notes are a set of tutorials on the scientific Python ecosystem: a quick introduction to central tools and techniques. The different chapters each correspond to a 1 to 2 hours course with increasing level of expertise, from beginner to expert.

Footnotes

  1. What Every Computer Scientist Should Know About Floating-Point Arithmetic
    David Goldberg, Computing Surveys, 1991.