Skip to article frontmatterSkip to article content
Contents
and

5. Python Essentials

5.1Overview

We have covered a lot of material quite quickly, with a focus on examples.

Now let’s cover some core features of Python in a more systematic way.

This approach is less exciting but helps clear up some details.

5.2Data Types

Computer programs typically keep track of a range of data types.

For example, 1.5 is a floating point number, while 1 is an integer.

Programs need to distinguish between these two types for various reasons.

One is that they are stored in memory differently.

Another is that arithmetic operations are different

  • For example, floating point arithmetic is implemented on most machines by a specialized Floating Point Unit (FPU).

In general, floats are more informative but arithmetic operations on integers are faster and more accurate.

Python provides numerous other built-in Python data types, some of which we’ve already met

  • strings, lists, etc.

Let’s learn a bit more about them.

5.2.1Primitive Data Types

5.2.1.1Boolean Values

One simple data type is Boolean values, which can be either True or False

x = True
x
True

We can check the type of any object in memory using the type() function.

type(x)
bool

In the next line of code, the interpreter evaluates the expression on the right of = and binds y to this value

y = 100 < 10
y
False
type(y)
bool

In arithmetic expressions, True is converted to 1 and False is converted 0.

This is called Boolean arithmetic and is often useful in programming.

Here are some examples

x + y
1
x * y
0
True + True
2
bools = [True, True, False, True]  # List of Boolean values

sum(bools)
3

5.2.1.2Numeric Types

Numeric types are also important primitive data types.

We have seen integer and float types before.

Complex numbers are another primitive data type in Python

x = complex(1, 2)
y = complex(2, 1)
print(x * y)

type(x)
5j
complex

5.2.2Containers

Python has several basic types for storing collections of (possibly heterogeneous) data.

We’ve already discussed lists.

A related data type is tuples, which are “immutable” lists

x = ('a', 'b')  # Parentheses instead of the square brackets
x = 'a', 'b'    # Or no brackets --- the meaning is identical
x
('a', 'b')
type(x)
tuple

In Python, an object is called immutable if, once created, the object cannot be changed.

Conversely, an object is mutable if it can still be altered after creation.

Python lists are mutable

x = [1, 2]
x[0] = 10
x
[10, 2]

But tuples are not

x = (1, 2)
x[0] = 10
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[13], line 2
      1 x = (1, 2)
----> 2 x[0] = 10

TypeError: 'tuple' object does not support item assignment

We’ll say more about the role of mutable and immutable data a bit later.

Tuples (and lists) can be “unpacked” as follows

integers = (10, 20, 30)
x, y, z = integers
x
10
y
20

You’ve actually seen an example of this already.

Tuple unpacking is convenient and we’ll use it often.

5.2.2.1Slice Notation

To access multiple elements of a sequence (a list, a tuple or a string), you can use Python’s slice notation.

For example,

a = ["a", "b", "c", "d", "e"]
a[1:]
['b', 'c', 'd', 'e']
a[1:3]
['b', 'c']

The general rule is that a[m:n] returns n - m elements, starting at a[m].

Negative numbers are also permissible

a[-2:]  # Last two elements of the list
['d', 'e']

You can also use the format [start:end:step] to specify the step

a[::2]
['a', 'c', 'e']

Using a negative step, you can return the sequence in a reversed order

a[-2::-1] # Walk backwards from the second last element to the first element
['d', 'c', 'b', 'a']

The same slice notation works on tuples and strings

s = 'foobar'
s[-3:]  # Select the last three elements
'bar'

5.2.2.2Sets and Dictionaries

Two other container types we should mention before moving on are sets and dictionaries.

Dictionaries are much like lists, except that the items are named instead of numbered

d = {'name': 'Frodo', 'age': 33}
type(d)
dict
d['age']
33

The names 'name' and 'age' are called the keys.

The objects that the keys are mapped to ('Frodo' and 33) are called the values.

Sets are unordered collections without duplicates, and set methods provide the usual set-theoretic operations

s1 = {'a', 'b'}
type(s1)
set
s2 = {'b', 'c'}
s1.issubset(s2)
False
s1.intersection(s2)
{'b'}

The set() function creates sets from sequences

s3 = set(('foo', 'bar', 'foo'))
s3
{'bar', 'foo'}

5.3Input and Output

Let’s briefly review reading and writing to text files, starting with writing

f = open('newfile.txt', 'w')   # Open 'newfile.txt' for writing
f.write('Testing\n')           # Here '\n' means new line
f.write('Testing again')
f.close()

Here

  • The built-in function open() creates a file object for writing to.
  • Both write() and close() are methods of file objects.

Where is this file that we’ve created?

Recall that Python maintains a concept of the present working directory (pwd) that can be located from with Jupyter or IPython via

%pwd
'/home/runner/work/lecture-python-programming.myst/lecture-python-programming.myst/lectures'

If a path is not specified, then this is where Python writes to.

We can also use Python to read the contents of newline.txt as follows

f = open('newfile.txt', 'r')
out = f.read()
out
'Testing\nTesting again'
print(out)
Testing
Testing again

In fact, the recommended approach in modern Python is to use a with statement to ensure the files are properly acquired and released.

Containing the operations within the same block also improves the clarity of your code.

Let’s try to convert the two examples above into a with statement.

We change the writing example first

with open('newfile.txt', 'w') as f:  
    f.write('Testing\n')         
    f.write('Testing again')

Note that we do not need to call the close() method since the with block will ensure the stream is closed at the end of the block.

With slight modifications, we can also read files using with

with open('newfile.txt', 'r') as fo:
    out = fo.read()
    print(out)
Testing
Testing again

Now suppose that we want to read input from one file and write output to another. Here’s how we could accomplish this task while correctly acquiring and returning resources to the operating system using with statements:

with open("newfile.txt", "r") as f:
    file = f.readlines()
    with open("output.txt", "w") as fo:
        for i, line in enumerate(file):
            fo.write(f'Line {i}: {line} \n')

The output file will be

with open('output.txt', 'r') as fo:
    print(fo.read())
Line 0: Testing
 
Line 1: Testing again 

We can simplify the example above by grouping the two with statements into one line

with open("newfile.txt", "r") as f, open("output2.txt", "w") as fo:
        for i, line in enumerate(f):
            fo.write(f'Line {i}: {line} \n')

The output file will be the same

with open('output2.txt', 'r') as fo:
    print(fo.read())
Line 0: Testing
 
Line 1: Testing again 

Suppose we want to continue to write into the existing file instead of overwriting it.

we can switch the mode to a which stands for append mode

with open('output2.txt', 'a') as fo:
    fo.write('\nThis is the end of the file')
with open('output2.txt', 'r') as fo:
    print(fo.read())
Line 0: Testing
 
Line 1: Testing again 

This is the end of the file

5.3.1Paths

Note that if newfile.txt is not in the present working directory then this call to open() fails.

In this case, you can shift the file to the pwd or specify the full path to the file

f = open('insert_full_path_to_file/newfile.txt', 'r')

5.4Iterating

One of the most important tasks in computing is stepping through a sequence of data and performing a given action.

One of Python’s strengths is its simple, flexible interface to this kind of iteration via the for loop.

5.4.1Looping over Different Objects

Many Python objects are “iterable”, in the sense that they can be looped over.

To give an example, let’s write the file us_cities.txt, which lists US cities and their population, to the present working directory.

%%writefile us_cities.txt
new york: 8244910
los angeles: 3819702
chicago: 2707120
houston: 2145146
philadelphia: 1536471
phoenix: 1469471
san antonio: 1359758
san diego: 1326179
dallas: 1223229
Overwriting us_cities.txt

Here %%writefile is an IPython cell magic.

Suppose that we want to make the information more readable, by capitalizing names and adding commas to mark thousands.

The program below reads the data in and makes the conversion:

data_file = open('us_cities.txt', 'r')
for line in data_file:
    city, population = line.split(':')         # Tuple unpacking
    city = city.title()                        # Capitalize city names
    population = f'{int(population):,}'        # Add commas to numbers
    print(city.ljust(15) + population)
data_file.close()
New York       8,244,910
Los Angeles    3,819,702
Chicago        2,707,120
Houston        2,145,146
Philadelphia   1,536,471
Phoenix        1,469,471
San Antonio    1,359,758
San Diego      1,326,179
Dallas         1,223,229

Here f' is an f-string used for inserting variables into strings.

The reformatting of each line is the result of three different string methods, the details of which can be left till later.

The interesting part of this program for us is line 2, which shows that

  1. The file object data_file is iterable, in the sense that it can be placed to the right of in within a for loop.
  2. Iteration steps through each line in the file.

This leads to the clean, convenient syntax shown in our program.

Many other kinds of objects are iterable, and we’ll discuss some of them later on.

5.4.2Looping without Indices

One thing you might have noticed is that Python tends to favor looping without explicit indexing.

For example,

x_values = [1, 2, 3]  # Some iterable x
for x in x_values:
    print(x * x)
1
4
9

is preferred to

for i in range(len(x_values)):
    print(x_values[i] * x_values[i])
1
4
9

When you compare these two alternatives, you can see why the first one is preferred.

Python provides some facilities to simplify looping without indices.

One is zip(), which is used for stepping through pairs from two sequences.

For example, try running the following code

countries = ('Japan', 'Korea', 'China')
cities = ('Tokyo', 'Seoul', 'Beijing')
for country, city in zip(countries, cities):
    print(f'The capital of {country} is {city}')
The capital of Japan is Tokyo
The capital of Korea is Seoul
The capital of China is Beijing

The zip() function is also useful for creating dictionaries --- for example

names = ['Tom', 'John']
marks = ['E', 'F']
dict(zip(names, marks))
{'Tom': 'E', 'John': 'F'}

If we actually need the index from a list, one option is to use enumerate().

To understand what enumerate() does, consider the following example

letter_list = ['a', 'b', 'c']
for index, letter in enumerate(letter_list):
    print(f"letter_list[{index}] = '{letter}'")
letter_list[0] = 'a'
letter_list[1] = 'b'
letter_list[2] = 'c'

5.4.3List Comprehensions

We can also simplify the code for generating the list of random draws considerably by using something called a list comprehension.

List comprehensions are an elegant Python tool for creating lists.

Consider the following example, where the list comprehension is on the right-hand side of the second line

animals = ['dog', 'cat', 'bird']
plurals = [animal + 's' for animal in animals]
plurals
['dogs', 'cats', 'birds']

Here’s another example

range(8)
range(0, 8)
doubles = [2 * x for x in range(8)]
doubles
[0, 2, 4, 6, 8, 10, 12, 14]

5.5Comparisons and Logical Operators

5.5.1Comparisons

Many different kinds of expressions evaluate to one of the Boolean values (i.e., True or False).

A common type is comparisons, such as

x, y = 1, 2
x < y
True
x > y
False

One of the nice features of Python is that we can chain inequalities

1 < 2 < 3
True
1 <= 2 <= 3
True

As we saw earlier, when testing for equality we use ==

x = 1    # Assignment
x == 2   # Comparison
False

For “not equal” use !=

1 != 2
True

Note that when testing conditions, we can use any valid Python expression

x = 'yes' if 42 else 'no'
x
'yes'
x = 'yes' if [] else 'no'
x
'no'

What’s going on here?

The rule is:

  • Expressions that evaluate to zero, empty sequences or containers (strings, lists, etc.) and None are all equivalent to False.
    • for example, [] and () are equivalent to False in an if clause
  • All other values are equivalent to True.
    • for example, 42 is equivalent to True in an if clause

5.5.2Combining Expressions

We can combine expressions using and, or and not.

These are the standard logical connectives (conjunction, disjunction and denial)

1 < 2 and 'f' in 'foo'
True
1 < 2 and 'g' in 'foo'
False
1 < 2 or 'g' in 'foo'
True
not True
False
not not True
True

Remember

  • P and Q is True if both are True, else False
  • P or Q is False if both are False, else True

We can also use all() and any() to test a sequence of expressions

all([1 <= 2 <= 3, 5 <= 6 <= 7])
True
all([1 <= 2 <= 3, "a" in "letter"])
False
any([1 <= 2 <= 3, "a" in "letter"])
True

5.6Coding Style and Documentation

A consistent coding style and the use of documentation can make the code easier to understand and maintain.

5.6.1Python Style Guidelines: PEP8

You can find Python programming philosophy by typing import this at the prompt.

Among other things, Python strongly favors consistency in programming style.

We’ve all heard the saying about consistency and little minds.

In programming, as in mathematics, the opposite is true

  • A mathematical paper where the symbols $\cup$ and $\cap$ were reversed would be very hard to read, even if the author told you so on the first page.

In Python, the standard style is set out in PEP8.

(Occasionally we’ll deviate from PEP8 in these lectures to better match mathematical notation)

5.6.2Docstrings

Python has a system for adding comments to modules, classes, functions, etc. called docstrings.

The nice thing about docstrings is that they are available at run-time.

Try running this

def f(x):
    """
    This function squares its argument
    """
    return x**2

After running this code, the docstring is available

f?
Type:       function
String Form:<function f at 0x2223320>
File:       /home/john/temp/temp.py
Definition: f(x)
Docstring:  This function squares its argument
f??
Type:       function
String Form:<function f at 0x2223320>
File:       /home/john/temp/temp.py
Definition: f(x)
Source:
def f(x):
    """
    This function squares its argument
    """
    return x**2

With one question mark we bring up the docstring, and with two we get the source code as well.

You can find conventions for docstrings in PEP257.

5.7Exercises

Solve the following exercises.

(For some, the built-in function sum() comes in handy).

Solution to Exercise 1

Part 1 Solution:

Here’s one possible solution

x_vals = [1, 2, 3]
y_vals = [1, 1, 1]
sum([x * y for x, y in zip(x_vals, y_vals)])
6

This also works

sum(x * y for x, y in zip(x_vals, y_vals))
6

Part 2 Solution:

One solution is

sum([x % 2 == 0 for x in range(100)])
50

This also works:

sum(x % 2 == 0 for x in range(100))
50

Some less natural alternatives that nonetheless help to illustrate the flexibility of list comprehensions are

len([x for x in range(100) if x % 2 == 0])
50

and

sum([1 for x in range(100) if x % 2 == 0])
50

Part 3 Solution:

Here’s one possibility

pairs = ((2, 5), (4, 2), (9, 8), (12, 10))
sum([x % 2 == 0 and y % 2 == 0 for x, y in pairs])
2
Solution to Exercise 2

Here’s a solution:

def p(x, coeff):
    return sum(a * x**i for i, a in enumerate(coeff))
p(1, (2, 4))
6
Solution to Exercise 3

Here’s one solution:

def f(string):
    count = 0
    for letter in string:
        if letter == letter.upper() and letter.isalpha():
            count += 1
    return count

f('The Rain in Spain')
3

An alternative, more pythonic solution:

def count_uppercase_chars(s):
    return sum([c.isupper() for c in s])

count_uppercase_chars('The Rain in Spain')
3
Solution to Exercise 4

Here’s a solution:

def f(seq_a, seq_b):
    for a in seq_a:
        if a not in seq_b:
            return False
    return True

# == test == #
print(f("ab", "cadb"))
print(f("ab", "cjdb"))
print(f([1, 2], [1, 2, 3]))
print(f([1, 2, 3], [1, 2]))
True
False
True
False

An alternative, more pythonic solution using all():

def f(seq_a, seq_b):
  return all([i in seq_b for i in seq_a])

# == test == #
print(f("ab", "cadb"))
print(f("ab", "cjdb"))
print(f([1, 2], [1, 2, 3]))
print(f([1, 2, 3], [1, 2]))
True
False
True
False

Of course, if we use the sets data type then the solution is easier

def f(seq_a, seq_b):
    return set(seq_a).issubset(set(seq_b))
Solution to Exercise 5

Here’s a solution:

def linapprox(f, a, b, n, x):
    """
    Evaluates the piecewise linear interpolant of f at x on the interval
    [a, b], with n evenly spaced grid points.

    Parameters
    ==========
        f : function
            The function to approximate

        x, a, b : scalars (floats or integers)
            Evaluation point and endpoints, with a <= x <= b

        n : integer
            Number of grid points

    Returns
    =======
        A float. The interpolant evaluated at x

    """
    length_of_interval = b - a
    num_subintervals = n - 1
    step = length_of_interval / num_subintervals

    # === find first grid point larger than x === #
    point = a
    while point <= x:
        point += step

    # === x must lie between the gridpoints (point - step) and point === #
    u, v = point - step, point

    return f(u) + (x - u) * (f(v) - f(u)) / (v - u)
Solution to Exercise 6

Here’s one solution.

n = 100
ϵ_values = [np.random.randn() for i in range(n)]
CC-BY-SA-4.0

Creative Commons License – This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International.