Dictionaries and Functions¶
Karën Fort (CC BY-NC-SA) -- 2024
Slightly modified by Fanny Ducel. (recap part, ouput, good practice)
First of all, time for your weekly quizz and recap:¶
- What's a loop?
- What's the difference between a
for
and awhile
loop? - What is a list? What can be done with a list? (What methods do you remember?)
- What will happen after using a
split()
? - How to use a file in your code?
For vs. while loops¶
for loops | while loops | |
---|---|---|
Syntax | for <element> in <collection/iterable> |
while <condition (is True)> |
Example | for word in word_list |
while counter < limit |
Iterates... | through a collection/iterable (= e.g. a list, a string) | as long as (= while 😉) the condition is True, stops when it is False |
Nb of iterations | pre-determined (= the length of the collection/iterable) | undetermined (depending on when the condition becomes False) |
When/why to use it? | To run the same instructions on all the elements of a collection/iterable | To keep doing something until a condition is met |
Lists¶
A list is a collection of elements (= str, int, booleans, ...).
You can access specific items based on their index: my_list[i]
.
my_list.append(item)
,my_list.insert(index, item)
my_list.extend(another_list)
my_list.remove(item)
,my_list.pop(index)
my_list.sort()
(orsorted(my_list)
).
Split()¶
You can use split() on a string, and it will "cut" the string on spaces (by default, or on any other character you specificy between brackets). The output will be a list:
sentence = "This is a standard sentence with spaces. How can I turn it into a list of words?"
# TODO together: turn "sentence" into a list of words
csv_data = """Student nb,Name,Masters,Group,Preferred lge
123456,Emma,NLP,Beginners,French
567890,Fantine,SC,Beginners,French
234567,Binesh,NLP,Beginners,English"""
# TODO together: turn "csv_data" into a list of words/values
Files¶
A file is like a box. It has a label (= name = path), it is located somewhere/in a room (= in a folder on your machine), and it contains stuff.
Step 1) Open the box to see what is inside.
with open(path) as f:
Step 2) Oh, it's text/data/..., let's take it outside of the box and read it!
with open(path) as f:
text = f.read() # if you want to read everything at once OR
line = f.readline() # if you want to read one line at a time OR
line_list = f.readlines() # if you want to read everything but separate the lines (= a list of lines)
Now, let's talk about last week's lab...¶
- What was difficult?
- Let's have a look at a proposed solution (there's not only one correct answer!) for ex 2, "part 2"
# TODO create a file in which you'll print the names of all the Iranian athletes (without duplicates)
with open("../athlete_events.csv") as f:
# Create a variable to store what's inside the file, as a list of lines
data = f.readlines()
# We want to store the names of the Iranian athletes in a list, so we have to create this (for now, empty) list
iranian_athletes = []
for line in data[1:]: # do not take into account the first line (= column names)
name = line.split(",")[1]
clean_name = name.replace("'", "") # remove the extra quotation marks
nationality = line.split(",")[6] # = team
clean_nationality = nationality.replace("'", "")
clean_nationality = clean_nationality.replace('"', "") # remove all the quotes
# if the nationality is Iran, we add the names to the list
if clean_nationality == 'Iran':
iranian_athletes.append(clean_name)
#print(iranian_athletes)
# Now we can write this list in a file BUT we convert the list into a *set* => it will remove duplicates
with open("iranian_athletes.txt", "w") as w:
for athlete in set(iranian_athletes):
w.write(athlete+"\n")
# TODO create a file in which you'll print the name of the women who gained medals in 1988 (without duplicates)
with open("../athlete_events.csv") as f:
# Create a variable to store what's inside the file, as a list of lines
data = f.readlines()
women_1988 = []
for line in data[1:]: # do not take into account the first line (= column names)
name = line.split(",")[1]
clean_name = name.replace("'", "")
medal = line.split(",")[-1]
gender = line.split(",")[2] # = team
year = line.split(",")[9]
# if the gender is F, the year is 1988 and the person won a medal
# Note that instead of cleaning, we use "in" so that it doesn't have to be an exact match
if "F" in gender and "1988" in year and "NA" not in medal:
women_1988.append(clean_name)
# Now we can write this list in a file BUT we change the list to a set to remove duplicates
with open("women_1988.txt", "w") as w:
for athlete in set(women_1988):
w.write(athlete+"\n")
Dictionaries¶
Dictionaries
are data structures where you can store for each entry a key
and a value
. The value
can be accessed using the key
:
# The values are associated to the keys.
my_dico ={
"name": "Shelly-Ann Fraser-Pryce",
"country": "Jamaica",
"nb_olympic_medals": 8
}
print(my_dico)
print(my_dico.get("country"))
print(my_dico.keys())
print(my_dico.values())
{'name': 'Shelly-Ann Fraser-Pryce', 'country': 'Jamaica', 'nb_olympic_medals': 8} Jamaica dict_keys(['name', 'country', 'nb_olympic_medals']) dict_values(['Shelly-Ann Fraser-Pryce', 'Jamaica', 8])
🥳 print the size of the dictionary
#TODO Code me!
Modifying a dictionary¶
Dictionaries
are mutable: it is possible to add and remove items in/from a Dictionary
k = my_dico.keys()
print(k) #before the change
# Creation of a new key (gender) and its associated value (female).
my_dico["gender"] = "female"
print(k) #after the change
print(my_dico)
# Removal of the key "gender" (and its associated value...)
my_dico.pop("gender")
print(k) #after the removal
dict_keys(['name', 'country', 'nb_olympic_medals']) dict_keys(['name', 'country', 'nb_olympic_medals', 'gender']) {'name': 'Shelly-Ann Fraser-Pryce', 'country': 'Jamaica', 'nb_olympic_medals': 8, 'gender': 'female'} dict_keys(['name', 'country', 'nb_olympic_medals'])
A Dictionary
can be completely removed, using del
(with the weird syntax)
del my_dico
print(my_dico)
--------------------------------------------------------------------------- NameError Traceback (most recent call last) <ipython-input-14-1f57f8b2e982> in <module> ----> 1 del my_dico 2 print(my_dico) NameError: name 'my_dico' is not defined
A Dictionary
can be emptied (but it still exists):
my_dico2 ={
"name": "Shelly-Ann Fraser-Pryce",
"country": "Jamaica",
"country": "USA",
"nb_olympic_medals": 8
}
my_dico2.clear()
print(my_dico2)
{}
Dictionaries
do not allow duplicates (duplicate values will overwrite existing values)
my_dico2 ={
"name": "Shelly-Ann Fraser-Pryce",
"country": "Jamaica",
"country": "USA",
"nb_olympic_medals": 8
}
print(my_dico2)
{'name': 'Shelly-Ann Fraser-Pryce', 'country': 'USA', 'nb_olympic_medals': 8}
A Dictionary
can store any values, including collections (= lists, sets, tuples)!
my_dico2 ={
"name": "Shelly-Ann Fraser-Pryce",
"country": "Jamaica",
"country": "USA",
"nb_olympic_medals": [3,4,1] # nb of gold, silver and bronze medals resp.
}
print(my_dico2)
{'name': 'Shelly-Ann Fraser-Pryce', 'country': 'USA', 'nb_olympic_medals': [3, 4, 1]}
🥳 Create a dictionary with fruits and their prices: banana cost 0.3, apple 0.22, orange 0.4
- print the price of oranges
- add pineapple (0.71)
- change the price of apples to 0.25
- we don't sell bananas anymore, remove them
- print the new dictionnary
# TODO: code me!
Ordered or not?
Since Python 3.7, dictionaries are ordered. In Python 3.6 and earlier, dictionaries are unordered.
Which version of Python do you have?
import sys
print(sys.version)
3.11.9 (main, Apr 19 2024, 16:48:06) [GCC 11.2.0]
Looping through Dictionaries¶
Note that we loop on the keys
of the dictionary!
dict1 = {
"name": "Shelly-Ann Fraser-Pryce",
"country": "Jamaica",
"country": "USA",
"nb_olympic_medals": 8
}
print("These are the keys of dict1: ")
for k in dict1: # k is a key in the dico
print(k)
These are the keys of dict1: name country nb_olympic_medals
print("These are the values of dict1: ")
for k in dict1: # k is a key in the dico
print(dict1[k]) # using the [] allows to access to the values associated with a specific key
These are the values of dict1: Shelly-Ann Fraser-Pryce USA 8
print("These are the items of dict1: ")
for x, y in dict1.items(): # x, y are key,value in the dico
print(x, y)
These are the items of dict1: name Shelly-Ann Fraser-Pryce country USA nb_olympic_medals 8
Functions¶
Let's discover the Turtle
, a Python implementation of the Logo Turtle), that allows to draw using simple methods (doc).
What does the following code do?
from turtle import *
t = Turtle()
t.shape("turtle")
t.forward(100)
t.left(90)
t.forward(100)
t.left(90)
t.forward(100)
t.left(90)
t.forward(100)
t.left(90)
mainloop()
Nice! But boring to write... the same goes for your code concerning the athletes...
We need a way to store certain actions, so that we can reuse them without having to copy/paste them all the time...
from turtle import *
t = Turtle()
t.shape("turtle")
def forwardLeft(tutu):
tutu.forward(100)
tutu.left(90)
forwardLeft(t)
forwardLeft(t)
forwardLeft(t)
forwardLeft(t)
mainloop()
Back to basics (Turtle
does not work so well on Jupyter):
def HelloWorld(): # function definition (note the :)
print("Hello, World!")
HelloWorld() # function call
Hello, World!
def compute(): # you can add any number of instructions in your function
a_number = 2713
print(2*a_number)
compute()
5426
We can add parameters
to a function:
def hello(lang): # here lang is a parameter
if lang == "fr":
print("Bonjour")
elif lang == "bzg":
print("Demat")
else:
print("Unknown language")
print("I'm done")
hello("fr") # here "fr" is an argument (the value of the parameter)
hello("en")
Bonjour I'm done Unknown language I'm done
Note the difference between print()
and return
:
def hello(lang): # here lang is a parameter
if lang == "fr":
return("Bonjour")
elif lang == "bzg":
return("Demat")
else:
return("Unknown language")
print("I'm done")
hello("fr") # here "fr" is an argument (the value of the parameter)
hello("en")
'Unknown language'
When you use a return
, you define the output of the function.
You can then re-use this output and store it in a variable!
your_lang = hello("bzg")
print(your_lang*2)
print("In your language, hello is", your_lang)
DematDemat In your language, hello is Demat
A function
can take several parameters:
def hello_you(your_name,lang): # here lang is a parameter
if lang == "fr":
return("Bonjour "+your_name)
elif lang == "bzg":
return("Demat "+your_name)
else:
return("Unknown language")
print("I'm done "+your_name)
hello_you("Karen","fr") # here "fr" is an argument (the value of the parameter)
'Bonjour Karen'
Arguments can be of more complex types, like lists
:
def my_function(food):
for x in food:
print(x)
fruits = ["apple", "banana", "cherry"]
my_function(fruits)
apple banana cherry
🥳 write a function "square" that returns the square of the (supposedly) number passed as argument.
Test it with the following arguments:
- 3
- 0
- titi
#TODO Code me!
Good practice¶
Document your functions!
Use docstring, i.e. a string (conventionally in triple quotes), at the beginning of your function, that explains what the function does/returns, and (even better), what it takes as input (parameters)
See more: https://peps.python.org/pep-0257/
def hello_you(your_name,lang):
"""Input: the user's name (str), the user's language code (str)
Output: returns 'hello' in the user's language and the user's name."""
if lang == "fr":
return("Bonjour "+your_name)
elif lang == "bzg":
return("Demat "+your_name)
else:
return("Unknown language")
print("I'm done "+your_name)
Your docstring is then automatically parsed and accessible with .__doc__
print(hello_you.__doc__)
Input: the user's name (str), the user's language code (str) Output: returns 'hello' in the user's language and the user's name.
Going further¶
- Recursion: Check the W3School doc (Recursion)