How to Write List Comprehensions with Python

September 29, 2020

One of the most common blocks you will write in Python scripts is a for loop. With for loops, you can repeat the same set of instructions in a block over and over. Python’s for loops are really foreach loops, where you repeat the instructions for every item in a collection. These collections are called iterators, which is something that a Python loop is able to iterate over, and the most common iterator is list.

For Loops

Let’s look at an example of a for loop. Write a function that prints the square of each number from one to n:

def write_squares(n):
    for number in range(n):
        square = number ** 2
        print(square)

write_squares(6)

This is the output you will get from the above example:

In write_squares, we calculate the variable square and print it for each value in range(n), which is all of the numbers from 0 to n inclusive.

Boxes and Arrows

Small for loops like this are very common in Python scripts. For example, we can read in lines from a file and strip any spaces from each line:

def lines_from(filename):
	with open(filename, "r") as file_obj:
		original_lines = file_obj.readlines()

	stripped_lines = []
	for line in original_lines:
		new_line = line.strip()
		stripped_lines.append(new_line)

Or, we might have a list of IDs and want to grab a piece of data for each ID:

def email_addresses(ids):
	addresses = []
	for identifier in ids:
		data = api_request(identifier)
		email_address = data["email"]
  		addresses.append(email_addresses)

With each of these for loops, we’re running the same piece of code on each item in the list. While the repeated piece of code isn’t too complicated, you still need to write (or read!) multiple lines to understand what the code block is doing. List comprehensions are an elegant way to create lists from existing lists.

List Comprehension

First, let’s look at a quick example:

chars = [letter for letter in "Hello, world!"]
print(chars)

When you run this Python code, the output will be:

['H', 'e', 'l', 'l', 'o', ',', ' ', 'w', 'o', 'r', 'l', 'd', '!']

In this example, a new list is created, named chars, that contains each of the items in the string "Hello, world!". Because items in a string are characters, chars is a list of every character in the string.

List Comprehension Syntax

List comprehensions are written as:

new_list = [expression for item in old_list]

This is what a list comprehension “unwrapped” into a traditional list would look like:

new_list = []
for item in old_list:
	new_item = expression
	new_list.append(new_item)

Used correctly, list comprehensions can reduce for loops into a more readable, one line expression.

Syntax for List Comprehension

Examples

Let’s re-write our earlier examples using list comprehensions. Instead of writing an whole new function, our write_squares example can be reduced to one line:

squares = [number ** 2 for number in range(6)]
print(squares)

We get the same results when we run this code:

Now let’s look at the line stripping function. We need to keep the first few lines to read the file contents, but the for loop that appended each stripped line to a new variable has been updated to use a list comprehension instead:

def lines_from(filename):
	with open(filename, "r") as file_obj:
		original_lines = file_obj.readlines()

	return [line.strip() for line in original_lines]

The email address fetcher can be reduced to one line, and it can be directly placed in another block of code:

def email_addresses(ids):
	return [api_request(id)["email"] for id in ids]

Conditionals

List comprehensions can also use conditional statements to filter or modify the values that are added to a new list. For example, if we wanted a list of only even integers from 1 to 20, we could add a conditional to the end of a list comprehension:

>>> evens = [num for num in range(0, 20) if num % 2 == 0]
>>> print(evens)

[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]

List comprehensions can use any number of and and or operators in conditionals. For example, we can use the conditional (num % 2 == 0) and (num % 3 == 0) to keep only numbers that are divisible by both 2 and 3:

>>> my_nums = [num for num in range(0, 20) if (num % 2 == 0) and (num % 3 == 0)]
>>> print(my_nums)

[0, 6, 12, 18]

Key Points

List comprehension is an elegant way to create new lists from existing lists. List comprehensions can reduce multiple-line code blocks to just one line! However, avoid writing large list comprehensions, as that may reduce legibility for your code readers.

Once you’re comfortable with list comprehensions, I recommend learning dictionary comprehensions, which are very similar to list comprehensions except that they operate on dictionaries.

Python's Walrus Operator

September 23, 2020

Beautiful is better than ugly.

— The Zen of Python

Python introduced a brand new way to assign values to variables in version 3.8.0. The new syntax is :=, and it’s called a “walrus operator” because it looks like a pair of eyes and a set of tusks. The walrus operator assigns values as part of a larger expression, and it can significantly increase legibility in many areas.

Named Expressions

You can create named expressions with the walrus operator. Named expressions have the format NAME := expression, such as x := 34 or numbers := list(range(10)). Python code can use the expression to evaluate a larger expression (such as an if statement), and the variable NAME is assigned the value of the expression.

If you’ve written Swift code before, Python’s walrus operator is similar to Swift’s Optional Chaining. With optional chaining, you assign a value to a variable inside the conditional of an if statement. If the new variable’s value is not nil (like Python’s None), the if block is executed. If the variable’s value is nil, then the block is ignored:

let responseMessages = [
    200: "OK",
    403: "Access forbidden",
    404: "File not found",
    500: "Internal server error"
]

let response = 444
if let message = responseMessages[response] {
    // This statement won't be run because message is nil
    print("Message: " + message)
}

Benefits

There are a lot of benefits to using the walrus operator in your code. But don’t take my word for it! Here’s what the authors of the idea said in their proposal:

Naming the result of an expression is an important part of programming, allowing a descriptive name to be used in place of a longer expression, and permitting reuse.

— PEP 572 – Assignment Expressions

Let’s take a look at some examples.

Don’t Repeat Yourself

With the walrus operator, you can more easily stick to the DRY principle and reduce how often you repeat yourself in code. For example, if you want to print an error message if a list is too long, you might accidentally get the length of the list twice:

my_long_list = list(range(1000))

# You get the length twice!
if len(my_long_list) > 10:
    print(f"List is too long to consume (length={len(my_long_list)}, max=10)")

Let’s use the walrus operator to only find the length of the list once and keep that length inside the scope of the if statement:

my_long_list = list(range(1000))

# Much better :)
if (count := len(my_long_list)) > 10:
    print(f"List is too long to consume (length={count}, max=10)")

In the code block above, count := len(my_long_list) assigns the value 1000 to count. Then, the if statement is evaluated as if len(my_long_list) > 10. The walrus operator has two benefits here:

We don’t calculate the length of a (possibly large) list more than once
We clearly show a reader of our program that we’re going to use the count variable inside the scope of the if statement.

Reuse Variables

Another common example is using Python’s regular expression library, re. We want to look at a list of phone numbers and print their area codes if they have one. With a walrus operator, we can check whether the area code exists and assign it to a variable with one line:

import re

phone_numbers = [
    "(317) 555-5555",
    "431-2973",
    "(111) 222-3344",
    "(710) 982-3811",
    "290-2918",
    "711-7712",
]

for number in phone_numbers:
    # The regular expression "\(([0-9]{3})\)" checks for a substring
    # with the pattern "(###)", where # is a 0-9 digit
    if match := re.match("\(([0-9]{3})\)", number):
        print(f"Area code: {match.group(1)}")
    else:
        print("No area code")

Legible Code Blocks

A common programming pattern is performing an action, assigning the result to a variable, and then checking the result:

result = parse_field_from(my_data)
if result:
    print("Success")

In many cases, these types of blocks can be cleaned up with a walrus operator to become one indented code block:

if result := parse_field_from(my_data):
    print("Success")

These blocks can be chained together to convert a nested check statements into one line of if/elif/else statements. For example, let’s look at some students in a dictionary. We need to print each student’s graduation date if it exists, or their student id if available:

sample_data = [
    {"student_id": 200, "name": "Sally West", "graduation_date": "2019-05-01"},
    {"student_id": 404, "name": "Zahara Durham", "graduation_date": None},
    {"student_id": 555, "name": "Connie Coles", "graduation_date": "2020-01-15"},
    {"student_id": None, "name": "Jared Hampton", "graduation_date": None},
]

for student in sample_data:
    graduation_date = student["graduation_date"]
    if graduation_date:
        print(f'{student["name"]} graduated on {graduation_date}')
    else:
        # This nesting can be confusing!
        student_id = student["student_id"]
        if student_id:
            print(f'{student["name"]} is currently enrolled with ID {student_id}')
        else:
            print(f'{student["name"]} has no data")

With walrus operators, we can put the graduation date and student id checks next to each other, and better show that we’re checking for one or the other for each student:

sample_data = [
    {"student_id": 200, "name": "Sally West", "graduation_date": "2019-05-01"},
    {"student_id": 404, "name": "Zahara Durham", "graduation_date": None},
    {"student_id": 555, "name": "Connie Coles", "graduation_date": "2020-01-15"},
    {"student_id": None, "name": "Jared Hampton", "graduation_date": None},
]

for student in sample_data:
    # Much cleaner
    if graduation_date := student["graduation_date"]:
        print(f'{student["name"]} graduated on {graduation_date}')
    elif student_id := student["student_id"]:
        print(f'{student["name"]} is currently enrolled with ID {student_id}')
    else:
        print(f'{student["name"]} has no data")

Wrap Up

With walrus operators and named expressions, we can dramatically increase the legibility of our code by simplifying statements, reusing variables, and reducing indentation. For more great examples, check out the original proposal and the Python 3.8 release notes.

AWS CLI: Multiple Named Profiles

September 6, 2020

The AWS CLI supports named profiles so that you can quickly switch between different AWS instances, accounts, and credential sets. Let’s assume you have two AWS accounts, each with an access key id and a secret access key. The first account is your default profile, and the second account is used less often.

Adding a Named Profile

First, open ~/.aws/credentials (on Linux & Mac) or %USERPROFILE%\.aws\credentials (on Windows) and add your credentials:

[default]
aws_access_key_id=AKIAIOSFODNN7EXAMPLE1
aws_secret_access_key=wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY1

[user2]
aws_access_key_id=AKIAI44QH8DHBEXAMPLE2
aws_secret_access_key=je7MtGbClwBF/2Zp9Utk/h3yCo8nvbEXAMPLEKEY2

If your two profiles use different regions, or output formats, you can specify them in ~/.aws/config (on Linux & Mac) or %USERPROFILE%\.aws\config (on Windows):

[default]
region=us-west-2
output=json

[profile user2]
region=us-east-1
output=text

Note: do not add profile in front of the profile names in the credentials file, like we do above in the config file.

Most AWS CLI commands support the named profile option --profile. For example, verify that both of your accounts are set up properly with sts get-caller-identify:

# Verify your default identity
$ aws sts get-caller-identity

# Verify your second identity
$ aws sts get-caller-identity --profile user2

EKS and EC2 commands also support the --profile option. For example, let’s list our EC2 instances for the user2 account:

$ aws ec2 describe-instances --profile user2

Setting a Profile for Kubeconfig

The AWS CLI --profile option can be used to add new clusters to your ~/.kubeconfig. By adding named profiles, you can switch between Kubernetes contexts without needing to export new AWS environment variables.

If your EKS instance is authenticated with only your AWS access key id and access key secret, add your cluster with eks update-kubeconfig:

$ aws eks update-kubeconfig --name EKS_CLUSTER_NAME --profile PROFILE

If your EKS instance uses an IAM Role ARN for authentication, first copy the role ARN from the AWS Console: Go to the EKS service page, then Clusters, then select your cluster name, and find the IAM Role ARN at the bottom of the page. The format of the role ARN is typically arn:aws:iam::XXXXXXXXXXXX:role/role_name. Then, use eks update-kubeconfig:

aws eks update-kubeconfig --name EKS_CLUSTER_NAME --role-arn ROLE_ARN --profile PROFILE

To verify that your kubeconfig is set properly, use kubectx to switch to one of your new clusters and try to list out its services:

$ kubectx EKS_CLUSTER_NAME
Switched to context "EKS_CLUSTER_NAME".

$ kubectl get services
...