Scopes and Globals Lesson


Imagine you just heard some amazingly positive news about your friend Lauren winning the lottery while driving on a road trip. You’re so blown away you ask a gas station attendant, “Wow, can you believe the news about Lauren?!” They look at you puzzled, “sorry, I don’t know a Lauren.” Slightly annoyed, you walk back out to the gas pumps and ask a stranger pumping gas. With a look of suspicion the stranger responds, “you know my horse?” Now you look puzzled and say, “neigh, sorry, different Laurens.”

When there can be many people, or more broadly things, which share the same name, the meaning of the name depends on the context it is being referenced from. In your head space, the name Lauren is your close friend. In the gas station attendant’s, the name Lauren is not associated with anyone. In the stranger’s head, Lauren is most closely associated with their horse friend. While this ambiguity can lead to confusion, it also makes sense! What an overwhelming world it would be if every thing required a unique name to be associated with it.

In the evaluation of a single program, the same name can be bound to many different values at the same time, as long as those bindings are held in different contexts. Each function, for example, has its own context for names. This is why, when diagramming with an environment diagram we setup a new frame for a function call. The rules of a language’s name resolution process decide what name you’re referring to in a given context.

The scope of an identifier, which includes variable names, function names, and names of a few other concepts you’ll learn in time, refers to where you can access a specific definition of the identifier from in a program. Unpleasant surprises occur when you attempt to access an identifier outside its scope. For example, if an identifier is yet to be defined, or its definition is in a context name resolution will not check, then you will be faced with a NameError.

Try the following example in your Python REPL:

>>> print(lauren)
NameError: name 'lauren' is not defined
>>> lauren: str = "a friend"
>>> def stranger() -> None:
...   lauren: str = "a horse"
...   print(lauren)
...
>>> print(lauren)
a friend
>>> stranger()
a horse
>>> print(lauren)
a friend

The name lauren was first defined outside of any function and bound to the string "a friend". Then, inside the stranger function, the name lauren was defined to mean "a horse". When the same statement print(lauren) was evaluated from these different contexts, the name lauren resolved to different definitions.

Accessing Global Names

Global names are those defined outside of a special context such as inside of a function definition.

In the Python programming language, specifically, global variables are more precisely called module variables. The concept of global variables and global scope is more widely applicable in other programming languages than module variables, though, so we will choose the more globally useful definition. Notice this description is meta-commentary. Sometimes you intentionally want to refer to names whose definitions are beyond your local concerns.

In the preceding section’s REPL example, the name lauren was initially defined as a global variable. In the following example, we will define a function that accesses this global variable.

>>> lauren: str = "a friend"
>>> def global_access() -> None:
...   print(lauren)
...
>>> global_access()
"a friend"

Notice a key distinction between the global_access function and the stranger function: lauren was redefined and bound to have a different, local meaning in stranger.

Name resolution rules inform why the global_access function was able to read from lauren rather than result in a NameError. When a name is not found in the current scope, which in a function call is the current frame of execution, then the Globals frame will be checked. You have already made use of this! Since you have written one function that calls some other function, both defined in Globals, the other function’s name could only be resolved to have meaning thanks to this lookup process.

Named Constants and Magic Numbers

Perhaps the most valuable use of globally scoped variables are to put names on constant values used throughout your program. Consider the following code listing:

Notice the float-literal 0.009 is a constant value in the program. It will not change. It is also kind of “magical” isn’t it? Where did this number come from and what does it represent? Often the same constant like this is copied throughout a program. In programming circles, this is an example of a “magic number”.

Magic numbers are a bad practice for two reasons:

  1. Magic numbers make your code harder to read and understand. Someone else reading your code likely will not immediately know why that number is chosen. Further, you yourself will become a stranger to your own code given enough time (typically weeks or months) and will often forget why you chose some specific number.
  2. Magic numbers used throughout a program are more work to change and maintain. At best, you remember you have multiple places you need to update the magic number or search/replace all. At worst, you forget to update a few places and are accidentally relying on different values for what should have been consistent!

Named constants are the preferred technique for avoiding magic numbers. A named constant is a global variable whose value is initialized and does not change at runtime.

Named constants are conventionally named using all capital letters with underscores separating words. This convention makes it easy to distinguise a variable name from a globally named constant. Let’s rewrite the previous example to use a named constant:

Convince yourself this change is both easier to understand to someone reading the code for the first time. If the same named constant were used in multiple places, notice how much easier this code is to tweak as well: you only need to update the value of the named constant and all references to it will make use of it.

Moving forward, any time you use a “magic number” in your program you should catch yourself as quickly as possible and refactor, meaning rewrite in such a way the program has the same meaning but is better structured, your program to use a named constant instead.

You should never attempt to reassign a named constant. Their purpose is to remain the same throughout the execution of a program. If you need a globally accessible value that changes as the program is running, the next section discusses how to go about it. Most programming languages will actively prevent you from accidentally reassigning a new value to a named constant. Doing so can be a source of great confusion. Python does not enforce this rule, though, so be careful to avoid being bitten by this snake.

Reassigning a Global Variable

You now know two primary use cases for globally defined names:

  1. Defining structural components of your programs such as functions and, you will soon learn, classes.
  2. Defining named constants.

There’s another use of globally defined variables that is worth knowing, but with a healthy disclaimer. Global variables can be handy in small programs, not too different from those you have written, but should generally be avoided as your programs grow in size and you learn techniques for avoiding their use. The reason for this is they can make programs more difficult to reason about and more difficult to write without subtle bugs.

When you initialize a local variable in a function in Python, by default it binds the variable’s name locally within the context of the function. Subsequent references to the name from inside the context of the function will refer to its local value. You saw this in the stranger example above.

So, what if you want to reassign to a global variable in Python? You must specifically declare your intent to do so. Let’s explore with an example:

>>> lauren: str = "a friend"
>>> def a_forceful_stranger() -> None:
...   global lauren
...   lauren = "MY HORSE"
...   print(lauren)
...
>>> print(lauren)
a friend
>>> a_forceful_stranger()
MY HORSE
>>> print(lauren)
MY HORSE

There are a few important points to notice. First, lauren is defined as a global variable, outside of any function.

Second, notice the global keyword followed by the name lauren on the first line of a_forceful_stranger. The global declaration states within the function’s context all references to lauren will resolve to the global variable lauren. Most importantly, all assignment statements attempting to assign to lauren will assign to the global variable lauren. Notice this means after the function call evaluates, the global variable’s value is changed!

When it comes to diagramming the call stack, when you evaluate the function a_forceful_stranger, you would not introduce the name lauren in its frame because of the global keyword. The global declaration prevents a new entry from being added locally.

It is worth noting these specific kinds of rules around name resolution, scope, variable access, and variable assignment are where different programming languages make slightly different choices. Python’s default rules make it easy for you to access global variables but make you go out of your way to reassign them. This is helpful! When you find yourself learning or writing other programming languages, comparing and contrasting these rules in Python to the other languages’ rules will help you get up to speed quickly. The same fundamental concepts apply, there will just be slightly different ways of achieving your goals.

In Python, you should name global variables with standard, lowercase snake_case_conventions.

Where is it appropriate to use a global variable? Suppose you’re writing a program in a single file and there’s a variable many different functions in your program need to make use of. For example, maybe you’re writing a little interactive game and ask the user for their name as the program begins. Rather than having to pass around their name as an argument to every function call, storing it in a global variable makes sense. Another example might be a single player game that keeps track of a score. Being able to access and modify the score from various functions is handy.

When in doubt, though, prefer local variables and named constants!

Why are global variables generally discouraged?

These concerns are a bit outside yours at this point, but foreshadowing can help prepare you for later ideas.

Reading from and writing to global variables are examples of side-effects in our fundamental pattern. The formal inputs to a function are its parameters and its formal result is its return value. Yet, a function can also read from and write to global variables it has access to through its environment.

Using global variables makes your programs more difficult to debug. Since any function is able to modify a global variable, if your bug is impacted by the global variable then you must check all of the possible places it is reassigned from. Not only that, but you must begin to consider how they change the global variable and reason through what order those functions might be called in, and so on. This is much, much more challenging than working with local variables whose scopes are confined within the context of a function call’s frame.

Functions which do not have any side-effects, including not accessing or reassigning global variables, are called pure functions and are the easiest kind of functions to work with because they come with no surprises. When in doubt, write pure functions!