Purpose: Use Unix shell rules to fine filenames matching a pattern.
Python Version: 1.4
Even though the glob API is very simple, the module packs a lot of power. It is useful in any situation where your program needs to look for a list of files on the filesystem with names matching a pattern. If you need a list of filenames that all have a certain extension, prefix, or any common string in the middle, use glob instead of writing code to scan the directory contents yourself.
The pattern rules for glob are not regular expressions. Instead, they follow standard Unix path expansion rules. Shell variable names and tilde (~) are not expanded. There are only a few special characters: two different wild-cards, and character ranges are supported. The patterns rules are applied to segments of the filename (stopping at /), but paths in the pattern can be relative or absolute.
The examples below assume the following test files are present in the current working directory:
Use the glob_maketestdata.py script in the sample code to create these files if you want to run the examples.
An asterisk (*) matches zero or more characters in a segment of a name. For example, dir/*.
The pattern matches every pathname (file or directory) in the directory dir, without recursing further into subdirectories.
$ python glob_asterisk.py
['dir/file.txt', 'dir/file1.txt', 'dir/file2.txt',
'dir/filea.txt', 'dir/fileb.txt', 'dir/subdir']
To list files in a subdirectory, you must include the subdirectory in the pattern:
print 'Named explicitly:'
print 'Named with wildcard:'
The first case above lists the subdirectory name explicitly, while the second case depends on a wildcard to find the directory.
$ python glob_subdir.py
Named with wildcard:
The results, in this case, are the same. If there was another subdirectory, the wildcard would match both subdirectories and include the filenames from both.
Single Character Wildcard
The other wildcard character supported is the question mark (?). It matches any single character in that position in the name. For example,
Matches all of the filenames which begin with “file”, have one more character of any type, then end with “.txt”.
$ python glob_question.py
When you need to match a specific character, use a character range instead of a question mark. For example, to find all of the files which have a digit in the name before the extension:
The character range [0-9] matches any single digit. The range is ordered based on the character code for each letter/digit, and the dash indicates an unbroken range of sequential characters. The same range value could be written , in this case.
$ python glob_charrange.py
Python Module of the Week Home
Download Sample Code
Pattern Matching Notation, The Open Group