Wildcards - Computerphile
### Understanding Wildcards: A Deep Dive into How They Work
Wildcards are a fundamental concept in computing, particularly when dealing with command-line interfaces, file management, and pattern matching. While they are often associated with files and directories, their principles can be applied in various other contexts. This article delves into the intricacies of wildcards, focusing on how they operate, their different types, and practical examples of their usage.
---
#### What Are Wildcards?
A wildcard is a character or symbol that represents one or more other characters in a string. The most common wildcards used in computing are:
1. **Star (*):** This matches any sequence of characters (including zero characters). For example, `*.docx` would match all files ending with the `.docx` extension.
2. **Question Mark (?):** This matches exactly one character. For instance, `m?m.dot` would match filenames like `mom.dot`, `mem.dot`, or `mmm.dot`.
These wildcards are not limited to file systems; they can also be used in programming, databases, and other areas where pattern matching is required.
---
#### How Wildcards Work
When you use a wildcard in a command-line interface (CLI), the system interprets it differently depending on whether you're using DOS/Windows or Unix-based systems. Here's how it works:
1. **File Matching Process:**
- The operating system iterates through each file in the directory.
- It compares each filename against the pattern provided by the user, character by character.
- If all characters match (except for wildcards), the file is included in the results.
2. **Example of Filename Matching:**
- Suppose you have files named `lecture01`, `lecture02`, `lab01`, and `notes` in a directory.
- If you use the pattern `lecture*`, the system will match all filenames starting with `lecture`.
- For each file, it compares characters until it finds a mismatch. If no mismatches are found before encountering a wildcard (like `*`), the file is included.
---
#### Differences Between DOS/Windows and Unix Wildcards
While both systems use wildcards like `*` and `?`, there are subtle differences in their implementation:
1. **DOS/Windows:**
- The `*` matches any sequence of characters, including none.
- The `?` matches exactly one character.
- DOS-based systems typically handle wildcard expansion within the command processor (e.g., `cmd.exe`).
2. **Unix/Linux:**
- Unix systems often use the shell to process wildcards. For example, Bash expands `*.docx` before passing it to commands like `ls`.
- Advanced patterns can be created using square brackets (e.g., `[a-z]` matches any lowercase letter from a to z).
---
#### Practical Examples of Wildcard Usage
1. **Finding Word Documents:**
- In DOS/Windows: Use `dir *.docx` to list all `.docx` files in the current directory.
- In Unix/Linux: Use `ls *.docx` or `find . -name "*.docx"` for a more comprehensive search.
2. **Matching Specific Patterns:**
- To find files starting with "pre" and ending with ".txt": Use `pre*.txt`.
- To find files containing exactly four characters after "lab": Use `lab???.txt`.
3. **Advanced Matching in Unix:**
- To match files between "a.txt" and "z.txt": Use `a-z.txt` (within square brackets).
- This would include files like `b.txt`, `c.txt`, etc., but not `A.txt` or `1.txt`.
---
#### Under the Hood: How Wildcards Are Processed
When you execute a command with wildcards, the operating system processes each file in the directory:
- **Step 1:** The system reads the list of files in the current directory.
- **Step 2:** For each filename, it compares it against the pattern provided by the user.
- **Step 3:** If the filename matches the pattern (considering wildcards), the file is added to the result set.
- **Step 4:** The results are then displayed or used as needed.
This process can be optimized in modern operating systems, but the underlying mechanism remains the same.
---
#### Conclusion
Wildcards are a powerful tool for efficiently managing files and performing searches. Understanding how they work allows users to save time by matching patterns rather than individual filenames. Whether you're working in DOS/Windows or Unix/Linux, mastering wildcards can significantly enhance your productivity at the command line.