# Paths to files and directories in Windows and Linux

This info is a complement to the lecture on working with files, please read the lecture first: lec08-files.pdf

Paths are sequences of directory names (possible ended with a file name) which unambiguasly resolve to a directory or a file in the filesystem.

On Linux, the forward slash character / is used as a separator in the paths, e.g.

• /home/janedoe/prg/spam is an absolute path to the spam directory (or file), which is inside the prg directory, which is inside janedoe directory, which is inside home directory , which is inside root directory /;
• data/1 is a relative path that points to directory (or file) named 1, which is inside data directory, which is inside the current working directory (relative paths do not start with backslash and are interpretted relative to the current working directory).

On Windows, the backword slash, or backslash, character \ is used as a separator in paths, e.g.

• \users\janedoe\prg\spam or c:\users\janedoe\prg\spam are valid absolute paths;
• data\1 is a relative path.

But on modern Windows systems you can use in most cases also the forward slashes, as on Linux. I.e. the paths

• /users/janedoe/prg/spam, c:/users/janedoe/prg/spam, and
• data/1

are valid on Windows as well.

## The backslash

In many programming languages, including Python, the backslash in strings is actually interpretted as a so called escape character, i.e. it starts a special character. E.g \n is a new-line character, \t is a tabulator, etc. But how do you type the backslash itself, if the backslash has this special meaning? Well, you use double backslash, \\, which actually represents a single backslash character.

So, if you want to use Windows paths with backslash separators, it will not work directly:

>>> '\users\janedoe\prg\spam'
File "<stdin>", line 1
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 0-1: truncated \uXXXX escape

You have several options for specifying paths on Windows:

• Use double backslashes instead of single backslash:
>>> '\\users\\janedoe\\prg\\spam'
'\\users\\janedoe\\prg\\spam'
• Use the “raw” string literal format (with r before the first apostrophe/quotation mark; this essentially tells Python not to interpret the backslashes as special characters):
>>> '\\users\\janedoe\\prg\\spam'
'\\users\\janedoe\\prg\\spam'
As you can seee, the resulting string literal is exactly the same as in the previous case.
• Or use the forward slashes, they shall work as well. (The string literal will not be the same, though.)

## Joining paths

Very often you have to create a path to a file, given the path to a directory (in which the file resides) and the filename. The best way is to join these parts using os.path.join() function:

>>> import os
>>> corpus_dir = r'\users\janedoe\prg\spam\data'
>>> file_name = '!truth.txt'
>>> os.path.join(corpus_dir, file_name)
'\\users\\janedoe\\prg\\spam\\data\\!truth.txt'