Listing a directory contents

In your solution, you will have to go through all files stored in directories:

  • images in training data directory (to learn something from them, i.e., estimate the parameters of a model) and
  • images in testing data directory (to classify these images with the chosen classifier).

There is more ways to achieve this; we shall list here some of them.

os.listdir()

Function listdir() (docs) from os module returns a list of filename from specified directory. If we have the following directory with files

+- train_data
   +- truth.dsv
   +- img112.png
   +- img113.png
   +- img114.png
then the following script
import os
 
for fname in os.listdir("train_data"):
    print(fname)
provides this output:
img_1112.png
img_1113.png
img_1114.png
truth.dsv

You can also use similar function os.scandir() which returns a generator.

Object-oriented interface of ''pathlib'' module

If you like object-oriented interfaces better, then you can use class Path and its method iterdir() (docs):

from pathlib import Path
 
path = Path("train_data")
for fpath in path.iterdir():
    print(fpath)

After running the above script:

train_data\img_1112.png
train_data\img_1113.png
train_data\img_1114.png
train_data\truth.dsv

Generator iterdir() returns instances of Path class which encapsulate the paths to individual files. But you can easily use these objects when opening files using open() function.

More examples how to use pathlib can be found in this nice pathlib tutorial.

courses/be5b33kui/semtasks/05_ml1/listdir.txt · Last modified: 2024/02/18 20:09 by xposik