Bash script Implementation: Task B

WARNING: This information might be out of date. Last updated for the winter term of 2016/2017.

The task is to be solved individually. Available variants follow. When you choose your variant(s), perform a reservation in the Upload System.

In case of any questions/ambiguities, please ask in time. If you encounter problem(s) during the implementation, solve them right away. Read thoroughly the common specifications and the specification of your task.

The task is checked by an automated script in the Upload System. If the task is not accepted, read thoroughly the test output.

Common rules for BASH projects

When implementing, you should fulfill the following:

The script must contain a (comment) header with the purpose of the script, author name, contact e-mail and current term.
All script inputs are specified as command line arguments (unless explicitly stated) – the script never asks for file/directory name in an interactive way.
Validate the correctness of the script parameter(s). When the parameters are incorrect, print an error message or help.
Unless needed, do not create temporary files (an exception is working with archives that must be extracted). Each script must clean its temporary data even in case of failure.
Temporary files must be created in a place (path) with sufficient write permissions (may not be set in the current directory). (Never use fixed filenames for temporary files. Use the mktemp command instead.) Remember that multiple instances of your script might be executed simultaneously.
When the script creates/overwrites/deletes file, it must not harness any other files (you must test existence of the files in advance and act accordingly).
File types cannot be determined by its suffix. A .sh file can be anything (binary, pipe or whatever). The users can assign an arbitrary suffix to any file.
Use standard output (stdout) and standard error output (stderr). The stderr should be used not only for error messages, but also for any output that should not mix with the data (such as help and debugging info). The stdout is reserved for output data only.
It must be easy to determine the purpose of the script. Implement a -h or –help argument that prints a short help with usage.
Mind that directories/paths might be specified relatively (., .., ../../, ./.././path, ../path) or absolutely (/var/lib/).
Test your script with strings (and/or filenames) containing spaces (and tabulators). Use correct quoting when appropriate.
Scripts are supposed to be called in a pipe (from another scripts). Do not expect interactive run:
- Always return an error code (non-zero value means an error).
- Do not display prompt when expecting data from stdin.
- Parameters are input only as a command line arguments.
Do not copy&paste blocks of similar code. The modification of the code should consume as low time as possible. Use functions and think before you code.
Name the script accordingly. Inappropriate names are: “bash”, “bash1”, “john”, “example”, “script.sh”, “task.sh”, “a.sh”, … Warning: Upload system deduces points for such (…) filenames.
Do not reinvent (reimplement) the wheel (sorting algorithms, …). Use the standard tools effectively.
Respect the already set permissions of a file and directory. Even if you are an owner and can change the write permission, do not change it. (Or, use an explicit parameter for your script, such as -f or –force as in case of standard tools).
Do not use DOS style end of lines (EOLs) (happens when editing file under Windows). The script might misbehave if you use the DOS end of lines.
Stick to the output format specified (see examples).
The whole task must be implemented in one file only. Otherwise the automated evaluator cannot guess which file to execute.
It must be easy to determine the script function. Implement an -h option that prints script information and invocation and exits (exit code 0).

Failure to follow these principles will result in a penalty!

IMPORTANT: As the script is automatically evaluated, implement the -v commandline parameter that prints out the number of the variant solved onto the standard output and exits. You need to perform a reservation of the task first.

Variant Specification

Task 1

Create a shell script that finds all bad symbolic links in the specified directory and subdirectories (i.e. all links that do not point to an existing file) and prints their absolute path to stdout. Script must have two optional parameters: -r and -a. The position of the parameters is not specified, the path should be specified only once.

When -r is specified, the script deletes the wrong link. Parameter -a lists all links (even the correct). Combination of both parameters is not permitted.

If there is a (correct) link to another directory in the directory structure, do not follow the link – do not search the referenced directory through.

When the specified directory does not exist or no wrong link is found, print a message to stderr. If no directory is specified, use the current directory.

Example: Output of a script that found two links nonex and wrong pointing to non-existent file/directory:

/home/student/example/nonex

/home/student/example/wrong

Hint: See commands find, dirname and readlink.

Task 2

Create a shell script for comparing two archives (of type tar.gz). For each file/directory (recursively) from either archive it verifies whether a file/directory of the same name exists in the other archive. When a whole subdirectory is missing, do not list its files nor subdirectories. Script outputs a list of filenames that do not have an equivalent in the other archive. When the specified file does not exist (or is not a tar.gz file), outputs an error message. Do not determine the filetype via suffix.

Example: Output of a script comparing archives arch1.tar.gz and arch2.tar.gz finding files aa/a.txt, bb/b.txt in arch1.tar.gz archive (but not in arch2.tar.gz; also finding and nic.txt in arch2.tar.gz and a directory dir1 in arch2.tar.gz:

> # An example of archive modification (pseudocode)
> $ tar tf archive1.tgz
> ===============
> aa/a.txt
> aa/aa.txt
> bb/b.txt
> bb/c.txt
> hello world/hello world.c
> ===============
> 
> CREATE: nic.txt
> DELETE: aa/a.txt
> DELETE: bb/b.txt
> CREATE: dir1/file.txt
> CREATE: dir1/file2.txt
>
> TAR: Create archive2.tgz
>
> $ tar tf archive2.tgz
> ===============	
> aa/aa.txt
> bb/c.txt
> nic.txt
> dir1/file.txt
> dir1/file2.txt
> hello world/hello world.c
> ===============

Expected output of your script:

 arch1.tar.gz:aa/a.txt
 arch1.tar.gz:bb/b.txt
 arch2.tar.gz:nic.txt
 arch2.tar.gz:dir1

Hint: See command tar. Keep in mind that you do not actually have to extract the archive.

Task 3

Create script that finds all files in a given directory (and subdirectories) having the suffix provided as an argument and performs one of the following actions:

Copies them and changes their suffix, or
renames them to use another suffix, or
deletes them.

Script has 4 mandatory parameters (3 in case of file deletion): suffix_handle.sh [ -c | -m | -r ] ext1 ext2, where extensions ext1 and ext2 are strings that might containing a dot. Parameter -c stands for copy operation, -m for rename (move) and -r for file deletion.

In case of copy/rename script outputs a line for each file in the following format: old_filename ⇒ new_filename,

in case of file deletion, the script lists the files being deleted.

When the provided directory does not exist or in case of insufficient permissions for copying (read perm.) or moving (read and write) outputs an error message.

Example: Output for the command suffix_find.sh -c a.txt ngz a/aa.txt ⇒ a/angz

a/ba.txt ⇒ a/bngz

ta.txt ⇒ tngz

Requirements: Program block for filesystem traversal/file lookup must not repeat in the code.

Hint: See commands find, cp, mv and rm.

Warning: The owner of the file is able to delete or rename the file even in case the write/read permission are not set. Therefore, permission r/w check might not be sufficient to find out whether the operation succeeds.

Task 4

Create a script that creates a tar.gz backup of a directory provided (as a parameter) and names it directoryName_NUMBER.tar.gz where NUMBER is the minimal positive integer number for which the file with the same filename does not exist (you do not have to find the maximal number, just fill the first missing one).

The backup is created in a current directory or in a directory specified via second (optional) commandline argument. Paths in the archive must be saved relatively to the directory specified (the archive must not contain the higher-level directory). Mind that the directory can be specified relatively or absolutely. Watch for proper naming when the path leads to higher-level directory (..).

Use only stderr for output: print a message when the directory provided does not exist or inform about success by printing the archive name (with its path).

In case a file in the backed directory cannot be read, print an error message and use return value of 2. The backup task must continue and back up the remaining files.

Hint: See command tar.

Task 5

Create a script that counts frequency of words in text documents (arbitrary number of them) specified as script parameter(s). The ordering is controlled by script parameters: Script has two parameters -a (alphabetical) and -r (descending). The possible ordering is:

by frequency ascending (default)
by frequency descending (parameter -r)
alphabetically ascending (parameter -a)
alphabetically descending (both parameters -a -r)

The order of parameters can change, however the parameters must precede the file name(s).

Definition: A word is a sequence of alphabetical characters (without punctuation) a-z and A-Z. The frequency counting is not case sensitive.

Each line should contain a word and its frequency (space separated).

If any of the files provided is unreadable or does not exist, output an error message and return an error code (not 0). Continue with processing the remaining file(s).

Example: Output of a script with -a parameter:

alpha 3

beta 1

delta 2

or with -r parameter:

alpha 3

delta 2

beta 1

Requirements: Do not create any files (not even temporary). Algorithms for file search and count should be contained only once (same code for all sorting).

Hint: See commands sort, uniq and eventually tr. It is recommended to transform all words to lowercase. Beware of the punctuation character in the input text (commas, periods, quotation marks, etc.).

Note: The sorting by the sort command can depend on the LANG shell variable (ev. LC_COLLATE, LC_TYPE).

Task 6

Create a script that finds all scripts in the specified directory (and subdirectories) and prints their full pathname along with the script type (shell, perl, awk, …). If no directory is specified, use the current directory.

Script has two parameters:

optional parameter -f name that allows script type filtering (Mind that when specified “-f sh”, only “sh” scripts should be printed. Not the “bash” scripts. It is NOT a suffix filtering.)
and an optional parameter -n (nonrecursive) that turns off directory traversal.

Arguments can be specified in any order.

Note: Scripts have a shebang (#!) on the first line followed by program name with a path.

Mind that the interpreter name can be followed by parameters (that must be cut off). Also cut off the interpreter path. As a script type print an interpreter name without path and without any parameters.

Output of the script is a line with full path of the script followed with space and interpreter name. If the directory provided does not exist, print an error message.

Example: An example of the script output:

/home/student/skripty/find.sh bash

/home/student/skripty/bin/server.pl perl

/home/student/skripty/bin/client.pl perl

Hint: See commands find and head, ev. sed.

Task 7

Create a script that traverses the tar.gz file archive and in each file changes a string specified to an another string. The resulting files are saved in an archive with the same name.

Script parameters:

input archive,
searched word and
target word.

The archive contains only text files and does not have a directory structure. Also no directory structure must be in the resulting archive.

The only output is to the stderr in case the archive does not exist or is not a valid tar.gz file (do not determine a filetype by the suffix).

Hint: See commands tar and sed.

Task 8

Implement a script that creates dictionary. The script input is a text from stdin (in case no input files are specified) or a list of text file(s) specified as script argument(s). The output of the script is a file that is specified as the first (mandatory) argument of the script. (The name of the output file is specified by the name of the first parameter.)

The script outputs alphabetically ordered list (one word per line) of words from the text or all files (each word appears only once). A word is a sequence of alphabetical characters (without punctuation) a-z and A-Z, other characters should be treated as separators. The case is not important.

If any of the files provided does not exist, output an error message and return an error code 2, but continue with processing the other file(s). If no file can be read, return an error code of 3.

Syntax: ./dictionary_B8_johnDoe.sh output_filename.txt input_file1 input_file2

./dictionary_B8_johnDoe.sh output_filename.txt

Example: An example output file can be:

alpha

charlie

november

Requirements: The code for word finding must be present only once in the code.

Hint: See commands sort, ev. sed or tr. It is recommended to transform all words to lowercase. Beware of the punctuation character in the input text (commas, periods, quotation marks, etc.).

Note: The sorting by the sort command can depend on the LANG shell variable (ev. LC_COLLATE, LC_TYPE).

Task 9

Create a script for comparing two directories (recursively) and listing all differences (file names and directories present in only one directory or having different modification time or size). Stick to the following format:

<directory1>:<local_path> <size> <modification_time> <directory2>:<local_path> <size> <modification_time>

Size is the filesize in bytes and modification_time is the time of last modification in the following format:

YYYY-MM-DD HH:MM.

In case the file does not exist list only a directory and colon (:).

Directory names are the two mandatory script arguments. If any of the directories does not exist, print an error message.

Example: Script output for comparing two directories work and archive:

work:a.txt 1024 2010-09-10 10:20 archive:

work:b.txt 20 2010-09-10 10:20 archive:b.txt 25 2010-09-10 10:20

work:nic.txt 100 2010-09-10 10:20 archive:nic.txt 100 2010-09-22 15:22

work: archive:neco.data 256 2011-01-02 10:17

Hint: See commands find, stat, ev. diff. Watch for excessive spaces in the output.

Task 10

Create a script that traverses the directory provided (recursively) and creates a tar.gz archive of all files that match the mask provided. Script conserves the directory structure (relative to the directory provided – must not contain higher level directories). Script has 4 parameters:

An optional parameter -r that allows overwriting of an existing archive (if not specified, you must not overwrite the archive),
the directory to be traversed,
filename mask(s) (i.e.: *.java and/or A*txt*),
and the last argument is a path to the resulting archive.

If the resulting archive exists and the parameter -r is not specified, print an error message. If the parameter -r is specified and the archive is overwritten, print a warning (stderr). If the directory provided does not exist, print an error message.

Hint: See commands find, tar.

Task 11

Create a script for finding and checking the existence of C/C++ include files. Script parameter(s) is a list of filenames (source C/C++ files) to be searched (any number). The filename(s) can be preceded by any number of parameters in format -Ipath defining an include file search path. Look for include files also in the same directory as the source file.

In each source file find lines containing #include <file> or #include “file”. For each file found check the existence in any of the paths provided and if found, print a full path to stdout. Files not found must be reported to stderr (together with the filename).

Hint: See commands grep, ev. sed.

Task 12

Create a script that mimics the grep command behavior for a group of files in the directory provided (recursively) that conform the mask provided. The mask can be specified in the following format: *.c'*, image*, etc. The script has 3 parameters:

Searched string (a regular expression),
file mask
a directory (for recursive search).

Mind that each parameter should be quoted “”.

The last parameter is optional: when not provided, use current directory.

Theese three parameters can be preceded by the following switches: (in any order)

-n: file name will be followed by a line number: <file>:<line_no>:<text_found>
-v: for inverse search: list files NOT containing the string

Report non-existent directories, insufficient permissions to stderr.

Example: Output example: (searching for the string 'for', mask: '*.c*')

src/test.cc:for (int i=0; i<count; i++) {

src/common/functions.c:process_data(–number_of_samples); decrease number before calling

Hint: See commands find and grep. An effective use of the grep can ease you your work.

Note: We are aware that the -v conflicts with the -v commandline parameter for determining the version. If only one parameter (-v) and only it is present, print the task number and exit. Whenever there are more (or the first one is not -v), implement normal script functionality.

Task 13

Create script for text highlighting. The script reads text from stdin and outputs the text to stdout. Odd parameters specify highlight color ('r' = red, 'g' = green and 'b' = blue) and even parameters searched strings (regular expressions). The number of strings searched is not limited. Do not set background color. The priority of coloring is specified by parameter order. The execution without any parameter is allowed (no coloring is made). The text must be processed by lines. Do not wait for end of input.

Example:Script output, parameters used: r 'col[a-z]*\>' g 'text' b 'reg':

Create a script for text highlighting.

The script reads text from stdin... Odd parameters specify coloring ... color) and even parameters (regular expressions).

Hint: See commands sed, echo and grep. Use the following sequence for coloring: \033[0;31;40m (see bash coloring). Look at the -e argument of the echo command.

2015/11/03 17:40

Table of Contents