- Shell
- Prompt and commands
- Keyboard shortcuts and command-line editing
- Command history
- Commands: builtins, functions and executables
- Pipes, streams and redirection
- Environment variables
- Wildcards
- Aliases
- Arguments and escaping a string
- Processes
- Manual pages
- Useful commands
- File system
- Users, groups and permissions
The traditional way of using a Unix or Linux system is the command line. That is what you will be presented with when you log in to a computer via SSH (Secure Shell), for example. For skilled users, using the command line is generally very efficient compared to most GUIs. A basic level of proficiency with the command line can be considered a necessary part of general IT competency. This guide is meant to be a basic shell tutorial and explains commands and key concepts. It does not cover advanced topics such as shell scripting or administration.
Shell
The program that handles user input and interacts with programs and applications is called the shell. It is a command line interpreter that executes programs, expands wildcards and provides variables, condition and loop statements and process management. There are many different shells that, even though they may differ in syntax or appearance, provide the same core functionality.
There are two pedigrees of shells, Bourne-type shells and C-shell derivatives. They differ in syntax and some features, but the functionality provided is very similar. This guide assumes for the most part a Bourne-type shell, but the C-shell syntax is also presented where necessary.
At Aalto, the default shell for new users is bash (Bourne-Again SHell). Before 2018 the default shell was zsh (Z-shell), and older users will still have that if they have not changed their shell. The two are mostly compatible. An even older, historical default shell was tcsh (TENEX C-shell). Users can change their shell using the chsh (CHange SHell) command. Of these shells, bash and zsh are Bourne shell variants, and tcsh is a C-shell variant.
Prompt and commands
bash:
tteekkari@kosh:~$
zsh:
tteekkari@kosh ~ %
tcsh:
kosh:~>
The shell presents the user with a command prompt and then waits for user input. The prompt is customisable in most shells, but the default prompts in Aalto environment are these three.
The prompt relays information: by default, it shows the username, the computer hostname, the current directory and the end sign. In the Bourne prompts above, the username is "tteekkari", the hostname is "kosh", the current directory is "~" and the end sign is "$" or "%". The C-shell prompt does not display the username and uses ">" as the end sign. The end sign can also in some circumstances be "#", which is usually reserved for root shells.
Commands are typed after the prompt and executed using the Enter key.
A command typically consists of the program to be executed and its options and arguments.
The program part is the name of the program file to be executed. The options are special arguments that alter what the program does. The options are usually prepended with a dash ("-") or, in their longer form, with two dashes ("--"). The rest of the arguments are typically some kind of target, instruction or filename for the program to work on.
Consider this command:
tteekkari@kosh:~$ ls -l foo.txt
Here, after the prompt, ls is the program part, -l is an option to toggle long-style output, and foo.txt is a filename argument.
Keyboard shortcuts and command-line editing
The commands typed after the prompt form the command line.
Most shells provide keybindings and shortcuts and command line editing features. The keybindings are usually customisable by the user, so they are not set in stone, but the ones presented here work in the default Aalto environment.
The command line can, in most shells, be edited much like one would do in a text editor. The cursor can be moved using the left and right arrow keys and characters can be added by typing them in, or deleted using the Delete or Backspace keys. The Insert key toggles whether characters are overwritten or inserted at the cursor. There are also key combinations for different functions. Some of these key combinations are presented in the table below:
Key combination | Action |
---|---|
Ctrl-C | Terminate program |
Ctrl-Z | Stop program |
Ctrl-L | Clear screen |
Ctrl-S | Freeze screen |
Ctrl-Q | Unfreeze screen |
Ctrl-A | Go to the start of the line (Home) |
Ctrl-E | Go to the end of the line (End) |
Ctrl-H | Backspace |
Ctrl-M | Enter |
Ctrl-P | Previous command in history |
Ctrl-N | Next command in history |
Ctrl-W | Cut previous word |
Ctrl-Y | Paste, yank |
Ctrl-D | End-of-file, logout |
The key combinations are entered by holding the modifier key (in the cases above, Ctrl) and then pressing the other key.
For information on how to modify the keybindings, please see your shell's man page.
Command history
Shells usually store the command history. The history is also usually
written into a file in the user's home directory. The filename depends
on the shell (and its configuration), but the default filenames are
~/.bash_history
for bash, ~/.zsh_history
for zsh and
~/.history
for tcsh.
The command history can be reviewed with the command history. The previous commands can also be recalled by using the up and down arrow keys or Ctrl-P and Ctrl-N. In bash and zsh, the history can also be searched by using Ctrl-R. This presents the user with a separate prompt that searches the command history.
The command history can also be recalled using the exclamation mark ("!"). In history substition, the commands can be referenced either by their numbers, relative numbers ("the command I did X commands ago") or their command line.
Consider the following command history:
tteekkari@kosh:~$ history
1000 history
1001 echo foo
1002 echo bar
1003 history
tteekkari@kosh:~$
To recall a command by its relative number ('The command 3 commands ago'):
tteekkari@kosh:~$ !-3
echo foo
foo
tteekkari@kosh:~$
To recall a command by its command line:
tteekkari@kosh:~$ !ec
echo foo
foo
tteekkari@kosh:~$
To recall a command by its absolute number:
tteekkari@kosh:~$ !1002
echo bar
bar
tteekkari@kosh:~$
More complex substitutions are also possible:
tteekkari@kosh:~$ echo !1002 !1001
echo echo bar echo foo
echo bar echo foo
tteekkari@kosh:~$
The command history can be cleared with the command history -c.
Commands: builtins, functions and executables
A shell can run three types of things: shell builtins, functions and regular executables. Most commands are implemented as separate executables (as per the Unix philosophy), but the shell usually has some of the functionality built in. This includes program flow statements used for scripting, the cd (change directory) command and the kill command.
Shells also support functions that are essentially shell scripts (or grouped commands) that can be executed using the function name. Functions can be made by the user, but writing them is not in the scope of this guide.
Anything not implemented as a builtin or a function is a separate executable, which is located somewhere in the file system.
When a command is executed by the shell, it first checks if the command is a builtin or a function. If the former, the builtin will run. If not, the shell will check the paths given in the environment variable PATH in order, and will run the first appropriately named executable it finds. If no executable is found this way, the shell will print an error message.
The type of command (whether it is a builtin or not) can be checked by using type (bash and zsh) or where (tcsh).
To check whether a command is a builtin or not (bash, zsh):
tteekkari@kosh:~$ type -a kill
kill is a shell builtin
kill is /bin/kill
tteekkari@kosh:~$
And same in tcsh:
kosh:~> where kill
kill is a shell built-in
/bin/kill
kosh:~>
Here we can see that kill is implemented both as a builtin and as a separate executable. If we try to run these, we see that they are different:
tteekkari@kosh:~$ kill
kill: usage: kill [-s sigspec | -n signum | -sigspec] pid | jobspec ... or kill -l [sigspec]
tteekkari@kosh:~$ /bin/kill
Usage:
kill [options] <pid> [...]
Options:
<pid> [...] send signal to every <pid> listed
-<signal>, -s, --signal <signal>
specify the <signal> to be sent
-l, --list=[<signal>] list all signal names, or convert one to a name
-L, --table list all signal names in a nice table
-h, --help display this help and exit
-V, --version output version information and exit
For more details see kill(1).
tteekkari@kosh:~$
A separate command that finds where an executable would be run is which. However, as it is a separate executable, it does not see shell builtins.
Using which:
tteekkari@kosh:~$ which kill
/bin/kill
tteekkari@kosh:~$
Pipes, streams and redirection
When the shell runs programs, every process has by default three standard streams it can use to communicate with the user or with other programs. These are called standard input, standard output and standard error and are also referred to by their shorter name and/or their file descriptor numbers stdin (0), stdout (1) and stderr (2). By default, standard output and standard error are printed on the terminal and stdin is read from the terminal. The shell can be used to redirect these streams so that, for example, the standard output of one program is used as the standard input of another program. This is called piping. The streams can also be directed to and/or read from files. This is called redirection.
Redirection into or from a file
A program's output can be redirected into a file using a greater-than sign (>). The file given after the sign is overwritten and the output of the program will be redirected into it.
Redirecting output into a file:
tteekkari@kosh:~$ echo "foo" > foo.txt
tteekkari@kosh:~$ cat foo.txt
foo
tteekkari@kosh:~$ echo "bar" > foo.txt
tteekkari@kosh:~$ cat foo.txt
bar
tteekkari@kosh:~$
The cat (conCATenate) command used above prints the contents of the files given as arguments, one after another. When it is used with only one argument, it just prints the contents of the single file.
By using two greater than -signs (>>), it is possible to append to a file instead of overwriting it.
Appending into a file:
tteekkari@kosh:~$ echo "foo" > foo.txt
tteekkari@kosh:~$ cat foo.txt
foo
tteekkari@kosh:~$ echo "bar" >> foo.txt
tteekkari@kosh:~$ cat foo.txt
foo
bar
tteekkari@kosh:~$
When used on its own, this kind of redirection only redirects the
standard output stream. It is also possible to redirect the standard
error stream. To do that, the stream number must be used along with the
greater than -sign. It can often be beneficial to suppress error
messages. To accomplish that, the standard error stream can be
redirected to /dev/null
which is a special device file that discards
anything written into it.
Separating the standard output and the standard error:
tteekkari@kosh:~$ curl http://www.aalto.fi > foo2.txt 2> foo3.txt
tteekkari@kosh:~$ cat foo2.txt
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>301 Moved Permanently</title>
</head><body>
<h1>Moved Permanently</h1>
<p>The document has moved <a href="http://www.aalto.fi/fi/">here</a>.</p>
<hr>
<address>Apache/2.4.7 (Ubuntu) Server at www.aalto.fi Port 80</address>
</body></html>
tteekkari@kosh:~$ cat foo3.txt
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 308 100 308 0 0 1036 0 --:--:-- --:--:-- --:--:-- 1037
tteekkari@kosh:~$
The curl command used above downloads an URL (Uniform Resource Locator, a 'web address'). It prints the contents of the URL on the standard output and a status display on the standard error.
It is also possible to combine the standard output and standard error streams. The syntax for combining the streams differs a bit between the shell flavours.
Combining stdout and stderr, bash and zsh:
tteekkari@kosh:~$ curl http://www.aalto.fi >foo2.txt 2>&1
tteekkari@kosh:~$ cat foo2.txt
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 308 100 308 0 0 955 0 --:--:-- --:--:-- --:--:-- 953
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>301 Moved Permanently</title>
</head><body>
<h1>Moved Permanently</h1>
<p>The document has moved <a href="http://www.aalto.fi/fi/">here</a>.</p>
<hr>
<address>Apache/2.4.7 (Ubuntu) Server at www.aalto.fi Port 80</address>
</body></html>
and tcsh:
kosh:~> curl http://www.aalto.fi >&foo2.txt
kosh:~> cat foo2.txt
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 308 100 308 0 0 1789 0 --:--:-- --:--:-- --:--:-- 1790
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>301 Moved Permanently</title>
</head><body>
<h1>Moved Permanently</h1>
<p>The document has moved <a href="http://www.aalto.fi/fi/">here</a>.</p>
<hr>
<address>Apache/2.4.7 (Ubuntu) Server at www.aalto.fi Port 80</address>
</body></html>
Standard input for a program can also be read from a file. This can be done by using the less-than sign (<) followed by a filename. The input for the program is then read from the file given.
tteekkari@kosh:~$ cat foo.txt
foo
tteekkari@kosh:~$ tr a-z A-Z <foo.txt
FOO
tteekkari@kosh:~$
In this example the tr (TRanslate) command translates the character set given in the first argument into the character set given in the second argument, which in this case means it changes small letters into capital letters.
Here documents
A here document is a special kind of redirection in that the text to be redirected is literally typed 'here'. Here documents can be used for example in scripts to include multiple lines of text in the script without having to put it in a separate file. The syntax for a here document is two less-than signs followed by a delimiter. Any text after this is treated as standard input for the command until a line containing only the delimiter is found.
An example of a here document:
tteekkari@kosh:~$ tr a-z A-Z << EOF
> don't
> panic
> EOF
DON'T
PANIC
tteekkari@kosh:~$
Piping commands
As mentioned earlier, the shell can be used to redirect the output of one program to be used as input of another program. This can be done with the pipe character (|). As per the Unix philosophy, programs usually read and print plain text, which means that using programs to modify other programs' output is a pretty common thing to do. Perhaps the most useful tool that depends on this functionality is grep, which searches its input for a pattern given as an argument and prints the matching lines.
Piping output:
tteekkari@kosh:~$ history
1000 history
1001 echo foo
1002 echo bar
1003 history
tteekkari@kosh:~$ history | grep foo
1001 echo foo
1004 history|grep foo
tteekkari@kosh:~$
Here the output of history is piped to the grep command which prints
the lines matching the pattern foo
.
The tr example in the redirection section could also be made using a pipe:
tteekkari@kosh:~$ cat foo.txt
foo
tteekkari@kosh:~$ cat foo.txt | tr a-z A-Z
FOO
tteekkari@kosh:~$
Environment variables
Every process has an environment which consists of environment variables. The environment of a process is inherited by its subprocesses. The shell can set, modify and unset environment variables, and they can be temporary or permanent. The variables usually affect the shell and/or program(s) and they are one way of configuring the shell or an application.
In bash and zsh, variables can be set using the syntax VARIABLE=value and in tcsh by using setenv VARIABLE value. In bash and zsh, a variable also has to be exported, which is done using the export command. If the variable is not exported, it remains an internal variable of the shell and is not inherited by subprocesses (programs executed by the shell).
Setting an environment variable, bash and zsh:
tteekkari@kosh:~$ VISUAL=nano
tteekkari@kosh:~$ export VISUAL
tteekkari@kosh:~$
or directly
tteekkari@kosh:~$ export VISUAL=nano
tteekkari@kosh:~$
Setting an environment variable, tcsh:
kosh:~> setenv VISUAL nano
The environment variables can then be referenced in the command line as
$VARIABLE or ${VARIABLE}. The entire environment can be printed
using env or printenv. The environment of a running process can
be viewed by looking at the file /proc/<pid>/environ
. The value of a
single environment variable can be checked using either printenv or
just echo.
Printing the value of an environment variable:
tteekkari@kosh:~$ export VISUAL=nano
tteekkari@kosh:~$ printenv VISUAL
nano
tteekkari@kosh:~$ echo ${VISUAL}
nano
tteekkari@kosh:~$
Some useful environment variables:
Variable | Meaning |
---|---|
HOME | Path to the user's home directory |
PATH | A list of paths the shell will search for executables |
DISPLAY | Tells any graphical applications the address of the windowing server |
VISUAL, EDITOR | Determines the editor used when a script or a program starts an editor |
LD_LIBRARY_PATH | A list of paths containing shared libraries used by programs |
PS1 | Used to set the prompt string (*bash*, *zsh*) |
LANG, LC_* | Variables used to set the locale |
TERM | The terminal (emulator) type |
Wildcards
The shell can complete partial file names using wildcards. The wildcards can be used as a shorthand to typing multiple file names or work in situations where the exact file name is not known or cannot even be determined. The most common wildcard characters are asterisk (*) and question mark (?). The asterisk replaces any string (even an empty one) and the question mark replaces any single character.
Consider the following list of files:
tteekkari@kosh:~$ ls
a ab abcd b c cd d
tteekkari@kosh:~$
Now the wildcards can be used as follows:
tteekkari@kosh:~$ ls a*
a ab abcd
tteekkari@kosh:~$ ls a?
ab
tteekkari@kosh:~$ ls ??
ab cd
tteekkari@kosh:~$ ls *d
abcd cd d
tteekkari@kosh:~$ ls a*?d
abcd
In the first example a*
completes as a
, ab
and abcd
. The
asterisk replaces an empty string in a
, 'b' in b
and 'bcd' in
abcd
. In the second example a?
only hits ab
since the question
mark only replaces exactly one character. In the third example ??
hits
ab
and cd
as they are the only file names with exactly two
characters. In the third example *d
hits abcd
, cd
and d
, much
like in the first example. The wildcards can also be used in the middle
of a filename, or multiple times, as the final example illustrates.
There, the a*?d
hits abcd
and the asterisk hits b
and the
question mark hits c
.
Note that the wildcards are expanded by the shell, so in these examples ls might get one or multiple arguments, depending on how many files the wildcards apply to.
Aliases
Shells usually allow aliases, which can be used to replace text in the beginning of commands. An alias could for example be used to always set certain options for certain commands.
The syntax for setting and removing aliases is different in bash and tcsh. After they have been set, they can be used just like normal commands.
Setting, using and removing an alias, bash and zsh:
tteekkari@kosh:~$ ls
a b
tteekkari@kosh:~$ alias rm='rm -i'
tteekkari@kosh:~$ rm a
rm: remove regular empty file 'a'? y
tteekkari@kosh:~$ unalias rm
tteekkari@kosh:~$ rm b
tteekkari@kosh:~$
Setting, using and removing an alias, tcsh:
kosh:~> ls
a b
kosh:~> alias rm 'rm -i'
kosh:~> rm a
rm: remove regular empty file 'a'? y
kosh:~> unalias rm
kosh:~> rm b
kosh:~>
Here the command rm (ReMove) is used to delete files. With the option -i the rm command asks for extra confirmation for each argument — without the -i the file is just removed silently. This is a somewhat common alias to set on an administrator (root) account where an accidental file deletion can cause major damage to the operating system.
When using aliases, any arguments to the command are just appended to any the alias might already have. If an argument is added somewhere in the middle of the command, the alias will not work and a shell function or a script would have to be used instead.
Arguments and escaping a string
As stated earlier, commands are often given one or more arguments or options. The command receives the arguments as a list consisting of all the arguments of the command line split at spaces. This can in certain cases behave counterintuitively.
Specifically, if an argument contains a space, it needs to be escaped. The same applies for wildcard characters if they are to be used verbatim and for some other characters that have a special meaning for the shell interpreter. Strings can be escaped in three ways. One of the ways involves escaping every special character separately by prepending it with a backslash () character. A backslash can itself be escaped (\\). The other two ways are to enclose the string in quotation marks, either using double quotes (") or single quotes ('), which behave differently.
When using single quotes, the string can contain any characters except the single quotes themselves, and everything is passed verbatim without expanding any variables, etc.
If the string is enclosed in double quotes, the special characters $ (dollar sign), \ (backslash) and often ! (exclamation mark) have special meanings. The backslash can be used to escape characters, the exclamation mark to reference command history and the dollar sign to reference variables.
Consider the following echo statements:
tteekkari@kosh:~$ echo Hello World
Hello World
tteekkari@kosh:~$ echo Hello World
Hello World
tteekkari@kosh:~$ echo "Hello World"
Hello World
tteekkari@kosh:~$ echo "Hello World"
Hello World
tteekkari@kosh:~$ echo Hello\ World
Hello World
tteekkari@kosh:~$ echo Hello\ \ \ \ \ \ World
Hello World
Without quotes or any escaping, echo receives two arguments, Hello
and World
no matter how many spaces there are in between; it then
prints the arguments separated by a single space. When quotes are used,
echo receives one argument, which is the string Hello World
in the
first case and Hello World
in the second. Similarly, backslashes can
be used to escape the spaces in which case echo only receives one
argument.
The difference between single and double quotes:
tteekkari@kosh:~$ echo ${LANG}
en_US.UTF-8
tteekkari@kosh:~$ echo *
file1 file2 file3
tteekkari@kosh:~$ echo \*
*
tteekkari@kosh:~$ echo '${LANG}'
${LANG}
tteekkari@kosh:~$ echo '\${LANG}'
\${LANG}
tteekkari@kosh:~$ echo '"${LANG}"'
"${LANG}"
tteekkari@kosh:~$ echo "*"
*
tteekkari@kosh:~$ echo "${LANG}"
en_US.UTF-8
tteekkari@kosh:~$ echo "\${LANG}"
${LANG}
tteekkari@kosh:~$ echo "\\\${LANG}"
\${LANG}
tteekkari@kosh:~$ echo "\"${LANG}\""
"en_US.UTF-8"
tteekkari@kosh:~$ echo "\"\${LANG}\""
"${LANG}"
tteekkari@kosh:~$ echo '*'
*
When a string is enclosed in double quotes, any environment variables are substituted and the backslash can be used to escape characters. Wildcards are not substituted. When using single quotes, all characters are treated literally and no substitutions or escapes are done. This also means that when using single quotes, the single quote character can not be present in the string.
Processes
A process is a running program. As a concept, a process comprises of the program code, memory areas used by the program, the call stack (which includes the function calls, memory and parameters), operating system resources (file descriptors, environment, etc), process ownership information, process permissions and the processor state, also known as the context. In short, a process contains all the run-time information on a running program.
As Linux is a multitasking operating system, there are always multiple processes running. From the point of view of the operating system, a process can be in one of three states. A process is either running, waiting or blocked. These states are directly related to whether the process is being run on the CPU at any given time. The transitions between states are controlled by the operating system's scheduler.
Processes can be listed with the command ps.
tteekkari@kosh:~$ ps
PID TTY TIME CMD
22521 pts/129 00:00:00 bash
22629 pts/129 00:00:00 ps
tteekkari@kosh:~$ ps ux
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
tteekka+ 22500 0.7 0.0 78268 9300 ? Ss 20:02 0:00 /lib/systemd/sy
tteekka+ 22501 0.0 0.0 300284 8436 ? S 20:02 0:00 (sd-pam)
tteekka+ 22520 0.0 0.0 142880 3044 ? R 20:02 0:00 sshd: tteekkari
tteekka+ 22521 1.0 0.0 28568 5640 pts/129 Ss 20:02 0:00 -bash
tteekka+ 22578 0.0 0.0 24376 2496 ? Ss 20:02 0:00 krenew -b -K 60
tteekka+ 22632 0.0 0.0 43172 3604 pts/129 R+ 20:02 0:00 ps ux
tteekkari@kosh:~$
The columns in the listing are as follows:
USER | The username that owns the process |
PID | Process ID number |
%CPU, %MEM | Percentage of CPU time and memory used by the process |
VSZ | Virtual memory size in kB |
RSS | Resident set size, non-swapped physical memory used in kB |
TTY | The terminal device assigned for the process |
STAT | Status of the process (S = sleeping, R = running, D = blocked) |
START, TIME | When the process was started and how much CPU time has it used |
COMMAND | Command line of the process |
There are also a few other statuses for the STAT field and there are a lot more fields one can choose from. The best resource for information on those is the ps(1) man page.
For a more continuous watching of processes there are the process monitors top and htop. Of these, htop is newer and more colourful, but the older top is also perfectly workable.
The output of top:
The output of top includes some general information on the computer load, uptime and memory usage. In addition to that, there is a periodically updated process listing that contains pretty much the same fields seen in ps output. The new fields are the PR and NI fields, which are the PRiority of the process and a NIce value. Processes with a higher nice value yield to other processes and it is encouraged to nice any long-term CPU-intensive processes to take other users into account.
Processes can be started nice'd with the command nice. The nice value of a running process can be altered with the command renice. Users can only renice their own processes.
Background and foreground processes, job control
A shell can run multiple processes at once. There can be one foreground process, which receives the keyboard input from the terminal. The rest of the processes are suspended or in the background. Background processes print to the terminal, but receive no keyboard input.
A shell groups the processes running in it into jobs. Running jobs can be listed using the command jobs. The job numbers are not visible outside the single shell session and will only work in the same shell session.
A process can be started in the background by following its command line with the ampersand symbol (&).
When a process is started to run in the background, its job number and PID (process id) are printed and the process is started. Jobs can be brought to foreground using the command fg (foreground) and a stopped process can be sent to background using the bg command (background).
A foreground process can be stopped by pressing Ctrl-Z. The execution will be halted and the shell is resumed. The execution can then be resumed by either sending the process to the background using bg or by bringing it back to front using fg. The command fg can also be used to bring a process straight to the foreground from the background.
tteekkari@kosh:~$ firefox &
[1] 31087
tteekkari@kosh:~$ jobs
[1]+ Running firefox &
tteekkari@kosh:~$ fg
firefox
^Z
[1]+ Stopped firefox
tteekkari@kosh:~$ jobs
[1]+ Stopped firefox
tteekkari@kosh:~$ bg
[1]+ firefox &
tteekkari@kosh:~$ fg
firefox
^C
Exiting due to channel error.
Exiting due to channel error.
Exiting due to channel error.
tteekkari@kosh:~$
Here Firefox is first started in background using the ampersand. It gets the job number 1 and PID 31087. Then when jobs are listed using jobs, the running Firefox process is listed. The process is then brought to foreground using fg and then stopped using Ctrl-Z. Stopping the job will print the job number, its status (Stopped) and the process command line. The jobs listing will reflect this also. The process is then resumed in the background using bg and then brought to foreground using fg and finally terminated using Ctrl-C.
The commands bg and fg as well as the shell builtin version of
kill can also take the job number as an argument. The job number can
be given in the form %n
where n
is the job number. The Firefox
process in the previous example could thus have been killed using
kill %1.
Signals
Processes can be controlled using signals. Signals are operating system messages that programs must handle when received. If the program doesn't have a custom signal handler, a default action is taken. For most signals, the default action is to terminate the process. Signals can also be sent by users with the command kill even though most signals originate from the operating system.
The kill command is usually a shell builtin so that the shell does not need to spawn a new process for sending the signal. This can be very beneficial in a situation where the operating system process table is full and new processes cannot be spawned. It is also available as a separate executable that can be called directly as /bin/kill. The separate executable can handily list the available signals.
tteekkari@kosh:~$ /bin/kill --table
1 HUP 2 INT 3 QUIT 4 ILL 5 TRAP 6 ABRT 7 BUS
8 FPE 9 KILL 10 USR1 11 SEGV 12 USR2 13 PIPE 14 ALRM
15 TERM 16 STKFLT 17 CHLD 18 CONT 19 STOP 20 TSTP 21 TTIN
22 TTOU 23 URG 24 XCPU 25 XFSZ 26 VTALRM 27 PROF 28 WINCH
29 POLL 30 PWR 31 SYS
tteekkari@kosh:~$
There are 31 different signals and they are numbered. The signals are usually referred to with their names prepended with "SIG". Some of the more useful signals to a normal user are SIGHUP, SIGINT, SIGKILL, SIGUSR1, SIGUSR2, SIGTERM, SIGCONT, SIGSTOP and SIGTSTP. All the signals are listed and described on the signal(7) man page.
With this information, we can understand some of the keyboard shortcuts better. For example, when Ctrl-C is pressed, the foreground process receives the signal SIGINT (2, Interrupt from keyboard), which by default ends the program. Ctrl-Z sends the foreground process the signal SIGTSTP (20, Stop typed at terminal), which will by default stop the program. Then when the process is fg'd or bg'd, it receives the signal SIGCONT (Continue if stopped), and continues execution.
A program can process signals using a signal handler. Programs can also block or ignore signals if the programmer so chooses. There are, however, two signals that cannot be handled in a program. They are SIGKILL (9) and SIGSTOP. SIGKILL always kills the program right away (this should usually be used as a last resort because programs do not get a chance to exit in a controlled manner and might skip cleanup steps, for example). SIGSTOP stops the program execution.
Sometimes if the process is waiting for some OS resource or system call (in uninterruptible sleep or blocking), it will not be terminated even by SIGKILL until it gets that resource. Unfortunately, there is no magic way of fixing these situations; sometimes if a process is in a really bad state, the only thing that helps is a reboot.
The most common use for signals is killing processes. This is usually accomplished by sending a process a SIGTERM signal, which is the default signal kill will send if no signal is specified. The process can be specified either by its PID or its job number if it is running in the same shell.
Killing processes:
tteekkari@kosh:~$ cat &
[1] 18779
tteekkari@kosh:~$ kill %1
tteekkari@kosh:~$
[1]+ Terminated cat
Killing from another terminal:
tteekkari@kosh:~$ kill <pid>
tteekkari@kosh:~$
In the terminal with the process being killed, this will look like:
tteekkari@kosh:~$ cat
Terminated
tteekkari@kosh:~$
To send other signals, the signal can be specified by using either the name or the number, thus
$ kill -9 <pid>
$ kill -KILL <pid>
are equivalent.
If -1 is used for the PID (or one of the PIDs, as kill also works for a list of multiple PIDs), kill will send the signal to all of the processes it is allowed. For a normal user, this means all the processes running as the user.
Processes and the /proc file system
In Linux, processes can be examined via the /proc
file system. This is
also how the tools described earlier (ps, top, etc) actually
work. It is a pseudo file system, generated 'on the fly' by the
operating system kernel, and the files are thus not stored anywhere.
Every process has a directory under /proc
. The name of the directory
is the process's PID.
tteekkari@kosh:~$ touch foo.txt
tteekkari@kosh:~$ tail -f foo.txt &
[1] 5866
tteekkari@kosh:~$ cd /proc/5866
tteekkari@kosh:/proc/5866$ ls
attr exe mounts projid_map status
autogroup fd mountstats root syscall
auxv fdinfo net sched task
cgroup gid_map ns schedstat timers
clear_refs io numa_maps sessionid timerslack_ns
cmdline limits oom_adj setgroups uid_map
comm loginuid oom_score smaps wchan
coredump_filter map_files oom_score_adj smaps_rollup
cpuset maps pagemap stack
cwd mem patch_state stat
environ mountinfo personality statm
tteekkari@kosh:/proc/5866$ ls -go fd/
total 0
lrwx------ 1 64 Dec 5 11:16 0 -> /dev/pts/304
lrwx------ 1 64 Dec 5 11:16 1 -> /dev/pts/304
lrwx------ 1 64 Dec 5 11:16 2 -> /dev/pts/304
lr-x------ 1 64 Dec 5 11:16 3 -> /m/home/home6/62/tteekkari/unix/foo.txt
tteekkari@kosh:/proc/5866$ cat cmdline
tail-ffoo.txttteekkari@kosh:/proc/5866$ kill %1
The /proc
entries contain a lot of information, but some of the more
useful files are: the cmdline
, which shows the command line for the
process; the directory fd
, which lists the open file descriptors (just
think files) of the process; and the environ
file, which lists the
environment for the process. The cwd
symlink points to the process's
CWD (current working directory). Even though a user can see most of the
information for their own processes, not all of it is available for
processes owned by other users. Access to the files and directories is
controlled by regular file permissions.
Manual pages
Most of the shell commands come with a corresponding manual page. Manual pages can be viewed with the man command. The syntax is man [category] page_name. For example the manual page of man could be viewed with the command man man. In texts and on the pages themselves, pages are referenced to as page_name(#) in which page_name is the name of the page and the number is the category the page is in.
There are seven sections of manual pages which date back to the original
Unix Programmer's Manual
from 1971. The sections can be relevant when there is a page by the same
name in multiple sections. An example of such a page would be for
example printf, which is an executable in /usr/bin/printf
as well
as a C function. man printf would open the page printf(1), which
describes the shell command. man 3 printf would describe the C
function.
Man page sections:
Number | Description | Examples |
---|---|---|
1 | Commands | pwd, ls, tr |
2 | System calls | fork, socket, chdir |
3 | Subroutines | scanf, sin, strtok |
4 | Special files | null, mem, rtc |
5 | File formats | hosts, shadow, fstab |
6 | Games | nethack |
7 | Miscellaneous | signal, capabilities, cgroups |
Man pages can be searched using the command apropos or man -k. They both do the same thing and print out all the names of the man pages on which the term searched for was found. This is a good way of finding a command you can't remember: searching the man pages for what the command does.
For example, to find all the man pages that deal with permissions:
tteekkari@kosh:~$ apropos permissions
access (2) - check user's permissions for a file
chmod (2) - change permissions of a file
dh_fixperms (1) - fix permissions of files in package build directories
dh_testroot (1) - ensure that a package is built with necessary level of...
eaccess (3) - check effective user's permissions for a file
euidaccess (3) - check effective user's permissions for a file
faccessat (2) - check user's permissions for a file
faked (1) - daemon that remembers fake ownership/permissions of fi...
faked-sysv (1) - daemon that remembers fake ownership/permissions of fi...
faked-tcp (1) - daemon that remembers fake ownership/permissions of fi...
fchmod (2) - change permissions of a file
fchmodat (2) - change permissions of a file
ioperm (2) - set port input/output permissions
WWW::RobotRules (3pm) - database of robots.txt-derived permissions
XF86VidModeGetPermissions (3) - Extension library for the XFree86-VidMode X e...
tteekkari@kosh:~$
As many pieces of software introduce changes between versions, or there might be several (mostly) interchangeable flavours of any given program, it is usually a good idea to consult the local manual page instead of searching for one on the WWW. The locally installed manual pages come with the software and correspond to the version and flavour that is actually present on the computer.
Useful commands
This is a partial (but long) list of useful commands grouped by categories.
HELP! | |
---|---|
man | (manual) Command manual pages, 'man <command>' |
info | Slightly longer manual pages for some programs, 'info <command>' |
apropos | Search for a string in manual pages |
Handling files | |
---|---|
ls | (list) Lists directory contents or individual files |
cd | (change directory) Changes the current working directory |
pwd | (print working directory) Prints the current working directory |
ln | (link) Creates a link |
cat | (concatenate) Prints contents of files |
more | Prints file contents paginated |
less | Less is more. Improved version of more |
mkdir | (make directory) Creates a directory |
cp | (copy) Copies files |
mv | (move) Moves and renames files |
rm | (remove) Deletes files |
rmdir | (remove directory) Deletes directories |
chmod | (change mode) Changes files' permission bits |
chown | (change owner) Changes files' owner and group |
chgrp | (change group) Changes files' group |
dd | Swiss army knife |
scp | (secure shell copy) Copies files over SSH connection |
rsync | Copies files and directory trees locally and/or between computers |
Tools | |
---|---|
wc | (word count) Counts characters, words and/or lines |
cut | Selects parts of lines |
paste | Combines lines from multiple files |
head | Prints the beginning of input/files |
tail | Prints the end of input/files |
fmt | (format) Formats input for different text widths |
grep | Prints matching lines |
tr | (translate) Translates characters or deletes them |
sort | Sorts lines |
uniq | (unique lines) Reports or omits repeated lines |
diff | (differences) Finds differences between files |
sed | (stream editor) A Swiss Army Knife |
tee | Writes input to both files and standard output |
od | (octal dump) Dumps files in octal and other formats |
xxd | Dumps input in hexadecimal or reverses dumps |
bc | (basic calculator) Calculator/mathematical programming language |
Identity, information | |
---|---|
logname, whoami | (login name) Prints username |
id | Prints user and group IDs |
groups | Prints the groups user is in |
hostname | Prints the computer hostname |
uname | (unix name) Displays information about the computer, OS and OS version |
getent | (get entries) Queries local databases |
lsb_release | (Linux Standard Base release) Displays information about the Linux distribution |
dmesg | (driver message) Prints the kernel ring buffer |
sysctl | Displays or changes settings on the running kernel |
Archiving files, file compression | |
---|---|
ar | (archive) Handles ar archives used for example in Debian packages |
tar | (tape archiver) Archives multiple files into one |
gzip, gunzip | Compresses or expands gz archives |
bzip2, bunzip2 | Compresses or expands bz2 archives |
xz, unxz | Compresses or expands xz archives |
zip, unzip | Compresses or expands zip archives |
Network tools | |
---|---|
ip | Displays/changes routing and network interface settings |
ifconfig | Old tool for displaying/changing network interface settings |
route | Old tool for displaying/changing routes |
arp | Displays or manipulates the kernel ARP cache |
ping | An ICMP diagnostic tool |
traceroute | An ICMP diagnostic tool |
nc | (netcat) Creates TCP and UDP connections and listens |
File systems and disk usage | |
---|---|
df | (disk free) Prints the amount of free space on file systems |
du | (disk usage) Estimates file space usage |
quota | Displays quotas |
Scripting and programming languages | |
---|---|
awk | (Aho, Weinberger and Kernighan) The AWK programming language |
perl | (Practical Extraction and Reporting Language) The Perl programming language |
python | The Python programming language |
File system
Everything is a file.
or, as Linus Torvalds put it,
Everything is a file descriptor or process.
The 'Everything is a file' paradigm is one of the key concepts of Unix (and Linux). Almost everything is presented as files in directories and the file system offers a common namespace for all system resources. This means that everything can be manipulated using the same set of tools created for manipulating files.
A file is a discretely stored collection of data or records. Files are arranged into directories, which are special files that contain a list of other files. These listed files are located in the directory. All directories except for the root directory are also contained in another directory.
In Unix and Unix-based operating systems (or, more accurately, most Unix/Linux file systems) filenames are usually case sensitive. The only forbidden characters in filenames are the forward slash (/), which is used as the directory separator and the null character (\0), which is used as a string separator. The maximum length of a filename is 255 characters.
In addition to these limitations, the special filenames .
and ..
,
denoted by a dot (.) and two dots (..) are reserved and refer to the
current and upper-level directory, respectively.
As a special case, files with a name beginning with a dot (.) are hidden from file listings and are considered hidden files.
There are several types of files in addition to regular files and directories, for example, device files and symbolic links. There are several other types as well, but they are encountered a bit more rarely in normal interactive use.
Structure
All files are arranged in a single tree-like structure which is called the root file system. The tree can contain many separate file systems, but they are all presented as part of this tree. Adding a file system to the tree is called mounting, and the directory that hosts the root directory of the mounted file system is called the mount point. By default, only the root user can mount file systems but normal users can still use the mount command to list the mounted file systems.
At the base of the file system tree is the root directory /
. All
other files and file systems are located in subdirectories of the root
directory. The structure of the file system is pretty well-established,
and much of it is standardised as FHS (File system Hierarchy Standard),
which can be found at https://refspecs.linuxfoundation.org/fhs.shtml.
Here are some of the top-level directories and a brief description:
Directory | Description |
---|---|
/bin | The essential user command binaries |
/boot | Boot loader files |
/dev | The device tree |
/etc | System configuration files |
/home | User home directories (Note: Not at Aalto) |
/lib | Essential shared libraries and kernel modules |
/media | Mount point for removable media |
/mnt | Mount point for temporary file systems |
/opt | Additional application software |
/proc | Processes exposed as files |
/root | Home directory of the root user |
/run | Run-time data |
/sbin | Essential system binaries |
/tmp | Temporary files |
/usr | Non-essential user and system binaries and applications |
/var | Variable data files |
Absolute and relative path
Files are referred to using their path. Every process has a working directory, which is usually the directory the process was started in. Files can be referred to either in relation to this (relative path) or using the absolute path beginning from the root directory.
Consider the following directory structure:
/
├── a
│ └── x.txt
└── b
├── c
│ └── y.txt
└── z.txt
If the current working directory is b
, b
can be referred to using
.
and the upper level directory, /
using ..
. Now to refer to
y.txt
one can use either c/y.txt
— the relative path — or
/b/c/y.txt
, the absolute path. To refer to x.txt
, one can use either
../a/x.txt
(relative path) or /a/x.txt
(absolute path). A good thing
to know is that the upper level directory of the root directory is the
root directory itself, so also ../../../../a/x.txt
would work just as
well here. The file z.txt
can be referred to as z.txt
or ./z.txt
(relative path) or /b/z.txt
(absolute path).
For file names containing odd characters and how to escape them, see Arguments and escaping a string.
Home directory and working directory
Every process, including the shell has a current working directory (CWD). For processes this is usually the directory where the process was started.
As the example in Absolute and relative path shows, when referring to files using a relative path, the path always starts at the current working directory.
The current working directory can be printed using the command pwd.
tteekkari@kosh:~$ pwd
/u/62/tteekkari/unix
tteekkari@kosh:~$
Another notable directory is the user's home directory. The home directory can be referred to using the tilde sign (~). The home directory is a directory where the user has write permission and in which private data can be stored.
Moving around in the file system, interacting with files
The current working directory can be changed using the cd command. The directory to be changed to is given as an argument. When used without arguments, cd will change to home directory.
Utilities such as ls (list files) default to the current working directory when run without arguments. It is also useful to be able to refer to files using just their filename instead of the full (or partial) path.
Consider the directory structure presented earlier:
/
├── a
│ └── x.txt
└── b
├── c
│ └── y.txt
└── z.txt
The current working directory is b
.
$ pwd
/b
$ ls
c/ z.txt
$ cd /a
$ ls
x.txt
$ cd ..
$ pwd
/
$ cd b/c
$ pwd
/b/c
$ ls
y.txt
$ rm y.txt
First, the current directory is /b
. ls without arguments lists the
files in the current directory and now shows that it contains the
directory c/
and z.txt
.
Then we change to the directory /a
using the absolute path. Here,
ls shows the file x.txt
. Then we change to the root directory
using the relative path ..
. Then we change to the directory /b/c
using relative path b/c
. Here, ls shows the file y.txt
, which is
then removed using the relative path.
We end up with the following file tree, where the file y.txt
has been
removed:
/
├── a
│ └── x.txt
└── b
├── c
└── z.txt
Listing files and properties of a file
The contents of a directory can be listed using the command ls like in the previous example. Without arguments, ls gives a listing of files in the current directory. When one or more file or directory names are given as arguments, only they will be listed. There are multiple options that modify the output, one of the more common ones being -l (long list).
The output of ls -l and its fields:
The first character on each row denotes the type of the file: -
for
a regular file, d
for a directory and l
for a symbolic link. There
are also other types, such as p
for a named pipe, c
for a character
device, b
for a block device, etc., but these more specialised types
will not be covered here. The full list of types can be found on
the 'ls' info page.
The next nine characters are the file permissions. They are listed in three groups of three different permissions. The permissions are covered in more detail in File permissions.
The second field shows the number of links referring to the file. For a
directory, this is in practice the number of subdirectories the
directory contains. For any directory that is at least two, since every
directory contains the special directories .
and ..
. For a normal
file, the number is the number of hard links to the file, which is in
practice how many times the contents of the file are referred to in the
file system. The internal structure of the file system and hard links
will not be covered here though.
The third field shows the username of the file's owner and the fourth the group of the file. Every file is owned by a user and a group, each of which can be given different permissions to the file. The permissions are, again, covered in more detail elsewhere (see File permissions).
The fifth field shows the file size in bytes. It is also possible to use the option -h to display the size in human-readable format.
The sixth field shows the modification time of the file. This time is updated whenever the file is changed.
The last field shows the file name. For symbolic links, ls -l also shows where the link is pointing.
The command ls also has many other options, which can be used, for example, to sort the files in various ways. There is one more important option, -a (all files), which lists all files, including the ones that have a file name beginning with a dot.
Listing 'hidden' files:
tteekkari@kosh:~/files$ ls
file1 file2
tteekkari@kosh:~/files$ ls -a
. .. .dotfile1 file1 file2
tteekkari@kosh:~/files$
All the options for ls can be found on its man page.
Creating and editing files
Files can be created with many programs and applications, or via the shell by piping output into a file. An empty file can be created using the touch command, but it is often more meaningful to use a text editor, for example.
An editor is a program that (interactively) modifies text files. The most popular editors are likely still emacs and vi or the vi-based vim, even though there are some rather popular graphical ones as well. Both of these are very versatile, extensible and extremely efficient in the hands of a skilled user. Both also have their strong advocates, and argumentation for one's preferred editor has escalated to almost religious proportions in the past. Here it probably makes more sense to introduce the lightweight editor GNU nano, though. It is both lighter and more intuitive to use than the two giants, emacs and vim. Regarding emacs and vim, it is probably useful to mention that to exit EMACS, the key combination is Ctrl-X Ctrl-C and for vim, <ESC>:q!<ENTER>.
A file can be opened by giving it to nano as an argument. The editor should look something like this:
The cursor can be moved using the arrow keys, and the editor works in rather intuitive fashion. The keyboard shortcuts are listed at the bottom of the screen and the shortcut Ctrl-G brings up a help page. In the shortcuts, the caret symbol ("^") means the Control key.
A file can be saved using the key combination Ctrl-O, after which the editor will prompt for a filename. To exit the editor, press Ctrl-X.
Removing files
Files can be removed using the command rm (remove files). The files to be removed are given as arguments. By default, rm only removes files, but it can also remove directories and directory trees with the option -r (recursive).
Creating an empty file using touch and then removing it with rm:
tteekkari@kosh:~$ touch testfile1
tteekkari@kosh:~$ ls testfile1
testfile1
tteekkari@kosh:~$ rm testfile1
tteekkari@kosh:~$ ls testfile1
ls: cannot access 'testfile1': No such file or directory
tteekkari@kosh:~$
Please note that many commands — rm included — accept options and arguments in any order. This means that especially with rm, one needs to be careful when handling files with names beginning with a dash (-) as they can be interpreted as options. This can be prevented by giving rm two dashes (--) as the last option before the arguments. It indicates that anything that comes afterward is not an option but an argument and works with quite a few other commands as well.
An example of a problematic filename:
tteekkari@kosh:~/testdir$ ls
directory1/ -r testfile1
tteekkari@kosh:~/testdir$ rm *
tteekkari@kosh:~/testdir$ ls
-r
tteekkari@kosh:~/testdir$
As seen here, both the directory directory1
and the file testfile1
get deleted with the asterisk wildcard, but the file -r
stays as it was
treated as an option (which also caused the directory to get deleted).
The file -r
can be deleted like this:
tteekkari@kosh:~/testdir$ ls
-r
tteekkari@kosh:~/testdir$ rm -- -r
tteekkari@kosh:~/testdir$ ls
tteekkari@kosh:~/testdir$
The example as it should have been done:
tteekkari@kosh:~/testdir$ ls
directory1/ -r testfile1
tteekkari@kosh:~/testdir$ rm -- *
rm: cannot remove 'directory1': Is a directory
tteekkari@kosh:~/testdir$ ls
directory1/
tteekkari@kosh:~/testdir$
Creating and removing directories
Directories can be created using the command mkdir (make directory). Empty directories can be removed using rmdir (remove directory). Both take the directory name(s) as argument(s). As seen in previous example, the command rm can also be used to remove entire directory trees, empty or not, but it should be used with care as without the option -i (interactive) it doesn't ask anything before deleting an entire directory tree.
Creating and deleting directories:
tteekkari@kosh:~$ mkdir directory
tteekkari@kosh:~$ cd directory/
tteekkari@kosh:~/directory$ ls
tteekkari@kosh:~/directory$ cd ..
tteekkari@kosh:~$ rmdir directory/
tteekkari@kosh:~$ mkdir directory
tteekkari@kosh:~$ cd directory/
tteekkari@kosh:~/directory$ touch testfile1
tteekkari@kosh:~/directory$ ls
testfile1
tteekkari@kosh:~/directory$ cd ..
tteekkari@kosh:~$ rmdir directory/
rmdir: failed to remove 'directory/': Directory not empty
tteekkari@kosh:~$ rm -r directory/
First, a directory called directory
was created. Then cd was used
to go into that directory and list its contents. We changed back out from
the directory and deleted it using rmdir which succeeded. We then created
the directory again and this time created a file called testfile1
inside the directory. Then removing the directory using rmdir failed,
but it could still be removed using rm -r.
Renaming, moving and copying files
Files can be renamed and moved using the command mv (move). This command accepts two or more arguments. When using two arguments, the file given as the first argument will be renamed as indicated in the second argument or, if the second argument is a directory, it will be moved into that directory. If a file with the same name already exists, it will be overwritten. If more than two arguments are used, the last argument has to be a directory, and all the files given as previous arguments will be moved into that directory.
An example of moving and renaming files:
tteekkari@kosh:~/testdir$ ls
directory1 file1 file2
tteekkari@kosh:~/testdir$ mv file1 file3
tteekkari@kosh:~/testdir$ ls
directory1 file2 file3
tteekkari@kosh:~/testdir$ mv file2 file3 directory1/
tteekkari@kosh:~/testdir$ ls
directory1
tteekkari@kosh:~/testdir$ ls directory1/
file2 file3
tteekkari@kosh:~/testdir$
First, the current directory contains an empty directory directory1
and two files, file1
and file2
. The file file1
is first renamed as
file3
and then both files file2
and file3
are moved into
directory1
.
Reading the contents of a file
The contents of a file can be printed on the terminal using the command cat (concatenate). As the name implies, the command was originally meant for printing out multiple files in succession to combine them, but can just as well be used to print the contents of a single file. The files to be printed are given as arguments.
Viewing a file using cat:
tteekkari@kosh:~$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=18.04
DISTRIB_CODENAME=bionic
DISTRIB_DESCRIPTION="Ubuntu 18.04.3 LTS"
tteekkari@kosh:~$
Files are often long and can't fit on the screen as is. With long files,
it is often better to use the command more, which prompts the user
for input after every full screen before printing out the next line or
screenful. The name of the command comes from its prompt, which is
---More--
.
An alternative to more is less (less is more), which offers more functionality, including scrolling in both directions and performing searches in the open file. Whereas more is a paginator, less is actually more like a file viewer. Less uses a vi-like interface which will not be covered in depth here, but the manual page lists all the commands and shortcuts. To exit less, press q.
Symbolic links
A symbolic link is a file that is either a relative or absolute reference to another place in the file system. Unlike regular files and directories, it does not have its own contents. All operations touching the contents are performed on the target file.
The file or directory that a symbolic link points to is called the target. In most cases, symbolic links work fully transparently — that is to say a program opening the link file sees the file or directory the link is pointing to. A symbolic link is a separate file from the target file in that if the target file is, for example, removed, the symbolic link is not. A symbolic link can thus also point to a non-existent target.
Symbolic links can be created using the command ln -s. Note also that a symbolic link can point to one of its parent directories in the file system tree. This will turn the tree structure into a directed graph and manual traversing of the tree might not always work intuitively. The purpose of symbolic links is usually to make one file or path visible in multiple places in the file system.
Creating symbolic links using relative and absolute paths:
tteekkari@kosh:~$ ls foo.txt
foo.txt
tteekkari@kosh:~$ ln -s foo.txt foo2.txt
tteekkari@kosh:~$ ls -l foo*.txt
lrwxrwxrwx 1 tteekkari domain users 7 Dec 12 17:41 foo2.txt -> foo.txt
-rw-r--r-- 1 tteekkari domain users 0 Dec 12 17:40 foo.txt
tteekkari@kosh:~$ ln -s /usr/share/dict/words words
tteekkari@kosh:~$ ls -l words
lrwxrwxrwx 1 tteekkari domain users 21 Dec 12 17:41 words -> /usr/share/dict/words
tteekkari@kosh:~$
Without the -s option, ln creates hard links. Hard links will not be covered here and generally should be avoided in normal use.
File permissions
The traditional Unix-style file permissions consist of three categories of users who can be given up to three types of permissions. These permissions can also be referred to as the mode of the file. The three categories are user, group and others. Every file has two owners: a user and a group.
The three different permissions are read permission, write permission and execute permission. Read permission allows reading the file contents, write permission allows modifying the file contents and execute permission allows executing the file.
When attempting to access a file, the permissions are checked and access may be denied, when necessary. First, the operating system checks if the user is the owner of the file, and if so, the 'user' category permissions are applied. If the user is not the owner, the operating system checks if the user belongs to the group that has ownership of the file. If so, the 'group' category permissions are applied. Otherwise, the permissions in the 'others' category are applied.
Permissions can be expressed in two ways. One is the symbolic notation seen in the output of ls -l. The other is in the form of a three or four-digit octal number.
The symbolic notation as seen in ls -l:
In symbolic notation the permissions are expressed in three groups of three permissions. The first group is the user permissions, the second group is the group permissions and the third is the others permissions. The read permission is shown as r, write as w and execute as x.
The other way to express file permissions is as an octal number. The
entire set of file permissions, i.e. the mode of the file, can be
expressed as a four-digit octal number. The first digit is a set of
special permissions and is usually 0. The next three digits are the
user, group and others permissions. Each digit is the sum of permissions
for each category. The read permission corresponds to 4, write to 2 and
execute to 1. Thus the full rwxrwxrwx
in octal would be 0777 (first
digit is 0 and the rest are 7 (r+w+x = 4+2+1). The octal number 0644,
for example, would correspond to rw-r--r--
, and 0600 to rw-------
in
symbolic notation.
The effect of each mode bit (permission) depends somewhat on the type of the file.
Regular file (-) | Directory (d) | Symbolic link (l) | |
---|---|---|---|
Read (r) | The file contents can be read. | The directory contents can be listed. | Ignored. Target file permissions apply. |
Write (w) | The file can be written into. | Files can be removed or added to the directory. | Ignored. Target file permissions apply. |
Execute (x) | The file can be executed. | The directory can be traversed and files can be used. | Ignored. Target file permissions apply. |
As the table shows, permissions on directories and symbolic links behave a bit differently from regular files. The read permission on a directory allows reading the directory contents, which in practice means the names of the files contained in the directory. To be able to read the contents of the files, the execute permission is also needed (and naturally the file itself needs to be readable). The execute bit on a directory allows entering that directory and accessing he contents. The write permission allows creating and removing files. A write permission on a directory controls removing files. Even users with write permission to a file cannot remove the file unless they also have write permission to the directory the file resides in.
For symbolic links, all the permissions on the symbolic link itself are ignored and the permissions of the target file are used instead.
In addition to these three permission bits, there are also three special modes, which are more like file attributes than access permissions. In octal presentation they form the first digit. The special modes are set user ID (setuid, SUID), set group ID (setgid, SGID) and the sticky bit.
The setuid bit, when set, allows a user to execute a file as the owner of the file instead of as him/herself. Thus, a user may run a setuid binary owned by root as the root user and bypass all permission checks. Allowing users to run binaries as other users is inherently dangerous and therefore the setuid bit should normally not be used. External file systems are also usually mounted with the setuid functionality disabled.
The setgid bit works much like the setuid bit, except that instead of changing the user ID when executed, it changes the effective group ID and the file is executed as though the user was in the group owning the file. This is also dangerous and should not be used on regular files.
On directories, the setgid bit causes all files and subdirectories in the directory to inherit the group of the parent directory. This can be very useful on shared directories to make sure that files can be read by other users (in a group).
The sticky bit on a directory changes the behaviour of the write
permission so that users cannot move, rename or delete files that are
not owned by themselves. This permission is in use, for example, in
/tmp/
, a shared directory that potentially contains files owned by
many different users and prevents the users from removing each others'
files.
In symbolic notation, the setuid/setgid special modes are shown -- should a file or directory have them set -- in the place of the relevant execute permission (user for setuid, group for setgid). An execute permission of 'x' is replaced by 's' and a '-' is replaced by 'S'. The sticky bit is shown in the place of the execute permission of the others category. An 'x' is transformed into a 't' and a '-' into a capital 'T'.
In octal notation, the special permissions form the first digit of the four-digit octal representation. The setuid bit corresponds to 4, the setgid bit to 2 and the sticky bit to 1.
For example, the /tmp/
directory usually has the mode drwxrwxrwt
,
which would correspond to 1777 octal. A directory with setgid and sticky
bits set could have a mode of drwxrws--T
, which would correspond to
3770 octal. The first digit of 3 corresponds to 2 (setgid) + 1 (sticky)
and the user and group permissions to 7 (r (4) + w (2) + x (1) = rwx
(7)). In the /tmp/
example the first digit corresponds to the sticky bit
(1) and the rest of the permissions are rwx
as above.
Setting file mode/permissions
File permissions can be set using the command chmod (change mode). The command can parse both octal and symbolic permissions. The octal notation can be used to set the file mode and the symbolic notation to set individual permission bits.
When using the symbolic notation, the user permissions are referred to using the letter 'u', group permissions using 'g' and others using 'o'. There is also a shorthand for all permissions: 'a' (this is the same as 'ugo'). The permission bits themselves are represented by "r", "w" and "x" (and "s" for setuid/setgid and "t" for sticky bit).
Different invocations of chmod:
tteekkari@kosh:~/testdir$ ls -l
total 0
-rw-r--r-- 1 tteekkari domain users 0 Dec 18 15:53 file1
tteekkari@kosh:~/testdir$ chmod g+w file1
tteekkari@kosh:~/testdir$ ls -l
total 0
-rw-rw-r-- 1 tteekkari domain users 0 Dec 18 15:53 file1
tteekkari@kosh:~/testdir$ chmod 0644 file1
tteekkari@kosh:~/testdir$ ls -l
total 0
-rw-r--r-- 1 tteekkari domain users 0 Dec 18 15:53 file1
tteekkari@kosh:~/testdir$ chmod go-r file1
tteekkari@kosh:~/testdir$ ls -l
total 0
-rw------- 1 tteekkari domain users 0 Dec 18 15:53 file1
tteekkari@kosh:~/testdir$ chmod a+x file1
tteekkari@kosh:~/testdir$ ls -l
total 0
-rwx--x--x 1 tteekkari domain users 0 Dec 18 15:53 file1
Users, groups and permissions
As Unixes (and Linux) are multi-user operating systems, there can be multiple users logged on to a single system. For every user, the operating system stores seven different data fields: username, password, numerical user ID (UID), numerical group ID (GID), the full name of the user, the home directory path of the user and the shell of the user.
Username is the name that is used to log in to the system and shown in various places where a human readable name is required. The password is (a hash of) the password used to authenticate the user. The numerical user and group IDs are the internal representation of the user and the primary group of the user. The full name and shell are pretty self-explanatory. The home directory path is the path to the user's home directory, which is where any shell initialisation files are read from and where the session begins.
Users
Users' information can be examined by using the finger command. The users who are logged on to a computer can be listed using w, who or users.
tteekkari@kuikka:~$ w
11:14:45 up 33 days, 16:00, 1 user, load average: 0.10, 0.04, 0.01
USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT
tteekkar pts/0 130.233.224.196 11:14 0.00s 0.04s 0.00s w
tteekkari@kuikka:~$ who
tteekkari pts/0 2019-12-18 11:14 (130.233.224.196)
tteekkari@kuikka:~$ users
tteekkari
tteekkari@kuikka:~$
Changing user info
Users can change their login shell using the command chsh. The full name (and a few other miscellaneous fields) can be changed using the command chfn. Of these, only chsh is available on Aalto computers.
Changing password
Users can change their password using the command passwd. The command prompts once for the old password and, if it is entered correctly, twice for the new password. If the new passwords match and fulfil all security requirements, the password is changed.
For changing passwords in the Aalto environment, please see Changing passwords.
Groups
Users can be bundled into user groups. A group has three properties, a name, a numerical group ID and a list of members. A user can be a member in multiple groups and groups are usually used for access control. If a group has permission to do something, then any group member has that same permission.
The groups that a user is member of can be listed using the commands id (which also prints the numeric group/user IDs) and groups.
tteekkari@kosh:~$ id
uid=6666666(tteekkari) gid=70000(domain users) groups=70000(domain users)
tteekkari@kosh:~$ groups
domain users
A user has a primary group, which is used as the default group owner for any new files created by the user. In addition to the primary group, a user can have up to 65535 auxiliary groups, which are just additional groups the user is a member of.
A user's groups are usually only updated at the beginning of the session. If a user is added to a new group during the session, the new group membership will not come into effect unless the user logs out and then back in.
The primary group of the user can be changed using the command newgrp. This is very rarely needed, but can, for example, be used to introduce a new group into an old session.
File ownership
All files are always owned by both a user and a group. Both the user and the group can be assigned different permissions to the file. The owner of a file can be changed using the command chown (change owner, can change both user and group) or chgrp (change group, only changes the group ownership).
Both of these commands will accept either the user or the group names or their numeric IDs.
Users can normally change only the group ownership of their files, and they can only choose groups they are themselves members of.