Linux is an open-source operating system. Its core components are a boot loader, a kernel, basic services including a shell, and a package manager. Linux can also come with a display manager and some applications (like LibreOffice or GIMP).

There are tons of different Linux distributions, but most of them belong to a specific family. The three main families of distributions are:

The three main families of Linux distributions

Since I use Ubuntu on my WSL and Amazon EC2, and Ubuntu belongs to the Debian family, I will only write down useful commands and tools from the Debian family.

Table of contents:

The file system

The file system works as a tree. Unlike Windows where there can be multiple roots like C:\ and D:\, Linux only has one root and it is accessible via /. All other files, directories and partitions are located under the root. In particular, multiple drives are mounted under /mnt.

The Filesystem Hierarchy Standard (FHS) describes the main top-level directories located just under the root:

name description
/bin user binaries
/boot boot loader files
/dev device files
/etc configuration files
/home home directories
/lib system libraries
/media removable devices
/mnt mount directory
/opt optional applications
/proc process information
/root home directory for the root user
/sbin system binaries
/srv service data
/tmp temporary files
/usr multi-user programs
/var variable files

Notes:

  • The file system is case-sensitive
  • Each user has a directory under /home named after him
  • /proc contains virtual files that only exist in memory
  • /etc contains system configuration files (only the root user can edit)
  • /lib contains dynamically loaded libraries

The command line

The command line is a text-based interface where users can interact with the operating system by typing commands. The basic structure of a command is:

command [flags] [arguments]

The command is the program or action to run. The flags are optional and modify the behavior of the command if needed. The arguments are the input of the command. For example, typing df -Th / displays the amount of available disk space for the root filesystem along with their type and human-readable numbers. We can break down this instruction as follows:

part description
df disk free
-T displays the filesystem type (e.g. ext4)
-h show the sizes in human-readable format (KB, MB, GB)
/ applies the command and flags to the root

Navigating the directory

Below are some basic commands to navigate the directory and manipulate files.

command description
pwd print working directory
cd change directory
cd ~ change to your home directory
cd .. change to parent directory
cd - change to previous directory
cd / change to the root directory
ls list the contents of the directory
tree display a tree view of the filesystem
pushd push the current directory onto a stack for later use and move to the specified directory
popd change to the top directory off the stack

Some flags often used with the ls command:

  • ls -a lists all contents including hidden files and directories
  • ls -i includes the inode of each file
  • ls -h displays files sizes in a human-readable format
  • ls -l shows more information

Flags can be combined: ls -aihl does all of the above at once. ls also works with Globbing patterns, such as ls *.txt to list all files with a txt extension.

In order to get help on any command, type man [command] (which stands for “manual”). For instance, to get help on find, type man find. This proves useful when looking for specific flags. To interrupt a command, type Ctrl + C. To clear the command line, type Ctrl + L.

Manipulating files and directories

command description
cat print content onto the standard output stream
head print the first 10 lines of a file
tail print the last 10 lines of a file
less print a file one page at a time
touch create a file
mv rename a file
rm remove a file
rm -rf empty a directory recursively
rmdir remove a directory (only works when the directory is empty)
mkdir create a directory
ln create a symbolic link
diff compare files and directories
file give the real nature of a file
cp copy files on the local machine
patch apply changes to a file (changes are contained in patchfiles)
which locate a program
whereis locate searching in a broader range

Note: most applications open the file rather than rely on extensions. Extensions are more useful for the user than for the system. Hence the file command.

The package manager

APT (Advanced Package Tool) is the default package manager on Debian. It is made of two components which are apt-get (high-level package manager used for dependency resolution) and dpkg (low-level package manager working under the hood). Most of the time we use apt-get.

Example commands:

command description
apt-get upgrade upgrade all installed packages to their latest version
dpkg --list | grep zip show all installed packages which name contains “zip”

Editing files

On the command-line, we can use Nano or Vi.

From the shell, let’s add “Hello World” to file1, then print the contents of file1 to the command line:

$ echo "Hello World" > file1
$ cat file1
Hello World

To open the file in Nano, type nano my_file.

User management

command description
id information about the current user
id [username] information about [username]
sudo useradd [username] add a new user named [username]
sudo userdel [username] delete user [username]
sudo userdel [username] -r fully delete [username] including his home directory

Each user has a unique id. Groups gather users with common access rights and privileges. Normal users start at 1000. All users are created with at least one group which id is equal to their uid.

file description
/etc/group list of groups
/etc/passwd list of users

The /etc/passwd file is organized as follows:

username:x:UID:GID:comment:home_directory:shell

How to search /etc/passwd for a specific user (here for user “root”):

$ grep root /etc/passwd
root:x:0:0:root:/root:/bin/bash
command description
useradd add a user directory under /home and add a line to /etc/passwd
userdel delete a user but leave the home directory unless userdel -r
groupadd create a new group on the system
groupdel delete an existing group from the system
whoami display the current logged-in user name
su switch user : launch a new shell running as another user

It is bad practice to use su to switch to root account. Instead, better use sudo on a user. This way, sudo privileges are granted to a specific user but only last for a specific command.

Every file is associated with a user and a group

command description
chown change user ownership of a file or directory
chgrp hange group ownership
chmod change the permissions on the file for owner, group and other

Illustration of chmod

Environment variables

Environment variables are specific variables used by the shell.

command description
env Print current environment variable to the shell
env | head Only print the first 10 variables
echo $SHELL Print the environment variable named SHELL

PS1 customize your prompt

SHELL default shell for the current user, usually bash

bash: history history of commands entered

history | tail show last 10

otherwise navigate using up/down arrow keys

Ctrl+R: search in history of past commands

Tab: autocomplete

Manipulating files

cat short for concatenate

$ cat readme.txt

main purpose is to concatenate multiple files together

Some files are very large. Opening the file in an editor may consume too much memory. Using less allows to load page by page, not saturating the memory along with head and tail.

$ tail -15 somefile.log
$ tail -f somefile.log

tail -f continuously monitors new lines in the log file and displays them in stdout

Text utilities

command description
cat Concatenate files (can also read and print files)
less Print the contents of a file in the shell page by page
head Show the first 10 lines of a file
tail Show the last 10 lines of a file
sed Edit data streams (filter and perform substitutions)
awk Pattern scanning and processing
sort Sort text files and output streams
uniq Remove duplicate consecutive lines
paste Combine fields from different files
join Enhanced version of paste
split Break up large files into equal segments
wc Display the number of lines and words in a file
cut Cut sections in each line of a file
grep Search text files and input streams (works with regex)

Streams and pipes

Standard streams are communication channels between a computer program and its environment. When we launch a Linux command, three data streams are created:

  • stdin: standard input
  • stdout: standard output
  • stderr: standard error

When a command is executed via a shell, the streams are connected to the terminal on which the shell is running.

Streams can be directed and changed using > and <:

$ do_something < input_file
$ do_something > output_file
$ do_something 2> output_file

Programs can be chained together, where the stdout of the previous program is the stdin of the next:

$ command_1 | command_2 | command_3

When several commands are chained together, each command does not have to wait for the previous command to finish before starting.

Searching for files:

command description
locate  
grep filter
locate zip | grep bin print all files containing both “zip” and “bin”
find  

Processes

Some clarification around the terminology:

  • a program is an executable sequence of instructions
  • a process is the running instance of a program
  • a thread is a lightweight process
  • a service is process running in the background

A process can run one or several threads, and can also run other processes.

There are two main commands to monitor processes:

command description
ps reports a snapshot of current processes
top task manager that displays information about CPU and memory utilization (auto-refresh)

A process can be defined by:

  • ID type
  • Description
  • Process ID (PID)
  • Unique Process ID number
  • Parent Process ID (PPID)
  • PID of the parent process
  • Thread ID (TID)
  • Thread ID number

For single-threaded processes, TID=PID. For multi-threaded processes, each thread shares the same PID but has a unique TID. To terminate a process: kill -SIGKILL <pid>. You can only terminate your own processes (unless you are root). Processes have priorities (their nice value, the lower the higher the priority). The “niceness” of a process is its priority. You can can change the niceness of a process from the CLI via renice command.

Cron

Cron makes it possible to launch background jobs at specific times.

We can schedule tasks by editing the crontab file (for cron table). We can access the crontab file by typing `crontab -e` (“crontab edit”). There are both system-wide and user-specific crontab files.

Cron instructions follow a specific format made of 6 fields:

Field Description values
MIN Minutes 0 to 59
HOUR Hours 0 to 23
DOM Day Of Month 1 to 31
MON Month 1 to 12
DOW Day of Week 0 to 6 (0 = Sunday)
CMD Command Any command to be executed

Example of cron schedule expression to run script.sh everyday at 1AM:

0 1 * * * /path/to/my/script.sh

Networks

A network is a group of computers connected together via communication channels. Each device has an IP address (unique logical network address). A DNS converts a hostname into an IP address.

An IPv4 address is dividend into 4 bytes:

  • class A network address: first octet as Net ID, other 3 as Host ID max 126 class A. Classes A go from 1.0.0.0 to 127.255.255.255 (i.e. 16.7m unique hosts)
  • classes B and C were added later: class b: 2 bits 65k unique hosts class c network addresses: first 3 octets as NET ID, last as host ID 2.1m available. Classes C go from 192.0.0.0 to 223.255.255.255

Home network only has 1 IP address. Within the network, you can assign IPs to connected devices either statically or dynamically (DHCP).

NAT = Network Address Translation, allows to share 1 IP address among many locally connected computers, each of which has a unique address only seen on the local network. In a home network, the router acts as a NAT.

Important files:

File Description
/etc/resolv.conf  
/etc/hosts  
/etc/network  

Network interfaces are points of connection between a computer and a network.

Command Description
ip  
ifconfig Display active network interfaces
hostname Display current host name
host Show IP address for a domain name
nslookup Same
ping Check whether remote host is alive and responding
route  
netstat  
nmap  
wget Download web pages
curl  
ssh Crytographic network protocol
scp Secure copy between 2 network hosts, uses SSH protocol
ssh <hostname> <command> SSH Tunnel

References: