Linux 101
A smorgasbord of linux commands best served as a cheat sheet.Linux is an open-source operating system. Its core components are a boot loader, a kernel, basic services including a shell, and a package manager. Linux can also come with a display manager and some applications (like LibreOffice or GIMP).
There are tons of different Linux distributions, but most of them belong to a specific family. The three main families of distributions are:
Since I use Ubuntu on my WSL and Amazon EC2, and Ubuntu belongs to the Debian family, I will only write down useful commands and tools from the Debian family.
Table of contents:
- The file system
- The command line
- Navigating the directory
- Manipulating files and directories
- The package manager
- Editing files
- User management
- Environment variables
- Manipulating files
- Streams and pipes
- Processes
- Cron
- Networks
The file system
The file system works as a tree. Unlike Windows where there can be multiple roots like C:\
and D:\
, Linux only has one root and it is accessible via /
. All other files, directories and partitions are located under the root. In particular, multiple drives are mounted under /mnt
.
The Filesystem Hierarchy Standard (FHS) describes the main top-level directories located just under the root:
name | description |
---|---|
/bin | user binaries |
/boot | boot loader files |
/dev | device files |
/etc | configuration files |
/home | home directories |
/lib | system libraries |
/media | removable devices |
/mnt | mount directory |
/opt | optional applications |
/proc | process information |
/root | home directory for the root user |
/sbin | system binaries |
/srv | service data |
/tmp | temporary files |
/usr | multi-user programs |
/var | variable files |
Notes:
- The file system is case-sensitive
- Each user has a directory under /home named after him
- /proc contains virtual files that only exist in memory
- /etc contains system configuration files (only the root user can edit)
- /lib contains dynamically loaded libraries
The command line
The command line is a text-based interface where users can interact with the operating system by typing commands. The basic structure of a command is:
command [flags] [arguments]
The command is the program or action to run. The flags are optional and modify the behavior of the command if needed. The arguments are the input of the command. For example, typing df -Th /
displays the amount of available disk space for the root filesystem along with their type and human-readable numbers. We can break down this instruction as follows:
part | description |
---|---|
df |
disk free |
-T |
displays the filesystem type (e.g. ext4) |
-h |
show the sizes in human-readable format (KB, MB, GB) |
/ |
applies the command and flags to the root |
Navigating the directory
Below are some basic commands to navigate the directory and manipulate files.
command | description |
---|---|
pwd |
print working directory |
cd |
change directory |
cd ~ |
change to your home directory |
cd .. |
change to parent directory |
cd - |
change to previous directory |
cd / |
change to the root directory |
ls |
list the contents of the directory |
tree |
display a tree view of the filesystem |
pushd |
push the current directory onto a stack for later use and move to the specified directory |
popd |
change to the top directory off the stack |
Some flags often used with the ls
command:
ls -a
lists all contents including hidden files and directoriesls -i
includes the inode of each filels -h
displays files sizes in a human-readable formatls -l
shows more information
Flags can be combined: ls -aihl
does all of the above at once. ls
also works with Globbing patterns, such as ls *.txt
to list all files with a txt extension.
In order to get help on any command, type man [command]
(which stands for “manual”). For instance, to get help on find
, type man find
. This proves useful when looking for specific flags. To interrupt a command, type Ctrl + C
. To clear the command line, type Ctrl + L
.
Manipulating files and directories
command | description |
---|---|
cat |
print content onto the standard output stream |
head |
print the first 10 lines of a file |
tail |
print the last 10 lines of a file |
less |
print a file one page at a time |
touch |
create a file |
mv |
rename a file |
rm |
remove a file |
rm -rf |
empty a directory recursively |
rmdir |
remove a directory (only works when the directory is empty) |
mkdir |
create a directory |
ln |
create a symbolic link |
diff |
compare files and directories |
file |
give the real nature of a file |
cp |
copy files on the local machine |
patch |
apply changes to a file (changes are contained in patchfiles) |
which |
locate a program |
whereis |
locate searching in a broader range |
Note: most applications open the file rather than rely on extensions. Extensions are more useful for the user than for the system. Hence the file
command.
The package manager
APT (Advanced Package Tool) is the default package manager on Debian. It is made of two components which are apt-get
(high-level package manager used for dependency resolution) and dpkg
(low-level package manager working under the hood). Most of the time we use apt-get
.
Example commands:
command | description |
---|---|
apt-get upgrade |
upgrade all installed packages to their latest version |
dpkg --list | grep zip |
show all installed packages which name contains “zip” |
Editing files
On the command-line, we can use Nano or Vi.
From the shell, let’s add “Hello World” to file1, then print the contents of file1 to the command line:
$ echo "Hello World" > file1
$ cat file1
Hello World
To open the file in Nano, type nano my_file
.
User management
command | description |
---|---|
id |
information about the current user |
id [username] |
information about [username] |
sudo useradd [username] |
add a new user named [username] |
sudo userdel [username] |
delete user [username] |
sudo userdel [username] -r |
fully delete [username] including his home directory |
Each user has a unique id. Groups gather users with common access rights and privileges. Normal users start at 1000. All users are created with at least one group which id is equal to their uid.
file | description |
---|---|
/etc/group | list of groups |
/etc/passwd | list of users |
The /etc/passwd file is organized as follows:
username:x:UID:GID:comment:home_directory:shell
How to search /etc/passwd for a specific user (here for user “root”):
$ grep root /etc/passwd
root:x:0:0:root:/root:/bin/bash
command | description |
---|---|
useradd |
add a user directory under /home and add a line to /etc/passwd |
userdel |
delete a user but leave the home directory unless userdel -r |
groupadd |
create a new group on the system |
groupdel |
delete an existing group from the system |
whoami |
display the current logged-in user name |
su |
switch user : launch a new shell running as another user |
It is bad practice to use su
to switch to root account. Instead, better use sudo
on a user. This way, sudo privileges are granted to a specific user but only last for a specific command.
Every file is associated with a user and a group
command | description |
---|---|
chown |
change user ownership of a file or directory |
chgrp |
hange group ownership |
chmod |
change the permissions on the file for owner, group and other |
Environment variables
Environment variables are specific variables used by the shell.
command | description |
---|---|
env |
Print current environment variable to the shell |
env | head |
Only print the first 10 variables |
echo $SHELL |
Print the environment variable named SHELL |
PS1 customize your prompt
SHELL default shell for the current user, usually bash
bash:
history
history of commands entered
history | tail
show last 10
otherwise navigate using up/down arrow keys
Ctrl+R: search in history of past commands
Tab: autocomplete
Manipulating files
cat
short for concatenate
$ cat readme.txt
main purpose is to concatenate multiple files together
Some files are very large. Opening the file in an editor may consume too much memory. Using less
allows to load page by page, not saturating the memory along with head
and tail
.
$ tail -15 somefile.log
$ tail -f somefile.log
tail -f
continuously monitors new lines in the log file and displays them in stdout
Text utilities
command | description |
---|---|
cat |
Concatenate files (can also read and print files) |
less |
Print the contents of a file in the shell page by page |
head |
Show the first 10 lines of a file |
tail |
Show the last 10 lines of a file |
sed |
Edit data streams (filter and perform substitutions) |
awk |
Pattern scanning and processing |
sort |
Sort text files and output streams |
uniq |
Remove duplicate consecutive lines |
paste |
Combine fields from different files |
join |
Enhanced version of paste |
split |
Break up large files into equal segments |
wc |
Display the number of lines and words in a file |
cut |
Cut sections in each line of a file |
grep |
Search text files and input streams (works with regex) |
Streams and pipes
Standard streams are communication channels between a computer program and its environment. When we launch a Linux command, three data streams are created:
- stdin: standard input
- stdout: standard output
- stderr: standard error
When a command is executed via a shell, the streams are connected to the terminal on which the shell is running.
Streams can be directed and changed using >
and <
:
$ do_something < input_file
$ do_something > output_file
$ do_something 2> output_file
Programs can be chained together, where the stdout of the previous program is the stdin of the next:
$ command_1 | command_2 | command_3
When several commands are chained together, each command does not have to wait for the previous command to finish before starting.
Searching for files:
command | description |
---|---|
locate |
|
grep |
filter |
locate zip | grep bin |
print all files containing both “zip” and “bin” |
find |
Processes
Some clarification around the terminology:
- a program is an executable sequence of instructions
- a process is the running instance of a program
- a thread is a lightweight process
- a service is process running in the background
A process can run one or several threads, and can also run other processes.
There are two main commands to monitor processes:
command | description |
---|---|
ps |
reports a snapshot of current processes |
top |
task manager that displays information about CPU and memory utilization (auto-refresh) |
A process can be defined by:
- ID type
- Description
- Process ID (PID)
- Unique Process ID number
- Parent Process ID (PPID)
- PID of the parent process
- Thread ID (TID)
- Thread ID number
For single-threaded processes, TID=PID. For multi-threaded processes, each thread shares the same PID but has a unique TID. To terminate a process: kill -SIGKILL <pid>
. You can only terminate your own processes (unless you are root). Processes have priorities (their nice
value, the lower the higher the priority). The “niceness” of a process is its priority. You can can change the niceness of a process from the CLI via renice
command.
Cron
Cron makes it possible to launch background jobs at specific times.
We can schedule tasks by editing the crontab file (for cron table). We can access the crontab file by typing `crontab -e` (“crontab edit”). There are both system-wide and user-specific crontab files.
Cron instructions follow a specific format made of 6 fields:
Field | Description | values |
---|---|---|
MIN | Minutes | 0 to 59 |
HOUR | Hours | 0 to 23 |
DOM | Day Of Month | 1 to 31 |
MON | Month | 1 to 12 |
DOW | Day of Week | 0 to 6 (0 = Sunday) |
CMD | Command | Any command to be executed |
Example of cron schedule expression to run script.sh
everyday at 1AM:
0 1 * * * /path/to/my/script.sh
Networks
A network is a group of computers connected together via communication channels. Each device has an IP address (unique logical network address). A DNS converts a hostname into an IP address.
An IPv4 address is dividend into 4 bytes:
- class A network address: first octet as Net ID, other 3 as Host ID max 126 class A. Classes A go from
1.0.0.0
to127.255.255.255
(i.e. 16.7m unique hosts) - classes B and C were added later:
class b: 2 bits 65k unique hosts
class c network addresses: first 3 octets as NET ID, last as host ID 2.1m available. Classes C go from
192.0.0.0
to223.255.255.255
Home network only has 1 IP address. Within the network, you can assign IPs to connected devices either statically or dynamically (DHCP).
NAT = Network Address Translation, allows to share 1 IP address among many locally connected computers, each of which has a unique address only seen on the local network. In a home network, the router acts as a NAT.
Important files:
File | Description |
---|---|
/etc/resolv.conf | |
/etc/hosts | |
/etc/network |
Network interfaces are points of connection between a computer and a network.
Command | Description |
---|---|
ip |
|
ifconfig |
Display active network interfaces |
hostname |
Display current host name |
host |
Show IP address for a domain name |
nslookup |
Same |
ping |
Check whether remote host is alive and responding |
route |
|
netstat |
|
nmap |
|
wget |
Download web pages |
curl |
|
ssh |
Crytographic network protocol |
scp |
Secure copy between 2 network hosts, uses SSH protocol |
ssh <hostname> <command> |
SSH Tunnel |
References: