Linux 101
What is Linux?
Linux is an open-source operating system. Its core components are a boot loader, a kernel, basic services including a shell, and a package manager. It can also come with a display manager and some applications (like LibreOffice or GIMP).
There are tons of different Linux distributions, but most of them belong to a specific family. The three main families of distributions are:
Since I use Ubuntu on my WSL and Amazon EC2, and Ubuntu comes from Debian, I will only write down useful commands and tools from the Debian family.
"Everything is a file":
root directory != root user
Linux is packed with choices.
From a command line perspective.
The filesystem
The filesystem works as a tree.
The root is accessible via /
. All other files, directories and partitions are located under the root (unlike Windows, in Linux multiple drives are mounted as directories under /mnt
).
The Filesystem Hierarchy Standard (FHS) describes the main top-level directories located just under root:
name | description |
---|---|
/bin | User binaries |
/boot | Boot loader files |
/dev | Device files |
/etc | Configuration files |
/home | Home directories |
/lib | System libraries |
/media | Removable devices |
/mnt | Mount directory |
/opt | Optional applications |
/proc | Process information |
/root | Home directory for the root user |
/sbin | System binaries |
/srv | Service data |
/tmp | Temporary files |
/usr | Multi-user programs |
/var | Variable Files |
* The filesystem is case-sensitive
* Each user has a directory under /home named after him
* /proc contains virtual files that only exist in memory
* /etc contains system configuration files (only the root user can edit)
* /lib contains dynamically loaded libraries
What does it meant to mount a filesystem?
diff
: compare files and directories
diff [options] <filename1> <filename2>
to compare binaries files, use cmp
patch
: applies changes to a file (changes are contained in patchfiles)
Most applications open the file rather than rely on extensions. Extensions are more useful for the user than for the system.
file
: gives the real nature of a file
cp
: copy files on the local machine
rsync
: powerful utility, can copy to destination over the network, only transmitting the differences, and supports recursion inside a directory, synchronizes contents.
filesystems can be mounted anywhere on the main filesystem: removable device like a USB key, attach it to a mount point so that its files can be accessed from the main FHS.
The command line
man <command>
(which stands for "manual"). For instance, to get help on find
, type man find
. This proves useful when looking for specific flags.To clear the command line: Ctrl + L
To interrupt a command: Ctrl + C
A first example
Typing df -Th /
displays the amount of available disk space for the filesystems along with their type and human-readable numbers.
Basic commands for directory navigation
command | description |
---|---|
pwd |
Print working directory |
cd |
Change directory |
cd ~ |
Change to your home directory |
cd .. |
Change to parent directory |
cd - |
Change to previous directory |
cd / |
Change to the root directory |
ls |
List the contents of the directory |
tree |
Display a tree view of the filesystem |
Flags often used with ls
:
ls -a
lists all contents including hidden files and directoriesls -i
includes the inode of each filels -h
displays files sizes in a human-readable formatls -l
shows more information
Flags can be combined: ls -aihl
does all of the above at once. ls
also works with Globbing patterns, such as ls *.txt
to list all files with a "txt" extension.
In order to navigate while keeping track of the history, use pushd
and popd
instead of cd
.
Manipulating files and directories:
command | description |
---|---|
cat |
Print content onto the standard output stream |
head |
Print the first 10 lines of a file |
tail |
Print the last 10 lines of a file |
less |
Print a file one page at a time |
touch |
Create a file |
mv |
Rename a file |
rm |
Remove a file |
rm -rf |
Empty a directory recursively |
rmdir |
Remove a directory (only works when the directory is empty) |
mkdir |
Create a directory |
ln |
Create a symbolic link |
| which
| Locates a program | which diff
|
| whereis
| Locates searching in a broader range | whereis diff
|
command | description | example |
---|
| echo
| ||
| exit
| ||
| login
| ||
| shutdown -r
| Reboots the system ||
Difference soft link vs hard link
The package manager
APT (Advanced Package Tool) is the default package manager on Debian.
It is made of two components:
apt-get
is the high-level package manager, used for dependency resolutiondpkg
is the low-level package manager working under the hood
Most of the time we use apt-get
.
Example commands:
command | description |
---|---|
apt-get upgrade |
upgrade all installed packages to their latest version |
dpkg --list | grep zip |
show all installed packages which name contains "zip" |
Editing files
On the command-line, we can use Nano or Vi.
Easiest way to add a line to use a file is to use echo
:
echo "Hello World" > file1
From the shell, let's add "Hello World" to file1, then print the contents of file1 to the command line:
$ echo "Hello World" > file1
$ cat file1
Hello World
Main text editors on the command line: nano and vi.
$ nano my_file
To open the file in Vim:
$vim my_file
To access the help on vim, type vimtutor
from the shell.
User management:
aliases
Command | Description |
---|---|
alias |
Show the commands aliases |
id |
Information about the current user |
id <username> |
Information about <username> |
sudo useradd <username> |
Add a new user named <username> |
sudo userdel <username> |
Delete user <username> |
sudo userdel <username> -r |
Fully delete <username> including his home directory |
Each user has a unique ID. What are groupids gid [TBC]. Groups gather users with common access rights and privileges. Normal users start at 1000. All users are created with at least one group which id is equal to their uid.
File | Description |
---|---|
/etc/group | List of groups |
/etc/passwd | List of users |
useradd
and userdel
: sudo useradd my_user
creates the my_user
directory under /home, adds a line to /etc/passwd
userdel leaves the home directory unless userdel -r
/etc/grpup
/etc/group : whoe
How to search /etc/passwd for a specific user (here for user "root"):
$ grep root /etc/passwd
root:x:0:0:root:/root:/bin/bash
groupadd
groupdel
whoami
columns meaning in /etc/passwd [TBC]
root is the superuser account, very powerful
assign sudo
su
(switch user) : launch a new shell running as another user
it's bad practice to use su
to switch to root
Rather, use sudo
on an user. sudo privileges need to be granted to specific users, then the user can access root commands using sudo. It only lasts for a specific command.
Every file is associated with a user and a group
Command | Description |
---|---|
chown |
Change user ownership of a file or directory |
chgrp |
Change group ownership |
chmod |
Change the permissions on the file for owner, group and other |
Usage of chmod:
Environment variables
Specific values used by the shell.
Command | Description |
---|---|
env |
Print current environment variable to the shell |
env | head |
Only print the first 10 variables |
echo $SHELL |
Print the environment variable named SHELL |
PS1 customize your prompt
SHELL default shell for the current user, usually bash
bash:
history
history of commands entered
history | tail
show last 10
otherwise navigate using up/down arrow keys
Ctrl+R: search in history of past commands
Tab: autocomplete
Manipulating
cat
short for concatenate
$ cat readme.txt
main purpose is to concatenate multiple files together
Some files are very large. Opening the file in an editor may consume too much memory. Using less
allows to load page by page, not saturating the memory along with head
and tail
.
$ tail -15 somefile.log
$ tail -f somefile.log
tail -f
continuously monitors new lines in the log file and displays them in stdout
Text utilities
Command | Description |
---|---|
cat |
Concatenate files (can also read and print files) |
less |
Print the contents of a file in the shell page by page |
head |
Show the first 10 lines of a file |
tail |
Show the last 10 lines of a file |
sed |
Edit data streams (filter and perform substitutions) |
awk |
Pattern scanning and processing |
sort |
Sort text files and output streams |
uniq |
Remove duplicate consecutive lines |
paste |
Combine fields from different files |
join |
Enhanced version of paste |
split |
Break up large files into equal segments |
wc |
Display the number of lines and words in a file |
cut |
Cut sections in each line of a file |
grep |
Search text files and input streams (works with regex) |
Streams and pipes
Standard streams are communication channels between a computer program and its environment. When we launch a Linux command, three data streams are created:
- stdin: standard input
- stdout: standard output
- stderr: standard error
When a command is executed via a shell, the streams are connected to the terminal on which the shell is running.
Streams can be directed and changed using >
and <
:
$ do_something < input_file
$ do_something > output_file
$ do_something 2> output_file
Programs can be chained together, where the stdout of the previous program is the stdin of the next:
$ command_1 | command_2 | command_3
Searching for files:
command | description |
---|---|
locate |
|
grep |
Filter |
locate zip | grep bin
: prints all files containing both "zip" and "bin"
find
command
Processes
Some clarification around the terminology:
- a program is an executable sequence of instructions
- a process is the running instance of a program
- a thread is a lightweight process
- a service is process running in the background
A process can run one or several threads, and can also run other processes.
There are two main commands to monitor processes:
ps
(process status) reports a snapshot of current processestop
(table of processes) is a task manager program that displays information about CPU and memory utilization. Also, it auto-refreshes (unlikeps
).
A process can be defined by:
ID type | Description |
---|---|
Process ID (PID) | Unique Process ID number |
Parent Process ID (PPID) | PID of the parent process |
Thread ID (TID) | Thread ID number |
To terminate a process: kill -SIGKILL <pid>
You can only terminate your own processes (unless you are root). Processes have priorities (their nice
value, the lower the higher the priority). The "niceness" of a process is its priority. You can can change the niceness of a process from the CLI via renice
command.
Cron
Cron makes it possible to launch background jobs at specific times.
We can schedule tasks by editing the crontab file (for cron table). We can access the crontab file by typing `crontab -e` ("crontab edit"). There are both system-wide and user-specific crontab files.
Cron instructions follow a specific format made of 6 fields:
Field | Description | Values |
---|---|---|
MIN | Minutes | 0 to 59 |
HOUR | Hours | 0 to 23 |
DOM | Day Of Month | 1 to 31 |
MON | Month | 1 to 12 |
DOW | Day of Week | 0 to 6 (0 = Sunday) |
CMD | Command | Any command to be executed |
Networks
A network is a group of computers connected together via communication channels. Each device has an IP address (unique logical network address).
A DNS converts a hostname to an IP address.
Structure of an IPv4 address:
Divided into 4 bytes
class A network address: first octet as Net ID, other 3 as Host ID max 126 class A
classes A go from 1.0.0.0 to 127.255.255.255 (16.7m unique hosts)
classes B and C were added later:
class b: 2 bits 65k unique hosts
class c network addresses: first 3 octets as NET ID, last as host ID 2.1m available
classes C go 192.0.0.0 to 223.255.255.255
Home network only has 1 IP address. Within the network, you can assign IPs to connected devices either statically or dynamically (DHCP).
NAT = Network Address Translation, allows to share 1 IP address among many locally connected computers, each of which has a unique address only seen on the local network. In a home network, the router acts as a NAT.
Important files:
File | Description |
---|---|
/etc/resolv.conf | |
/etc/hosts | |
/etc/network |
Network interfaces are points of connection between a computer and a network.
Command | Description |
---|---|
ip |
|
ifconfig |
Display active network interfaces |
hostname |
Display current host name |
host |
Show IP address for a domain name |
nslookup |
Same |
ping |
Check whether remote host is alive and responding |
route |
|
netstat |
|
nmap |
|
wget |
Download web pages |
curl |
|
ssh |
Crytographic network protocol |
scp |
Secure copy between 2 network hosts, uses SSH protocol |
SSH Tunnel
ssh <hostname> <command>