tlouarn

Linux 101

posted on • 9 min read | linux
A penguin in space.

What is Linux?

Linux is an open-source operating system. Its core components are a boot loader, a kernel, basic services including a shell, and a package manager. It can also come with a display manager and some applications (like LibreOffice or GIMP).

There are tons of different Linux distributions, but most of them belong to a specific family. The three main families of distributions are:

The three main families of Linux distributions.

Since I use Ubuntu on my WSL and Amazon EC2, and Ubuntu comes from Debian, I will only write down useful commands and tools from the Debian family.

"Everything is a file":

root directory != root user

Linux is packed with choices.

From a command line perspective.

The filesystem

The filesystem works as a tree.

The root is accessible via /. All other files, directories and partitions are located under the root (unlike Windows, in Linux multiple drives are mounted as directories under /mnt).

The Filesystem Hierarchy Standard (FHS) describes the main top-level directories located just under root:

name description
/bin User binaries
/boot Boot loader files
/dev Device files
/etc Configuration files
/home Home directories
/lib System libraries
/media Removable devices
/mnt Mount directory
/opt Optional applications
/proc Process information
/root Home directory for the root user
/sbin System binaries
/srv Service data
/tmp Temporary files
/usr Multi-user programs
/var Variable Files
Notes:
* The filesystem is case-sensitive
* Each user has a directory under /home named after him
* /proc contains virtual files that only exist in memory
* /etc contains system configuration files (only the root user can edit)
* /lib contains dynamically loaded libraries

What does it meant to mount a filesystem?

diff: compare files and directories

diff [options] <filename1> <filename2>

to compare binaries files, use cmp

patch: applies changes to a file (changes are contained in patchfiles)

Most applications open the file rather than rely on extensions. Extensions are more useful for the user than for the system.

file: gives the real nature of a file

cp: copy files on the local machine

rsync: powerful utility, can copy to destination over the network, only transmitting the differences, and supports recursion inside a directory, synchronizes contents.

filesystems can be mounted anywhere on the main filesystem: removable device like a USB key, attach it to a mount point so that its files can be accessed from the main FHS.

The command line

In order to get help on any command, type man <command> (which stands for "manual"). For instance, to get help on find, type man find. This proves useful when looking for specific flags.

To clear the command line: Ctrl + L

To interrupt a command: Ctrl + C

A first example

Typing df -Th / displays the amount of available disk space for the filesystems along with their type and human-readable numbers.

Basic commands for directory navigation

command description
pwd Print working directory
cd Change directory
cd ~ Change to your home directory
cd .. Change to parent directory
cd - Change to previous directory
cd / Change to the root directory
ls List the contents of the directory
tree Display a tree view of the filesystem

Flags often used with ls:

  • ls -a lists all contents including hidden files and directories
  • ls -i includes the inode of each file
  • ls -h displays files sizes in a human-readable format
  • ls -l shows more information

Flags can be combined: ls -aihl does all of the above at once. ls also works with Globbing patterns, such as ls *.txt to list all files with a "txt" extension.

In order to navigate while keeping track of the history, use pushd and popd instead of cd.

Manipulating files and directories:

command description
cat Print content onto the standard output stream
head Print the first 10 lines of a file
tail Print the last 10 lines of a file
less Print a file one page at a time
touch Create a file
mv Rename a file
rm Remove a file
rm -rf Empty a directory recursively
rmdir Remove a directory (only works when the directory is empty)
mkdir Create a directory
ln Create a symbolic link
Often pipe an output to less in order to read the result page by page.

| which | Locates a program | which diff |
| whereis | Locates searching in a broader range | whereis diff |

command description example

| echo | ||

| exit | ||
| login | ||
| shutdown -r | Reboots the system ||

Difference soft link vs hard link

The package manager

APT (Advanced Package Tool) is the default package manager on Debian.

It is made of two components:

  • apt-get is the high-level package manager, used for dependency resolution
  • dpkg is the low-level package manager working under the hood

Most of the time we use apt-get.

Example commands:

command description
apt-get upgrade upgrade all installed packages to their latest version
dpkg --list | grep zip show all installed packages which name contains "zip"

Editing files

On the command-line, we can use Nano or Vi.

Easiest way to add a line to use a file is to use echo:

echo "Hello World" > file1

From the shell, let's add "Hello World" to file1, then print the contents of file1 to the command line:

$ echo "Hello World" > file1
$ cat file1
Hello World

Main text editors on the command line: nano and vi.

$ nano my_file

To open the file in Vim:

$vim my_file

To access the help on vim, type vimtutor from the shell.

User management:

aliases

Command Description
alias Show the commands aliases
id Information about the current user
id <username> Information about <username>
sudo useradd <username> Add a new user named <username>
sudo userdel <username> Delete user <username>
sudo userdel <username> -r Fully delete <username> including his home directory

Each user has a unique ID. What are groupids gid [TBC]. Groups gather users with common access rights and privileges. Normal users start at 1000. All users are created with at least one group which id is equal to their uid.

File Description
/etc/group List of groups
/etc/passwd List of users

useradd and userdel  : sudo useradd my_user creates the my_user directory under /home, adds a line to /etc/passwd

userdel leaves the home directory unless userdel -r

/etc/grpup

/etc/group : whoe

How to search /etc/passwd for a specific user (here for user "root"):

$ grep root /etc/passwd
root:x:0:0:root:/root:/bin/bash

groupadd

groupdel

whoami

columns meaning in /etc/passwd [TBC]

root is the superuser account, very powerful

assign sudo

su (switch user) : launch a new shell running as another user

it's bad practice to use su to switch to root

Rather, use sudo on an user. sudo privileges need to be granted to specific users, then the user can access root commands using sudo. It only lasts for a specific command.

Every file is associated with a user and a group

Command Description
chown Change user ownership of a file or directory
chgrp Change group ownership
chmod Change the permissions on the file for owner, group and other

Usage of chmod:

Environment variables

Specific values used by the shell.

Command Description
env Print current environment variable to the shell
env | head Only print the first 10 variables
echo $SHELL Print the environment variable named SHELL

PS1 customize your prompt

SHELL default shell for the current user, usually bash

bash:

history history of commands entered

history | tail show last 10

otherwise navigate using up/down arrow keys

Ctrl+R: search in history of past commands

Tab: autocomplete

Manipulating

cat short for concatenate

$ cat readme.txt

main purpose is to concatenate multiple files together

Some files are very large. Opening the file in an editor may consume too much memory. Using less allows to load page by page, not saturating the memory along with head and tail.

$ tail -15 somefile.log
$ tail -f somefile.log

tail -f continuously monitors new lines in the log file and displays them in stdout

Text utilities

Command Description
cat Concatenate files (can also read and print files)
less Print the contents of a file in the shell page by page
head Show the first 10 lines of a file
tail Show the last 10 lines of a file
sed Edit data streams (filter and perform substitutions)
awk Pattern scanning and processing
sort Sort text files and output streams
uniq Remove duplicate consecutive lines
paste Combine fields from different files
join Enhanced version of paste
split Break up large files into equal segments
wc Display the number of lines and words in a file
cut Cut sections in each line of a file
grep Search text files and input streams (works with regex)

Streams and pipes

Standard streams are communication channels between a computer program and its environment. When we launch a Linux command, three data streams are created:

  • stdin: standard input
  • stdout: standard output
  • stderr: standard error

When a command is executed via a shell, the streams are connected to the terminal on which the shell is running.

Streams can be directed and changed using > and <:

$ do_something < input_file
$ do_something > output_file
$ do_something 2> output_file

Programs can be chained together, where the stdout of the previous program is the stdin of the next:

$ command_1 | command_2 | command_3
When several commands are chained together, each command does not have to wait for the previous command to finish before starting.

Searching for files:

command description
locate
grep Filter

locate zip | grep bin: prints all files containing both "zip" and "bin"

find command

Processes

Some clarification around the terminology:

  • a program is an executable sequence of instructions
  • a process is the running instance of a program
  • a thread is a lightweight process
  • a service is process running in the background

A process can run one or several threads, and can also run other processes.

There are two main commands to monitor processes:

  • ps (process status) reports a snapshot of current processes
  • top (table of processes) is a task manager program that displays information about CPU and memory utilization. Also, it auto-refreshes (unlike ps).

A process can be defined by:

ID type Description
Process ID (PID) Unique Process ID number
Parent Process ID (PPID) PID of the parent process
Thread ID (TID) Thread ID number
For single-threaded processes, TID=PID. For multi-threaded processes, each thread shares the same PID but has a unique TID.

To terminate a process: kill -SIGKILL <pid>

You can only terminate your own processes (unless you are root). Processes have priorities (their nice value, the lower the higher the priority). The "niceness" of a process is its priority. You can can change the niceness of a process from the CLI via renice command.

Cron

Cron makes it possible to launch background jobs at specific times.

We can schedule tasks by editing the crontab file (for cron table). We can access the crontab file by typing `crontab -e` ("crontab edit"). There are both system-wide and user-specific crontab files.

Cron instructions follow a specific format made of 6 fields:

Field Description Values
MIN Minutes 0 to 59
HOUR Hours 0 to 23
DOM Day Of Month 1 to 31
MON Month 1 to 12
DOW Day of Week 0 to 6 (0 = Sunday)
CMD Command Any command to be executed
0 1 * * * /path/to/my/script.sh
Example of cron schedule expression: everyday at 1AM.

Networks

A network is a group of computers connected together via communication channels. Each device has an IP address (unique logical network address).

A DNS converts a hostname to an IP address.

Structure of an IPv4 address:

Divided into 4 bytes

class A network address: first octet as Net ID, other 3 as Host ID max 126 class A

classes A go from 1.0.0.0 to 127.255.255.255 (16.7m unique hosts)

classes B and C were added later:

class b: 2 bits 65k unique hosts

class c network addresses: first 3 octets as NET ID, last as host ID 2.1m available

classes C go 192.0.0.0 to 223.255.255.255

Home network only has 1 IP address. Within the network, you can assign IPs to connected devices either statically or dynamically (DHCP).

NAT = Network Address Translation, allows to share 1 IP address among many locally connected computers, each of which has a unique address only seen on the local network. In a home network, the router acts as a NAT.

Important files:

File Description
/etc/resolv.conf
/etc/hosts
/etc/network

Network interfaces are points of connection between a computer and a network.

Command Description
ip
ifconfig Display active network interfaces
hostname Display current host name
host Show IP address for a domain name
nslookup Same
ping Check whether remote host is alive and responding
route
netstat
nmap
wget Download web pages
curl
ssh Crytographic network protocol
scp Secure copy between 2 network hosts, uses SSH protocol

SSH Tunnel

ssh <hostname> <command>