Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • can/unix-command-line
  • gdurif/unix-command-line_dev
2 results
Show changes
Commits on Source (45)
Showing with 457 additions and 695 deletions
*.html
.DS_Store
.Rproj.user
/.quarto/
/_book/
*_cache/
# This file is a template, and might need editing before it works on your project.
# Full project: https://gitlab.com/pages/plain-html
pages:
stage: deploy
image: lbmc/pandoc:2.11
image: rocker/tidyverse
script:
- mkdir -p public/img
- make
- cp img/* public/img/
- apt update && apt install -y libxt6
- quarto -v
- |
quarto render
mkdir public
cp -r _book/* public/
interruptible: true
artifacts:
paths:
- public
......
# Ssh
---
title: SSH
author: "Laurent Modolo"
---
```{r include = FALSE}
if (!require("fontawesome")) {
install.packages("fontawesome")
}
library(fontawesome)
knitr::opts_chunk$set(echo = TRUE)
knitr::opts_chunk$set(comment = NA)
```
[![cc_by_sa](./img/cc_by_sa.png)](http://creativecommons.org/licenses/by-sa/4.0/)
<a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">
<img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/88x31.png" />
</a>
Objective: Learn basics of ssh connection in GNU/Linux
......@@ -63,7 +77,12 @@ An **IPv4** is composed of 4 digits ranging from 0 to 255 separated by `.` , whi
### IPv6
An **IPv6** is composed of 8 groups of 4 digits long number separated by `:`. The numbers are in hexadecimal format (number of base 16, randing from 0 to 9 and A to F). Compared to **IPv4**, **IPv6** allows for 2^128 = 340,282,366,920,938,463,463,374,607,431,768,211,456 addresses (approximately 3.4×10^38). For example, an IP address is: *2001:0db8:0000:0000:0000:ff00:0042:8329*
An **IPv6** is composed of 8 groups of 4 digits long number separated by `:`.
The numbers are in hexadecimal format (number of base 16, randing from 0 to 9 and A to F).
Compared to **IPv4**, **IPv6** allows for 2^128 = 340,282,366,920,938,463,463,374,607,431,768,211,456 addresses (approximately 3.4×10^38).
For example, an IP address is: **2001:0db8:0000:0000:0000:ff00:0042:8329**
To display your VM IP addresses you can use the following command: `ip address show`
......@@ -77,14 +96,14 @@ Instead of using IP addresses in your everyday life, you often use the domain na
### Transmission Control Protocol (**TCP**)
The **Transmission Control Protocol** (**TCP**) is one of the main [protocols](https://en.wikipedia.org/wiki/Communications_protocol) of the [Internet protocol suite](https://en.wikipedia.org/wiki/Internet_protocol_suite). TCP provide, reliable, ordered, and error-checked delivery of a stream of data between applications running on hosts communincating over an IP network.
The **Transmission Control Protocol** (**TCP**) is one of the main [protocols](https://en.wikipedia.org/wiki/Communications_protocol) of the [Internet protocol suite](https://en.wikipedia.org/wiki/Internet_protocol_suite). TCP provide, reliable, ordered, and error-checked delivery of a stream of data between applications running on hosts communicating over an IP network.
- data arrives in-order
- data has minimal error (i.e., correctness)
- duplicate data is discarded
- lost or discarded packets are resent
- includes traffic congestion control
- Heavtweight (no ordering of messages, no tracking connections, etc. It is a very simple transport layer designed on top of IP)
- Heavyweight (loots of checks)
### **User Datagram Protocol** (**UDP**)
......@@ -98,9 +117,9 @@ UDP uses a simple [connectionless communication](https://en.wikipedia.org/wiki/C
### Port
Higher, communication protocols like TCP and UDP, also define **port**. A **port** is a communication endpoint. When software wants to communicate overt TCP or UDP it will do so using a specific **port**. Each system has **port** numbers ranging from 0 to 65535. **Port** numbered from 0 through 1023 are system **ports** used by well-known processes (you need specific rights to use them).
Higher, communication protocols like TCP and UDP, also define **port**. A **port** is a communication endpoint. When software wants to communicate overt TCP or UDP it will do so using a specific **port**. Each system has **port** numbers ranging from **0** to **65535**. **Port** numbered from **0** through **1023** are system **ports** used by well-known processes (you need specific rights to use them).
Here are a list of notable port numbers:
Here is a list of notable port numbers:
| Number | Assignment |
| ------ | ------------------------------------------------------------ |
......@@ -108,7 +127,7 @@ Here are a list of notable port numbers:
| 21 | [File Transfer Protocol](https://en.wikipedia.org/wiki/File_Transfer_Protocol) (FTP) Command Control |
| 22 | [Secure Shell](https://en.wikipedia.org/wiki/Secure_Shell) (SSH) Secure Login |
| 23 | [Telnet](https://en.wikipedia.org/wiki/Telnet) remote login service, unencrypted text messages |
| 25 | [Simple Mail Transfer Protocol](https://en.wikipedia.org/wiki/Simple_Mail_Transfer_Protocol) (SMTP) E-mail routing |
| 25 | [Simple Mail Transfer Protocol](https://en.wikipedia.org/wiki/Simple_Mail_Transfer_Protocol) (SMTP) e-mail routing |
| 53 | [Domain Name System](https://en.wikipedia.org/wiki/Domain_Name_System) (DNS) service |
| 67, 68 | [Dynamic Host Configuration Protocol](https://en.wikipedia.org/wiki/Dynamic_Host_Configuration_Protocol) (DHCP) |
| 80 | [Hypertext Transfer Protocol](https://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol) (HTTP) used in the [World Wide Web](https://en.wikipedia.org/wiki/World_Wide_Web) |
......@@ -124,11 +143,11 @@ Nowadays, **ports** provide multiplexing, which means that multiple service or c
## SSH
There are numerous other protocols ([RTP](https://en.wikipedia.org/wiki/Real-time_Transport_Protocol) for example). But most of them run over the TCP and UDP protocols. **SSH** or **Secure Shell** is one of them. SSH is a [cryptographic](https://en.wikipedia.org/wiki/Cryptography) [network protocol](https://en.wikipedia.org/wiki/Network_protocol) for operating network services securely over an unsecured network.
There are numerous other protocols ([RTP](https://en.wikipedia.org/wiki/Real-time_Transport_Protocol) for example). But most of them run over the **TCP** and **UDP** protocols. **SSH** or **Secure Shell** is one of them. **SSH** is a [cryptographic](https://en.wikipedia.org/wiki/Cryptography) [network protocol](https://en.wikipedia.org/wiki/Network_protocol) for operating network services securely over an unsecured network.
SSH use a client-server architecture, you use an SSH client to connect to an SSH server. By default most Linux distribution don’t comes with an SSH server installed. For the IFB, SSH connection is the default way to connect to your VMs, so you should have an SSH sever up and running.
**SSH** use a client-server architecture, you use an **SSH client** to connect to an **SSH server**. By default most Linux distribution don’t come with an **SSH server** installed. For the IFB, **SSH** connection is the default way to connect to your VMs, so you should have an **SSH** sever running.
Find the name of the SSH server process
Find the name of the **SSH** server process
<details><summary>Solution</summary>
<p>
......@@ -138,7 +157,7 @@ ps -el | grep "ssh"
</p>
</details>
SSH uses [Public-key cryptography (or asymmetric cryptography](https://en.wikipedia.org/wiki/Public-key_cryptography)), to secure its communications.
SSH uses [Public-key cryptography (or asymmetric cryptography](https://en.wikipedia.org/wiki/Public-key_cryptography)), to secure its communications.
### Public-key cryptography
......@@ -190,9 +209,10 @@ docker run -it --volume /:/root/chroot alpine sh -c "chroot /root/chroot /bin/ba
Using the `sudo` command edit the configuration file to set **PasswordAuthentication** to **yes** and add the following lines:
**AllowUsers etudiant student**
**PermitRootLogin no**
```
AllowUsers etudiant student
PermitRootLogin no
```
The `sshd` (SSH Daemon) process in launched and managed by `systemd`. You can manage `systemd` service with the `systemctl` command. Try this command without any arguments. You can search for `sshd` by typing `/sshd` and pressing `enter`. You can leave the `systemctl` view by pressing `q`.
......@@ -221,7 +241,7 @@ sudo useradd -m -s /bin/bash -g users student
sudo passwd student
```
Give the password and your IP on the chat.
Give the password and your IP to another member of your choice (`ip address show`).
## SSH client
......@@ -233,7 +253,7 @@ ssh login@IP_adress
Use this command to connect to another student VM.
On the first connection, `ssh` ask you to accept the public key of the server (key fingerprint). With that in the future if someone try to fool you by impersonating the ssh server, he won’t be able to do it without the corresponding private key.
On the first connection, `ssh` ask you to accept the public key of the server (key fingerprint). With that in the future if someone tries to fool you by impersonating the ssh server, he won’t be able to do it without the corresponding private key.
You can close the connection by pressing `ctrl` + `d` or with the command `exit`.
......@@ -285,7 +305,7 @@ Sometime, you want to do other things than executing commands on a remove comput
### scp
The `scp` command comes with the `ssh` client installation you can use it to transfer file from your computer to the ssh sever:
The `scp` command comes with the `ssh` client installation you can use it to transfer files from your computer to the ssh sever:
```sh
scp local/path login@IP_adress:remote/path
......@@ -293,7 +313,7 @@ scp local/path login@IP_adress:remote/path
> You can use a relative remote path, where the ":" correspond to your home folder on the remote server.
You can also retrieve file from the server:
You can also retrieve files from the server:
```sh
scp login@IP_adress:remote/path local/path
......@@ -303,7 +323,7 @@ To transfer directory you can use the `-r` witch
### rsync
`scp` Is a basic command for file transfer. If you want advanced process bar and file integrity checking, you can use the `rsync` command instead.
`scp` is a basic command for file transfer. If you want advanced process bar and file integrity checking, you can use the `rsync` command instead.
For example
......@@ -388,7 +408,6 @@ Host work-ens
With the command `ssh work-ens`, the `ssh` client is going to first connect to `ens` and then from `ens` to the `ip.ip.ip.ip` server.
In the next session, we are going to learn how to [install system-wide programs](http://perso.ens-lyon.fr/laurent.modolo/unix/11_install_system_programs.html) like the one managed by `systemd`
> We have used the following commands:
>
......@@ -397,3 +416,4 @@ In the next session, we are going to learn how to [install system-wide programs]
> - scp to copy files
> - rsync to copy files
In the next session, we are going to learn how to [install systemwide programs](./11_install_system_programs.html) like the one managed by `systemd`
# Install system programs
---
title: Install system programs
author: "Laurent Modolo"
---
```{r include = FALSE}
if (!require("fontawesome")) {
install.packages("fontawesome")
}
library(fontawesome)
knitr::opts_chunk$set(echo = TRUE)
knitr::opts_chunk$set(comment = NA)
```
[![cc_by_sa](./img/cc_by_sa.png)](http://creativecommons.org/licenses/by-sa/4.0/)
<a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">
<img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/88x31.png" />
</a>
Objective: Learn how to install programs in GNU/Linux
As we have seen in the [4 unix file system](http://perso.ens-lyon.fr/laurent.modolo/unix/4_unix_file_system.html#lib-and-usrlib) session, programs are files that contain instruction for the computer to do things. Those files can be in binary or text format (with a [shebang](http://perso.ens-lyon.fr/laurent.modolo/unix/9_batch_processing.html#shebang)). Any of those files, present in a folder of the [**PATH**](http://perso.ens-lyon.fr/laurent.modolo/unix/9_batch_processing.html#path) variable are executable anywhere by the user. For system wide installation, the program files are copied within shared folder path containained in the [**PATH**](http://perso.ens-lyon.fr/laurent.modolo/unix/9_batch_processing.html#path) variable.
As we have seen in the [4 Unix file system](http://perso.ens-lyon.fr/laurent.modolo/unix/4_unix_file_system.html#lib-and-usrlib) session, programs are files that contain instruction for the computer to do things. Those files can be in binary or text format (with a [shebang](http://perso.ens-lyon.fr/laurent.modolo/unix/9_batch_processing.html#shebang)). Any of those files, present in a folder of the [**PATH**](http://perso.ens-lyon.fr/laurent.modolo/unix/9_batch_processing.html#path) variable are executable anywhere by the user. For system wide installation, the program files are copied within shared folder path containained in the [**PATH**](http://perso.ens-lyon.fr/laurent.modolo/unix/9_batch_processing.html#path) variable.
Developers don’t want to reinvent the wheel each time they want to write complex instruction in their programs, this is why they use shared library of pre-written complex instruction. This allows for quicker development, fewer bugs (we only have to debug the library once and use it many times), and also [better memory management](http://perso.ens-lyon.fr/laurent.modolo/unix/6_unix_processes.html#processes-tree) (we only load the library once and it can be used by different programs).
## Package Manager
However, interdependencies between programs and libraries can be a nightmare to handle manually this is why most of the time when you install a program you will use a [package manager](https://en.wikipedia.org/wiki/Package_manager). [Package manager](https://en.wikipedia.org/wiki/Package_manager) are system tools that will handle automatically all the dependencies of a program. They rely on **repositories** of programs and library which contains all the information about the trees of dependence and the corresponding files (**packages**).
However, interdependencies between programs and libraries can be a nightmare to handle manually this is why most of the time when you install a program you will use a [package manager](https://en.wikipedia.org/wiki/Package_manager). [Package managers](https://en.wikipedia.org/wiki/Package_manager) are system tools that will handle automatically all the dependencies of a program. They rely on **repositories** of programs and library which contains all the information about the trees of dependence and the corresponding files (**packages**).
System-wide installation steps:
Systemwide installation steps:
- The user asks the package manager to install a program
- The **package manager** queries its repository lists to search for the most recent **package** version of the program (or a specific version)
- The **package manager** construct the dependency tree of the program
- The **package manager** check that the new dependency tree is compatible with every other installed program
- The **package manager** install the program **package** and all its dependencies **packages** in their correct version
- The **package manager** install the program **package** and all its dependencies **packages** in their correct version
The main difference between GNU/Linux distribution is the package manager they use
......@@ -41,7 +55,7 @@ docker run -it --volume /:/root/chroot alpine sh -c "chroot /root/chroot /bin/ba
### Installing R
**R** is a complex program that relies on loots of dependencies. Your current VM run on Ubuntu, so we are going to use the `apt` tool (`apt-get` is the older version of the `apt` command, `synaptic` is a graphical interface for `apt-get`)
**R** is a complex program that relies on loots of dependencies. Your current VM run on Ubuntu, so we are going to use the `apt` tool (`apt-get` is the older version of the `apt` command, `synaptic` is a graphical interface for `apt-get`).
You can check the **r-base** package dependencies on the website [packages.ubuntu.com](https://packages.ubuntu.com/focal/r-base). Not too much dependency ? Check the sub-package **r-base-core**.
......@@ -122,7 +136,7 @@ If it’s not a good idea to have different **package manager** on the same syst
- `install.packages` for R
- ...
These **package managers** allows your to make installation local to the user, which is advisable to avoid any conflict with the **packages manager** of the system.
These **package managers** allow you to make installation local to the user, which is advisable to avoid any conflict with the **packages manager** of the system.
For example, you can use the following command to install `glances` system wide with `pip`
......@@ -168,7 +182,7 @@ wget https://github.com/Automattic/simplenote-electron/archive/v2.7.0.tar.gz
You can use the command `tar -xvf` to extract this archive
When you go into the `simplenote-electron-2.7.0` folder, you can see a `Makefile` this means that you can use the `make` command to build Simplenote from those files. `make` Is a tool that read recipes (`Makefiles`) to build programs.
When you go into the `simplenote-electron-2.7.0` folder, you can see a `Makefile` this means that you can use the `make` command to build Simplenote from those files. `make` is a tool that read recipes (`Makefiles`) to build programs.
You can try to install `node` and `npx` with `apt`. What happened ?
......@@ -219,7 +233,6 @@ You can finalize the installation with the command `make install`. Usually the c
Read the `README` file of the [fastp](https://github.com/OpenGene/fastp) program to see which methods of installation are available.
Installing programs and maintain different versions of a program on the same system, is a difficult task. In the next session, we will learn how to use [virtualization](http://perso.ens-lyon.fr/laurent.modolo/unix/12_virtualization.html) to facilitate our job.
> We have used the following commands:
>
......@@ -229,4 +242,5 @@ Installing programs and maintain different versions of a program on the same sys
> - make to build programs from sources
Installing programs and maintain different versions of a program on the same system is a difficult task. In the next session, we will learn how to use [virtualization](./12_virtualization.html) to facilitate our job.
# Install system programs
---
title: Virtualization
author: "Laurent Modolo"
---
[![cc_by_sa](./img/cc_by_sa.png)](http://creativecommons.org/licenses/by-sa/4.0/)
```{r include = FALSE}
if (!require("fontawesome")) {
install.packages("fontawesome")
}
library(fontawesome)
knitr::opts_chunk$set(echo = TRUE)
knitr::opts_chunk$set(comment = NA)
```
<a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">
<img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/88x31.png" />
</a>
Objective: Learn how to build virtual images or container of a system
......@@ -49,13 +63,13 @@ sudo apt install virtualbox
sudo usermod -G vboxusers -a $USER
```
The first things that we need to do with virtual box is to create a new virtual machine. We want to install Ubuntu 20.04 on it.
The first thing that we need to do with virtual box is to create a new virtual machine. We want to install Ubuntu 20.04 on it.
```sh
VBoxManage createvm --name Ubuntu20.04 --register
```
We the create a virtual hard disk for this VM:
We create a virtual hard disk for this VM:
```sh
VBoxManage createhd --filename Ubuntu20.04 --size 14242
......@@ -73,7 +87,7 @@ We set the virtual RAM
VBoxManage modifyvm Ubuntu20.04 --memory 1024
```
We add an vitual IDE periferic storage on which we can boot on
We add a virtual IDE peripheric storage on which we can boot on.
```sh
VBoxManage storagectl Ubuntu20.04 --name IDE --add ide --controller PIIX4 --bootable on
......@@ -108,7 +122,7 @@ You can use the `systemctl` command and the `/` key to search for this daemon.
Like VirtualBox, you can install system programs within a container.
Prebuilt container can be found on different sources like [the docker hub](https://hub.docker.com/) or [the biocontainers registry](https://biocontainers.pro/registry).
Prebuilt containers can be found on different sources like [the docker hub](https://hub.docker.com/) or [the biocontainers registry](https://biocontainers.pro/registry).
Launching a container
......
---
title: First step in a terminal
title: Understanding a computer
author: "Laurent Modolo"
---
# Understanding a computer
```{r include = FALSE}
if (!require("fontawesome")) {
install.packages("fontawesome")
}
library(fontawesome)
knitr::opts_chunk$set(echo = TRUE)
knitr::opts_chunk$set(comment = NA)
```
[![cc_by_sa](./img/cc_by_sa.png)](http://creativecommons.org/licenses/by-sa/4.0/)
<a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">
<img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/88x31.png" />
</a>
Objective: understand the relations between the different components of a computer
---
## Which parts are necessary to define a computer ?
# Which parts are necessary to define a computer ?
## Computer components
### CPU (Central Processing Unit)
---
![CPU](./img/amd-ryzen-7-1700-cpu-inhand1-2-1500x1000.jpg){width=100%}
# Computer components
### Memory
## CPU (Central Processing Unit)
![CPU](./img/amd-ryzen-7-1700-cpu-inhand1-2-1500x1000.jpg)
#### RAM (Random Access Memory)
## Memory
![RAM](./img/ram.png){width=100%}
### RAM (Random Access Memory)
<img src="./img/220px-Swissbit_2GB_PC2-5300U-555.jpg" alt="RAM" style="zoom:150%;" />
#### HDD (Hard Disk Drive) / SSD (Solid-State Drive)
### HDD (Hard Disk Drive) / SSD (Solid-State Drive)
<img src="./img/220px-Laptop-hard-drive-exposed.jpg" alt="HDD" style="zoom:150%;" />
<img src="./img/220px-SSD_Samsung_960_PRO_512GB_-_front_and_back_-_2018-05-27.jpg" alt="SSD" style="zoom:150%;" />
![HDD](./img/SSD.jpeg){width=100%}
![SSD](./img/hdd.png){width=100%}
## Motherboard
![motherboard](./img/motherboard.jpg)
### Motherboard
## GPU (Graphical Processing Unit)
![GPU](./img/foundation-100046736-orig.jpg)
![motherboard](./img/motherboard.jpg){width=100%}
## Alimentation
![Alim](./img/LD0003357907_2.jpg)
### GPU (Graphical Processing Unit)
---
![GPU](./img/foundation-100046736-orig.jpg){width=100%}
# Computer model: universal Turing machine
### Alimentation
![Alim](./img/LD0003357907_2.jpg){width=100%}
---
## Computer model: universal Turing machine
![width:20% height:20%](./img/lego_turing_machine.jpg)
![width:20% height:20%](./img/lego_turing_machine.jpg){width=100%}
---
# As simple as a Turing machine ?
## As simple as a Turing machine ?
![universal_truing_machine](./img/universal_truing_machine.png)
![Universal Truing Machine](./img/universal_truing_machine.png){width=100%}
- A tape divided into cells, one next to the other. Each cell contains a symbol from some finite alphabet.
- A head that can read and write symbols on the tape and move the tape left and right one (and only one) cell at a time.
......@@ -59,7 +68,7 @@ Objective: understand the relations between the different components of a comput
---
# Basic Input Output System (BIOS)
## Basic Input Output System (BIOS)
> Used to perform hardware initialization during the booting process (power-on startup), and to provide runtime services for operating systems and programs.
......@@ -69,7 +78,7 @@ Objective: understand the relations between the different components of a comput
---
# Unified Extensible Firmware Interface (UEFI)
## Unified Extensible Firmware Interface (UEFI)
Advantages:
......@@ -86,50 +95,54 @@ Disadvantages:
---
# Operating System (OS)
## Operating System (OS)
> A system software that manages computer hardware, software resources, and provides common services for computer programs.
- The first thing loaded by the BIOS/UEFI
- The first thing on the tape of a Turing machine
## Kernel
### Kernel
> The kernel provides the most basic level of control over all of the computer's hardware devices. It manages memory access for programs in the RAM, it determines which programs get access to which hardware resources, it sets up or resets the CPU's operating states for optimal operation at all times, and it organizes the data for long-term non-volatile storage with file systems on such media as disks, tapes, flash memory, etc.
![Kernel](./img/220px-Kernel_Layout.svg.png)
![Kernel](./img/220px-Kernel_Layout.svg.png){width=100%}
---
# UNIX
## UNIX
> Unix is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix,
[![Unix history](./img/1920px-Unix_timeline.en.svg.png)](https://upload.wikimedia.org/wikipedia/commons/7/77/Unix_history-simple.svg)
[![Unix history](./img/1920px-Unix_timeline.en.svg.png){width=100%}](https://upload.wikimedia.org/wikipedia/commons/b/b5/Linux_Distribution_Timeline_21_10_2021.svg)
The ones you are likely to encounter:
- [macOS](https://en.wikipedia.org/wiki/MacOS)
- [BSD (Berkeley Software Distribution) variant](https://www.freebsd.org/)
- [GNU/Linux](https://www.kernel.org/)
The philosophy of UNIX is to have a large number of small software which do few things but to them well.
# GNU/Linux
## GNU/Linux
Linux is the name of the kernel which software, to get a full OS, Linux is part of the [GNU Project](https://www.gnu.org/).
The GNU with Richard Stallman introduced the notion of Free Software:
1. The freedom to run the program as you wish, for any purpose.
2. The freedom to study how the program works, and change it so it does your computing as you wish. Access to the source code is a precondition for this.
3. The freedom to redistribute copies so you can help others.
4. The freedom to distribute copies of your modified versions to others. By doing this you can give the whole community a chance to benefit from your changes. Access to the source code is a precondition for this.
![Richard Stallman](https://audio-video.gnu.org/video/TEDxGE2014_Stallman05_LQ.webm)
You can find a [list of software licenses](https://www.gnu.org/licenses/license-list.html)
<video width="100%" controls>
<source src="https://audio-video.gnu.org/video/TEDxGE2014_Stallman05_LQ.webm" type="video/webm">
Your browser does not support the video tag.
</video>
[Instead of installing GNU/Linux on your computer, you are going to learn to use the IFB Cloud.](http://perso.ens-lyon.fr/laurent.modolo/unix/2_using_the_ifb_cloud.html)
See this [presentation](https://plmlab.math.cnrs.fr/gdurif/presentation_foss/-/blob/main/presentation/presentation_DURIF_foss.pdf) (in french) for a quick introduction about **software license** and **free/open source software**.
[Instead of installing GNU/Linux on your computer, you are going to learn to use the IFB Cloud.](./2_using_the_ifb_cloud.html)
---
title: IFB (Institu Français de bio-informatique) Cloud
title: IFB (Institut Français de bio-informatique) Cloud
author: "Laurent Modolo"
---
# IFB (Institu Français de bio-informatique) Cloud
```{r include = FALSE}
if (!require("fontawesome")) {
install.packages("fontawesome")
}
library(fontawesome)
knitr::opts_chunk$set(echo = TRUE)
knitr::opts_chunk$set(comment = NA)
```
[![cc_by_sa](./img/cc_by_sa.png)](http://creativecommons.org/licenses/by-sa/4.0/)
<a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">
<img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/88x31.png" />
</a>
Objective: Start and connect to an appliance on the IFB cloud
Instead of working on your computer where you don't have an Unix-like OS or have limited right, we are going to use the [IFB (Institu Français de bio-informatique) Cloud]( https://biosphere.france-bioinformatique.fr/).
Instead of working on your computer where you don't have an Unix-like OS or have limited right, we are going to use the [IFB (Institut Français de bio-informatique) Cloud]( https://biosphere.france-bioinformatique.fr/).
## Creating an IFB account
1. Access the [**https://biosphere.france-bioinformatique.fr/**](https://biosphere.france-bioinformatique.fr/) website
2. On the top right (First) steps with GNU/Linux
Instead of working on your computer where you don't have an Unix-like OS or have limited right, we are going to use the [IFB (Institu Français de bio-informatique) Cloud]( https://biosphere.france-bioinformatique.fr/). For this you will need:
1. Access the [**https://biosphere.france-bioinformatique.fr/**](https://biosphere.france-bioinformatique.fr/) website
2. On the top right of the screen click on <img src="./img/signin_ifb.png" alt="sign in" style="zoom:150%;" />
3. Then click on ![login](./img/login_ifb.png)
4. Use the **Incremental search field** to select your identity provider (CNRS / ENS de Lyon / etc.)
5. Login
6. Complete the form with your **Name**, **First Name**, **Town** and **Zip Code**. You can ignore the other field and click on **accept**.
7. Go to your **Groups** parameters on the top right ![group_selection_ifb](./img/group_selection_ifb.png)
8. Click on ![join_a_group](./img/join_a_group.png) and type **LBMC Unix 2020**
8. Click on ![join_a_group](./img/join_a_group.png) and type **CAN UNIX 2023**
9. You can click on the **+** sign to register and wait to be accepted in the group
## Starting the LBMC Unix 2020 appliance
## Starting the LBMC Unix 2022 appliance
To follow this practical you will need to start the **[LBMC Unix 2020](https://biosphere.france-bioinformatique.fr/catalogue/appliance/177/) ** appliance from the [IFB Cloud](https://biosphere.france-bioinformatique.fr/) and click on the ![start](./img/start_VM.png) button after login with your account.
To follow this practical you will need to start the **[LBMC Unix 2022](https://biosphere.france-bioinformatique.fr/catalogue/appliance/177/)** appliance from the [IFB Cloud](https://biosphere.france-bioinformatique.fr/) and click on the ![start](./img/start_VM.png) button after login with your account.
In the IFB jargon, appliance means **virtual machine** (VM). Remember how a universal Turing machine can run any programs ? A virtual machine, is a simulation program, simulating a physical computer. VM's have the following advantages:
- Copies of the VM will be identical (there will be no differences between your running *LBMC Unix 2020 appliance* and mine )
- Upon starting the VM is reset to the *LBMC Unix 2020 appliance* state
- Copies of the VM will be identical (there will be no differences between your running *LBMC Unix 2022 appliance* and mine )
- Upon starting the VM is reset to the *LBMC Unix 2022 appliance* state
- You can break everything in your VM, terminate it and start a new one.
To access to your appliance you can go to the [**myVM** tab](https://biosphere.france-bioinformatique.fr/cloud/)
......@@ -53,7 +56,7 @@ You will need to start this appliance at the start of each session of this cours
The ![hourglass](./img/wait_my_appliances_ifb.png) symbol indicates that your appliance is starting.
## Accessing the LBMC Unix 2020
## Accessing the LBMC Unix 2022
You can open the **https** link next to the termination button of your appliance in a new tab. You will have the following message
......@@ -89,4 +92,4 @@ Then paste your password in the dialog box.
Don't worry the password will not be displayed (not even in the form of `*****`, so someone looking at your screen will not be able to guess it's length), you can press **enter** to log on your VM.
[First steps in a terminal.](http://perso.ens-lyon.fr/laurent.modolo/unix/3_first_steps_in_a_terminal.html)
\ No newline at end of file
[First steps in a terminal.](./3_first_steps_in_a_terminal.html)
---
title: First step in a terminal
author: "Laurent Modolo"
---
# First step in a terminal
```{r include = FALSE}
if (!require("fontawesome")) {
install.packages("fontawesome")
}
library(fontawesome)
knitr::opts_chunk$set(echo = TRUE)
knitr::opts_chunk$set(comment = NA)
```
[![cc_by_sa](./img/cc_by_sa.png)](http://creativecommons.org/licenses/by-sa/4.0/)
<a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">
<img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/88x31.png" />
</a>
Objective: learn to use basic terminal command
......@@ -23,25 +33,26 @@ You can go to this distribution website and have a look at the list of firms usi
A command-line interpreter (or shell), is a software designed to read lines of text entered by a user to interact with an OS.
To simplify the shell executes the following infinite loop:
1. read a line
2. translate this line as a program execution with its parameters
3. launch the corresponding program with the parameters
3. wait for the program to finish
4. Go back to 1.
When you open a terminal on an Unix-like OS, you will have a **prompt** displayed: it can end with a **$** or a **%** character depending on your configuration. As long as you see your prompt, it means that you are in step **1.**, if no prompt is visible, you are in step **4.** or you have set up a very minimalist configuration for your shell.
When you open a terminal on an Unix-like OS, you will have a **prompt** displayed: it can end with a `$` or a `%` character depending on your configuration. As long as you see your prompt, it means that you are in step **1.**, if no prompt is visible, you are in step **4.** or you have set up a very minimalist configuration for your shell.
<img src="./img/prompt.png" alt="prompt" style="zoom:150%;" />
The blinking square or vertical bar represents your **cursor**. Shell predates graphical interfaces, so most of the time you won’t be able to move this cursor with your mouse, but with the directional arrows (left and right).
The blinking square or vertical bar represents your **cursor**. Shell predates graphical interfaces, so most of the time you won’t be able to move this cursor with your mouse, but with the **directional arrows** (left and right).
On the IFB, your prompt is a **$**
On the IFB, your prompt is a `$`:
```sh
etudiant@VM:~$
```
You can identify the following information from your prompt: **etudiant** is your login and **VM** is the name of your VM (**~** is where you are on the computer, but we will come back to that later).
You can identify the following information from your prompt: **etudiant** is your login and **VM** is the name of your VM (`~` is where you are on the computer, i.e. the current working directory, but we will come back to that later).
On Ubuntu 20.04, the default shell is [Bash](https://en.wikipedia.org/wiki/Bash_(Unix_shell)) while on recent version of macOS it’s [zsh](https://en.wikipedia.org/wiki/Z_shell). There are [many different shell](https://en.wikipedia.org/wiki/List_of_command-line_interpreters), for example, Ubuntu 20.04 also has [sh](https://en.wikipedia.org/wiki/Bourne_shell) installed.
......@@ -127,7 +138,7 @@ What happens when you type `cd` without any argument ?
What is the location shown in your prompt ? Is it coherent with the `pwd` information ? Can you `cd` to the `pwd` path ?
When we move around a file system, we often want to see what is in a given folde. We want to **l**i**s**t the directory content. Go back to the `/home` directory and use to the `ls` command see how many have a home directory there.
When we move around a file system, we often want to see what is in a given folder. We want to **l**i**s**t the directory content. Go back to the `/home` directory and use to the `ls` command see how many people have a home directory there.
We will see various options for the `ls` command, throughout this course. Try the `-a` option.
......@@ -137,9 +148,9 @@ ls -a
What changed compared to the `ls` command without this option ?
Go to your home folder with the bare `cd` command and run the `ls -a` command again. The `-a` option makes the `ls` command list hidden files and folders. On Unix systems, hidden files and folders are all files and folders whose name starts with a "**.**".
Go to your home folder with the bare `cd` command and run the `ls -a` command again. The `-a` option makes the `ls` command list hidden files and folders. On Unix systems, hidden files and folders are all files and folders whose name starts with a `.`.
Can you `cd` to "**.**" ?
Can you `cd` to `.` ?
```sh
cd .
......@@ -147,7 +158,7 @@ cd .
What happened ?
Can you cd to "**..**" ?
Can you cd to `..` ?
```sh
cd ..
......@@ -168,4 +179,4 @@ You can use the `-l` option in combination with the `-a` option to know more abo
> - `ls` for list directory
> - `pwd` for print working directory
[You can now go to the Unix file system.](http://perso.ens-lyon.fr/laurent.modolo/unix/4_unix_file_system.html)
\ No newline at end of file
[You can now go to the Unix file system.](./4_unix_file_system.html)
---
title: GNU/Linux file system
author: "Laurent Modolo"
---
# GNU/Linux file system
```{r include = FALSE}
if (!require("fontawesome")) {
install.packages("fontawesome")
}
library(fontawesome)
knitr::opts_chunk$set(echo = TRUE)
knitr::opts_chunk$set(comment = NA)
```
[![cc_by_sa](./img/cc_by_sa.png)](http://creativecommons.org/licenses/by-sa/4.0/)
<a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">
<img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/88x31.png" />
</a>
Objective: Understand how files are organized in Unix
......@@ -28,19 +38,19 @@ This file system is organized as a tree. As you have seen, every folder has a pa
Every file can be accessed by an **absolute path** starting at the root. Your user home folder can be accessed with the path `/home/etudiant/`. Go to your user home folder.
We can also access file with a **relative path**, using the special folder "**..**". From your home folder, go to the *ubuntu* user home folder without passing by the root (we will see use of the "**.**" folder later).
We can also access files with a **relative path**, using the special folder `..`. From your home folder, go to the *ubuntu* user home folder without passing by the root (we will see use of the `.` folder later).
## File Types
As you may have guessed, every file type is not the same. We have already seen that common file and folder are different. Here are the list of file types:
As you may have guessed, every file type is not the same. We have already seen that common file and folder are different. Here is the list of file types:
- **-** common files
- **d** folders
- **l** links
- **b** disk
- **c** special files
- **s** socket
- **p** named pipes
- `-` common files
- `d` folders
- `l` links
- `b` disk
- `c` special files
- `s` socket
- `p` named pipes
To see the file type you can type the command
......@@ -48,11 +58,11 @@ To see the file type you can type the command
ls -la
```
The first column will tell you the type of the file (here we have only the type "**-**" and "**d**" ). We will come back on the other information later. An other less used command to get fine technical information on a file is the command `stat [file_name]`. Can you get the same information as `ls -la` with `stat` ?
The first column will tell you the type of the file (here we have only the type `-` and `d`). We will come back on the other information later. Another less used command to get fine technical information on a file is the command `stat [file_name]`. Can you get the same information as `ls -la` with `stat` ?
## Common Structure
From the root of the system (**/**), most of the Unix-like distribution will share the same folder arborescence. On macOS, the names will be different because when you sell the most advanced system in the world you need to rename things, with more advanced names.
From the root of the system (`/`), most of the Unix-like distribution will share the same folder tree structure. On macOS, the names will be different because when you sell the most advanced system in the world you need to rename things, with more advanced names.
### `/home`
......@@ -62,7 +72,7 @@ You already know this one. You will find all your file and your configuration fi
You can find the Linux kernel and the boot manager there. What is the name of your boot manager (process by elimination) ?
You can see a new type of file here, the type "**l**". What it the version of the **vmlinuz** kernel ?
You can see a new type of file here, the type `l`. What it the version of the **vmlinuz** kernel ?
### `/root`
......@@ -72,17 +82,17 @@ The home directory of the super user, also called root (we will go back on him l
The folder containing the programs used by the system and its users. Programs are simple file readable by a computer, these files are often in **bin**ary format which means that it’s extremely difficult for a human to read them.
What is the difference between **/bin** and **/usr/bin** ?
What is the difference between `/bin` and `/usr/bin` ?
**/sbin** stand for system binary. What are the names of the programs to power off and restart your system ?
`/sbin` stand for system binary. What are the names of the programs to power off and restart your system ?
**/opt** is where you will find the installation of non-conventional programs (if you don’t follow [the guide of good practice of the LBMC](http://www.ens-lyon.fr/LBMC/intranet/services-communs/pole-bioinformatique/ressources/good_practice_LBMC), you can put your bioinformatics tools with crapy installation procedure there).
`/opt` is where you will find the installation of non-conventional programs (if you don’t follow [the guide of good practice of the LBMC](http://www.ens-lyon.fr/LBMC/intranet/services-communs/pole-bioinformatique/ressources/good_practice_LBMC), you can put your bioinformatics tools with crapy installation procedure there).
### `/lib` and `/usr/lib`
Those folder contains system libraries. Libraries are a collection of pieces of codes usable by programs.
What is the difference between **/lib** and **/usr/lib**.
What is the difference between `/lib` and `/usr/lib`.
Search information on the `/lib/gnupg` library on the net.
......@@ -96,7 +106,7 @@ Contains every peripheric
What is the type of the file `stdout` (you will have to follow the links)?
With the command `ls -l` can you identify files of type "**b**" ?
With the command `ls -l` can you identify files of type `b` ?
Using `less` can you visualize the content of the file `urandom` ? What about the file `random` ?
......@@ -116,7 +126,7 @@ You can navigate the file with the navigation arrows. Which group the user `ubun
To close the `less` you can press `Q`. Try the opposite of `less`, what are the differences ?
What is the type of the file `autofs.fifo-var-autofs-ifb` in the `run` folder ? From **fifo** in the name, can you guess the function of the "**p**" file ?
What is the type of the file `autofs.fifo-var-autofs-ifb` in the `run` folder ? From **fifo** in the name, can you guess the function of the `p` file ?
There are few examples of the last type of file in the `run` folder, in which color the command `ls -l` color them ?
......@@ -146,7 +156,7 @@ stat /var/run
What is the kind of link for `/var/run` ?
Most of the time, when you are going to work with links, you will work with this kind of link. You can create a **l**i**n**k with the command `ln` and the option `-s` for **s**ymbolic.
Most of the time, when you are going to work with links, you will work with this kind of link. You can create a **l**i**n**k with the command `ln` and the option `-s` for a **s**ymbolic.
The first argument after the option of the `ln` command is the target of the link, the second argument is the link itself:
......@@ -182,7 +192,7 @@ ln .bashrc bashrc_linkb
Use `stat` to also study `bashrc_linka` and `bashrc_linkb`.
What happen when you delete `bashrc_linka` ?
What happens when you delete `bashrc_linka` ?
To understand the notion of **Inode** we need to know more about storage systems.
......@@ -200,11 +210,11 @@ You cannot dump data directly into the disk, you need to organize things to be a
![disk](./img/disk.png)
Each media is divided into partition:
Each media is divided into partitions:
![partitions](./img/partition.png)
The media is divided into one or many partition, each of which have a file system type. Examples of file system type are:
The media is divided into one or many partition, each of which have a file system type. Examples of file system types are
- fat32, exFAT
- ext3, ext4
......@@ -225,8 +235,8 @@ Find which disk is mounted at the root of the file tree.
> We have seen the commands:
>
> - `stat` to display information on a file
> - `less` to visualise the content of a file
> - `ln` to create link
> - `less` to visualize the content of a file
> - `ln` to create links
> - `mount` to list mount points
[That’s all for the Unix file system, we will come back to it from time to time but for now you can head to the next section.](http://perso.ens-lyon.fr/laurent.modolo/unix/5_users_and_rights.html)
\ No newline at end of file
[That’s all for the Unix file system, we will come back to it from time to time but for now you can head to the next section.](./5_users_and_rights.html)
---
title: Users and rights
author: "Laurent Modolo"
---
# Users and rights
```{r include = FALSE}
if (!require("fontawesome")) {
install.packages("fontawesome")
}
library(fontawesome)
knitr::opts_chunk$set(echo = TRUE)
knitr::opts_chunk$set(comment = NA)
```
[![cc_by_sa](./img/cc_by_sa.png)](http://creativecommons.org/licenses/by-sa/4.0/)
<a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">
<img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/88x31.png" />
</a>
Objective: Understand how rights works in GNU/Linux
......@@ -38,7 +47,7 @@ Check your set of rights on your `.bashrc` file
ls -l ~/.bashrc
```
The first column of the `ls -l` output show the status of the rights on the file
The first column of the `ls -l` output shows the status of the rights on the file
![user_rights](./img/user_right.png)
......@@ -96,7 +105,7 @@ What can you conclude on the symbols `+` , `=`, `-` and `,` with the `chmod` com
> | `-r--r--r--` | 0444 | read |
> | `-r-xr-xr-x` | 0555 | read & execute |
> | `-rw-rw-rw-` | 0666 | read & write |
> | `-rwxr-----` | 0740 | owner can read, write, & execute; group can only read; others have no permissions |
> | `-rwxr-----` | 0740 | owner can read, write, & execute; group can only read; others have no permission |
The default group of your user is the first in the list of the groups you belong to. You can use the command `groups` to display this list. What is your default group ?
......@@ -208,14 +217,14 @@ You can add a new user to your system with the command `useradd`
useradd -m -s /bin/bash -g users -G adm,docker student
```
- `-m` create a hone directory
- `-m` create a home directory
- `-s` specify the shell to use
- `-g` the default group
- `-G` the additional groups
To log into another account you can use the command `su`
What is the difference between the two following command ?
What is the difference between the two following commands ?
```sh
su student
......@@ -235,7 +244,7 @@ You can add new groups to your system with the command `groupadd`
sudo groupadd dummy
```
Then you can add users to these group with the command `usermod`
Then you can add users to this group with the command `usermod`
```sh
sudo usermod -a -G dummy student
......@@ -247,7 +256,7 @@ And check the result:
groups student
```
To remove an user from a group you can rewrite it's list of group with the command `usermod`
To remove an user from a group you can rewrite its list of groups with the command `usermod`
```sh
sudo usermod -G student student
......@@ -257,9 +266,9 @@ Check the results.
## Security-Enhanced Linux
While what you have seen in this section hold true for every Unix system, additionnal rules can be applied to control the rights in Linux. This is what is called [SE Linux](https://en.wikipedia.org/wiki/Security-Enhanced_Linux) (**s**ecurity-**e**nhanced **Linux**)
While what you have seen in this section hold true for every Unix system, additional rules can be applied to control the rights in Linux. This is what is called [SE Linux](https://en.wikipedia.org/wiki/Security-Enhanced_Linux) (**s**ecurity-**e**nhanced **Linux**)
When SE Linux is enabled on a system, every **processes** can be assigned a set of right. This is how, on Android for example, some programs can access your GPS while other cannot etc. In this case it's not the user rights that prevail, but the **process** launched by the user.
When SE Linux is enabled on a system, every **process** can be assigned a set of rights. This is how, on Android for example, some programs can access your GPS while others cannot, etc. In this case it's not the user rights that prevail, but the **process** launched by the user.
> We have seen the commands:
>
......@@ -273,6 +282,6 @@ When SE Linux is enabled on a system, every **processes** can be assigned a set
> - `sudo` to borrow **root** rights
> - `groupadd` to create groups
> - `groups` to list groups
> - `usermod`to manipulate user's to groups
> - `usermod`to manipulate users' groups
[To understand more about processes you can head to the next section.](http://perso.ens-lyon.fr/laurent.modolo/unix/6_unix_processes.html)
\ No newline at end of file
[To understand more about processes you can head to the next section.](./6_unix_processes.html)
---
title: Unix Streams and pipes
author: "Laurent Modolo"
---
# Steams and pipes
```{r include = FALSE}
if (!require("fontawesome")) {
install.packages("fontawesome")
}
library(fontawesome)
knitr::opts_chunk$set(echo = TRUE)
knitr::opts_chunk$set(comment = NA)
```
[![cc_by_sa](./img/cc_by_sa.png)](http://creativecommons.org/licenses/by-sa/4.0/)
<a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">
<img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/88x31.png" />
</a>
Objective: Understand function of streams and pipes in Unix systems
......@@ -39,7 +46,7 @@ cat .bashrc
## Streams manipulation
You can use the `>` character to redirect a flux toward a file. The following command make a copy of your `.bashrc` files.
You can use the `>` character to redirect a flux toward a file. The following command makes a copy of your `.bashrc` files.
```sh
cat .bashrc > my_bashrc
......@@ -59,11 +66,11 @@ cal -N 2 > my_cal
What is the content of `my_cal` what happened ?
The `>` command can have an argument, the syntax to redirect **stdout** to a file is `1>` it's also the default option (equivalent to `>`). Here the `-N` option doesn't exists, `cal` throws an error. Errors are sent to **stderr** which have the number 2.
The `>` command can have an argument, the syntax to redirect **stdout** to a file is `1>` it's also the default option (equivalent to `>`). Here the `-N` option doesn't exist, `cal` throws an error. Errors are sent to **stderr** which have the number 2.
Save the error message in `my_cal` and check the results with `less`.
We have seen tha `>` overwrite the content of the file. Try the following commands:
We have seen that `>` overwrite the content of the file. Try the following commands:
```sh
cal 2020 > my_cal
......@@ -73,7 +80,7 @@ cal -N 2 2>> my_cal
Check the results with the command `less`.
The command `>` send the stream from the left to the file on the right. Try the following:
The command `>` sends the stream from the left to the file on the right. Try the following:
```sh
cat < my_cal
......@@ -91,7 +98,7 @@ Type some text and type `EOF` on a new line. `EOF` stand for **e**nd **o**f **f*
What happened ? Can you check the content of `my_notes` ? How would you modify this command to add new notes?
Finaly you can redirect a stream toward another stream with the following syntax:
Finally, you can redirect a stream toward another stream with the following syntax:
```sh
cal -N2 2&> my_redirection
......@@ -132,7 +139,7 @@ Analyze the following command, what would it do ?
wget -q -O - http://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/hg38.fa.gz | gzip -dc | less
```
Remember that most Unix command process input and output line by line. Which means that you can process huge dataset without intermediate files or huge RAM capacity.
Remember that most Unix command process input and output line by line. Which means that you can process huge datasets without intermediate files or huge RAM capacity.
> We have users the following commands:
>
......@@ -141,4 +148,4 @@ Remember that most Unix command process input and output line by line. Which mea
> - `|` the pipe operator to connect processes
> - `wget` to download files
[You can head to the next session to apply pipe and stream manipulation.](http://perso.ens-lyon.fr/laurent.modolo/unix/8_text_manipulation.html)
\ No newline at end of file
[You can head to the next session to apply pipe and stream manipulation.](./8_text_manipulation.html)
---
title: Text manipulation
author: "Laurent Modolo"
---
# Text manipulation
```{r include = FALSE}
if (!require("fontawesome")) {
install.packages("fontawesome")
}
library(fontawesome)
knitr::opts_chunk$set(echo = TRUE)
knitr::opts_chunk$set(comment = NA)
```
[![cc_by_sa](./img/cc_by_sa.png)](http://creativecommons.org/licenses/by-sa/4.0/)
<a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">
<img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/88x31.png" />
</a>
Objective: Learn basics way to work with text file in Unix
Objective: Learn simple ways to work with text file in Unix
One of the great thing with command line tools is that they are simple and fast. Which means that they are great for handle large files. And as bioinformaticians you have to handle large file, so you need to use command line tools for that.
One of the great things with command line tools is that they are simple and fast. Which means that they are great for handling large files. And as bioinformaticians you have to handle large files, so you need to use command line tools for that.
## Text search
The file [hg38.ncbiRefSeq.gtf.gz](http://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/genes/hg38.ncbiRefSeq.gtf.gz) contains the RefSeq annotation for hg38 in [GFT format](http://www.genome.ucsc.edu/FAQ/FAQformat.html#format4)
We can download files with the `wget` command. Here the annotation is in **gz** format which is a compressed format, you can use the `gzip` tool to hande **gz** files.
We can download files with the `wget` command. Here the annotation is in **gz** format which is a compressed format, you can use the `gzip` tool to handle **gz** files.
On useful command to check large text file is the `head `command.
......@@ -40,7 +46,7 @@ gzip -dc hg38.ncbiRefSeq.gtf.gz | grep "chr2" | head
What is the last annotation on the chromosome 1 (to write a tabulation character you can type `\t`) ?
You can count things in text file with the command `wc` read the `wc` **man**ual to see how you can count line in a file.
You can count things in text file with the command `wc` read the `wc` **man**ual to see how you can count lines in a file.
Does the number of *3UTR* match the number of *5UTR* ?
......@@ -48,7 +54,7 @@ How many transcripts does the gene *CCR7* have ?
## Regular expression
When you do a loot text search, you will encounter regular expression (regexp), which allow you to perform fuzzy search. To run `grep` in regexp mode you can use the switch `-E`
When you do a loot text search, you will encounter regular expressions (regexp), which allow you to perform fuzzy search. To run `grep` in regexp mode you can use the switch. `-E`
The most basic form fo regexp si the exact match:
......@@ -56,7 +62,7 @@ The most basic form fo regexp si the exact match:
gzip -dc hg38.ncbiRefSeq.gtf.gz | head | grep -E "gene_id"
```
You can use the `.` wildcard character to match any thing
You can use the `.` wildcard character to match anything
```sh
gzip -dc hg38.ncbiRefSeq.gtf.gz | head | grep -E "...._id"
......@@ -93,7 +99,7 @@ gzip -dc hg38.ncbiRefSeq.gtf.gz | head | perl -E "\d\d[A-Z]\d"
By default, regular expressions will match any part of a string. It’s often useful to *anchor* the regular expression so that it matches from the start or end of the string. You can use
- ^` to match the start of the string.
- `^` to match the start of the string.
- `$` to match the end of the string.
```sh
......@@ -168,7 +174,7 @@ gzip -dc hg38.ncbiRefSeq.gtf.gz | head | sed -E 's|ncbiRefSeq(.*)(transcript_id
```
</p>
</details>
Regexp can be very complexe see for example [a regex to validate an email on starckoverflow](https://stackoverflow.com/questions/201323/how-to-validate-an-email-address-using-a-regular-expression/201378#201378). When you start you can always use for a given regexp to a more experienced used (just give him the kind of text you want to match and not match). You can test your regex easily with the [regex101 website](https://regex101.com/).
Regexp can be very complex see for example [a regex to validate an email on starckoverflow](https://stackoverflow.com/questions/201323/how-to-validate-an-email-address-using-a-regular-expression/201378#201378). When you start you can always use for a given regexp to a more experienced used (just give him the kind of text you want to match and not match). You can test your regex easily with the [regex101 website](https://regex101.com/).
## Sorting
......@@ -269,9 +275,7 @@ You have 3 modes in `vim`:
- The **insert** mode, where you can write things. You enter this mode with the `i` key or any other key insertion key (for example `a` to insert after the cursor or `A` to insert at the end of the line)
- The **visual** mode where you can select text for copy/paste action. You can enter this mode with the `v` key
If you want to learn more about `vim` you can start with the https://vim-adventures.com/ website. Once you master `vim` everything is faster but you will have to practice a loot.
In the next session, we are going to apply the logic of pipes and text manipulation to [batch processing.](http://perso.ens-lyon.fr/laurent.modolo/unix/9_batch_processing.html)
If you want to learn more about `vim`, you can start with the <https://vim-adventures.com/> website. Once you master `vim` everything is faster but you will have to practice a lot.
> We have used the following commands:
>
......@@ -286,3 +290,5 @@ In the next session, we are going to apply the logic of pipes and text manipulat
> - `cat` / `paste` for concatenation
> - `nano` / `vim` for text edition
In the next session, we are going to apply the logic of pipes and text manipulation to [batch processing.](./9_batch_processing.html)
# Batch processing
---
title: Batch processing
author: "Laurent Modolo"
---
[![cc_by_sa](./img/cc_by_sa.png)](http://creativecommons.org/licenses/by-sa/4.0/)
```{r include = FALSE}
if (!require("fontawesome")) {
install.packages("fontawesome")
}
library(fontawesome)
knitr::opts_chunk$set(echo = TRUE)
knitr::opts_chunk$set(comment = NA)
```
<a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">
<img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/88x31.png" />
</a>
Objective: Learn basics of batch processing in GNU/Linux
......@@ -28,7 +42,7 @@ CMD1 || CMD2
## Executing list of commands
The easiest option to execute list of command is to use `xargs`. `xargs` reads arguments from **stdin** and use them as argument for a command. In UNIX systems the command `echo` send string of character into **stdout**. We are going to use this command to learn more about `xargs`.
The easiest option to execute list of command is to use `xargs`. `xargs` reads arguments from **stdin** and use them as arguments for a command. In UNIX systems the command `echo` send string of character into **stdout**. We are going to use this command to learn more about `xargs`.
```sh
echo "hello world"
......@@ -118,7 +132,7 @@ There are also some predefined variables that you can use like.
- `$0` Correspond to all the columns.
- `FS` the field separator used
- `NF` the number of fields separated by `FS`
- `NR` the number for records already read
- `NR` the number of records already read
A `awk` program is a chain of commands with the form `motif { action }`
......@@ -247,13 +261,13 @@ To tell the system that your text file is a bash script you need to add a **sheb
For example, for a bash script in a system where `bash` is installed in `/bin/bash` the **shebang** is:
```bash
#!/bin/bash
##!/bin/bash
```
When you are not sure `which`is the path of the tools available to interpret your script, you can use the following shebang:
```bash
#!/usr/bin/env bash
##!/usr/bin/env bash
```
You can add a **shebang** to your script and add it the e**x**ecutable right.
......@@ -368,7 +382,7 @@ Means that from within the script:
You can write the following `variables.sh` script in your `scripts` folder:
```sh
#!/bin/bash
##!/bin/bash
echo "Name of the script: $0"
echo "Total number of arguments: $#"
......@@ -377,7 +391,6 @@ echo "Values of all the arguments: $@"
And you can try to call it with some arguments !
In the next session, we are going to learn how to execute command on other computers with [ssh.](http://perso.ens-lyon.fr/laurent.modolo/unix/10_network_and_ssh.html)
> We have used the following commands:
>
......@@ -389,3 +402,5 @@ In the next session, we are going to learn how to execute command on other compu
> - `shebang` to specify the language of a script
> - `PATH` to install script
In the next session, we are going to learn how to execute command on other computers with [ssh.](./10_network_and_ssh.html)
all: public/index.html \
public/github-pandoc.css \
public/1_understanding_a_computer.html \
public/2_using_the_ifb_cloud.html \
public/3_first_steps_in_a_terminal.html \
......@@ -12,42 +13,47 @@ all: public/index.html \
public/11_install_system_programs.html \
public/12_virtualization.html
public/github-pandoc.css: github-pandoc.css
cp github-pandoc.css public/github-pandoc.css
cp -R img public/
cp *.Rmd public/
cp -R www public/
public/index.html: index.md github-pandoc.css
pandoc -s -c github-pandoc.css index.md -o public/index.html
public/1_understanding_a_computer.html: 1_understanding_a_computer.md github-pandoc.css
pandoc -s --toc -c github-pandoc.css 1_understanding_a_computer.md -o public/1_understanding_a_computer.html
public/1_understanding_a_computer.html: 1_understanding_a_computer.Rmd public/github-pandoc.css
cd public && Rscript -e 'rmarkdown::render("1_understanding_a_computer.Rmd")'
public/2_using_the_ifb_cloud.html: 2_using_the_ifb_cloud.md github-pandoc.css
pandoc -s --toc -c github-pandoc.css 2_using_the_ifb_cloud.md -o public/2_using_the_ifb_cloud.html
public/2_using_the_ifb_cloud.html: 2_using_the_ifb_cloud.Rmd public/github-pandoc.css
cd public && Rscript -e 'rmarkdown::render("2_using_the_ifb_cloud.Rmd")'
public/3_first_steps_in_a_terminal.html: 3_first_steps_in_a_terminal.md github-pandoc.css
pandoc -s --toc -c github-pandoc.css 3_first_steps_in_a_terminal.md -o public/3_first_steps_in_a_terminal.html
public/3_first_steps_in_a_terminal.html: 3_first_steps_in_a_terminal.Rmd public/github-pandoc.css
cd public && Rscript -e 'rmarkdown::render("3_first_steps_in_a_terminal.Rmd")'
public/4_unix_file_system.html: 4_unix_file_system.md github-pandoc.css
pandoc -s --toc -c github-pandoc.css 4_unix_file_system.md -o public/4_unix_file_system.html
public/4_unix_file_system.html: 4_unix_file_system.Rmd public/github-pandoc.css
cd public && Rscript -e 'rmarkdown::render("4_unix_file_system.Rmd")'
public/5_users_and_rights.html: 5_users_and_rights.md github-pandoc.css
pandoc -s --toc -c github-pandoc.css 5_users_and_rights.md -o public/5_users_and_rights.html
public/5_users_and_rights.html: 5_users_and_rights.Rmd public/github-pandoc.css
cd public && Rscript -e 'rmarkdown::render("5_users_and_rights.Rmd")'
public/6_unix_processes.html: 6_unix_processes.md github-pandoc.css
pandoc -s --toc -c github-pandoc.css 6_unix_processes.md -o public/6_unix_processes.html
public/6_unix_processes.html: 6_unix_processes.Rmd public/github-pandoc.css
cd public && Rscript -e 'rmarkdown::render("6_unix_processes.Rmd")'
public/7_streams_and_pipes.html: 7_streams_and_pipes.md github-pandoc.css
pandoc -s --toc -c github-pandoc.css 7_streams_and_pipes.md -o public/7_streams_and_pipes.html
public/7_streams_and_pipes.html: 7_streams_and_pipes.Rmd public/github-pandoc.css
cd public && Rscript -e 'rmarkdown::render("7_streams_and_pipes.Rmd")'
public/8_text_manipulation.html: 8_text_manipulation.md github-pandoc.css
pandoc -s --toc -c github-pandoc.css 8_text_manipulation.md -o public/8_text_manipulation.html
public/8_text_manipulation.html: 8_text_manipulation.Rmd public/github-pandoc.css
cd public && Rscript -e 'rmarkdown::render("8_text_manipulation.Rmd")'
public/9_batch_processing.html: 9_batch_processing.md github-pandoc.css
pandoc -s --toc -c github-pandoc.css 9_batch_processing.md -o public/9_batch_processing.html
public/9_batch_processing.html: 9_batch_processing.Rmd public/github-pandoc.css
cd public && Rscript -e 'rmarkdown::render("9_batch_processing.Rmd")'
public/10_network_and_ssh.html: 10_network_and_ssh.md github-pandoc.css
pandoc -s --toc -c github-pandoc.css 10_network_and_ssh.md -o public/10_network_and_ssh.html
public/10_network_and_ssh.html: 10_network_and_ssh.Rmd public/github-pandoc.css
cd public && Rscript -e 'rmarkdown::render("10_network_and_ssh.Rmd")'
public/11_install_system_programs.html: 11_install_system_programs.md github-pandoc.css
pandoc -s --toc -c github-pandoc.css 11_install_system_programs.md -o public/11_install_system_programs.html
public/11_install_system_programs.html: 11_install_system_programs.Rmd public/github-pandoc.css
cd public && Rscript -e 'rmarkdown::render("11_install_system_programs.Rmd")'
public/12_virtualization.html: 12_virtualization.md github-pandoc.css
pandoc -s --toc -c github-pandoc.css 12_virtualization.md -o public/12_virtualization.html
public/12_virtualization.html: 12_virtualization.Rmd public/github-pandoc.css
cd public && Rscript -e 'rmarkdown::render("12_virtualization.Rmd")'
......@@ -4,6 +4,6 @@
All the materials are accessible at the following url:
[https://perso.ens-lyon.fr/laurent.modolo/unix/](https://perso.ens-lyon.fr/laurent.modolo/unix/)
[https://can.gitbiopages.ens-lyon.fr/unix-command-line/](https://can.gitbiopages.ens-lyon.fr/unix-command-line/)
You can join us on the dedicated matrix channel (ask [laurent.modolo@ens-lyon.fr](mailto:laurent.modolo@ens-lyon.fr))
\ No newline at end of file
You can join us on the dedicated matrix channel (ask [laurent.modolo@ens-lyon.fr](mailto:laurent.modolo@ens-lyon.fr))
project:
type: book
book:
title: "UNIX command line"
author:
- "Laurent Modolo"
date: "2023-10-09"
chapters:
- index.md
- 1_understanding_a_computer.Rmd
- 2_using_the_ifb_cloud.Rmd
- 3_first_steps_in_a_terminal.Rmd
- 4_unix_file_system.Rmd
- 5_users_and_rights.Rmd
- 6_unix_processes.Rmd
- 7_streams_and_pipes.Rmd
- 8_text_manipulation.Rmd
- 9_batch_processing.Rmd
- 10_network_and_ssh.Rmd
- 11_install_system_programs.Rmd
- 12_virtualization.Rmd
body-footer: "License: Creative Commons [CC-BY-SA-4.0](http://creativecommons.org/licenses/by-sa/4.0/).<br>Made with [Quarto](https://quarto.org/)."
navbar:
search: true
right:
- icon: git
href: https://gitbio.ens-lyon.fr/can/unix-command-line
text: Sources
# bibliography: references.bib
format:
html:
theme:
light: flatly
dark: darkly
execute:
cache: true
\ No newline at end of file
This diff is collapsed.
img/SSD.jpeg

156 KiB

img/hdd.png

1.35 MiB