Objective: Learn basics of ssh connection in GNU/Linux
In the previous section, we have seen how to run scripts and complex commands on your computer. In this session we are going to learn to do that over the network.
Most of the content from this session are from [wikipedia.org](https://wikipedia.org)
## Network
First before talking about how to communicate over a network, we need to define what is a network in computational science. We can distinguish between two types of network, **circuit switching** networks and **packet switching** networks.
### circuit switching
Circuit switching is the historical telephonic network architecture. When device A wants to communicate with device B, it has to establish a connection over the network. In a circuit switching network, the connections between a chain of nodes (hopefully the shortest chain) are established and fixed. Device A connects to the closest node and ask connection to Device B, this node will do the same thing to the closest node to Device B, so on and so forth until the connection reach Device B.
If you try to call someone, who is already in a phone conversation, the line will be occupied.
Packet switching is a method of grouping data over the network into packets. Each packet has a header and a payload. The header data can be read by each node to direct the packet to its destination. The header data also inform the Host 2 of the packets order. The payload contains the data that we want to transmit over the network. In packet switching, the network bandwidth is not pre-allocated like in circuit switching. Each packet is called a datagram.
> “A self-contained, independent entity of data carrying sufficient information to be routed from the source to the destination computer without reliance on earlier exchanges between this source and destination computer and the transporting network.”
> The **Internet Protocol** (**IP**) is the principal [communications protocol](https://en.wikipedia.org/wiki/Communications_protocol) in the [Internet protocol suite](https://en.wikipedia.org/wiki/Internet_protocol_suite) for relaying [datagrams](https://en.wikipedia.org/wiki/Datagram) across network boundaries. Its [routing](https://en.wikipedia.org/wiki/Routing) function enables [internetworking](https://en.wikipedia.org/wiki/Internetworking), and essentially establishes the [Internet](https://en.wikipedia.org/wiki/Internet).
IP has the task of delivering [packets](https://en.wikipedia.org/wiki/Packet_(information_technology)) from the source [host](https://en.wikipedia.org/wiki/Host_(network)) to the destination host solely based on the [IP addresses](https://en.wikipedia.org/wiki/IP_address) in the packet [headers](https://en.wikipedia.org/wiki/Header_(computing)). For this purpose, IP defines packet structures that [encapsulate](https://en.wikipedia.org/wiki/Encapsulation_(networking)) the data to be delivered. It also defines addressing methods.
The first major version of IP, [Internet Protocol Version 4](https://en.wikipedia.org/wiki/IPv4)(IPv4), is the dominant protocol of the Internet. Its successor is [Internet Protocol Version 6](https://en.wikipedia.org/wiki/IPv6)(IPv6), which has been in increasing [deployment](https://en.wikipedia.org/wiki/IPv6_deployment) on the public Internet since c. 2006.
### IPv4
An **IPv4** is composed of 4 digits ranging from 0 to 255 separated by `.` , which gives an address space of 4294967296 (2^32) addresses. Some combinations of **IPv4** are restricted:
| Address block | Address range | Number of addresses | Scope | Description |
| 0.0.0.0/8 | 0.0.0.0–0.255.255.255 | 16777216 | Software | Current network[[6\]](https://en.wikipedia.org/wiki/IPv4#cite_note-rfc6890-6)(only valid as source address). |
| 10.0.0.0/8 | 10.0.0.0–10.255.255.255 | 16777216 | Private network | Used for local communications within a [private network](https://en.wikipedia.org/wiki/Private_network).[[7\]](https://en.wikipedia.org/wiki/IPv4#cite_note-rfc1918-7) |
| 100.64.0.0/10 | 100.64.0.0–100.127.255.255 | 4194304 | Private network | [Shared address space](https://en.wikipedia.org/wiki/IPv4_shared_address_space)[[8\]](https://en.wikipedia.org/wiki/IPv4#cite_note-rfc6598-8) for communications between a service provider and its subscribers when using a [carrier-grade NAT](https://en.wikipedia.org/wiki/Carrier-grade_NAT). |
| 127.0.0.0/8 | 127.0.0.0–127.255.255.255 | 16777216 | Host | Used for [loopback addresses](https://en.wikipedia.org/wiki/Loopback_address) to the local host.[[6\]](https://en.wikipedia.org/wiki/IPv4#cite_note-rfc6890-6) |
| 169.254.0.0/16 | 169.254.0.0–169.254.255.255 | 65536 | Subnet | Used for [link-local addresses](https://en.wikipedia.org/wiki/Link-local_address)[[9\]](https://en.wikipedia.org/wiki/IPv4#cite_note-rfc3927-9) between two hosts on a single link when no IP address is otherwise specified, such as would have normally been retrieved from a [DHCP](https://en.wikipedia.org/wiki/DHCP) server. |
| 172.16.0.0/12 | 172.16.0.0–172.31.255.255 | 1048576 | Private network | Used for local communications within a private network.[[7\]](https://en.wikipedia.org/wiki/IPv4#cite_note-rfc1918-7) |
| 192.0.2.0/24 | 192.0.2.0–192.0.2.255 | 256 | Documentation | Assigned as TEST-NET-1, documentation and examples.[[10\]](https://en.wikipedia.org/wiki/IPv4#cite_note-rfc5737-10) |
| 192.88.99.0/24 | 192.88.99.0–192.88.99.255 | 256 | Internet | Reserved.[[11\]](https://en.wikipedia.org/wiki/IPv4#cite_note-rfc7526-11) Formerly used for [IPv6 to IPv4](https://en.wikipedia.org/wiki/6to4) relay[[12\]](https://en.wikipedia.org/wiki/IPv4#cite_note-rfc3068-12)(included[IPv6](https://en.wikipedia.org/wiki/IPv6) address block [2002::/16](https://en.wikipedia.org/wiki/IPv6_address#Special_addresses)). |
| 192.168.0.0/16 | 192.168.0.0–192.168.255.255 | 65536 | Private network | Used for local communications within a private network.[[7\]](https://en.wikipedia.org/wiki/IPv4#cite_note-rfc1918-7) |
| 198.18.0.0/15 | 198.18.0.0–198.19.255.255 | 131072 | Private network | Used for benchmark testing of inter-network communications between two separate subnets.[[13\]](https://en.wikipedia.org/wiki/IPv4#cite_note-rfc2544-13) |
| 198.51.100.0/24 | 198.51.100.0–198.51.100.255 | 256 | Documentation | Assigned as TEST-NET-2, documentation and examples.[[10\]](https://en.wikipedia.org/wiki/IPv4#cite_note-rfc5737-10) |
| 203.0.113.0/24 | 203.0.113.0–203.0.113.255 | 256 | Documentation | Assigned as TEST-NET-3, documentation and examples.[[10\]](https://en.wikipedia.org/wiki/IPv4#cite_note-rfc5737-10) |
| 224.0.0.0/4 | 224.0.0.0–239.255.255.255 | 268435456 | Internet | In use for [IP multicast](https://en.wikipedia.org/wiki/IP_multicast).[[14\]](https://en.wikipedia.org/wiki/IPv4#cite_note-rfc5771-14)(Former Class D network). |
| 240.0.0.0/4 | 240.0.0.0–255.255.255.254 | 268435455 | Internet | Reserved for future use.[[15\]](https://en.wikipedia.org/wiki/IPv4#cite_note-rfc3232-15)(Former Class E network). |
| 255.255.255.255/32 | 255.255.255.255 | 1 | Subnet | Reserved for the "limited [broadcast](https://en.wikipedia.org/wiki/Broadcast_address)" destination address.[[6\]](https://en.wikipedia.org/wiki/IPv4#cite_note-rfc6890-6)[[16\]](https://en.wikipedia.org/wiki/IPv4#cite_note-rfc919-16) |
### IPv6
An **IPv6** is composed of 8 groups of 4 digits long number separated by `:`. The numbers are in hexadecimal format (number of base 16, randing from 0 to 9 and A to F). Compared to **IPv4**, **IPv6** allows for 2^128 = 340,282,366,920,938,463,463,374,607,431,768,211,456 addresses (approximately 3.4×10^38). For example, an IP address is: *2001:0db8:0000:0000:0000:ff00:0042:8329*
To display your VM IP addresses you can use the following command: `ip address show`
Local **IPv6** addresses start with **fe80::**
### **Domain Name System** (**DNS**)
Instead of using IP addresses in your everyday life, you often use the domain name. The DNS is composed of many DNS servers that are hierarchically organized and decentralized. By querying the DNS with a particular domain name, the correct name server will return the corresponding IP address. For most network tools, you can use domain names (URL) or IP addresses.

### Transmission Control Protocol (**TCP**)
The **Transmission Control Protocol** (**TCP**) is one of the main [protocols](https://en.wikipedia.org/wiki/Communications_protocol) of the [Internet protocol suite](https://en.wikipedia.org/wiki/Internet_protocol_suite). TCP provide, reliable, ordered, and error-checked delivery of a stream of data between applications running on hosts communincating over an IP network.
- data arrives in-order
- data has minimal error (i.e., correctness)
- duplicate data is discarded
- lost or discarded packets are resent
- includes traffic congestion control
- Heavtweight (no ordering of messages, no tracking connections, etc. It is a very simple transport layer designed on top of IP)
### **User Datagram Protocol** (**UDP**)
UDP uses a simple [connectionless communication](https://en.wikipedia.org/wiki/Connectionless_communication) model with a minimum of protocol mechanisms.
- Unreliable
- Not ordered
- Broadcast (being connectionless, UDP can broadcast - sent packets can be addressed to be receivable by all devices on the subnet)
- Multicast (a single datagram packet can be automatically routed without duplication to a group of subscribers)
- Lightweight (no ordering of messages, no tracking connections, etc. It is a very simple transport layer designed on top of IP)
## SSH
There exist numerous other protocols ([RTP](https://en.wikipedia.org/wiki/Real-time_Transport_Protocol) for example). But most of them run over the TCP and UDP protocols. **SSH** or **Secure Shell** is one of them. SSH is a [cryptographic](https://en.wikipedia.org/wiki/Cryptography)[network protocol](https://en.wikipedia.org/wiki/Network_protocol) for operating network services securely over an unsecured network.
SSH use a client-server architecture, you use an SSH client to connect to an SSH server. By default most Linux distribution don’t comes with an SSH server installed. For the IFB, SSH connection is the default way to connect to your VMs, so you should have an SSH sever up and running.
SSH uses [Public-key cryptography (or asymmetric cryptography](https://en.wikipedia.org/wiki/Public-key_cryptography)), is a cryptographic system which uses pairs of [keys](https://en.wikipedia.org/wiki/Cryptographic_key): *public keys* (which may be known to others), and *private keys* (which may never be known by any except the owner).
A cryptographic algorithm is used to generate a pair of *public* and *private* keys from a large random number. Then, the 3 following scheme can be used to secure communication:
### Communicate with the server
The server sent a public key to the client on the first connection.
By default, on the IFB, password authentication is disabled to enforce the use of public key based authentication. To learn `ssh` command we are going to enable this option on your VMs. Find their `sshd` configuration file and open it with an editor of your choice.
<details><summary>Solution</summary>
<p>
```sh
vim /etc/ssh/sshd_config
```
</p>
</details>
This file is own by **root**, you need to get **root** access to your account.
<details><summary>Solution</summary>
<p>
```sh
docker run -it--volume /:/root/chroot alpine sh -c"chroot /root/chroot /bin/bash -c 'usermod -a -G sudo etudiant'"&& su etudiant
```
</p>
</details>
Using the `sudo` command edit the configuration file to set **PasswordAuthentication** to **yes** and add the following lines:
**AllowUsers etudiant student**
**PermitRootLogin no**
The `sshd` (SSH Daemon) process in launched and managed by `systemd`. You can manage `systemd` service with the `systemctl` command. Try this command without any arguments. You can search for `sshd` by typing `/sshd` and pressing `enter`. You can leave the `systemctl` view by pressing `q`.
To apply our modification to the `sshd` server configuration, we need to restart the corresponding *service*. You can use the following command:
```sh
sudo systemctl restart sshd
```
You can use the keyword `start`, `stop` and `status` to manage `systemd` services.
Check the **status** of your `sshd`*service*.
<details><summary>Solution</summary>
<p>
```sh
sudo systemctl status sshd
```
</p>
</details>
You are going to create an account for another member of the formation to connect on your VM.
```sh
sudo useradd -m-s /bin/bash -gusers student
sudo passwd student
```
Give the password and your IP on the chat.
## SSH client
To connect of an SSH server you can use the following command:
```sh
ssh login@IP_adress
```
Use this command to connect to another student VM.
On the first connection, `ssh` ask you to accept the public key of the server (key fingerprint). With that in the future if someone try to fool you by impersonating the ssh server, he won’t be able to do it without the corresponding private key.
You can close the connection by pressing `ctrl` + `d` or with the command `exit`.
Check the content of the `~/.ssh/` folder, where is saved the server public key ?
Congratualtion you are connected on a VM through another VM !
### Key authentication
Every time, that you want to connect to the ssh server, you have to type your account password, this password is encrypted and send over the network. Instead you can use a pair of private and public key to authenticate yourself.
First you have to generate a pair of key with the command:
```sh
ssh-keygen -t ed25519 -C"your.mail@ens-lyon.fr"
```
The option `-t` specifies the algorithm to use while `-C` specify comment associated with the key (generally the email of the person generating the key). You can check the **man**ual and internet to compare the different available algorithms.
It is a good practice to name a given pair of keys after the name of the server, on which you want to use those keys.
You can use the name `/home/etudiant/.ssh/id_ed25519_otherVM`
Then as an additional security measure, you can restrict the usage of your private key by defining a password. You will need the password and the key file to authenticate yourself.
The generated keys are in the folder `~/.ssh/`
Then you need to make a copy of your public key (`.pub`) on the sshd server.
Note that for security reason, only you should be able to read and write within your `.ssh` folder (you don’t want someone else to mitigate with your keys). You can use the command `chmod 600 .ssh/*`
You can try to log on the server using the key with the following command:
```sh
ssh login@IP_adress -i ~/.ssh/id_ed25519_otherVM
```
`ssh` Should ask for your key password instead of the student account password.
Congratulations, you authenticated yourself on a remote server without sending your password over the network !
## SSH based tools
Sometime, you want to do other things than executing commands on a remove computer. For example, you may want to transfer files over the network.
### scp
The `scp` command comes with the `ssh` client installation you can use it to transfer file from your computer to the ssh sever:
```sh
scp local/path login@IP_adress:remote/path
```
> You can use a relative remote path, where the ":" correspond to your home folder on the remote server.
You can also retrieve file from the server:
```sh
scp login@IP_adress:remote/path local/path
```
To transfer directory you can use the `-r` witch
### rsync
`scp` Is a basic command for file transfer. If you want advanced process bar and file integrity checking, you can use the `rsync` command instead.
For example
```sh
rsync -auvlocal/path login@IP_adress:remote/path
```
Will only transfer files from `local/path` not already present in `remote/path`. The `-c` switch will compute a checksum of the file locally and remotely to be certain that they are identical.
### sshfs
You can use the `sshfs` command to mount a remote folder over ssh on your computer.
## SSH tips
### IFB authentication
The default authentication method for the IFB uses keys generated with the `rsa` algorithm
Instead of using the `ssh-copy-id` command, you are going to copy paste your public key into your [IFB configuration page.](https://biosphere.france-bioinformatique.fr/cloudweb_account/settings/edit)
You can now use the [RainBio catalogue](https://biosphere.france-bioinformatique.fr/catalogue/) to launch any available VMs and connect to is with SSH from your current VM.
### SSH configuration
Long ssh command can be tedious to use. This is why we are now going to explore the last file in the `.ssh` folder: `.ssh/config`.
This file is decomposed in different `Host` sections like the following to connect yourself to [the ssh server of the ens](https://instella.ens-lyon.fr/stella/intra/ent-ssh.html).
```yaml
Host ens
HostName ssh.ens-lyon.fr
User <login>
IdentitiesOnly yes
IdentityFile ~/.ssh/id_ens
PreferredAuthentications publickey,password,
```
-`HostName` define the server url or IP address
-`User` the login to use
-`Identit*` define the key authentication mechanism
-`PreferredAuthentications` tells the order of the authentication mechanism to try
With this configuration you can use the command:
```sh
ssh ens
```
To connect to the `ssh.ens-lyon.fr` server.
You can also apply ssh configuration to all the `Host` with the following:
```yaml
Host *
Compression yes
ServerAliveInterval 36000
ControlMaster auto
ControlPersist yes
ControlPath ~/.ssh/socket-%r@%h:%p
```
Here we say that we want to enable compression for all the connections. And that we want each connection to stay alive 3600 seconds. The connection is maintained with socked files in the `~/.ssh/` folder with names starting with `socket-`. This also means that if you connect more than once to the same server, the same connection will be used.
Sometime you want to connect to a ssh server from an intermediate (or many intermediate) ssh server. To do that you can use the **ProxyJump** option. For example, you can connect to a computer running a ssh server within the ens with the following config.
```yaml
Host work-ens
ProxyJump ens
HostName ip.ip.ip.ip
User login
IdentitiesOnly yes
IdentityFile ~/.ssh/id_work
PreferredAuthentications publickey,password,
```
With the command `ssh work-ens`, the `ssh` client is going to first connect to `ens` and then from `ens` to the `ip.ip.ip.ip` server.
@@ -168,7 +168,7 @@ gzip -dc hg38.ncbiRefSeq.gtf.gz | head | sed -E 's|ncbiRefSeq(.*)(transcript_id
```
</p>
</details>
Regexp can be very complexe see for example [a regex to validate an email on starckoverflow](https://stackoverflow.com/questions/201323/how-to-validate-an-email-address-using-a-regular-expression/201378#201378). When you start you can always use for a given regexp to a more experienced used (just give him the kind of text you want to match and not match).
Regexp can be very complexe see for example [a regex to validate an email on starckoverflow](https://stackoverflow.com/questions/201323/how-to-validate-an-email-address-using-a-regular-expression/201378#201378). When you start you can always use for a given regexp to a more experienced used (just give him the kind of text you want to match and not match). You can test your regex easily with the [regex101 website](https://regex101.com/).
@@ -377,7 +377,7 @@ echo "Values of all the arguments: $@"
And you can try to call it with some arguments !
In the next session, we are going to learn how to execute command on other computers with [ssh.](http://perso.ens-lyon.fr/laurent.modolo/unix/10_ssh.html)
In the next session, we are going to learn how to execute command on other computers with [ssh.](http://perso.ens-lyon.fr/laurent.modolo/unix/10_network_and_ssh.html)