Newer
Older
author: "Laurent Modolo"
---
```{r include = FALSE}
if (!require("fontawesome")) {
install.packages("fontawesome")
}
library(fontawesome)
knitr::opts_chunk$set(echo = TRUE)
knitr::opts_chunk$set(comment = NA)
```
<a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">
<img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/88x31.png" />
</a>
Objective: Learn basics of ssh connection in GNU/Linux
In the previous section, we have seen how to run scripts and complex commands on your computer. In this session we are going to learn to do that over the network.
Most of the content from this session are from [wikipedia.org](https://wikipedia.org)
First before talking about how to communicate over a network, we need to define what is a network in computational science. We can distinguish between two types of network, **circuit switching** networks and **packet switching** networks.
Circuit switching is the historical telephonic network architecture. When device A wants to communicate with device B, it has to establish a connection over the network. In a circuit switching network, the connections between a chain of nodes (hopefully the shortest chain) are established and fixed. Device A connects to the closest node and ask connection to Device B, this node will do the same thing to the closest node to Device B, so on and so forth until the connection reach Device B.
If you try to call someone, who is already in a phone conversation, the line will be occupied.

Packet switching is a method of grouping data over the network into packets. Each packet has a header and a payload. The header data can be read by each node to direct the packet to its destination. The header data also inform the Host 2 of the packets order. The payload contains the data that we want to transmit over the network. In packet switching, the network bandwidth is not pre-allocated like in circuit switching. Each packet is called a datagram.
> “A self-contained, independent entity of data carrying sufficient information to be routed from the source to the destination computer without reliance on earlier exchanges between this source and destination computer and the transporting network.”

In a packet switching network when you send a flux of data (video, sound, etc.), you have the illusion of continuity like for process switching handled by the scheduler.
> The **Internet Protocol** (**IP**) is the principal [communications protocol](https://en.wikipedia.org/wiki/Communications_protocol) in the [Internet protocol suite](https://en.wikipedia.org/wiki/Internet_protocol_suite) for relaying [datagrams](https://en.wikipedia.org/wiki/Datagram) across network boundaries. Its [routing](https://en.wikipedia.org/wiki/Routing) function enables [internetworking](https://en.wikipedia.org/wiki/Internetworking), and essentially establishes the [Internet](https://en.wikipedia.org/wiki/Internet).
IP has the task of delivering [packets](https://en.wikipedia.org/wiki/Packet_(information_technology)) from the source [host](https://en.wikipedia.org/wiki/Host_(network)) to the destination host solely based on the [IP addresses](https://en.wikipedia.org/wiki/IP_address) in the packet [headers](https://en.wikipedia.org/wiki/Header_(computing)). For this purpose, IP defines packet structures that [encapsulate](https://en.wikipedia.org/wiki/Encapsulation_(networking)) the data to be delivered. It also defines addressing methods.
The first major version of IP, [Internet Protocol Version 4](https://en.wikipedia.org/wiki/IPv4) (IPv4), is the dominant protocol of the Internet. Its successor is [Internet Protocol Version 6](https://en.wikipedia.org/wiki/IPv6) (IPv6), which has been in increasing [deployment](https://en.wikipedia.org/wiki/IPv6_deployment) on the public Internet since c. 2006.
An **IPv4** is composed of 4 digits ranging from 0 to 255 separated by `.` , which gives an address space of 4294967296 (2^32) addresses. Some combinations of **IPv4** are restricted:
| Address block | Address range | Number of addresses | Scope | Description |
| ------------------ | --------------------------- | ------------------- | --------------- | ------------------------------------------------------------ |
| 0.0.0.0/8 | 0.0.0.0–0.255.255.255 | 16777216 | Software | Current network[[6\]](https://en.wikipedia.org/wiki/IPv4#cite_note-rfc6890-6) (only valid as source address). |
| 10.0.0.0/8 | 10.0.0.0–10.255.255.255 | 16777216 | Private network | Used for local communications within a [private network](https://en.wikipedia.org/wiki/Private_network).[[7\]](https://en.wikipedia.org/wiki/IPv4#cite_note-rfc1918-7) |
| 100.64.0.0/10 | 100.64.0.0–100.127.255.255 | 4194304 | Private network | [Shared address space](https://en.wikipedia.org/wiki/IPv4_shared_address_space)[[8\]](https://en.wikipedia.org/wiki/IPv4#cite_note-rfc6598-8) for communications between a service provider and its subscribers when using a [carrier-grade NAT](https://en.wikipedia.org/wiki/Carrier-grade_NAT). |
| 127.0.0.0/8 | 127.0.0.0–127.255.255.255 | 16777216 | Host | Used for [loopback addresses](https://en.wikipedia.org/wiki/Loopback_address) to the local host.[[6\]](https://en.wikipedia.org/wiki/IPv4#cite_note-rfc6890-6) |
| 169.254.0.0/16 | 169.254.0.0–169.254.255.255 | 65536 | Subnet | Used for [link-local addresses](https://en.wikipedia.org/wiki/Link-local_address)[[9\]](https://en.wikipedia.org/wiki/IPv4#cite_note-rfc3927-9) between two hosts on a single link when no IP address is otherwise specified, such as would have normally been retrieved from a [DHCP](https://en.wikipedia.org/wiki/DHCP) server. |
| 172.16.0.0/12 | 172.16.0.0–172.31.255.255 | 1048576 | Private network | Used for local communications within a private network.[[7\]](https://en.wikipedia.org/wiki/IPv4#cite_note-rfc1918-7) |
| 192.0.0.0/24 | 192.0.0.0–192.0.0.255 | 256 | Private network | IETF Protocol Assignments.[[6\]](https://en.wikipedia.org/wiki/IPv4#cite_note-rfc6890-6) |
| 192.0.2.0/24 | 192.0.2.0–192.0.2.255 | 256 | Documentation | Assigned as TEST-NET-1, documentation and examples.[[10\]](https://en.wikipedia.org/wiki/IPv4#cite_note-rfc5737-10) |
| 192.88.99.0/24 | 192.88.99.0–192.88.99.255 | 256 | Internet | Reserved.[[11\]](https://en.wikipedia.org/wiki/IPv4#cite_note-rfc7526-11) Formerly used for [IPv6 to IPv4](https://en.wikipedia.org/wiki/6to4) relay[[12\]](https://en.wikipedia.org/wiki/IPv4#cite_note-rfc3068-12) (included [IPv6](https://en.wikipedia.org/wiki/IPv6) address block [2002::/16](https://en.wikipedia.org/wiki/IPv6_address#Special_addresses)). |
| 192.168.0.0/16 | 192.168.0.0–192.168.255.255 | 65536 | Private network | Used for local communications within a private network.[[7\]](https://en.wikipedia.org/wiki/IPv4#cite_note-rfc1918-7) |
| 198.18.0.0/15 | 198.18.0.0–198.19.255.255 | 131072 | Private network | Used for benchmark testing of inter-network communications between two separate subnets.[[13\]](https://en.wikipedia.org/wiki/IPv4#cite_note-rfc2544-13) |
| 198.51.100.0/24 | 198.51.100.0–198.51.100.255 | 256 | Documentation | Assigned as TEST-NET-2, documentation and examples.[[10\]](https://en.wikipedia.org/wiki/IPv4#cite_note-rfc5737-10) |
| 203.0.113.0/24 | 203.0.113.0–203.0.113.255 | 256 | Documentation | Assigned as TEST-NET-3, documentation and examples.[[10\]](https://en.wikipedia.org/wiki/IPv4#cite_note-rfc5737-10) |
| 224.0.0.0/4 | 224.0.0.0–239.255.255.255 | 268435456 | Internet | In use for [IP multicast](https://en.wikipedia.org/wiki/IP_multicast).[[14\]](https://en.wikipedia.org/wiki/IPv4#cite_note-rfc5771-14) (Former Class D network). |
| 240.0.0.0/4 | 240.0.0.0–255.255.255.254 | 268435455 | Internet | Reserved for future use.[[15\]](https://en.wikipedia.org/wiki/IPv4#cite_note-rfc3232-15) (Former Class E network). |
| 255.255.255.255/32 | 255.255.255.255 | 1 | Subnet | Reserved for the "limited [broadcast](https://en.wikipedia.org/wiki/Broadcast_address)" destination address.[[6\]](https://en.wikipedia.org/wiki/IPv4#cite_note-rfc6890-6)[[16\]](https://en.wikipedia.org/wiki/IPv4#cite_note-rfc919-16) |
An **IPv6** is composed of 8 groups of 4 digits long number separated by `:`.
The numbers are in hexadecimal format (number of base 16, randing from 0 to 9 and A to F).
Compared to **IPv4**, **IPv6** allows for 2^128 = 340,282,366,920,938,463,463,374,607,431,768,211,456 addresses (approximately 3.4×10^38).
For example, an IP address is: **2001:0db8:0000:0000:0000:ff00:0042:8329**
To display your VM IP addresses you can use the following command: `ip address show`
Local **IPv6** addresses start with **fe80::**
Instead of using IP addresses in your everyday life, you often use the domain name. The DNS is composed of many DNS servers that are hierarchically organized and decentralized. By querying the DNS with a particular domain name, the correct name server will return the corresponding IP address. For most network tools, you can use domain names (URL) or IP addresses.

The **Transmission Control Protocol** (**TCP**) is one of the main [protocols](https://en.wikipedia.org/wiki/Communications_protocol) of the [Internet protocol suite](https://en.wikipedia.org/wiki/Internet_protocol_suite). TCP provide, reliable, ordered, and error-checked delivery of a stream of data between applications running on hosts communicating over an IP network.
- data arrives in-order
- data has minimal error (i.e., correctness)
- duplicate data is discarded
- lost or discarded packets are resent
- includes traffic congestion control
UDP uses a simple [connectionless communication](https://en.wikipedia.org/wiki/Connectionless_communication) model with a minimum of protocol mechanisms.
- Unreliable
- Not ordered
- Broadcast (being connectionless, UDP can broadcast - sent packets can be addressed to be receivable by all devices on the subnet)
- Multicast (a single datagram packet can be automatically routed without duplication to a group of subscribers)
- Lightweight (no ordering of messages, no tracking connections, etc. It is a very simple transport layer designed on top of IP)
Higher, communication protocols like TCP and UDP, also define **port**. A **port** is a communication endpoint. When software wants to communicate overt TCP or UDP it will do so using a specific **port**. Each system has **port** numbers ranging from **0** to **65535**. **Port** numbered from **0** through **1023** are system **ports** used by well-known processes (you need specific rights to use them).
Here is a list of notable port numbers:
| Number | Assignment |
| ------ | ------------------------------------------------------------ |
| 20 | [File Transfer Protocol](https://en.wikipedia.org/wiki/File_Transfer_Protocol) (FTP) Data Transfer |
| 21 | [File Transfer Protocol](https://en.wikipedia.org/wiki/File_Transfer_Protocol) (FTP) Command Control |
| 22 | [Secure Shell](https://en.wikipedia.org/wiki/Secure_Shell) (SSH) Secure Login |
| 23 | [Telnet](https://en.wikipedia.org/wiki/Telnet) remote login service, unencrypted text messages |
| 25 | [Simple Mail Transfer Protocol](https://en.wikipedia.org/wiki/Simple_Mail_Transfer_Protocol) (SMTP) e-mail routing |
| 53 | [Domain Name System](https://en.wikipedia.org/wiki/Domain_Name_System) (DNS) service |
| 67, 68 | [Dynamic Host Configuration Protocol](https://en.wikipedia.org/wiki/Dynamic_Host_Configuration_Protocol) (DHCP) |
| 80 | [Hypertext Transfer Protocol](https://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol) (HTTP) used in the [World Wide Web](https://en.wikipedia.org/wiki/World_Wide_Web) |
| 110 | [Post Office Protocol](https://en.wikipedia.org/wiki/Post_Office_Protocol) (POP3) |
| 119 | [Network News Transfer Protocol](https://en.wikipedia.org/wiki/Network_News_Transfer_Protocol) (NNTP) |
| 123 | [Network Time Protocol](https://en.wikipedia.org/wiki/Network_Time_Protocol) (NTP) |
| 143 | [Internet Message Access Protocol](https://en.wikipedia.org/wiki/Internet_Message_Access_Protocol) (IMAP) Management of digital mail |
| 161 | [Simple Network Management Protocol](https://en.wikipedia.org/wiki/Simple_Network_Management_Protocol) (SNMP) |
| 194 | [Internet Relay Chat](https://en.wikipedia.org/wiki/Internet_Relay_Chat) (IRC) |
| 443 | [HTTP Secure](https://en.wikipedia.org/wiki/HTTP_Secure) (HTTPS) HTTP over TLS/SSL |
Nowadays, **ports** provide multiplexing, which means that multiple service or communication session can use the same **port** number.
There are numerous other protocols ([RTP](https://en.wikipedia.org/wiki/Real-time_Transport_Protocol) for example). But most of them run over the **TCP** and **UDP** protocols. **SSH** or **Secure Shell** is one of them. **SSH** is a [cryptographic](https://en.wikipedia.org/wiki/Cryptography) [network protocol](https://en.wikipedia.org/wiki/Network_protocol) for operating network services securely over an unsecured network.
**SSH** use a client-server architecture, you use an **SSH client** to connect to an **SSH server**. By default most Linux distribution don’t come with an **SSH server** installed. For the IFB, **SSH** connection is the default way to connect to your VMs, so you should have an **SSH** sever running.
Find the name of the **SSH** server process
<details><summary>Solution</summary>
<p>
```sh
ps -el | grep "ssh"
```
</p>
</details>
SSH uses [Public-key cryptography (or asymmetric cryptography](https://en.wikipedia.org/wiki/Public-key_cryptography)), to secure its communications.
[Public-key cryptography (or asymmetric cryptography](https://en.wikipedia.org/wiki/Public-key_cryptography)), is a cryptographic system which uses pairs of [keys](https://en.wikipedia.org/wiki/Cryptographic_key): *public keys* (which may be known to others), and *private keys* (which may never be known by any except the owner).
A cryptographic algorithm is used to generate a pair of *public* and *private* keys from a large random number. Then, the 3 following scheme can be used to secure communication:
The server sent a public key to the client on the first connection.

Can be used to share public keys (see [Diffie-Hellman)](https://fr.wikipedia.org/wiki/%C3%89change_de_cl%C3%A9s_Diffie-Hellman).

- The server sends a random string of characters to the client
- The client crypt the random string and send it back to the server
- The server decrypt the message with the client public key and compare it to the random string

By default, on the IFB, password authentication is disabled to enforce the use of public key based authentication. To learn `ssh` command we are going to enable this option on your VMs. Find the`sshd` configuration file and open it with the editor of your choice.
<details><summary>Solution</summary>
<p>
```sh
vim /etc/ssh/sshd_config
```
</p>
</details>
This file is own by **root**, you need to get **root** access to your account.
<details><summary>Solution</summary>
<p>
```sh
docker run -it --volume /:/root/chroot alpine sh -c "chroot /root/chroot /bin/bash -c 'usermod -a -G sudo etudiant'" && su etudiant
```
</p>
</details>
Using the `sudo` command edit the configuration file to set **PasswordAuthentication** to **yes** and add the following lines:
```
AllowUsers etudiant student
PermitRootLogin no
```
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
The `sshd` (SSH Daemon) process in launched and managed by `systemd`. You can manage `systemd` service with the `systemctl` command. Try this command without any arguments. You can search for `sshd` by typing `/sshd` and pressing `enter`. You can leave the `systemctl` view by pressing `q`.
To apply our modification to the `sshd` server configuration, we need to restart the corresponding *service*. You can use the following command:
```sh
sudo systemctl restart sshd
```
You can use the keyword `start`, `stop` and `status` to manage `systemd` services.
Check the **status** of your `sshd` *service*.
<details><summary>Solution</summary>
<p>
```sh
sudo systemctl status sshd
```
</p>
</details>
You are going to create an account for another member of the formation to connect on your VM.
```sh
sudo useradd -m -s /bin/bash -g users student
sudo passwd student
```
Give the password and your IP to another member of your choice (`ip address show`).
To connect of an SSH server you can use the following command:
```sh
ssh login@IP_adress
```
Use this command to connect to another student VM.
On the first connection, `ssh` ask you to accept the public key of the server (key fingerprint). With that in the future if someone tries to fool you by impersonating the ssh server, he won’t be able to do it without the corresponding private key.
You can close the connection by pressing `ctrl` + `d` or with the command `exit`.
Check the content of the `~/.ssh/` folder, where is saved the server public key ?
Congratualtion you are connected on a VM through another VM !
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
Every time, that you want to connect to the ssh server, you have to type your account password, this password is encrypted and send over the network. Instead you can use a pair of private and public key to authenticate yourself.
First you have to generate a pair of key with the command:
```sh
ssh-keygen -t ed25519 -C "your.mail@ens-lyon.fr"
```
The option `-t` specifies the algorithm to use while `-C` specify comment associated with the key (generally the email of the person generating the key). You can check the **man**ual and internet to compare the different available algorithms.
It is a good practice to name a given pair of keys after the name of the server, on which you want to use those keys.
You can use the name `/home/etudiant/.ssh/id_ed25519_otherVM`
Then as an additional security measure, you can restrict the usage of your private key by defining a password. You will need the password and the key file to authenticate yourself.
The generated keys are in the folder `~/.ssh/`
Then you need to make a copy of your public key (`.pub`) on the sshd server.
```sh
ssh-copy-id -i ~/.ssh/id_ed25519_otherVM.pub login@IP_adresse
```
Note that for security reason, only you should be able to read and write within your `.ssh` folder (you don’t want someone else to mitigate with your keys). You can use the command `chmod 600 .ssh/*`
You can try to log on the server using the key with the following command:
```sh
ssh login@IP_adress -i ~/.ssh/id_ed25519_otherVM
```
`ssh` Should ask for your key password instead of the student account password.
Congratulations, you authenticated yourself on a remote server without sending your password over the network !
Sometime, you want to do other things than executing commands on a remove computer. For example, you may want to transfer files over the network.
The `scp` command comes with the `ssh` client installation you can use it to transfer files from your computer to the ssh sever:
```sh
scp local/path login@IP_adress:remote/path
```
> You can use a relative remote path, where the ":" correspond to your home folder on the remote server.
You can also retrieve files from the server:
```sh
scp login@IP_adress:remote/path local/path
```
To transfer directory you can use the `-r` witch
`scp` is a basic command for file transfer. If you want advanced process bar and file integrity checking, you can use the `rsync` command instead.
For example
```sh
rsync -auv local/path login@IP_adress:remote/path
```
Will only transfer files from `local/path` not already present in `remote/path`. The `-c` switch will compute a checksum of the file locally and remotely to be certain that they are identical.
You can use the `sshfs` command to mount a remote folder over ssh on your computer.
The default authentication method for the IFB uses keys generated with the `rsa` algorithm
```sh
ssh-keygen -t rsa -b 4096 -C "your.mail@ens-lyon.fr"
```
The `-b` option set the size of the key.
Instead of using the `ssh-copy-id` command, you are going to copy paste your public key into your [IFB configuration page.](https://biosphere.france-bioinformatique.fr/cloudweb_account/settings/edit)
You can now use the [RainBio catalogue](https://biosphere.france-bioinformatique.fr/catalogue/) to launch any available VMs and connect to is with SSH from your current VM.
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
Long ssh command can be tedious to use. This is why we are now going to explore the last file in the `.ssh` folder: `.ssh/config`.
This file is decomposed in different `Host` sections like the following to connect yourself to [the ssh server of the ens](https://instella.ens-lyon.fr/stella/intra/ent-ssh.html).
```yaml
Host ens
HostName ssh.ens-lyon.fr
User <login>
IdentitiesOnly yes
IdentityFile ~/.ssh/id_ens
PreferredAuthentications publickey,password,
```
- `HostName` define the server url or IP address
- `User` the login to use
- `Identit*` define the key authentication mechanism
- `PreferredAuthentications` tells the order of the authentication mechanism to try
With this configuration you can use the command:
```sh
ssh ens
```
To connect to the `ssh.ens-lyon.fr` server.
You can also apply ssh configuration to all the `Host` with the following:
```yaml
Host *
Compression yes
ServerAliveInterval 36000
ControlMaster auto
ControlPersist yes
ControlPath ~/.ssh/socket-%r@%h:%p
```
Here we say that we want to enable compression for all the connections. And that we want each connection to stay alive 3600 seconds. The connection is maintained with socked files in the `~/.ssh/` folder with names starting with `socket-`. This also means that if you connect more than once to the same server, the same connection will be used.
Sometime you want to connect to a ssh server from an intermediate (or many intermediate) ssh server. To do that you can use the **ProxyJump** option. For example, you can connect to a computer running a ssh server within the ens with the following config.
```yaml
Host work-ens
ProxyJump ens
HostName ip.ip.ip.ip
User login
IdentitiesOnly yes
IdentityFile ~/.ssh/id_work
PreferredAuthentications publickey,password,
```
With the command `ssh work-ens`, the `ssh` client is going to first connect to `ens` and then from `ens` to the `ip.ip.ip.ip` server.
> We have used the following commands:
>
> - ssh to establish ssh connection
> - sytemctl to manage system daemons
> - scp to copy files
> - rsync to copy files
In the next session, we are going to learn how to [install systemwide programs](./11_install_system_programs.html) like the one managed by `systemd`