Skip to content
Snippets Groups Projects
session_1.Rmd 21.36 KiB
title: 'R.1: Introduction to R and RStudio'
author: "Laurent Modolo [laurent.modolo@ens-lyon.fr](mailto:laurent.modolo@ens-lyon.fr);\n Hélène Polvèche [hpolveche@istem.fr](mailto:hpolveche@istem.fr)"
date: "2022"
output:
  rmdformats::downcute:
    self_contain: true
    use_bookdown: true
    default_style: "light"
    lightbox: true
    css: "../www/style_Rmd.css"
library(fontawesome)

r fa(name = "fas fa-house", fill = "grey", height = "1em")https://can.gitbiopages.ens-lyon.fr/R_basis/

rm(list=ls())
knitr::opts_chunk$set(echo = TRUE)
knitr::opts_chunk$set(comment = NA)
if (! require("klippy")) {
  install.packages("remotes")
  remotes::install_github("rlesur/klippy")
}
klippy::klippy(
  position = c('top', 'right'),
  color = "white",
  tooltip_message = 'Click to copy',
  tooltip_success = 'Copied !')

Introduction

The goal of this practical is to familiarize yourself with R and the RStudio environment.

The objectives of this session will be to:

  • Understand the purpose of each pane in RStudio
  • Do basic computation with R
  • Define variables and assign data to variables
  • Manage a workspace in R
  • Call functions
  • Manage packages
  • Be ready to write graphics !
![](./img/intro_img.png){width=400px}

Acknowledgments

Some R background

is a programming language and free software environment for statistical computing and graphics supported by the R Foundation for Statistical Computing.

  • Created by Ross Ihaka and Robert Gentleman
  • initial version released in 1995
  • free and open-source implementation the S programming language
  • Currently developed by the R Development Core Team.

Reasons to use it:

  • It's open source, which means that we have access to every bit of underlying computer code to prove that our results are correct (which is always a good point in science).
  • It’s free, well documented, and runs almost everywhere
  • It has a large (and growing) user base among scientists
  • It has a large library of external packages available for performing diverse tasks.
cran_packages <- nrow(available.packages(repos = "http://cran.us.r-project.org"))

if (! require("rvest")) {
  install.packages("rvest", quiet = T)
}

library(rvest)
url <- 'https://www.bioconductor.org/packages/release/bioc/'
biocPackages <- url %>% read_html() %>% html_table() %>%.[[1]]
bioconductor_packages <- nrow(biocPackages)

How do I use R ?

Unlike other statistical software programs like Excel, SPSS, or Minitab that provide point-and-click interfaces, R is an interpreted language.

This means that you have to write instructions for R. Which means that you are going to learn to write code / program in R.

R is usually used in a terminal in which you can type or paste your R code:

But navigating between your terminal, your code and your plots can be tedious, this is why in r format(Sys.time(), "%Y") there is a better way to use R !

RStudio, the R Integrated development environment (IDE)

An IDE application provides comprehensive facilities to computer programmers for software development. Rstudio is free and open-source.

To open RStudio, you can install the RStudio application and open the app.

Otherwise you can use the link and the login details provided to you by email. The web version of Rstudio is the same as the application expect that you can open it any recent browser.

Rstudio interface

The same console as before (in Red box)

Errors, warnings, and messages

The R console is a textual interface, which means that you will enter code, but it also means that R is going to write information back to you and that you will have to pay attention at what is written.

There are 3 categories of messages that R can send you: Errors prefaced with Error in…, Warnings prefaced with Warning: and Messages which don’t start with either Error or Warning.

  • Errors, you must consider them as red light. You must figure out what is causing it. Usually you can find useful clues in the errors message about how to solve it.
  • Warning, warnings are yellow light. The code is running but you have to pay attention. It's almost always a good idea to try to fix warnings.
  • Message are just friendly messages from R telling you how things are running.

R as a Calculator

Now that we know what we should do and what to expect, we are going to try some basic R instructions. A computer can perform all the operations that a calculator can do, so let's start with that:

  • Add: +
  • Divide: /
  • Multiply: *
  • Subtract: -
  • Exponents: ^ or **
  • Parentheses: (, )
Now Open RStudio. Write the commands in colors in a blue box in the terminal. The expected results will always be printed in white in a blue box.

You can copy paste but I advise you to practice writing directly in the terminal. Like all the languages, you will become more familiar with R by using it.

To validate the line at the end of your command: press Return.

First commands

You should see a > character before a blinking cursor. The > is called a prompt. The prompt is shown when you can enter a new line of R code.

1 + 100

For classical output R will write the results with a [N] with N the row number. Here you have a one-line results [1]

1 + 100

Do the same things but press (return) after typing +.

1 +

The console displays +.
The > can become a + in case of multi-lines code. As there are two sides to the + operator, R know that you still need to enter the right side of your formula. It is waiting for the next command. Write just 100 and press :

100
1 + 100

R keeps to the mathematical order

The order of operation is the natural mathematical order in R:

3 + 5 * 2

You can use parenthesis ( ) to change this order.

(3 + 5) * 2

But to much parenthesis can be hard to read

(3 + (5 * (2 ^ 2))) # hard to read
3 + 5 * (2 ^ 2)     # if you forget some rules, this might help

Note : The text following a # is a comment. It will not be interpreted by R. In the future, I advise you to use comments a lot to explain in your own words what the command means.

Scientific notation

For small of large numbers, R will automatically switch to scientific notation.

2/10000

2e-4 is shorthand for 2 * 10^(-4) You can use e to write your own scientific notation.

5e3

Mathematical functions

R is distributed with a large number of existing functions. To call mathematical function you must with function_name(<number>).

For example, for the natural logarithm:

log(1)  # natural logarithm
log10(10) # base-10 logarithm
exp(0.5)

Compute the factorial of 9 (9!)

9 * 8 * 7 * 6 * 5 * 4 * 3 * 2 * 1

or

factorial(9)

Comparing things

We have seen some examples that R can do all the things that a calculator can do. But when we are speaking of programming language, we are thinking of writing computer programs. Programs are collections of instructions that perform specific tasks. If we want our future programs to be able to perform automatic choices, we need them to be able to perform comparisons.

Comparisons can be made with R. The result will return a TRUE or FALSE value (which is not a number as before but a boolean type).

Try the following operator to get a `TRUE` then change your command to get a `FALSE`.

You can use the (upper arrow) key to edit the last command and go through your history of commands