-
Laurent Modolo authoredLaurent Modolo authored
title: Unix Streams and pipes
Steams and pipes
Objective: Understand function of streams and pipes in Unix systems
When you read a file you start at the top from left to right, you read a flux of information which stops at the end of the file.
Unix streams are much the same things instead of opening a file as a whole bunch of data, process can process it as a flux. There are 3 standard Unix streams:
- stdin the standard input
- stdout the standard output
- sterr the standard error
Historically, stdin has been the card reader or the keyboard, while the two others where the card puncher or the display.
The command cat
simply read from stdin and displays the results on stdout
cat
I can talk with
myself
It can also read files and display the results on stdout
cat .bashrc
Streams manipulation
You can use the >
character to redirect a flux toward a file. The following command make a copy of your .bashrc
files.
cat .bashrc > my_bashrc
Check the results of your command with less
.
Following the same principle create a my_cal
file containing the calendar of this month. Check the results with the command less
Reuse the same command with the unnamed option 1999
. Check the results with the command less
. What happened ?
Try the following command
cal -N 2 > my_cal
What is the content of my_cal
what happened ?
The >
command can have an argument, the syntax to redirect stdout to a file is 1>
it's also the default option (equivalent to >
). Here the -N
option doesn't exists, cal
throws an error. Errors are sent to stderr which have the number 2.
Save the error message in my_cal
and check the results with less
.
We have seen tha >
overwrite the content of the file. Try the following commands:
cal 2020 > my_cal
cal >> my_cal
cal -N 2 2>> my_cal
Check the results with the command less
.
The command >
send the stream from the left to the file on the right. Try the following:
cat < my_cal
What is the function of the command <
?
You can use different redirection on the same process. Try the following command:
cat <<EOF > my_notes
Type some text and type EOF
on a new line. EOF
stand for end of file, it's a conventional sequence to use to indicate the start and the end of a file in a stream.
What happened ? Can you check the content of my_notes
? How would you modify this command to add new notes?
Finaly you can redirect a stream toward another stream with the following syntax:
cal -N2 2&> my_redirection
cal 2&>> my_redirection
Pipes
The last stream manipulation that we are going to see is the pipe which transforms the stdout of a process into the stding of the next. Pipes are useful to chain multiples simple operations. The pipe operator is |
cal 2020 | less
What is the difference between with this command ?
cal 2020 | cat | cat | less
The command zcat
has the same function as the command cat
but for compressed files in gzip
format.
The command wget
download files from a url to the corresponding file. Don't run the following command which would download the human genome:
wget http://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/hg38.fa.gz
We are going to use the -q
switch which silence wget
(no download progress bar or such), and the option -O
which allows use to set the name of the output file. In Unix setting the output file to -
allow you to write the output on the stdout stream.
Analyze the following command, what would it do ?
wget -q -O - http://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/hg38.fa.gz | gzip -dc | less
Remember that most Unix command process input and output line by line. Which means that you can process huge dataset without intermediate files or huge RAM capacity.
We have users the following commands:
cat
/zcat
to display information in stdout>
/>>
/<
/<<
to redirect a flux|
the pipe operator to connect processeswget
to download files
You can head to the next session to apply pipe and stream manipulation.