Working with Data Streams in Linux
Overview

A neat trick in Linux is that the input and output of a program can be redirected from/to typical files. This short note will show you some of the stream redirection you can do on the command line, which may help you to generally deal with program input and ouput better and specifically to build test suites for text-based programs.

This note starts with the short version of working with streams on the command line and within programs, then gives a little background for those interested in why things work this way.

Working with Streams: the Command Line

Output from Linux programs can be redirected to a file by invoking the command with extra redirection specifiers, like so:

./prog >& saved.txt

Whatever output that prog would have otherwise printed to the console will overwrite the file saved.txt.

Most programs make a distinction between their error messages and their regular output. Those two streams of data can be redirected separately, using 1> for regular output and 2> for error output. For example:

./prog 1> log.txt 2> err.txt
will overwrite the file log.txt with the regular output of ./prog and will overwrite the file err.txt with any error messages that ./prog shows.

Introducing the Output Streams

Linux processes are, by default, started with two output data streams open:

  • The standard output stream: This stream is associated with file descriptor 1, and is writeable. When launched from a shell, stdin will correspond (by default) to output to the terminal window.
  • The standard error stream: This stream is associated with file descriptor 2, and is writable. When launched from a shell, stderr will also correspond (by default) to output to the terminal window.
Programming languages may choose different abstractions for writing to these streams. For example, the following C program writes "reg" to standard out and "bad" to standard err:
#include <stdio.h>
int main(){
  fprintf(stdout, "reg");
  fprintf(stderr, "bad");
}
Note that the plain printf function, (used like printf("howdy"), for example) implicitly selects stdout as the output stream. The equivalent code in C++ looks like:
#include <iostream>
int main(){
  std::cout << "reg";
  std::cerr << "bad";
}
The presence of two streams may seem superfluous, especially when both are written to the console.