Decoded: tail (coreutils)

[Back to Project Main Page]

Note: This page explores the design of command-line utilities. It is not a user guide.
[GNU Manual] [POSIX requirement] [Linux man] [FreeBSD man]

Logical flow of tail command (coreutils)

Summary

tail - output the last part of files

[Source] [Code Walkthrough]

Lines of code: 2515
Principal syscall: write()
Support syscalls: open(), close()
Options: 25 (8 short, 17 long, does not include legacy digits for line count)

Originated with or shortly before the release of System III (1982)
Added to Textutils in November 1992 [First version]
Number of revisions: 408

tail is necessarily more complex than head because we must buffer input as it's read. Non-seekable steams are the most complex case since we cannot look forward to the end.

Helpers:
  • any_live_files() - Checks if any target files are already open
  • any_non_regular_fifo() - Checks if any targets are non-regular files (devices, etc)
  • any_non_remote_file() - Check if target files are on the local file system
  • any_remote_file() - Checks if target files are on remote file systems
  • any_symlinks() - Checks if any targets are actually symbolic links
  • check_fspec() - Checks for new data from target (in forever loops)
  • check_output_alive() - Checks if the target is still valid (in forever loops)
  • close_fd() - Close the file descriptor with the associated name
  • dump_remainder() - Reads and counts input, possibly buffering
  • file_lines() - Output the last lines of a buffered file
  • fremote() - Tests if a file descriptor is on a remote system with fstatfs(), if possible
  • ignore_fifo_and_pipe() - Flags fifo and pipe streams to ignore
  • parse_obsolete_option() - Manual parsing of legacy options
  • parse_option() - Performs option parsing with getopt
  • pipe_bytes() - Prints final buffered characters from a pipe source
  • pipe_lines() - Prints final buffered lines from a pipe source
  • pretty_name() - Prints source name, which may be 'standard input'
  • recheck() - Tests File_spec and underlying file for changes
  • record_open_fd() - Fills out File_spec information from stat()
  • start_bytes() - Skip a number of bytes from the start of a pipe
  • start_lines() - Skips a number of lines from the start of a file or pipe
  • tail() - Performs the tail procedure on an open file descriptor
  • tail_bytes() - Tail procedure based on byte count
  • tail_file() - Opens a single file and performs tail procedure
  • tail_forever() - Procedure to tail files forever (-f)
  • tail_forever_inotify() - Procedure to tail files forever (-f)
  • tail_lines() - Tail procedure based on line count
  • tailable_stdin() - Check if user requested tail on STDIN
  • valid_file_spec() - Checks that a file descriptor is valid and without errors
  • wd_comparator() - Comparison function for the watch descriptor (inotify)
  • wd_hasher() - Hash function for the watch descriptor (inotify)
  • write_header() - Prints the name of the new file on change
  • xlseek() - Attempts to lseek() on a file
  • xwrite_stdout() - Writes bytes from a buffer to STDOUT
External non-standard helpers:
  • die() - Exit with mandatory non-zero error and message to stderr
  • error() - Outputs error message to standard error with possible process termination

Setup

tail defines a structure, File_spec that contains important data about the current file being processed

tail also keeps several flags and variables as globals, including:

  • count_lines - Flag set if we're counting lines (not bytes) (-n)
  • disable_inotify - Flag to disable inotify
  • follow_mode - Determines if we follow names or descriptors
  • forever - Flag set if we're looping forever (-F, -f)
  • from_start - Flag set if output from the start of a file
  • line_end - The end of line character, \n or \0
  • monitor_output - Flag to end processing if pipe closes
  • pid - The user-provided pid to associate with output
  • presume_input_pipe - Flag set if the presume pipe undocumented feature is on
  • print_headers - Flag set to print file name headers
  • repoen_inaccessible_files - Flag set if we attempt to reopen closed files (-F)

main() introduces a few local variables:

  • *F - The File_spec struct array for the input files
  • **file - The list of file names as provided by the user
  • header_mode - The current header mode
  • i - Generic iterator used in several ways (usually file number)
  • n_files - The number of files provided
  • n_units - The number of lines/bytes to process with tail
  • obsolete_options - Flag if obsolete options were processed
  • ok - The final return status

Parsing

Parsing answers the following questions to define the execution parameters

  • Are we counting lines or bytes? And how many?
  • Are we following a file descriptor for more output? Is there a wait time?
  • Should we display headers?

Parsing failures

These warning and failure cases are explicitly checked:

  • Invalid lines/bytes number provided
  • Invalid number of seconds provided
  • Trying to retry without following (warning)
  • Tracking a PID without following (warning)
  • Unknown option used

Parsing failures result in a short error message followed by the usage instructions. Warnings may allow processing to continue


Execution

tail execution is more complex that the diagram above suggests. There are two general strategies depending on if the file source is seekable. This is simplified if we're skipping a fixed start size, or if we need to see the end first. Finally, the read/poll forever case also defines a separate execution path.

  • It's possible to immediately exit successfully if there's no work to do (0 lines/bytes from end)
  • If there is work, start by allocating a File_spec for each target file
  • Open the next file
  • Sets the File_spec data based on both the file and user options
  • Printer headers for the new file (if multiple)
  • If this file source is seekable, seek to the end and scan backwards until enough data is found.
  • If the file source is not seekable, create a linebuffer linked list and continue to read until EOF
  • The 'output-forever case' is distinct from the seekable cases (not pictured above)
  • In all cases, close the file and move to the next as needed

Failure cases:

  • Unable to stat() file streams
  • Unable to read from input stream
  • Unable to write to standard out
  • Failure to fcntl() to nonblocking mode
  • Clock failure when trying to wait/sleep
  • Watch failures with inotify

All failures at this stage output an error message to STDERR and return without displaying usage help


[Back to Project Main Page]