Decoded: realpath (coreutils)

[Back to Project Main Page]

Note: This page explores the design of command-line utilities. It is not a user guide.
[GNU Manual] [No POSIX requirement] [Linux man] [FreeBSD man]

Logical flow of realpath command (coreutils)

Summary

realpath - print the resolved file name

[Source] [Code Walkthrough]

Lines of code: 279
Principal syscall: realpath() via canonicalize_filename_mode() -> areadlink_with_size()
Support syscalls: stat()
Options: 19 (7 short, 12 long)

Added to Coreutils in January 2012 [First version]
Number of revisions: 27 [Code Evolution]

Helpers:
  • isdir() - Checks if a path is a directory
  • path_prefix() - Checks path prefix type (parent or path)
  • process_path() - Outputs canonical name and returns success status
  • realpath_canon() - Handles canonical names as needed based on logical mode setting
External non-standard helpers:
  • canonicalize_filename_mode() - Performs the 'canonicalize' mode execution (from gnulib)
  • error() - Outputs error message to standard error with possible process termination

Setup

Many flags and strings are defined globally, including:

  • *can_relative_base - String holding the canonicalized form of the user-provided relative-base string
  • *can_relative_to - String holding the canonicalized form of the user-provided relative-to string
  • logical - Flag for logical mode
  • use_nuls - Flag to determine how to handle end of lines
  • verbose - Flag for output detail

realpath initializes the following local variables in main():

  • can_mode - Bitfield for canonicalization flags
  • need_dir - Flag set if directories need to be verified
  • ok - Flag for success status
  • *relative_base - string holding the relative-base string specified by the user
  • *relative_to - string holding the relative-to string specified by the user

Parsing

Parsing realpath considers:

  • What paths should we reference from?
  • Should we verify path components?
  • Should we follow symlinks?
  • Are we in logical mode? (Prioritize dots or symlinks)
  • Should we separate new lines with \0 or \n?
  • How much feedback should we provide the user?

Parsing failures

These failure cases are explicitly checked:

  • No target specified
  • No newline separate with multiple output files (warning)

This failure result in a short error message followed by the usage instructions.


Execution

The most important options affecting execution are the user specified reference paths. That is, relative-to and relative-base. We need to find an absolute (canonical) basis for the given relative references. Execution generally goes like this:

  • Find the canonical form of the user-specified reference paths (if any)
  • Verify canonical paths exist (if requested by the user)
  • Check the relationship between relative-base and relative-to (if used)
  • Process each input target
    • Get the canonical name, possibly twice for symlinks.
    • For relative paths, compare canonical names and parse difference
    • No errors so far? Print the name
    • Print requested separator (\0 or \n)

The common, trivial case is that the absolute path for each target is printed as returned from canonicalize_filename_mode()

Failure cases:

  • No target argument provided
  • Unable to stat() target
  • Paths don't exist when required
  • No canonical file name for relative reference

[Back to Project Main Page]