Decoded: df (coreutils)

[Back to Project Main Page]

Note: This page explores the design of command-line utilities. It is not a user guide.
[GNU Manual] [POSIX requirement] [Linux man] [FreeBSD man]

Logical flow of df command (coreutils)

Summary

df - report file system disk space usage

[Source] [Code Walkthrough]

Lines of code: 1819
Principal syscall: stat(), statvfs()
Support syscalls: fstatat()
Options: 30 (14 short, 16 long)

Descended from df introduced in Version 1 UNIX (1971)
Added to Fileutils in October 1992 [First version]
Number of revisions: 298

Two general strategies are to ask the filesystem for accounting metadata or check every file and estimate. The df utility uses the former strategy via sysfs()/sysvfs(). Check out the du utility for the other strategy

Helpers:
  • add_excluded_fs_type() - Removes a filesystem from processing
  • add_fs_type() - Adds a file system to be processed
  • add_uint_with_neg_flag() - Handles signs for uint types as a distinct flag
  • add_to_grand_total() - Fully summarizes usage
  • alloc_field() - Allocates space for a field
  • alloc_table_row() - Allocates a row for the output table
  • decode_output_arg() - Parse the output arguments for the eventual display
  • devlist_compare() - Comparison function for devices
  • devlist_for_dev() - Locate a device within the device table
  • devlist_free() - Free the devlice list
  • devlist_hash() - Index function for the device list
  • df_readable() - Human-readable display with distinct negative flag
  • excluded_fstype() - Checks if a filesystem type is excluded
  • filter_mount_list() - Removes duplicates, ensuring distinct mount list entries
  • get_all_entries() - Constructs a list of all distinct mount entries
  • get_dev() - Creates an output listing for a given device
  • get_disk() - Display usage for a single mount point
  • get_entry() - Verifies device request and prepares stat()
  • get_field_list() - Prepares the table columns
  • get_field_values() - Gathers block/inode values
  • get_header() - Builds the appropriate header
  • get_point() - stat() the device at the given mount point
  • has_uuid_suffix() - True if the input is a UUID
  • hide_problematic_chars() - Convert control characters to '?'
  • last_device_for_mount() - Returns a string for the last device mounted
  • me_for_dev() - Finds a device in the mount entry list
  • print_table() - Final output for df
  • selected_fstype() - Returns true if the input is a displayed fs type
External non-standard helpers:
  • free_mount_entry() - Removes a mount entry
  • get_fs_usage() - The workhorse function to get fs data for many systems (from gnulib)
  • human_readable() - Formats numeric data with meaningful suffixes
  • read_file_system_list() - Parses mounted filesystems at /proc/self/mountinfo

Setup

df defines a few global structs and variables prior to execution:

  • struct devlist is a single-linked list of devices (number/mount point)
  • struct fs_type_list is a list of filesystem types
  • struct field_values_t holds the possible values of all fields
  • struct field_data_t holds the attributes for a single field

The global variables are dedicated to the option flags and the data table to display, including:

  • show_all_fs - Flag for all files systems (-a)
  • show_local_fs - Flag for only checking local filesystems (-l)
  • show_listed_fs - Flag to display only given filesystems (cli arguments)
  • file_systems_processed - Flag set if any file system was processed
  • *fs_select_list - The list of file systems explicitly requested
  • *fs_exclude_list - The list of file systems explicitly ignored
  • print_type - Flag to show the file system type (-T)
  • print_grand_total - Flag to show the usage grand total (--total)
  • field_data[] - The header for each column
  • **columns - The data for each column
  • ***table - The values in each cell (*table -> *column -> *cell)

Also, several enum types are defined to facility options, which include the display mode, the field basis (block/inode/text), the actual field types, etc.

main() initializes all of the above to default values (or NULL pointers). Two more local variables are defined:

  • *msg_mut_excl - Error message for mutually exclusive options
  • posix_format - Forces the posix output format (-p)

Parsing kicks off with the short options passed as a string literal:
"aB:iF:hHklmPTt:vx:"


Parsing

Parsing df asks a few questions from the user:

  • What file systems and types are we interested in?
  • What information (fields) of data do we want?
  • What units do we want the answer in? (block/inodes?)
  • Any special considerations for the display? (POSIX style?)

Parsing failures

Parsing may fail in several ways:

  • A filesystem type is both selected and excluded
  • Including the --output option with any of -i, -T, or -P
  • Using output fields with inode, portability, and print-type options

User specified parsing failures result in a short error message followed by the usage instructions. Access related parsing errors die with an error message.


Execution

Executing the df utility follows this path:

  • stat() any of the explicitly checked file systems
  • Build the list of currently mounted devices
  • Construct the output table header and column structures
  • Go through each mounted device and get file system data
  • Construct the table entry
  • Print the output table

Failure cases:

  • No file systems are processed (nothing mounted or target not available)
  • Unable to access file systems

All failures at this stage output an error message to STDERR and return without displaying usage help


[Back to Project Main Page]