Decoded: numfmt (coreutils)

[Back to Project Main Page]

Note: This page explores the design of command-line utilities. It is not a user guide.
[GNU Manual] [No POSIX requirement] [Linux man] [No FreeBSD entry]

Logical flow of numfmt command (coreutils)

Summary

numfmt - reformat numbers

[Source] [Code Walkthrough]

Lines of code: 1652
Principal syscall: write()
Support syscalls: None
Options: 20 (2 short, 18 long)

This is one of the newest utilities added to coreutils
Added to Coreutils in February 2013 [First version]
Number of revisions: 44

Helpers:
  • absld() - Absolute value for long doubles
  • default_scale_base() - Returns the base (either 1024 or 1000)
  • double_to_human() - Converts an input long double to a string including the magnitude suffix
  • expld() - Computes the power and scalar for an input long double
  • next_field() - Returns the next field (multiple on a line)
  • parse_format_string() - Processes a string with the user provided format (--format)
  • parse_human_number() - Breaks a human readable number to double and precision components
  • powerld() - The exponentiation for a long double base
  • prepare_padded_number() - Constructs and prints a number with given precision
  • print_padded_number() - Prints a padded number including prefix and suffix
  • process_field() - Reads, handles, and outputs the next field in the line
  • process_line() - The top-level numfmt operation for a single number
  • process_suffixed_number() - Reads a number string and adds a suffix if needed
  • setup_padding_buffer() - Initializes the padding buffer
  • simple_round() - Performs a round with selectable operations
  • simple_round_ceiling() - Round up without bounds checking
  • simple_round_floor() - Rounds down without bounds checking
  • simple_round_from_zero() - Rounds away from zero without bounds checking
  • simple_round_nearest() - The default 'round' case (up or down depending on fractional value)
  • simple_round_to_zero() - Rounds towards zero without bounds checking
  • simple_strtof_fatal() - Checks the strtod error code and prints a message
  • simple_strtod_float() - Read a string float to and returns the long double equivalent
  • simple_strtod_human() - Read a string float to and returns the long double equivalent with suffix
  • simple_strtod_int() - Read a string integer to and returns the long double equivalent
  • suffix_power() - Returns the power magnitude for the input suffix character
  • suffix_power_char() - Returns the single suffix character for the input magnitude
  • unit_to_umax() - Parses custom from/to unit sizes from user arguments
  • valid_suffix() - Tests if a suffix character is valid
External non-standard helpers:
  • die() - Exit with mandatory non-zero error and message to stderr
  • error() - Outputs error message to standard error with possible process termination

Setup

numfmt keeps many of the utility parameters as globals, including:

  • auto_padding - Flag to detect padding requirement
  • conv_exit_code - Variable to hold conversion errors
  • debug - Flag for debug mode (stderr printing)
  • *decimal_point - The decimal point character (locality-based)
  • decimal_point_length - The decimal point length
  • delimiter - The delimiter between fields in a line
  • dev_debug - Flag to enable devmsg()
  • *format_str_prefix - The format string prior to any formatting
  • *format_str_suffix - The format string after to all formatting
  • from_unit_size - The user-provided source unit (--from-unit)
  • grouping - Flag for locale-based grouping (--grouping)
  • header - Flag to skip STDIN header lines
  • inval_style - The invalid mode type (--invalid)
  • line_delim - The end of line character
  • padding_alignment - The alignment mode using mba alignment
  • *padding_buffer - The characters in the padding buffer
  • padding_buffer_size - The size of the padding buffer
  • padding_width - The padding value provided by the user (--padding)
  • round_style - The rounding style provided by the user (--round)
  • scale_from - The magnitude of the source number (--from)
  • scale_to - The magnitude of the desired number (--to)
  • *suffix - The user provided suffix string (--suffix)
  • to_unit_size - The user-provided target output unit (--to-unit)
  • user_precision - The user provided precision size (--format)
  • zero_padding_width - The length of a zero format section

main() introduces a few local variables:

  • locale_ok - Tests that locality is properly set
  • valid_numbers - Tracks that all convertions are successful

Parsing

Parsing puts together the formatting requested by the user. Things to answer are:

  • Are there fields? If so, how many? what is the delimiter?
  • What is the output format string?
  • Is there a conversion to or from human readable values?
  • Is there a header?
  • Do we assume input or output values are already scaled?

Parsing failures

These failure cases are explicitly checked:

  • Nonsensical padding sizes or header lengths
  • Providing multiple field definitions
  • Providing a multi-character field delimiter
  • Using --grouping and either --format or --to together
  • Not providing any formatting instructions
  • Unknown option used

User specified parsing failures result in a short error message followed by the usage instructions. Access related parsing errors die with an error message.


Execution

numfmt employs a small optimization to minimize processing and enhance responsiveness depending on the behavior selected by the user. To keep it simple, here is the complex path that may happen during file checking:

  • Parse the format string provided by the user to set padding, alignment, and grouping globals
  • Allocate and initialize the padding buffers from format information
  • Print header lines if requested. No formatting is applied and information is passed directly to fputs()
  • Read the next line of input
    • Find the next field (all lines have at least 1 field)
    • Compute the suffix conversion
    • Add padding as necessary for the computed value
    • Print the value
    • Repeat for all fields
  • After the last line, check if all conversion's succeeded and return result

Note that there is a special exit case for conversion failures (EXIT_CONVERSION_WARNING)

Failure cases:

  • Setting locality failed
  • Unable to read from STDIN
  • Number conversion failed for any reason
  • Number too large for the target precision
  • Invalid suffix value found
  • Converted value is too large
  • User-provided format is invalid

All failures at this stage output an error message to STDERR and return without displaying usage help


[Back to Project Main Page]