Decoded: nl (coreutils)

[Back to Project Main Page]

Note: This page explores the design of command-line utilities. It is not a user guide.
[GNU Manual] [POSIX requirement] [Linux man] [FreeBSD man]

Logical flow of nl command (coreutils)

Summary

nl - number lines and write files

[Source] [Code Walkthrough]

Lines of code: 601
Principal syscall: write()
Support syscalls: open(), close(), fadvise()
Options: 24 (11 short, 13 long)

Descended from nl introduced in System III (1982)
Added to Textutils in October 1992 [First version]
Number of revisions: 137

Helpers:
  • build_type_arg() - Sets the section type and builds the regex pattern
  • check_section() - Determines which section we are processing (header, body, footer)
  • nl_file() - The top-level nl procedure
  • print_lineno() - Prints the line number as currently formatted
  • proc_body() - The body procedure for nl
  • proc_footer() - The footer procedure for nl
  • proc_header() - The header procedure for nl
  • proc_text() - Processes non-sectioned text
  • process_file() - Processes a single input file
External non-standard helpers:
  • die() - Exit with mandatory non-zero error and message to stderr
  • error() - Outputs error message to standard error with possible process termination

Setup

At global scope, nl declares a long list of globals that describe and control execution:

  • *body_del is the body delimiter string
  • body_del_len is the length of the body delimiter
  • body_fastmap is the body's fastmap supporting regex search
  • body_regex is the body's regular expression pattern buffer
  • body_type is the style of the body section (-b [atnp])
  • current_type holds the type in use now -- should match the current section
  • have_read_stdin is set if STDIN is ever read from
  • *header_del is the header delimiter string
  • header_del_len is the length of the header delimiter
  • header_fastmap is the header's fastmap supporting regex search
  • header_regex is the header's regular expression pattern buffer
  • header_type is the style of the header section (-h [atnp])
  • *footer_del is the footer delimiter string
  • footer_del_len is the length of the footer delimiter
  • footer_fastmap is the footer's fastmap supporting regex search
  • footer_regex is the footer's regular expression pattern buffer
  • footer_type is the style of the footer section (-f [atnp])
  • line_buf is the buffer to hold the next line
  • line_no is the current line number to read
  • page_incr is the amount to increment the line counter after each line
  • reset_numbers is a flag to force reset of line numbers after each page
  • starting_line_number is the initial line number

main() initializes the following:

  • c - Holds the next option character for parsing
  • len - The base length of the section delimiter (footer).
  • ok - Flag for execution success

Parsing kicks off with the short options passed as a string literal:
"h:b:f:v:i:pl:s:w:n:d:"


Parsing

Parsing collects the options that answer the questions we need to know from the user:

  • What line number formatting applies to each section?
  • What are the section delimiters?
  • What line number do we start with and do we increment by 1?
  • Do we reset line number after each page?
  • Should we force the width of the line number field?

A subtle parsing task is that any user provided styles are compiled in to a pattern buffer as they are read in from options. See the the gnulib manual for more details

Parsing failures

These failure cases are explicitly checked:

  • Unknown numbering style
  • Unusual line starting numbers
  • Non-sensical line increments or number widths (such as zero or negatives)
  • Unknown formats
  • Unknown options used

Failures result in a short error message followed by the usage instructions.


Execution

The nl execution path is predictable with few gotchas. For each file, perform the following procedure:

  • For each input file:
    • Open the file streams
    • Read a line
    • Check the line type to set the formats and regex pattern buffer
    • Write the current line number and increment statistics
    • Write the actual text line
    • Close the file streams
  • Close standard input if used

Failure cases:

  • Unable to open or close input file
  • Too many lines (overflow)
  • A regular expression search failed
  • Failure to write to STDOUT
  • Failure to close input standard input (if used)

All failures at this stage output an error message to STDERR and return without displaying usage help


[Back to Project Main Page]