Decoded: cp (coreutils)

[Back to Project Main Page]

Note: This page explores the design of command-line utilities. It is not a user guide.
[GNU Manual] [POSIX requirement] [Linux man] [FreeBSD man]

Logical flow of cp command (coreutils)

Summary

cp - copy files and directories

[Source] [Code Walkthrough]

Lines of code: 1227
Principal syscall: None
Support syscalls: stat(), open(), close()
Options: 50 (21 short, 29 long)

Descended from cp introduced in Version 1 UNIX (1971)
Added to Fileutils in October 1992 [First version]
Number of revisions: 327

Much of the work behind the cp utility passes through support functions in copy.c which are shared among the mv, and install utilities

. Helpers:
  • cp_option_init() - Initializes the cp_options structure
  • decode_preserve_arg() - Processes argument provded by the user with --preserve
  • do_copy() - The top-level copy procedure for all the requested targets
  • make_dir_parents_private() - Verifies that the parent directory exists
  • re_protect() - Verifies parent directory access properties
  • target_directory_operand() - Verifies that the target is a directory
External non-standard helpers:
  • copy() - The copy interface for copy.c to perform the task using provided options
  • die() - Exit with mandatory non-zero error and message to stderr
  • error() - Outputs error message to standard error with possible process termination

Setup

cp uses several global flags and variables, including:

  • parents_option - Flag to use an existing directory for each given file (--parents)
  • remove_trailing_slashes - Flag to remove trailing slashes from sources
  • selinux_enabled - Flag set if SELinux services are enabled on the system

main() adds a few locals before starting parsing:

  • backup_suffix - The user-provided backup file suffix (-S)
  • c - The next option character to process
  • copy_contents - Flag to force copy of special file contents (--copy-contents)
  • make_backups - Flag to make backups of files that change (-b)
  • no_target_directory - Treat the final operand as a file (-T)
  • ok - The final return status
  • *scontext - The security context
  • *target_directory - The user-provided target directory (-t)
  • *version_control_string - The user-provided backup method (-b)
  • x - The copy options (as a struct cp_options)

Parsing

Parsing breaks down the user-provided options to answer these questions about the copy procedure:

  • Which files do we copy and where to?
  • Should we create a backup, if so, is there a special suffix?
  • Should the operation force move?
  • Any special treatment for links or other special files?
  • Is this is a recursive operation?
  • Is there a security context?

Parsing failures

These failure cases are explicitly checked:

  • Trying to create a hard and a soft link
  • Using --no-clobber and --backup at the same time
  • Using --reflink without --sparse=auto
  • Try to set and preserve the security context
  • Specifying a security context without SELinux
  • Unknown option used

User specified parsing failures result in a short error message followed by the usage instructions. Access related parsing errors die with an error message.


Execution

The key organization idea behind cp is that the source/dest files/directories are shuffled to the copy() interface along with control information within the cp_options structure

The execution path splits along the two possible cp forms: copying src to dest and copying files to a target directory. The processes look like this:

The overall process for cp looks like this:

  • Gather the backup type and set the naming convention
  • Set up the security context as requested
  • Initialize a hash table for file name search (see cp-hash.c)
  • Copy oldfile to newfile:
    • Test for edge case when force and backup is used on the same source and destination
    • Pass names and options to copy()
  • Copy file(s) to destination directory:
    • Initialize hash lookup tables for source and destination (fast problem detection)
    • Remove the trailing slashes from the source files
    • Apply the parent directory name or else...
    • Update the current directory name
  • Pass the source/destination arguments and copy options to copy()
  • Repeat the copy for the next source file provided (for the target directory case)
  • Return the exit status from copy()

Failure cases:

  • Too many repeating lines
  • Unable to open or close I/O files
  • Unable to read from input source

All failures at this stage output an error message to STDERR and return without displaying usage help


[Back to Project Main Page]