click below
click below
Normal Size Small Size show me how
Text Manipulation
Change the nature and content of text files
Question | Answer |
---|---|
Removes characters and fields from lines of text in a text stream | cut |
cut Options: cuts characters | cut -c |
cut Options: cuts fields | cut -f |
cut Options: delimiter | cut -d |
cut Options: remove lines that do not have a field delimiter | cut -s |
cut Options: Select only bytes | cut -b |
Replace TABS with Spaces | expand |
Specify the number of spaces for expand (with no option, default is 8 spaces) | expand -t (number) |
Change SPACES into a tab | unexpand |
unexpand Option: Change all occurrences – without –a command changes only leading spaces | unexpand -a |
unexpand Option: Specify number of spaces to be changed (default is 8) | unexpand -t (number) |
Format lines in file or text stream to a uniform length (default is 75) | fmt |
fmt option: Specifies number of characters for the width | fmt -w (number) |
fmt Option: Prevent formatting lines shorter than the specified length | fmt -s |
Combines text form 2 files based on IDENTICAL fields - by default fields offset by whitespace | join |
join Options: Ignore case when searching for identical text | join -i |
join Options: Specifies number of field to use when joining | join -j |
join Options: Number of the field from the 1st listed file to use when joining | join -1 |
join Options: Number of the field from the 2nd listed file to use when joining | join -2 |
join Options: Character used as delimiter | join -t |
join Example: Using the 2nd field of each file as join field (o/p from sort piped to join) | sort –n text1 | join-1 2 -2 2 text5 |
Places a line number in front of each line in text file | nl |
nl Options: Specifies the increment to use in numbering | nl -i |
nl Options: Starting number | nl -v |
nl Options: Specifies the text to place between number and the line- default is 2 spaces | nl -s |
Displays the contents of any file in octal, decimal, hexadecimal, or character format | od |
od Option: Radix of the file offset o = octal, d = decimal, x = hexadecimal and n = no offset | od -A (Example: od –A d –t c text2b) |
od Option: Controls the form of the display | od -t |
od Option: Character dump | od -c |
Adds contents of one file to the contents of another on a LINE BY LINE basis | paste |
paste Option: Specifies the character to be placed between the conjoined lines on each file (single character only) | paste -d |
paste Example; place a @ symbol between each line of file1 and file2 | paste –d @ file1 file2 |
Formats a text file for printing | pr |
pr Options:Double space lines | pr -d |
pr Options: Specifies text to replace file name (default) in header | pr -h |
pr Options: Specify number of lines per page (default = 66) | pr -l |
pr Options: Have no header or footer | pr -t |
pr Options: Omit file name (blank) | pr -m |
pr Options: Create a left-hand margin | pr -o |
pr Example: Create a left hand margin of 4 spaes | pr –o 4 |
Sorts each line in a file or text stream alphabetically | sort |
sort Options: Ignore blank leading spaces | sort -b |
sort Options: use 1st alpha-numeric character and ignore special characters | sort -d |
sort Options: Ignore case | sort -f |
sort Options: Sort by month | sort -M |
sort Options: Sort by numeric value | sort -n |
sort Options: sort in reverse order | sort -r |
Split lines of text form file or text stream into segments of specified number of lines | split |
split Options: Specify number of lines per file | split -1 or -number |
split Options: Splits into specified byte size instead of lines | split -b |
split Options: Use numeric suffixes rather than alphabetic for file names | split -d |
split Options: Specifies number of characters in the suffix | split -a |
split Example: Splits the AllNames file into individual files containing 50 lines = FiftyNames-001, FiftyNames002 etc. | split -50 –d –a 3 AllNames FiftyNames |
Filters identical lines form a file – lines must be adjacent (use sort) | uniq |
uniq Options: Print duplicate lines only | uinq -d |
uniq Options: Specifies number of initial words to skip (words delimited by white-space) | unig -f |
uniq Options: Specify the number of initial characters to skip | uniq -s |
uniq Options: Specifies number of characters to compare | unig -w |
uniq Options: Leave out duplicate lines | uniq -u |
Prints number of lines, words, and bytes from text in file or text stream | wc |
wc Options: Print BYTES | wc -c |
wc Options: print CHARACTERS = identical to bytes | wc -m |
wc Options: Print LINES | wc -l |
wc Options: Specifies length of longest line | wc -L |
wc Options: Print WORD COUNT | wc -w |
Transposes characters in a text stream (only works with character streams)- has 2 character sets - 1st set specifies characters to be changed 2nd set specifies how they should be changed - usually text stream derived from cat command (piped to command) | tr |
tr Options: Change all characters except those specified in 1st set | tr -c |
tr Options: Deletes characters found in 1st set | tr -d |
tr Options: Changes double-characters to single ones | tr -s |
tr Options: Truncate (shorten) the 1st set of characters to match size of 2nd set | tr -t |
Create reports based in data retrieved from files, build databases, or perform mathematical operations against numbers in text files | awk |
awk Options: Specifies file | awk -f |
awk Options: Specify DELIMITER to be used | awk -F |
awk Options: Used to designate FIELDS | $# |
awk Options: Used to insert a TAB | \t |
awk Options: Used to insert a NEW LINE | \n |
awk Options: Used to insert a form-feed character | \f |
awk Options: Used to insert a CARRIAGE RETURN | \r |