The intersection and set difference (A-B) on text files
Intersection and set difference operations are common in mathematics classes on set theory. Similar operations on strings are useful in some scenarios.
Getting ready
The comm
command is a utility to perform a comparison between two sorted files. It displays lines that are unique to file 1, file 2, and lines in both files. It has options to suppress one more column, making it easy to perform intersection and difference operations.
- Intersection: The intersection operation will print the lines the specified files have in common with one another
- Difference: The difference operation will print the lines the specified files contain and that are not the same in all of those files
- Set difference: The set difference operation will print the lines in file
A
that do not match those in all of the set of files specified (B
plusC
, for example)
How to do it...
Note that comm
takes two sorted files as input. Here are our sample input files:
$ cat A.txt apple...