Grep command inaccurate or bad command? Windows WSL

Posted on

Problem :

I am using grep command in Windows WSL and it seems to be inaccurate.

The command is to remove matching lines from file1 from file2 and output of those lines in file3 (text files)

grep -v -f file2.txt file1.txt >> file3.txt

However, the output doesn’t add up to the total lines, file2 may have 100 lines, and file1 may have 50, and the output file3 may have 30 lines, for example.

But my scale is larger. File2 has 430,000 lines and file1 has 30,000, but the output has 370,000. Also to know, every line from file 1 matches in file 2 just randomly mixed in there, so I have to use grep to remove them. Also, there are no duplicate lines in neither file1 or file2.

Solution :

>> means “append to file” while > means “overwrite file”
but as an example I generated 2 files of random UUIDs, file2 has 100 unique lines, file1 has 50 unique ones, plus all the ones from file2, shuffled to a totally random order

$ wc -l file2.txt file1.txt
 100 file2.txt
 150 file1.txt

then I do your command

$ grep -v -f file2.txt file1.txt >> file3.txt
$ wc -l file3.txt
 50 file3.txt

just as expected, but now watch what happens if I run it again:

$ grep -v -f file2.txt file1.txt >> file3.txt
$ wc -l file3.txt
 100 file3.txt

all your problems seem to be because you misunderstood the linux shell redirection operands

Leave a Reply

Your email address will not be published. Required fields are marked *