grep
to the rescue!
Let’s reformat your test sentence like this:
This apple is the most disgusting
fruit I've ever tried to eat.
I would rather have an orange
or a banana.
Save that to test.txt
Put all of the bad words in a file named badwords.txt
(one word per line). Then run this command:
grep -i -f badwords.txt test.txt
which will output this:
This apple is the most disgusting
I would rather have an orange
or a banana.
If you want to get really fancy, you could use cat -n
which will prefix line numbers, that way you could easily tell where the offending words are:
cat -n test.txt | grep -i -f badwords.txt
1 This apple is the most disgusting
3 I would rather have an orange
4 or a banana.
Be sure to use -i
with grep
so that it will ignore case.
If you just want to see which files contain the “bad words”…
Then you can use:
grep -l -i -f badwords.txt test.txt
(Note that -l
is a lowercase -L
)
That way you could put all of your .srt
files in a directory and do:
grep -l -i -f badwords.txt *.srt
and you would get a list of all of the files which matched any of the words in badwords.txt
If you wanted to put all of the files that do have “bad words” into a separate folder, you could do this:
mkdir -p HasBadWords
grep -l -i -f badwords.txt *.srt | while read line
do
mv -vn '$line' HasBadWords/
done
Then all of the offending .srt files would be in the folder ‘HasBadWords’ and the ones left would be “clean”.
BEWARE!
Make sure that badwords.txt
does not have any BLANK lines in it, or else it grep
will consider any file with a blank line to have a “bad word”.
Scope
Note that searching for apple
will also match crabapple
and pineapple
and Applebees
. I presume that is the desired result in this situation.