Problem :
Is there a better way to do this command to find strings in a file excluding special characters?
Currently I’m doing:
strings file.abc | grep -v = | grep -v ] | grep -v ) | more
I’d like to add more special characters so I’m only getting a-z and A-Z in the results.
Solution :
If you just want to exclude these special characters you can use regular expressions (e.g., PCRE like this:
strings file.abc | grep -Pv "[=])]"
If you want to display only strings containing some specific characters, you could use grep instead of strings.
The command
grep -Poa "[A-Za-z]{4,}" file.abc
shows all words with of at least four letters.
Here:
- The
-o
switch makes grep show only the match (rather than the entire line). - The
-a
switch forces treating a binary file as a text file. -
The PCRE
[A-Za-z]{4,}
matches four or more consecutive letters.Four is the default number that strings uses. Adjust as needed.
How about
strings file.abc | grep '^[A-Za-z]*$'
?
That will give you only lines consisting only of letters.
Actually, you probably want lines containing only one or more sequences of letters;
i.e., lines containing letters and spaces. If that’s what you want, do
strings file.abc | grep '^[A-Za-z ]*$'
with a space after the z
.
If you decide that you want to include any other characters, put them inside the brackets.
(Warning: some characters will be tricky,
such as the quote character itself, '
, and the right bracket, ]
.)