Imagine you have a file. Perhaps it is a log file. It is long, and has much information in it. And buried somewhere in that file, there is one piece of information that you need to find. It would take far too long to read through the huge file. This situation cries out for some faster method of extracting that piece of information.
For this very task, Linux provides us grep.
grep takes two pieces of input; the string you are looking for in the file, and the file you are looking through. Given that input, it then hands you all the lines in the file that contain that string.
grep "input_string" filename.txt
Here is a practical situation you may encounter with your server. You notice a php error on your site, and you want to see if you can get the specific error from the apache log for the specific time that you saw it. A good way to find that in the apache error log is to grep for the IP address of the computer you are working from. Note that this is not your server’s IP. If you do not know this IP, you can find it by visiting whatismyip.com. The way to do grep the log for your IP on a cPanel server is:
grep "127.0.0.1" /usr/local/apache/logs/error_log
Assuming your IP is “127.0.0.1”, that command will show you all the times in the log file that your IP address appears. The most recent should be the error you’re looking for.
It is not strictly necessary for the string you are looking for to be in quotes. It is usually a good idea to do so. Sometimes the string you are searching for contains special characters that the shell might interpret as another command. While you can escape those characters by putting a “\” in front of each one, it is usually easier to just put the string in quotes.
Not only is it not necessary to use quotes, but it is not necessary to use just one filename as an argument. If you want to grep for an ip in all of the logs in a directory, cd into that directory and substitute “*” for the filename:
root@host # pwd
root@host # grep "127.0.0.1" *
access_log:127.0.0.1 - - [28/Jul/2010:11:01:35 -0400] "GET / HTTP/1.0" 200 111access_log:127.0.0.1 - - [28/Jul/2010:11:05:01 -0400] "GET /whm-server-status HTTP/1.0" 200 47928
access_log:127.0.0.1 - - [28/Jul/2010:11:06:36 -0400] "GET / HTTP/1.0" 200 111access_log:127.0.0.1 - - [28/Jul/2010:11:10:02 -0400] "GET /whm-server-status HTTP/1.0" 200 47926
error_log:[Sat Jun 28 09:06:26 2008] [error] [client 127.0.0.1] File does not exist: /usr/local/apache/htdocserror_log:[Sat Jun 28 09:06:26 2008] [error] [client 127.0.0.1] File does not exist: /usr/local/apache/htdocs
error_log:[Fri Jan 02 19:06:51 2009] [error] [client 127.0.0.1] File does not exist: /usr/local/apache/htdocserror_log:[Fri Jan 02 19:06:51 2009] [error] [client 127.0.0.1] File does not exist: /usr/local/apache/htdocs
(Plenty of output was omitted for brevity's sake.)
Note that the beginning of each line shows the name of the file the line is from, followed by a colon, and then the content of the line.
Combining grep With Other Commands
While grep is extremely useful on its own, it really shines when combined with other commands. Another example:
Suppose you have seen evidence that a script has somehow made its way onto your server, and is sending spam. Based on the mail logs, you have narrowed it down to a particular user’s directory; but there are many files in that directory, too many to search through manually. Not only are there files in there, but there are plenty of directories as well that you do not want to search through. Using find to limit the search to the files in the one directory, and then using a pipe to send the list of files to grep is the way to go.
root@host [/home/fnbrilli/public_html]# ll
drwxr-x--- 4 fnbrilli nobody 4096 Jul 28 11:46 ./
drwx--x--x 11 fnbrilli fnbrilli 4096 Jul 28 11:41 ../
-rw-r--r-- 1 fnbrilli fnbrilli 201 Apr 8 2009 .htaccess
-rw-r--r-- 1 fnbrilli fnbrilli 287 Jul 28 11:24 badfile.php
drwxr-xr-x 2 fnbrilli fnbrilli 4096 Aug 24 2008 cgi-bin/
-rw-r--r-- 1 fnbrilli fnbrilli 17 Aug 24 2008 index.html
-rw-r--r-- 1 fnbrilli fnbrilli 1337 Jul 28 11:19 safefile.php
drwxr-xr-x 2 fnbrilli fnbrilli 4096 Jul 28 11:46 unwanted-directory/
-rw-r--r-- 1 fnbrilli fnbrilli 37 Jul 28 11:41 wp-admin.php
root@host [/home/fnbrilli/public_html]# find . -type f | xargs grep -li 'mail('
I searched for ‘mail(‘ because that string is the start of the mail function in PHP, and is found in nearly all php mailer. Note that it did catch a wordpress file that legitimately uses the mail function, but it did weed out other files that do not have the function. It certainly reduced the size of the haystack you need to look through.
With a little ingenuity, grep can be combined with many other commands to extract just the information you are looking for. Keeping this small tool in your command-line tool belt can mean the difference between a few minutes of searching and hours of tedious review. Use it wisely, and you will have freed up much-needed time for more pressing tasks.
Liquid Web’s Heroic Support is always available to assist customers with this or any other issue. If you need our assistance please contact us:
Toll Free 1.800.580.4985