Sounds like a “lose-lose” situation, doesn’t it? If you improve your visibility and market more and get a wider outreach then bots will also take more notice of you and the chances of a bot attack go up along with your popularity. Luckily, you can control bot visits relatively easily. The idea behind blocking bots is to find out where they come from and stop the source.
Locate your Log Files: Servers all have a series of log files that make a record of visits to the website from each user based on their IP location. These logs are usually stored on your server. If you have a hosting company that gives you cPanel as your hosting front-end you can simply access your logs by clicking the link in the main cPanel window (the first window you visit). If you use Apache for your front-end, your logs will be in the /var/log folder.
IIS users can configure their logging through the local computer’s control panel. In the control panel, you select administrative tools, then internet services manager, then select website, right click and then select properties, select website in the tabs then available, then on to properties and finally the general properties tab. In typical fashion, the Microsoft logs are the hardest to get your hands on.
Figuring Out the Most Visits by IP and User Agents: When you get your log files downloaded, it’s a simple matter to consolidate them into a single text file and then import them into Excel (or whatever you prefer to view your log files in). Excel is a very innovative way to manipulate data so that you can make sense out of it. When you import your data into Excel you can select the space delimiter to get the right data into the right columns. With a little cleaning up you’ll have usable data almost immediately.
Utilizing Excel’s Pivot Table Builder, you can create a pivot table to link number of visits to Client IP and then get a feel for the counts of visits from malicious IP’s. Client IP’s determine where your visitors are coming from and can easily give you insight into where most of your visitors are based. Renaming the table headers to Client IP, Hits and finally a User Agent column gives you a setup to determine which IP’s have visited your site the most.
The User Agent determines the browser version and the operating system version that your visitor was running. Obviously, bots would have none of these so it’s just a matter of determining which ones are blank to pinpoint the presence of bots.
Blocking the IP: After you’ve figured out which IP is the bot location, you can now move forward in blocking reference to the bots in your analytics reports.
Additionally, if you’re concerned about security, you can also block the bots from accessing the site altogether. Google Analytics gives you the option to block individual IP’s. It also comes with a built in bot-checker that you can enable in the Admin panel under View Settings and by selecting “Exclude all hits from known bots and spiders.”
A handy tool for filtering your analytics to get a more realistic view of your outreach. Omniture gives you a bit more control about your analytics viewing and tabulation by giving you the option to exclude individual IP’s, exclude a set of IP’s (if you have a large number of bot entries) or create a processing rule that ignores certain IP’s and IP ranges.
On the server side of the spectrum, you can limit the availability of your site to certain visitors based on their IP. CPanel includes a handy IP Deny manager which allows you to enter IP’s that you can deny access to. In Apache, you can utilize either the mod_authz_host module or the .htaccess module, but the former is the more preferred method for controlling access. Open IIS Manager allows blocking through its features view, then navigating to the IP4 Address and Domain Restrictions, then to the actions pane and finally adding the IP address of the bot into the Add/Deny Entry list.