Stopping HTTP Referer Spam with ColdFusion
By Pete Freitag
I get a lot of hits from HTTP Referer spammers in my logs these days. If your not familiar with this type of spam, it's pretty simple. Someone has a url that they want you to visit, so they write a spider to visit your site, but they put in their url as the HTTP referer. So then when I check my web site logs I see 50 hits from their site, curious as to why they linked to me I visit their site. Many blogs, or web sites show the recent http referers for an article, if a spammer shows up here, then they also get a page rank boost.
HTTP referer spam is really hard to stop or prevent, sure many of them have keywords in their urls, those are easy to block, and that's what this entry will show you how to do, but long term this is a big problem.
My Simple solution in CFML
Before we get into the code I should point out that its probably better to block these guys on your web server or firewall because the CFML solution will only help if they are going for a CFML page, and its probably a bit more efficient on those layers.
At any rate here goes my solution, it simply looks for keywords in the referrer and returns a 403 Forbidden HTTP status code. This works for the stats package that I use (awstats) because it only logs referers for status code 200. Here's the code, I just stick it in my Application.cfm
:
<cfif Len(CGI.HTTP_REFERER)> <cfset spam.badwords = "highprofitclub,holdem,poker"> <cfloop list="#spam.badwords#" index="spam.word"> <cfif FindNoCase(spam.word, CGI.HTTP_REFERER)> <cfheader statuscode="403" statustext="Forbidden http referer"> <html><head><title>403</title></head><body> <h1>403 Forbidden Referer</h1> <a href="/">Please Continue to the home page</a> </body></html> <cfabort> </cfif> </cfloop> </cfif>
Note that if your running on a version prior to CFMX 6 you might want to add a check to see if CGI.HTTP_REFERER is defined as well.
Stopping HTTP Referer Spam with ColdFusion was first published on March 11, 2005.
The Fixinator Code Security Scanner for ColdFusion & CFML is an easy to use security tool that every CF developer can use. It can also easily integrate into CI for automatic scanning on every commit.
Try Fixinator
CFBreak
The weekly newsletter for the CFML Community
Comments
There is no good reason why I looped vs using ListFindNoCase - I may change it to use that instead, I'd suspect ListFindNoCase to be a tad more efficent.
One of the reasons throwing a 403 header is better is that you can display a message to a human incase one of your keywords shows up in a valid referer. And also to not piss of Disney.
Also as for keywords, I am adding - in my badwords, so things like -credit, -loan, and it looks like texas- might be a good way to block the texas-holdem variations.
The only recommendation I would make is to only compare against the domain name of the referring URL and not the entire referring URL or else links & queries like http://www.somedomain.com/blocking_highprofitclub.htm or http://www.google.com/?q=blocking+highprofitclub won't work.
This combination does not work, only first word.
Here it "highprofitclub". Other words like holdem and poker not filtered.
I find the evolution of http referer spam quite interesting, and I always have to keep pruning and modifying the keywords search as they evolve from things like texas-holdem to something like texes-holdm.
On a related note, referer in HTTP_REFERER is apparently a misspelling that stuck, according to http://dictionary.reference.com/search?q=referer