Posts Tagged ‘SEO’

Search Engine Ode (Limerick)

Saturday, April 21st, 2018

When your site’s on the first page of Bing
And Google, you whistle and sing,
Cuz your marketing plan
Just might work. You’re “The Man,”
As your Net takes a large upward swing.

Victory In My Battle Against Feed Scraping Content Thief 4Comedy.com (Updated)

Monday, June 11th, 2007

Last week when I was bitching about 4Comedy.com’s stealing my humor blog’s content, I had no idea I was dealing with a “feed scraper” site. All I knew was that I was being ripped off — every time I posted in this humor blog, my entire post appeared on 4Comedy.com within minutes.  Needless to say, I was a very pissed off blogger.

So what did I do?  First I posted comments at 4Comedy.com demanding that its thievery stop.  At least I tried to.  But not surprisingly, my comments never appeared there.

Next I reported its copyright violations to Google AdSense, via a link at 4comedy.com’s site.  The following day I received an email telling me how to formally report a Google AdSense DMCA (Digital Millennium Copyright Act) Infringement Complaint. 

Unfortunately, the procedure involved tons of time-consuming work, requiring me to assemble all sorts of documentation of 4Comedy.com’s  many infringements.   Then (I’m told) this documentation is forwarded to the alleged infringer.  And after that, heaven-only-knows-what happens.

I responded to Google’s email with a request for some accelerated action.  My justification was that in this post http://4comedy.com/?p=463, 4comedy.com had reproduced this post, in which I called it a content thief.

I didn’t receive any response to my email, but apparently it got their attention.  How do I know?  Because today the Google AdSense text ads disappeared from the top of 4Comedy.com’s site.  Hallelujah!

But I’m getting ahead of myself. While I was waiting to hear from Google, I did some research on how to deal with RSS feed scraping content thieves.  And I found some great resources,  including:

1. AntiLeech, a plugin that “helps prevent content theft by sploggers” and  a detailed article explaining the benefits of AntiLeech Splog Stopper: Fighting Back Against Content Thieves;

2. An interesting article with the enticing title How to stop rss scrapers from stealing your content plus revenge; 

3. A tutorial, Blocking bad bots and site rippers (aka offline browsers);

4. An article entitled How you can stop dirty feedscrapers in 3 easy steps; and

5. This article about Attacking scrapers and content thieves legally.

I found the material posted at all of those links very informative, and I’m planning to give that AntiLeech plugin a try.  But I was feeling a bit lazy and I was looking for some instant gratification.  And, happily, I found it: A commenter named Robert posted the following suggestion here:

Alternatively, to curl up even less unproductive work, add this line to .htaccess:
Deny from 74.52.58.162
Which would even allow you to block a whole range of IP addresses in case it proves necessary…

Armed with this simple-sounding solution, I decided to try it.  My first step was to identify 4Comedy.com’s IP.  So I checked its trackback data, which identified its IP as  74.53.110.146.  I then confirmed the IP number by pinging 4comedy.com, using my computer’s Run function:  ping 4comedy.com.  Next I checked my logs and verified that 4comedy.com’s IP was routinely showing up there.

Now that I had the infringer’s IP, I added this code to my .htaccess file:

Deny from 74.53.110.146

Finally, I FTPed the revised .htaccess file and, like magic, 4Comedy.com’s content thievery came to a halt: MadKane.com has been freed from the slings and arrows of 4Comedy.com’s feed scraping infringements.

Of course, that low-life feed scraper is still taking material from other sites like Comedy Central. Hey Comedy Central!  Try this. You’ll like it.

UPDATE WITH ADDITIONAL RESOURCES: As I learn about additional good resources on this topic, I’ll be adding them here.  Feel free to make suggestions via a comment to this post.

a: What To Do When Someone Steals Your Content is a must read.

b: Dnsstuff.com is a source of many fine tools, including a “DNS Lookup” tool — helpful in ascertaining IP addresses.

c. Plagiarism Today is an excellent source of information about plagiarism, content theft, and copyright issues online.

UPDATE 2: 4Comedy.com seems to have disappeared.  I can only hope it stays that way. 

UPDATE 3: 4Comedy.com is now resolving to a different domain — domainnamesbusiness.info/4comedy/index.php. But the IP is the same, so my blog should still be protected from its feed scraping.