(Updated 6 November 2014) One of the wonderful things about the web is that most of the world’s information is accessible online. Better still, a large portion of the world has access to all of that information.
Search engines play a huge role in making it easy to sift through that information and find the stuff you are looking for. Problems arise, however, when people who are less scrupulous decide to publish content and decide that the best way to do that is to steal it.
Unfortunately, the web makes theft of your content quite easy, and enforcement of your rights somewhat difficult.
Assessing the Consequences
One of the first things you need to is to assess the level of damages to you. Getting your stolen content removed from someone else’s website requires a fair amount of work, and you should only pursue it if you are likely to be impacted at some level.
In general, the search engines are pretty good at recognizing the original author of a piece of content. However, search engines can make mistakes. For example, if you just launched a new blog that has little visibility on the web, and your article is stolen by the New York Times, it is likely the Times version will outrank your original (note that the New York Times does not steal content, this is just an example!). And, if a prominent site steals your content it is usually quite easy to address, as we will explain below.
Detecting Stolen Content
The first thing I would do if you’re worried about possible content theft is take several different unique strings from your content and search on it within double quotes. For example, if your content included the phrase: “The slow gray fox tripped over the startled dog”, you can search on “slow gray fox tripped over” (including the quotes) in Google and Bing. If your article comes up first, that is a good sign that the search engines know that you are the authoritative source for that content.
Try this with several phrases to make sure that you are OK. One key tip – avoid picking phrases that include punctuation, such as commas, hyphens, and quote characters. These seem to work less well for these types of searches. Once your testing is done, if you show up first for everything you need to consider whether you are suffering any damages.
As a techie-type, I tend to approach things this way, but you can also use 3rd party tools, such as CopyScape, which also do an excellent job of detecting if your content has been copied.
Another component to consider is whether or not the stolen article contains links back to your site. If it does, the search engines are pretty good at unraveling this type of theft, and knowing that you are the original author. Chances are that you passed the quoted strings test above if this is the case.
Making Content Harder to Steal
There are some things that you can do to make your content harder to steal effectively, or to lower the consequences of the damage if it is stolen.
- Use relative links for images. I.e. something that looks like “/images/yourimage.gif”, instead of “http://www.yourdomain.com/images/yourimage.gif”. This will force the thief to copy all of the images in your content over to their web servers or to modify the links to absolute links.
- For the same reason, use relative links when referring to your CSS files, or any Javascript you have on the site. Note that if you use third-party tools such as Google Analytics, you will need to use absolute links to refer to those elements. Just make sure you use relative links for any Javascript you have developed for the site that is hosted on your web server.
- Use absolute links when linking to other pages on your site. I.e., “http://www.yourdomain.com/page1.html” instead of “/page1.html”. The reason for doing this is that it ensures that you get links back to your site unless the thief goes through all the content and modifies those links to make them relative links.
- For fun, you can also create a custom piece of Javascript that recognizes what domain it is on, and if it runs and finds it is not on your web servers, it publishes a big bold image that says “STOLEN CONTENT” on the stolen pages.
The general idea here is to make your content more work to steal than someone else’s content. Few publishers will take all the steps outlined above, and therefore those other people will represent easier targets for thieves than you.
Taking Action
Of course, there are times when it is worth taking action. I recommend a three-step process when doing this:
- Contact the site owner. Use whatever means they provide for doing so, tell them where the offending content is, and tell them they need to take it down, or you will take action. Even though you are angry, there is no need to be nasty about it. Focus on your goal, which is to get it taken down. However, do be very clear that you intend to pursue this further.
- If that does not work, the next step is to contact the hosting company for the web site. You can often get this information from their WhoIs records at the registrar, but if it is not there, try using a third-party service such as Who is Hosting This. The reason for contacting the hosting company is that they can be held liable for the content theft if you have notified them and they do not act on it. They may be more motivated to avoid the liability than your thief.
- The third step is to file a DMCA request with the search engines. Here is the Google DMCA form and the Bing DMCA instructions. The beauty of this is that the search engines also have an obligation to respond. Do not do this lightly! Do it only if you are in fact the original author. If you used a contractor to write that article, do some due diligence to make sure that they did give you original content.
This three-step process should address most issues. You can also file something with Chilling Effects, as this makes the request visible to others. Or, if this all seems like too much work, enlist a 3rd party service, such as DMCA.com to do it for you (as of November 2014, the cost is $199).
Since it will take a lot of time and effort, do make sure you evaluate whether or not it is worth it. If there are no real damages to you, then it probably is not worth taking action, unless someone is copying your whole site, or otherwise extensively stealing from you.