What Duplicate Content in Google’s Index Means to Me

Google's duplicate content index

One of the problems facing digital strategists currently is that of duplicate content, and search engine’s like Google are working harder than ever to discourage website owners from publishing it on the web. Duplicate content is just that, content that is a duplicate of something that already exists. If you aren’t familiar with this problem, here’s why you should care.

You’re going to be presented with people wanting to sell, co-brand, and white label duplicate content. You will have more than enough chances to host duplicate content on your website, and at least a few people will tell you that you can get away with it. Here’s why I think that in 2011, I’d limit the amount of time and money invested in content already found elsewhere on the web.

For a moment let’s say you aren’t starting up a website, but that you are starting a local library in your community. Let’s call this library Google, and say that it contains books that summarize all the world’s information.

You finish construction of your beautiful library and find that you’re missing one critical piece to the puzzle, books! You have all this empty space, hundreds of shelves just waiting to be filled with everything from encyclopedias to sports almanacs.

So you begin searching for books, looking everywhere you can find them. In the beginning your focus is filling every shelve with a book, so you aren’t as picky with what you grab. Donated books, used books, and maybe even multiple books focusing on the same topic will do for now.

You finally fill your library to the brim, and people begin to come looking for information on what is relevant to them. As word gets out your library becomes popular, and people begin using your building as a trusted source for information. A few years go by and people begin requesting books that you don’t have, and often people want a newer edition.

Now your task becomes not just filling up the library with books, but making sure those books are of high quality for all your customers. You begin tossing out old books in favor of new ones, and making sure that you only keep the most popular books on hand for people.

After awhile you begin noticing that you have multiple copies of some books. In your haste to fill up those shelves you grabbed 3 volumes of the same encyclopedia, as well as six copies of the exact same 1988 sports almanacs.

Rather than keep that stuff around, you decide on the best copy to keep on hand, and throw away the rest. Space is now at a premium in your library, and it’s pointless to keep around multiple copies when usually 1 or 2 at most will keep your customers happy.

Fast forward a few years more and you’re library is becoming one of the most popular places in the world! People come from all around the globe just to read your books and use your information for research. Now authors have taken notice too, and famous publishers and authors from around the world line up at your office hoping that you will include their books in your library.

But now you can afford to be picky, you can afford to reject more and more books each day. Space is limited, but you are still open to adding new books to your index, as long as they are worth it to your customers!

Google built a library, and now it’s the largest in the world. It’s also the most important one in the world to be listed in, and that means they can afford to be picky. Would you have two copies of the exact same book in your library? I don’t think I would either. Can you blame them?