I love to visit social bookmarking sites such as Digg and Reddit. Those sites comprise about 75% of my news gathering on a typical day.  It is a great way of keeping up with not only news events, but funny, informative, and even sometimes blasphemous material. However, these sites are a two-way street: your reputation on these websites is not only related to what you vote up or down, but also by what items and comments you submit, and how those things are judged by other users.

You find something that’s interesting and newsworthy, and you try to submit it. Chances are though that someone has beaten you to the punch. So what do you do? Submit anyway?

On most of these sites, the more votes you get for your story, the higher your reputation becomes, and the more visible that story becomes. If your story has already been submitted, then your story is a duplicate, or dupe for short. Dupe stories rarely make their way to the top, and if they do become more visible then it may receive many down-votes from other users who have noticed that the story is indeed a dupe. However,  this does not always apply to those people with an extremely high rep. The person with the higher rep could submit an item that has already been submitted, but since that person has a higher rep, the site algorithm promotes the story faster, and as a result it gets more votes, even though it is a dupe story.

So this is a chicken and egg problem. In order to make sure your story is promoted to the front page, you have to have a high rep, but in order to have a high rep, you need to have stories that get promoted to the front page.  What does a person that doesn’t have a high rep do here? What if that person just wants to submit a story that may have already been submitted?

My point is that dupes should not be frowned upon. Sure, you may see seventeen thousand stories about Apple releasing a polished turd that wipes your butt for you, but the point is that these people are making a contribution to the site. Dupes should also contribute to rep. Obviously not as much as original stories, but they should still count. The amount of rep that should be awarded should be a function of 1) How many dupes have been posted beforehand 2) The time elapsed between the original article and this dupe, and 3) If a different source was used from the original article.

Suppose there is a topic about Microsoft announcing details concerning Windows 7.  Submitter A may submit an article from source A, which may have a slight bias towards Microsoft. Submitter B may submit an article about the same subject, but it may originate from source B, which has a slight bias against Microsoft. Submitter C may submit an article from source C, which may analyze the details of the subject at hand.  Submitter A may be first, but submitter B may submit 10 minutes later, while submitter C may submit two hours later. Should submitter A get all the rep while B and C gets no rep?

One solution to this would be to form a sort of “topic tree” where a single item is listed, and underneath that tree you have each and every submission that is related to that item. For example, for our example with Windows 7, it may look something like this:

  • Microsoft announces Windows 7 Details
    • Windows 7 to deliver faster performance, new interface
    • Windows 7 to deliver new interface, performance tweaks
    • Windows 7 to be incompatible with many hardware, new drivers needed
    • A closer look at Windows 7 Beta 1
    • Screenshots of leaked Windows 7 Build 6023
    • Microsoft won’t be able to deliver in time with changes

All of the items have to do with Windows 7 and the new details surrounding it, but each article takes a different look at the topic. Some of the items may be dupes, but it contributes to the overall topic at hand. Perhaps the amount of dupes an article receives may contribute to the rep of the article or subject matter itself.

What do you think?