Monday, September 1, 2008

Wikitrust

Luca De Alfaro is a researcher who has been working on an awesome new MediaWiki extension: Wikitrust. Wikitrust is a coloring system for text on a MediaWiki website that uses a computed trust metric for authors to color the words on a page depending on how reliable they are computed to be. The words of a well-trusted author appear white, the words of an untrusted trouble-maker are shown in bright orange. Everybody else is colored somewhere in the middle of the scale. In Wikitrust, there would be a "Trust" tab at the top of every page that you could click on to see what parts of the page are the most stable, and which are most suspicious.

He announced the V2.0 of his extension on wikiquality-l last week, and we've been discussing it ever since. He was looking for some place to install and test it that wouldn't be too big (because we need to evaluate server-load before we install it on en.wikipedia), but would be bigger then a small test wiki. I, of course, nominated en.wikibooks for the honor, and I think there is a little bit of support for that on the list. It's one more item in the queue of desired features, behind flaggedrevs and a few other home-brew extensions we've been clamoring for.

4 comments:

  1. Sounds like an interesting concept. I think it could bring up some interesting questions about how we decide who is a "trusted" and of course "untrusted" contributor. I guess it could be done on number of edits, though editing lots does not necessarily mean the author knows what they are talking about. Alternatively community consesus could make people "trusted" contributors, though branding people "untrusted" in the same way sounds like a recipe for conflict. Perhaps "untrusted" would just be anonomous contibutors? All questons ahead of us if this extension works out... -AdRiley

    ReplyDelete
  2. The algorithm that decides the trust value of the text is the central focus of the group's research, so it's not something that's arbitrary or that we can set ourselves.

    What his algorithm does (and I only know the rough details) is this:
    1) Go over revisions to a page, find text that stays for a long time. Mark that text as being trusted.
    2) Find authors who's text is trusted. increase the trust level of those authors
    3) Find edits which have been deleted or immediately reverted, or overwritten. Mark that text as untrusted.
    4) Find authors of untrusted text, and decrease the trust level of those authors.
    5) Go through and color text in an article based on the trust level of the author of that text at the time of the last edit.

    I'm sure I'm screwing up some details, but that's how I understand it.

    ReplyDelete
  3. Sounds a little like Google's algorithm for ranking pages, only in reverse. My understanding is that Google promotes pages based on citations on other pages. Citations from higher ranked pages are worth more. This OTOH, demotes text based on whether the contributor is frequently reverted. I would hope that reversions by untrusted users would count less.

    I'm not making a value judgment here - just trying to understand it in terms of something more familiar.

    ReplyDelete
  4. Jomegat: It's definitely a concept that's tricky to grasp, but Luca is adept at explaining what's going on behind the scenes. You should consider joining the wikiquality-l mailing list; archives are at http://lists.wikimedia.org/pipermail/wikiquality-l/

    ReplyDelete