Happyninja42 said:
I think it would be pretty easy to develop some kind of accuracy metric, based on cited sources for the information versus random opinions without any evidence to back them up.
But then you get into a complicated situation trying to verify the validity of a statement. There is two types of problems (that I can think of) you can hit:
1. There are sources that back up a statement. The sources are also backed up. The sources of those, however, are bogus. Example: propagated misinformation, such as "The humans use only 10% of their brains" or some shit. The chain could be really long, you have to parse the entirety of it, and if the source is something free text (so, the most likely scenario), you have to somehow make the computer understand enough to verify if the fact is indeed there. This makes verifying a statement quite hard and quite time inefficient.
2. The bottomline sources that back up a statement are "common sense". It's really hard to find common sense on the Internet. And that's not a pun. Even though I often feel
that is also true, but I digress. Usually IRL common sense knowledge is not on the web because...well, it's common sense. You rarely need to document that, for example, riding a bicycle is a faster transportation than walking.
But actually what the KV is doing is not
that complicated or new, if we look at it in simplified form. It's a lot like backwards chaining but with fuzzy logic applied to the new facts. In effect, it's got a bunch of data which expresses a number of statements with a various degree of certainty, and when it comes across a new statement, it verifies it against the previously known stuff and based on them, assigns it a confidence, then adds it to the list of other statements. That's really a basic explanation of how it works, the actual one involves more maths and, frankly, I struggled to follow it myself. It makes sense, though, as I said, it's a lot like backwards chaining.
Let's illustrate how things are supposed to go. Let's say, the KV comes across the statement:
"
tippy2k2 is the sexiest person in the solar system"
Now, it runs this through the things we already know, and it turns out that it knows the following
tippy2k2 was unanimously voted the most sexy Escapist user
tippy2k2 is the sexiest person in USA
tippy2k2 is the sexiest person in South America
tippy2k2 is the sexiest person on the moon
tippy2k2 is the sexiest person on Jupiter
tippy2k2 is the sexiest user in the Atlantic ocean
Now, of course we
know the new statement is true, but the KV is going to reason only with the above facts. So, it assigns some confidence value to the new statement, let's say, 90% since it seems likely based on these facts.
Now, it comes across the statement "
shrekfan246 is the sexiest person in the universe. For simplicity's sake, let's assume there are three solar systems in the entire universe (small world, eh). And the KV knows that
shrekfan246 is the sexiest person in the other two solar systems. However, seeing that he isn't unanimously the sexiest person in all of them (remember,
tippy2k2 is 90% likely to be the sexiest in our solar system), the KV decides that it's pretty likely but not totally, so it assigns a confidence of 98% to the statement.
And so on and so forth.