News: Google considering ranking sites on factual content rather than popularity

Recommended Videos

Skatologist

Choke On Your Nazi Cookies
Jan 25, 2014
628
0
21

Our favorite internet overlords are currently working on ways to rank sites based on truthfulness.

Google, remaining the most popular search engine on the web, is attempting to change its enduring model of ranking sites with one that hopes to improve the flaws.

The current model of Google's search engine can be summarized in that links to a certain webpage equals the quality of that page, leading to where it appears in search results. The most glaring flaw in this is that sites full of misinformation may rise up the ranks if it is linked by enough people.

A Google research team is developing a system to count the number of incorrect facts of a page. The statement of the research team can be found here.

The software will harness Google's Knowledge Vault, an information base dedicated to storing information that both machines and people can read.

Similar apps that already aid internet users eliminate falsities are LazyTruth, which weeds out hoax emails in inboxes and Columbia University's Emergent, a tool that focuses on rumors in the media.

The prospect of sites being valued on their merit of honesty rather than how many people link it in an online debate seems to be a rather promising one.

Source: NewScientist
 

Sleepy Sol

New member
Feb 15, 2011
1,831
0
0
Huh...more power to them if they can make it work. Hopefully it will.

Preliminary golf clap to Google.
 

tippy2k2

Beloved Tyrant
Legacy
Mar 15, 2008
14,870
2,349
118
While I appreciate the effort, I wonder how this will actually (or if it CAN actually) work. Some things have cold hard facts to them; like water freezes at 32 F, the sun is a star, tippy2k2 is the sexist Escapist user, etc.

However, how is that going to work with things that are not 100% factual? I don't even mean subjective things like best movie ever or anything like that but things like how much water is needed every day for someone to be healthy, how much sleep you should get in a day, the best way to treat a fever, which body part is the sexist on tippy2k2. These are all questions that don't necessarily have a "right" answer and even experts have different opinions...

It would be nice to have a more reliable way to search through information but I'm not sure how possible it is.
 

Zombie_Fish

Opiner of Mottos
Mar 20, 2009
4,584
0
0
I'm more interested in seeing what the Knowledge Vault is like to be honest. I've done some very brief skimming of the paper and it seems cool.[footnote]A version of the paper is available here: https://www.cs.cmu.edu/~nlao/publication/2014.kdd.pdf[/footnote]

Unfortunately that will probably remain private for a very long time, if not forever. Making the vault public would probably make it easier for illegitimate sites to boost their rankings.

Oh well, yet another reason to get a job at Google then.

EDIT: Just to clarify, this is the Knowledge Vault paper. There's a separate paper on the proposed ranking algorithm, with an ArXiv link on the New Scientist article.
 

Skatologist

Choke On Your Nazi Cookies
Jan 25, 2014
628
0
21
tippy2k2 said:
However, how is that going to work with things that are not 100% factual? I don't even mean subjective things like best movie ever or anything like that but things like how much water is needed every day for someone to be healthy, how much sleep you should get in a day, the best way to treat a fever, which body part is the sexist on tippy2k2. These are all questions that don't necessarily have a "right" answer and even experts have different opinions...
You bring up an interesting point, however, I don't believe these methods will be universalized for all of Google's searches. I'm fairly certain sites dedicated to question answering like "How much should the average person walk in a day" will most likely stay high. I'll quote the source and hopefully it calms some of your worries.

The software works by tapping into the Knowledge Vault, the vast store of facts that Google has pulled off the internet. Facts the web unanimously agrees on are considered a reasonable proxy for truth. Web pages that contain contradictory information are bumped down the rankings.
I'm a bit skeptical myself of the "unanimous" aspect of what the internet agrees on, but hopefully we'll see notabe change in the positive for this project

Captcha: laser beams

Aw yeah son
 

PainInTheAssInternet

The Ship Magnificent
Dec 30, 2011
826
0
0
I can already imagine how this would turn out. Accusations of conspiracies abound. What about various news sites? Who judges honesty and truth?
 

Zombie_Fish

Opiner of Mottos
Mar 20, 2009
4,584
0
0
tippy2k2 said:
However, how is that going to work with things that are not 100% factual? I don't even mean subjective things like best movie ever or anything like that but things like how much water is needed every day for someone to be healthy, how much sleep you should get in a day, the best way to treat a fever, which body part is the sexist on tippy2k2. These are all questions that don't necessarily have a "right" answer and even experts have different opinions...
Skatologist said:
I'm a bit skeptical myself of the "unanimous" aspect of what the internet agrees on, but hopefully we'll see notabe change in the positive for this project
Yeah, that unanimous point to me seems incorrect. At the very least, I cannot see anything about the Knowledge Vault only storing points that are unanimously agreed upon, and it would be extremely limiting if that was the case.

Going on the original paper about the Knowledge Vault, they seem to base it probabilistically on how many sources agree with each other:

The feature vector is composed of two numbers for each extractor: the square root of the number of sources that the extractor extracted this triple from, and the mean score of the extractions from this extractor, averaging over sources (or 0 if the system did not produce this triple).
In English: For each fact the system manages to find, they store the square root of the number of sources that state that fact and the average of how trustworthy each source is as rated by Google.
 

DoPo

"You're not cleared for that."
Jan 30, 2012
8,665
0
0
tippy2k2 said:
However, how is that going to work with things that are not 100% factual? I don't even mean subjective things like best movie ever or anything like that but things like how much water is needed every day for someone to be healthy, how much sleep you should get in a day, the best way to treat a fever, which body part is the sexist on tippy2k2. These are all questions that don't necessarily have a "right" answer and even experts have different opinions...
Without knowing much about the implementation or Google's intetnion, I can nonetheless make an educated guess and say, in those cases, they wouldn't weigh the results one way or another. This would be way I'd do it, and this would be the way I believe any sensible system would do it - very simplistically, there would be input (a crawled resource) and an output (weight) and this would be just a plugin to their overall system of categorising search results. The ouput of the "truth evaluator" would either be positive (truth-y content), negative (false-y content) or neither (content which doesn't have rules associated with correctness).

Thus, a personal blog which is just somebody's diary wouldn't be penalised for it wouldn't be offering any information that can be evaluated.

With that said, I don't really know what Google's plan is. I'd be interested to know, though.

Zombie_Fish said:
A version of the paper is available here: https://www.cs.cmu.edu/~nlao/publication/2014.kdd.pdf
Thanks for that - I'll certainly have a look at this.
 

Skatologist

Choke On Your Nazi Cookies
Jan 25, 2014
628
0
21
PainInTheAssInternet said:
I can already imagine how this would turn out. Accusations of conspiracies abound. What about various news sites? Who judges honesty and truth?
Funnily enough, you type this into Google first result is, BAM! , Infowars.

Proof:



It's almost like Google did this to show how necessary it was to change their search system. It's almost like a conspiracy.
<_<
 

shrekfan246

Not actually a Japanese pop star
May 26, 2011
6,374
0
0
tippy2k2 said:
tippy2k2 is the sexist Escapist user,
Uh...

[sub]I don't know if I have the heart to tell you...[/sub]

OT: I already voiced my concerns about this. It seems like it would be equally exploitable, and while computers aren't bogged down by things like "emotional bias", I'm not quite convinced of their ability to tell the absolute truth yet.
 

Sleepy Sol

New member
Feb 15, 2011
1,831
0
0
shrekfan246 said:
tippy2k2 said:
tippy2k2 is the sexist Escapist user,
Uh...

[sub]I don't know if I have the heart to tell you...[/sub]

OT: I already voiced my concerns about this. It seems like it would be equally exploitable, and while computers aren't bogged down by things like "emotional bias", I'm not quite convinced of their ability to tell the absolute truth yet.
Well, he did say the SEXIST Escapist user. I think you're still in the running.

It's totally me, though.
 

shrekfan246

Not actually a Japanese pop star
May 26, 2011
6,374
0
0
Solaire of Astora said:
shrekfan246 said:
tippy2k2 said:
tippy2k2 is the sexist Escapist user,
Uh...

[sub]I don't know if I have the heart to tell you...[/sub]

OT: I already voiced my concerns about this. It seems like it would be equally exploitable, and while computers aren't bogged down by things like "emotional bias", I'm not quite convinced of their ability to tell the absolute truth yet.
Well, he did say the SEXIST Escapist user. I think you're still in the running.

It's totally me, though.
I'm holding out hope that it was a typo, but he did it twice. D:
 

Thaluikhain

Elite Member
Legacy
Jan 16, 2010
19,538
4,128
118
Adam Jensen said:
Ha. Fox News website won't even appear on Google if they did that :D
Or, potentially, other news services people at Google happened not to like.

Yeah, I'm a bit wary of this.
 

tippy2k2

Beloved Tyrant
Legacy
Mar 15, 2008
14,870
2,349
118
shrekfan246 said:
Solaire of Astora said:
shrekfan246 said:
tippy2k2 said:
tippy2k2 is the sexist Escapist user,
Uh...

[sub]I don't know if I have the heart to tell you...[/sub]

OT: I already voiced my concerns about this. It seems like it would be equally exploitable, and while computers aren't bogged down by things like "emotional bias", I'm not quite convinced of their ability to tell the absolute truth yet.
Well, he did say the SEXIST Escapist user. I think you're still in the running.

It's totally me, though.
I'm holding out hope that it was a typo, but he did it twice. D:
Wait...what?

.....damn it!!! Curse these fat sausage fingers! I mean...curse these sexy sexy fingers!!!

Well that's OK, I can erase the evidence! It's not true until it's on Goo...


CURSE YOU GOOGLE!!!!
 

shrekfan246

Not actually a Japanese pop star
May 26, 2011
6,374
0
0
tippy2k2 said:
shrekfan246 said:
Solaire of Astora said:
shrekfan246 said:
tippy2k2 said:
tippy2k2 is the sexist Escapist user,
Uh...

[sub]I don't know if I have the heart to tell you...[/sub]

OT: I already voiced my concerns about this. It seems like it would be equally exploitable, and while computers aren't bogged down by things like "emotional bias", I'm not quite convinced of their ability to tell the absolute truth yet.
Well, he did say the SEXIST Escapist user. I think you're still in the running.

It's totally me, though.
I'm holding out hope that it was a typo, but he did it twice. D:
Wait...what?

.....damn it!!! Curse these fat sausage fingers! I mean...curse these sexy sexy fingers!!!

Well that's OK, I can erase the evidence! It's not true until it's on Goo...


CURSE YOU GOOGLE!!!!
I bet that's going straight into Google's Knowledge Vault.

It's going to be known as hard fact for the rest of internet time.

Poor tippy...
 

DoPo

"You're not cleared for that."
Jan 30, 2012
8,665
0
0
shrekfan246 said:
tippy2k2 said:
tippy2k2 is the sexist Escapist user,
Uh...

[sub]I don't know if I have the heart to tell you...[/sub]
It's OK, I will: tippy, you're actually the sexiest Escapist user in the universe. Well done, keep up the good work! :p

shrekfan246 said:
OT: I already voiced my concerns about this. It seems like it would be equally exploitable, and while computers aren't bogged down by things like "emotional bias", I'm not quite convinced of their ability to tell the absolute truth yet.
What computers are good at, is following instructions and rules. And Google are really good at making them obey complex rules. Especially ones to do with enormous quantities of data. Google are, maybe not the best, but certainly top tier at machine learning, AI, agent systems and most stuff surrounding those.

The "absolute truth" is actually relatively simple to work out. Heck, it doesn't even need to be worked out, as much as memorized. However, you could have the system work out what to memorize (possibly having it verified by somebody).

The trickier part would actually be finding out what's on a page. This is the odd thing in AI - what's easy and hard for humans is usually reversed for computers. As a simple example, it's usually hard to be a doctor, however, it's easy to just...well, communicate. We do the latter every day with ease, the former, we need 5 years (at least) of specialised schooling to do. For an computer, parsing out communication is hard, but giving out a diagnose is relatively easy, especially in comparison. The doctor one falls under what's called "expert systems" - it is mostly what it sounds like - a system that deals with expert knowledge for a given domain: this could be medical diagnosis, technical troubleshooting, mechanical fault analysis and so on. They are generally quite good at quite specialised tasks that are not as easy for humans. But try to get an AI to understand language, or other things we do every day, and the thing turns out really friggin' hard

As always, xkcd is spot on



There is also the alt-text:
In the 60s, Marvin Minsky assigned a couple of undergrads to spend the summer programming a computer to use a camera to identify objects in a scene. He figured they'd have the problem solved by the end of the summer. Half a century later, we're still working on it.
 

shrekfan246

Not actually a Japanese pop star
May 26, 2011
6,374
0
0
DoPo said:
Well, I mostly meant that I'm skeptical of a computer's capability to pull absolute truth from something like the internet when all of the "facts" it's going to be pulling are influenced by human nature and bias in the first place.

But hey, I'm not a programmer, so I don't know how hard it would be or how accurate they'll be able to make it.
 

DoPo

"You're not cleared for that."
Jan 30, 2012
8,665
0
0
shrekfan246 said:
DoPo said:
Well, I mostly meant that I'm skeptical of a computer's capability to pull absolute truth from something like the internet when all of the "facts" it's going to be pulling are influenced by human nature and bias in the first place.
That's why usually domain experts assist in expert systems. And "domain expert" is just a fancy word for "somebody who knows what they are talking about" - doctors would help out with a diagnosing system, mechanics would help out with a system that checks cars, etc. Similarly, I assume this is what Google would do.

Google could, I assume, run a task that would crawl the web and try to find out possible answers for topics and try to classify them according to likelyhood. It's not even that hard...well, more or less - it would involve something like finding resources for a topic and then analysing them (when you have enough) and synthesizing the resources into general facts. Well, if you're able to parse out meaning from the text, that is, you'd be able to read two articles on "boiling water" and find that they both mention "100 degrees Celsius". Or you could find that one say that, and the other says "212 degrees Fahrenheit" but as soon as you have that, you can convert them to a common base and know what is the boiling point. You can then ask a physicist over to look at that result and say if it's correct or not and make amendments if needed.

Later on, if a Google crawler comes across something new and identifies it's about "boiling water" but mentions "55 degrees", it should be able to tell that's not a correct number.

Still, I am pretty sure a lot of scientific facts (like boiling water) would actually be readily available, so they don't need to derive them.