Aerodyamic said:
Here's my point, in a nutshell:
spoiler=I can prove damn near anything
If:
100% of people that (ate tomatoes) prior to 1890 [are dead]
99% of people that (ate tomatoes) prior to 1910 [are dead]
95% of people that [ate tomatoes) prior 1930 [are dea]
etc...
Based on that example, I can infer that tomatoes are poisonous. I can alter the variables within each set of brackets, and define different parameters, but the fact remains that I have demonstrated that a bias can be created relatively easily.
So:
If I say hello to 100 women and some number of them greet me in return, I can state that that percentage of women are LIKELY to return a social greeting.
If I then greet 100 men, and some number return the greeting, I can then state that that percentage of men are LIKELY to return a social greeting.
If I compare those 2 percentages, I can then outline the differences in the percentage of women and men that are LIKELY to return a social greeting.
That's not sexist, that's statistics. Capthcha= Easy as cake, which it really is.
Aerodynamic, if you don't mind I'll try to explain this to you in a different way.
If we take the experiment above. With sample sizes of 100 over two populations that yield X and Y positive responses each, we then assume that this means that X% of one populations will yield a positive outcome and Y% of the other.
N.B. This is not how this works! This neglects everything about bias, sampling and confidence levels that elevates statistics to a useful tool. If you ignore these things you cannot claim that you have made an argument based on statistics.
I cannot overstate how wrong this is.
Let's say for the sake of argument that we don't know anything that might let us make a statistically reasonable claim from this data. So, X% and Y% of each population is expected to yield a positive result.
Now, we want to use that data to construct a framework for social interaction. If we decide that we will use membership of one of these populations will be the basis for whether or not we continue to provide the stimulus with the expected result.
The standard by which we could say that this framework fails, is when we provide the wrong social to queue to an individual and we are perceived as rude** (otherwise, why would you construct the framework at all). If Y% of men were expected to return a greeting, then this is the percentage to whom we will have been rude by basing our framework solely on gender "statistics".
So, how long can we go without being rude? After one encounter with a man, there is a Y% chance that we have been rude. After two encounters (assuming independence of trials, which is another bad assumption) there is a (1 - (1-Y%)*(1-Y%)) chance, after three: (1 - (1-Y)^3).
This probability of having been rude to at least one man out of any N men you meet is (1 - (1-Y/100)^N). This approaches 100% as the sample size N increases. If we want to see how many men you could interact it with and be 95% sure that you haven't been rude to anybody, we need to rearrange this a little (95% is a bit arbitrary, but it's the arbitrary level that everyone uses).
We get N = log(0.05) / log ((1-Y)/100). And if we plug in some numbers we can see how many men we would have to meet before we can be fairly sure that we have been rude to at least one, based on the varying values of Y.
Y -------- N
50 - - - - 4.3
40 - - - - 5.9
30 - - - - 8.4
20 - - - - 13.4
10 - - - - 28.4
So, even at a very generous value for Y, you don't have to interact with more than thirty men before you can be reasonably certain that you you're social framework has broken down.
This got pretty lengthy, but I wanted to keep it as simple as possible. The point is that statistics and probability aren't simple things to apply, and the way you tried to apply them in your earlier post was wrong on several levels.
**This brings up a point that I forgot to make about insurance companies above. Insurance companies offset all the individuals who vary from the mean against each other. Unless you want to claim that each person that you are rude to is offset by a certain number of people to whom you have been polite, this is another important difference.