Questions about Quiz #4?
https://bgsu.instructure.com/courses/1137459/quizzes/1526334/statistics
Lesson 4.0: Generalizations
- Generalizations - definition
- Formalizing & assessing arguments for generalizations
- Review selected homework questions
Lesson 4.1: Statistical syllogisms
- Statistical syllogism - definition
- Formalizing & assessing statistical syllogisms
- Review selected homework questions
[Takeaway points for today @ bottom of post]
Lesson 4.0: Generalizations
General claims are probabilistic:
-Most BGSU students like candy
-70% of Americans will hand out Halloween candy this year
-Only a small proportion of spiders are dangerous (to humans)
Think of generalizations as two-premise arguments:
P1) S is a representative sample of group P.
P2) [Some proportion] of S is Y (Y=the trait we're interested in).
C) [Some proportion] 2 of P is Y.
Example:
P1) The students in this class are a representative sample of BGSU undergraduates.
P2) 20% of the students in this class have seen those creepy clowns everyone's talking about.
C) Therefore, 20% of BGSU undergraduates have seen those creepy clowns.
1001 Definitions - make sure to familiarize yourself with these prior to the exam
- Sampling frame - defines the trait and the total population (group) you want to study
- Target population (group) - The total population you want to study
- Sample - The subset of individuals (from the total population) you plan to study
- Sample size - The number of subjects being studied, i.e., your chosen subset of the total population
- Hasty generalization - You commit this fallacy when you rely on a sample that is too small to support a general conclusion (cannot represent the diversity of the total population)
- Representative sample - A sample such that the relevant traits or variables in the sample are represented proportionally in the total population
- Biased sample - Does not have the same distribution of variables represented in the same proportion as in the total population
- Random sample - To get a random sample you must use a selection method to ensure that every member of the total population has an equal chance of being included in the sample
- Margin of error - Refers to the probability of a sample being biased. As sample size goes down (under a certain limit) the margin of error goes up
- Stratified random sampling - First, identify relevant population clusters; second, make sure that they are represented in the sample in the same proportion that they exist in the population
- Anecdotal evidence - Evidence collected in a casual or informal manner, relies heavily or entirely on personal testimony. Usually too small sample size & a biased sample
- Operationalize - A fancy word for defining your terms clearly
Takeaway points
Roughly speaking:
1) Sufficiently large sample size, representative sample, low margin of error, stratified random sampling, terms operationalized --> good generalizations
2) Small sample size, non-representative (i.e. biased) sample, high margin of error, hasty generalization, anecdotal evidence, terms not operationalized --> bad generalizations
THIS WEEK: ASSESS PREMISES FOR ADEQUATE SAMPLE SIZE & REPRESENTATIVENESS
Homework 4.0, Part A
A. Suppose you are asked to conduct a study but only given vague terms. Operationalize the sampling frame to eliminate vagueness in the target population and in the trait we're interested in. You are free to operationalize any way you like so long as you eliminate vagueness.
3. Small dogs are aggressive.
Target population: ?
Trait: ?
For those interested, here's a great article (note: focus is on dog bite-related fatalities, and not dog bites simpliciter):
Patronek, Gary J., et al. "Co-occurrence of potentially preventable factors in 256 dog bite–related fatalities in the United States (2000–2009)." Journal of the American Veterinary Medical Association 243.12 (2013): 1726-1736.
Search Google Scholar to find PDF.
Homework 4.0, Part B
B. Put the argument into standard form for generalizations. Evaluate each premise according to criteria we covered in class:
(P1) Check for sample size, representativeness.
(P2) Check for measurement errors. (Probably not relevant for this set of exercises. We'll look at this in more detail next class).
(C) Is the proportion in (P2) the same as the one in the conclusion.
3. I never found calculus to be useful, therefore it isn't useful.
P1)?
P2)?
C)?
P1, Sample size - sufficient, insufficient, or not sure?
P1, Representativeness - representative, not representative, or not sure?
6. 40% of bees in Nebraska die every winter, therefore 40% of bees in the US die every winter.
7. 90% of people in LA, San Fransisco, Manhattan, San Diego, Austin, and Seattle support gay marriage, therefore 90% of Americans support it.
More examples
1. My cats love Fancy Feast. Therefore, your cats will love Fancy Feast too.
2. A randomized study of 2,000 adult Americans found that the average person has three friends.
3. Students in my class are really good at recognizing invalid inferences. BGSU undergraduates must be good at recognizing invalid inferences.
Measurement errors <-- More on this subject in upcoming lectures, so not our focus today
- Sometimes the way we collect my data affects whether we are actually measuring what we think we are measuring.
- Sometimes two properties are closely correlated, and we confuse one (specifically, the causal influence of one) for the other.
- Often with human trials, participants will drop out over the course of the study.... This affects the size of the sample, and it may affect whether we can trust the results.
Lesson 4.1: Statistical syllogisms
Syllogism = def. (for our purposes) a two-premise argument
Don't be intimidated by terminology. It's a fancy word for something really simple!
Example of a deductive syllogism:
P1) All humans are mortal
P2) Socrates is human
C) Therefore, Socrates is mortal
This comic will only be funny to those who've taken Phil 1010 (or read some Plato on their own):
Socrates "apologizes"
We're talking about inductive syllogisms, aka, statistical syllogisms. Statistical syllogisms are also two-premise arguments... but you'll need to fill in the implicit premise. Filling in the implicit premise is really simple.
You're given an argument like this, which relies on a generalization:
Most clowns are creepy, so the clown at your niece's birthday party is probably going to be creepy.
Filling in the implicit premise, you wind up with a two-premise argument that looks like this:
P1) Most clowns are creepy
P2) There's going to be a clown at my niece's birthday party
C) The clown at my niece's party will probably be creepy
Statistical syllogisms formalized - general form
P1) Generalization
P2) Instance
C) Conclusion regarding the instance (based on the generalization)
Example 2:
P1) 99% of cats like boxes
P2) I have a box & a cat
C) 99% chance my cat's going to like the box
Assessing statistical syllogisms:
Probability of conclusion being true depends on two things
1) The proportion of the target population that has the trait in question (as indicated by the generalization)
More likely to be true, in decreasing order:
99%...75%... 66%... 51%*
Notice that all these percentages fall under the umbrella term "the majority."
2) Homogeneity of the target population with respect to the trait in question.
In other words: How similar the individuals in a certain group are, with respect to a certain trait.
Related: Are there important subgroups within my target population that affect the likelihood of the conclusion's being true?
The less homogeneous the target population is, the less likely the conclusion is to be true.
Homework 4.1, Part A
(a) Put the statistical syllogism into standard form and
(b) Identify what relevant sub-groups we'd need to know about to strengthen the inference to the conclusion.
1. 25% of BGSU students scored well on the math component of the SAT, therefore there's a 25% chance that my classmate scored well on the math component of the SAT.
a) P1) Generalization
P2) Instance
C) Conclusion regarding an instance (based on generalization)
b) More information needed? Relevant sub-groups?
2. 51.1% of Americans voted for Obama last election, therefore there's a 51.1% chance that the person sitting next to me on my flight to Texas voted for Obama.
a)
b)
Are you still with me?? Here's a cat video for Halloween:
(Cats who hate their costumes)
Homework 4.1, Part B
- How to manipulate your audience: Averages and percentages vs. absolute numbers
- Depending on whether we express a value as a percentage or as an absolute value we can tell contradictory stories using the exact same data!
Basic math review
1) Calculating percentages ("percentage of X")
Convert percentages to decimals, then multiply.
Example: What is 1% of 20?
1% = 1/100= 0.01
0.01 x 20 = 0.2
2) Calculating percentages ("X out of Y = Z%")
Divide one number by the other, then multiply x 100
Example: 300 is what percentage of 100,000?
300/100,000 = 0.003
0.003 x 100 = 0.3%
1. Suppose you're the CEO of company that did 10 billion dollars in annual sales last year. Last year the economy grew by 5% but your sales only grew by 3% compared to the previous year. You have a shareholder meeting coming up and these numbers don't look good. Your sales haven't kept pace with the economy. Thinking about absolute numbers vs. percentages, how can you present the sales numbers to the shareholder meeting in a positive way? Hint: Translate your annual sales into an absolute number.
How are you going to pitch your company's sales data to the shareholders?
4. Here are some statistics on Syrian refugees and US immigration. Obama has set a goal to accept 10,000 Syrian refugees currently residing in Jordanian refugee camps. There are about 10 million Syrian refugees and 500,000 Syrian refugees in Jordan are awaiting resettlement to a new country.
(a) Suppose you write a pro-Syrian immigration blog. Thinking about absolute numbers vs percentages, how could present the numbers in a way that makes it seem like the US isn't doing very much to help refugees? Present them in a headline that conveys this point of view.
Your answer:
(b) If you were writing an anti-immigration blog, how would you present the numbers? Present them in a headline that conveys this point of view.
Your answer:
Key points from today's class (not the cat video):
- Evaluating generalizations
- Sample size: definition and importance
- Hasty generalizations
- Representativeness: definition and importance
- Biased samples
- Evaluating statistical syllogisms
- Form of the (implied) argument
- Reliance on generalizations (averages)
- Importance of relevant sub-groups
- How to manipulate audiences: Absolute vs. average values
No comments:
Post a Comment