SOCI 3P11 1

QUANTITATIVE DATA ANALYSIS I

REVIEW FOR MID-TERM TEST

February 2015

TEST DATE:

A make-up test will only be granted if one of the following is produced: (a) a note from

a medical doctor on official letterhead fully documenting your illness; (b) a death

certificate or a published newspaper obituary that documents the death of an immediate

family member; or (c) a wedding invitation for the wedding of an immediate family

member that takes place the day of or 24 hours prior to your scheduled test. NO

OTHER DOCUMENTATION WILL BE ACCEPTED.

During the test each student may make use of a non-programmable electronic

calculator. Smart phones, tablets and laptop computers are strictly prohibited.

Calculators may not be shared between students.

———————————-

1. Identify the level of measurement for each of the following variables. In addition, indicate

(a) the possible measures of central tendency, and (b) underline the MOST

APPROPRIATE measure of central tendency (in the case of numerical variables, assume

normality).

(a) Interest rate

(b) Province of residence

(c) Total family income

(d) Self-esteem score on a standardized instrument

(e) Hours of training

(f) Profit margin

(g) Birth order

(h) Degree of test anxiety (high, medium, low)

(i) Academic letter grade

(j) Marital status

(k) Religious affiliation

(l) Depression score on a standardized instrument

(m) Favourite Buffalo Sabres hockey player

(n) Number of abuse incidents reported

(o) Height

(p) Educational aspirations (e.g., university, college, high school, etc.)

(q) Ethnicity

(r) Number of unsafe injection drug use episodes in the past month (0; between 1 and

5; between 6 and 10; 10 or more)SOCI 3P11 Dr. Kevin Gosine, Instructor

2

2. The scores for eight (8) students on a quiz out of 10 is as follows: 6, 6, 8, 7, 3, 5, 4, and 9.

(a) Construct a tabular frequency distribution for these data. In this table, include the raw

counts, percentages, and the cumulative percentage.

(b) Find or compute the following statistics: mean, median, mode, range, interquartile

range, variance, and standard deviation.

3. (i) At Brock University, a group of 32 political science students were randomly selected

and asked the average number of hours they spent watching politically- oriented news

and talk show programs on television each week. The sum of the squared deviations

from the mean in this sample of students = 357. The variance of the population is not

known.

(a) What formula for the variance should you use to calculate an estimation the

population variance?

(b) Calculate the best estimate of the population variance from the information given.

(ii) You have discovered that the population for question 3,i totals 32 individuals. You

therefore have the entire population.

(a) Given this information, what formula should one use to calculate the population

variance?

(b) What is the population variance?

(c) Comment on the difference between the variance found in Q. 3, i & ii. Why is there

a difference? Does the difference make sense?

4. (a) Identify three characteristics of the normal distribution.

(b) Identify two variables in your area of research/scholarly interest that you feel would

be normally distributed. Draw hypothetical distributions and clearly label the X

(horizontal) and Y (vertical) axes.

(c) Sketch a positively skewed distribution. Indicate where the three measures of central

tendency would fall in the distribution. Clearly explain why the mean would fall

where you have placed it relative to the other two measures of central tendency.

(d) Identify two variables in your area of interest that you feel would NOT be normally

distributed. Draw hypothetical distributions, and label the X and Y axes. Describe the

non-normal shape (whatever that might be).

5. (a) Sketch a standard normal distribution indicating the appropriate values for the mean

and the standard deviation units.

(b) Explain what a z-score is. Carefully explain how it differs from the first step in the

calculation of the standard deviation (i.e., where you subtract the mean from each raw

score).

For questions 6, 7, & 8 please summarize your final answer to each question in sentence

form.SOCI 3P11 Dr. Kevin Gosine, Instructor

3

6. A researcher wanted to find out the mean income of Brock University undergraduate

students who were employed in the labour market while attending school. Looking only at

those students in the population who are employed part-time or full-time (let’s say, N =

20,000), she calculated a mean weekly income of $295 and a standard deviation of $20.

The data for this population are normally distributed.

(a) Sketch a standard normal distribution complete with standard deviation confidence

intervals. Identify the mean (i.e., $295) and indicate the income dollar values

associated with the boundaries for the three SD confidence intervals. In addition,

indicate the percentage of data that we know fall within each interval on either side of

the mean.

(b) What percentage of students in the Brock undergraduate population earn between $335

and $355 a week?

(c) Chatting with an undergraduate student on the bus to Brock, you learn that she holds a

part-time job waiting tables. You’re curious to know how much she makes, but feel it

would be rude to ask. If you were to ask, what is the probability that this student would

report earning between $255 and $275 a week?

(d) What percentage of students earn less than $255 and more than $335 a week?

7. The graduate programme director in the Faculty of Education at Brock wanted to develop

a demographic profile of the Ph.D. candidates in the faculty. Where age is concerned, her

research assistant calculated a mean of 30 years with a standard deviation of 2 years.

Assuming the distribution of doctoral students’ ages are normally distributed, what

percentage of candidates in the program:

(a) are older than 35 years of age?

(b) are younger than 27 years of age?

(c) are between 25 and 29 years?

(d) are between 32 and 34 years?

(e) Suppose there are 78 students in the Ph.D. programme; for questions 7. a, b, c, & d

above, indicate the exact number of students who would fall beyond or between the

designated ages/scores.

8. The mean weekly allowance of students in Ms. Attieh’s second grade class is $11 with a

standard deviation of $3.50. Assuming that the kids’ allowances are normally distributed,

how much allowance does little Timmy get from his parents if:

(a) he’s in the top 10% of students according to weekly allowance?

(b) he’s in the top 2% of students according to weekly allowance?

(c) Mary Lou is in the bottom 3% of students where weekly allowance is concerned. What

is the maximum amount of weekly allowance Mary Lou receives from her parents?

Unless otherwise stated, state your final answer to all probability questions as a percentage.

Also, state your final answer in sentence form.

9. What is the probability of tossing 3 heads in a row with a fair coin?SOCI 3P11 Dr. Kevin Gosine, Instructor

4

10. A player rolls a pair of dice, one red and the other green.

(a) Find the probability of rolling a 2 on the red die and a 3 on the green die.

(b) Find the probability of rolling a 2 on one die and a 3 on the other die.

11. Two letters are chosen at random from the English alphabet. If y is considered to be a

consonant, find the probability that

(a) both are vowels

(b) both are consonants

(Note: For this problem, assume sampling with replacement.)

12. A bag contains 4 white, 3 blue, and 6 red marbles. A marble is drawn from the bag,

replaced, and another marble is drawn. Find the probability that:

(a) both marbles are red

(b) both marbles are blue

(c) the first marble is red and the second is blue

(d) one marble is red and the other is blue

(e) neither is red

13. Do question 12 assuming that the first marble is not replaced.

14. Five-thousand (5000) raffle tickets have been sold and Sidney has purchased 4 of

them. There are 10 prizes. What is the probability that Sidney will win:

(a) first prize?

(b) first and second prize?

15. A dime and a quarter are tossed, and a die is rolled. What is the probability of getting:

(a) two heads and a 6?

(b) a head on the dime, a tail on the quarter, and a 2?

(c) a head on the quarter, a tail on the dime, and a number greater than 2?

16. What is the probability that the sum of two die will be greater than 8 if you

rolled a 6 on the first die?

17. On the current active team roster, the New York Rangers have 22 players listed, 7 of

whom are defencemen. The Toronto Maple Leafs have 23 players, 7 of whom play

defence. If we combine the rosters of the two hockey teams and select a player at

random, what is the probability of choosing a New York Ranger or a defenceman? SOCI 3P11 Dr. Kevin Gosine, Instructor

5

18. A social work researcher examined outcomes that provide insight into the relationship

between psychotherapy and type of professional visited by the client – a psychiatrist or a

psychologist. The study results are summarized in the following table of probabilities:

Psychotherapy Type of Professional

Outcome Psychologist Psychiatrist Total

Good .30 .12 .42

Poor .15 .43 .58

Total .45 .55 1.00

(a) What is the probability that a person will have success in psychotherapy?

(b) Given that a person saw a psychiatrist, what is the probability that the

person will have a good psychotherapy outcome?

(c) What is the probability that a person saw a psychologist?

(d) Given that a person has a poor psychotherapy outcome, what is the probability

that s/he saw a psychologist?

19. A scholar of organizational behaviour wants to investigate the effectiveness of two forms

of leadership within a large bureaucracy. After an extensive review of both goal

achievement and client satisfaction, the scholar determines the following probabilities:

!the probability of an organization possessing Type A leadership is 0.7

!for those organizations possessing Type A leadership, the probability of

meeting established goals is 0.6

!for organizations characterized by Type B leadership, the probability of

meeting established goals is 0.65

!for organizations possessing Type A leadership, the probability of a client

being satisfied is 0.4

!for organizations possessing Type B leadership, the probability of a client

being satisfied is 0.3

(a) Produce two probability trees – one for leadership type/goal achievement, and one

for leadership type/client satisfaction.

(b) What is the probability of:

(i) an organization with leadership Type B achieving its goals?

(ii) an organization satisfying its clients?

(iii) an organization possessing Type A leadership, given that it has achieved its

goals?

(iv) an organization possessing Type B leadership, given that its clients are not

satisfied?SOCI 3P11 Dr. Kevin Gosine, Instructor

6

20. Suppose a fair coin is flipped three (3) times – heads is considered a “success” and tails is

considered a “failure.”

(a) Using the binomial formula, compute the probability associated with each possible

number of successes in three trials (i.e., 0 successes in 3 trials; 1 success in 3 trials; 2

successes in 3 trials; 3 successes in 3 trials). Construct a table to display your results.

(Note: Express probabilities as proportions.)

(b) Construct a binomial probability distribution histogram based on the probabilities

that you computed in Q. 20(a). Label the X and Y axes.

(c) Compute the mean and standard deviation for this distribution.

(d) What is the probability of getting heads at least once in three flips?

21. If you roll a die 1000 times, how many times would you expect to roll a 4? Explain your

answer making explicit reference to large number theory. (Note: You do not need to

perform any calculations for this question.)

22. A new UFC fighter has four fights scheduled in 2015 (the first four fights of his UFC

career). Odds makers give him a 25% probability of winning each fight.

(a)Complete the blanks in the table below: When answering this question, feel free to

take the shortest route possible to achieve the answers, but show your work

somewhere on this page.

(b) Determine the probability that the rookie fighter will win 2 or more fights in 2015.

(Show your work and state your final answer in sentence form.)

Number of wins in four

fights:

Probability

0 wins

1 win

2 wins

3 wins

4 wins

Total probability:SOCI 3P11 Dr. Kevin Gosine, Instructor

7

ANSWERS TO SELECTED QUESTIONS:

2. (b) Mean = 6

Median = 6

Mode = 6

Range = 6

IQR = 4;

Variance = 3.5

Std deviation = 1.87

3. i. (b) Estimated variance = 11.5

ii. (b) Population variance = 11.2

6. (b) 2.15%

(c) 13.59%

(d) 4.3%

7. (a) 0.62%

(b) 6.68%

(c) 30.2%

(d) 13.6%

(e) 0.4836; 5.2; 23.6; 10.6

8. (a) 15.48 (b) 18.18 (c) 4.42

9. 12.5% 10. (a) 2.8% (b) 5.6% 11. (a) 3.7% (b) 65.2%

12. (a) 21.3% (b) 5.3% (c) 10.7% (d) 21.3% (e) 29%

13. (a) 19.2% (b) 3.8% (c) 11.5% (d) 23.1% (e) 27%

14. (a) 0.08% (b) 0.000048% 15. (a) 4.2% (b) 4.2% (c) 16.7% 16. 67%

17. 64% 18. (a) 42% (b) 22% (c) 45% (d) 26%

19. (b) i. 65% ii. 37% iii. 68.3% iv. 33%

20. (a) 3 successes = .125; 2 successes = .375; 1 success = .375; 0 successes = .125

(c)Mean=1.5;s=.87 (d).875

22. (a) 0 wins = .3164; 1 win = .4219; 2 wins = .2109; 3 wins = .0469; 4 wins = .0039;

Total probability = 1

(b).2617

Note: You may approach the course instructor or the TAs for assistance with questions for

which answers are not provided, but only after you have made an honest attempt to answer

the questions yourself.