Monday, 30 March 2015
Monday, 23 March 2015
Q1 Ten individuals are chosen at random, from a normal population and their weights (in kg) are found to be 63, 63, 66, 67, 68, 69, 70, 70, 71 and 71. In the light of this data set, test the claim that the mean weight in population is 66 kg at 5% level of significance.
Ans. Weighted mean = Σwx/Σw
Σ = the sum of (in other words…add them up!).
w = the weights.
x = the value.
Σ = the sum of (in other words…add them up!).
w = the weights.
x = the value.
To use the formula:
- Multiply the numbers in your data set by the weights.
- Add the numbers in Step 1 up. Set this number aside for a moment.
- Add up all of the weights.
- Divide the numbers you found in Step 2 by the number you found in Step 3.
Mean weight = 67.8 kg
Standard deviation = 8.16 kg
Standard deviation = 8.16 kg
Z = (67.9 – 66) /8.16 kg = 0.2328
ZTable(Z) = 0.091 it is 9.1 % significance level.
So the claim that the mean weight in the population is 66 kg at 5% level of significance is WRONG.
Q2 I bought two packets of apples, 25 in each packet. The mean and standard deviation of weights of apples in the first packet are 235 and 3; and the mean and standard deviation for the second packet are 237.5 and 4. Write down the mean and standard deviation formulae for all the fifty
apples and compute them.
Ans. Apples in Each Packets – 25
Mean of First Packets – 235
Standard Daviation of Weight is – 3
Mean of second Packets – 237.5
Standard Daviation of Weight of second packets is – 4
Mean and Standard Deviation for all fifty apples are :
mean = Σwx/Σw
mean of Total = Σ(235 X3 )/Σ(237.5 X 4)
= 705/950
= 0.74
Q3 A consumer research organization tests three brands of tires to see how many miles they can be driven before they should be replaced. One tyre of each brand is tested in each of five types of cars. The results (in thousands of miles) are as follows: (10)
Type of car Brand A Brand B Brand C
I 6 9 4
II 3 2 7
III 2 3 6
IV 8 8 5
V 9 1 8
Compute the ANOVA and interpret your result.
Ans.
STEP 1:
Do the ANOVA table
d.fit <- aov(v~TR,data=d)
summary(d.fit)
Interpretation:
Makes an ANOVA table of the data set
d
, analysing if the factor
TR
has a signi
cant e
ect on
v
.
The function
summary
shows the ANOVA table.
> summary(d.fit)
Df Sum Sq Mean Sq F value Pr(>F)
TR 2 26.1667 13.0833 35.682 0.001097 **
Residuals 5 1.8333 0.3667
—
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
d.fit <- aov(v~TR,data=d)
summary(d.fit)
Interpretation:
Makes an ANOVA table of the data set
d
, analysing if the factor
TR
has a signi
cant e
ect on
v
.
The function
summary
shows the ANOVA table.
> summary(d.fit)
Df Sum Sq Mean Sq F value Pr(>F)
TR 2 26.1667 13.0833 35.682 0.001097 **
Residuals 5 1.8333 0.3667
—
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
STEP 2:
Decision:
Interpretation:
Exactly the same as for the “by hand” calculated tableWith
R
we do not have the critical values to a level, but we have the P
Interpretation:
Exactly the same as for the “by hand” calculated tableWith
R
we do not have the critical values to a level, but we have the P
Q4 A building has 11 flats. A sample of 4 flats is to be selected using (i) linear systematic sampling and (ii) circular systematic sampling. List all possible samples for each of these cases (10)
separately.
N = 11
n = 4
N/n = 11/4 =2.75 nearest integer = 3
let k = 3
n = 4
N/n = 11/4 =2.75 nearest integer = 3
let k = 3
1) Linear systematic sampling
Let i = 1, sample : 1, 4, 7, 10
i = 2, sample : 2, 5 , 8 , 11
i =3 , sample : 3, 6, 9
i = 2, sample : 2, 5 , 8 , 11
i =3 , sample : 3, 6, 9
2) Circular systematic sample:
i = 1, sample: 1, 4,7,10
2 2,5,8, 11
3 3,6,9,1
4 4,7,10,2
5 5, 8, 11, 3
6 6,9,1, 4
7 7,10,2,5
8 8,11,3,6
9 9,1,4,7
10 10,2,5,8
11, 11,3,6,9
2 2,5,8, 11
3 3,6,9,1
4 4,7,10,2
5 5, 8, 11, 3
6 6,9,1, 4
7 7,10,2,5
8 8,11,3,6
9 9,1,4,7
10 10,2,5,8
11, 11,3,6,9
Q5 Calculate Probabilities for following situations :
a) There are 1000 pages in a book out of which 100 pages are defective. What is the probability that out of first 50 pages 10 pages will be defective?
a) There are 1000 pages in a book out of which 100 pages are defective. What is the probability that out of first 50 pages 10 pages will be defective?
Ans. Probability Functions
A probability function is a function which assigns probabilities to the values of a random variable.
- All the probabilities must be between 0 and 1 inclusive
- The sum of the probabilities of the outcomes must be 1.
If these two conditions aren’t met, then the function isn’t a probability function. There is no requirement that the values of the random variable only be between 0 and 1, only that the probabilities be between 0 and 1.
Probability Distributions
A listing of all the values the random variable can assume with their corresponding probabilities make a probability distribution.
A note about random variables. A random variable does not mean that the values can be anything (a random number). Random variables have a well defined set of outcomes and well defined probabilities for the occurrence of each outcome. The random refers to the fact that the outcomes happen by chance — that is, you don’t know which outcome will occur next.
Here’s an example probability distribution that results from the rolling of a single fair die.
x | 1 | 2 | 3 | 4 | 5 | 6 | sum |
p(x) | 1/6 | 1/6 | 1/6 | 1/6 | 1/6 | 1/6 | 6/6=1 |
Mean, Variance, and Standard Deviation
Consider the following.
The definitions for population mean and variance used with an ungrouped frequency distribution
Some of you might be confused by only dividing by N. Recall that this is the population variance, the sample variance, which was the unbiased estimator for the population variance was when it was divided by n-1.
What’s even better, is that the last portion of the variance is the mean squared. So, the two formulas that we will be using are:
x | 1 | 2 | 3 | 4 | 5 | 6 | sum |
p(x) | 1/6 | 1/6 | 1/6 | 1/6 | 1/6 | 1/6 | 6/6 = 1 |
x p(x) | 1/6 | 2/6 | 3/6 | 4/6 | 5/6 | 6/6 | 21/6 = 3.5 |
x^2 p(x) | 1/6 | 4/6 | 9/6 | 16/6 | 25/6 | 36/6 | 91/6 = 15.1667 |
The mean is 7/2 or 3.5
The variance is 91/6 – (7/2)^2 = 35/12 = 2.916666…
The standard deviation is the square root of the variance = 1.7078
The variance is 91/6 – (7/2)^2 = 35/12 = 2.916666…
The standard deviation is the square root of the variance = 1.7078
Q 5 b) A die is tossed twice. Getting an odd number in at least a toss is termed as a success. Find the probability distribution of number of successes. Also find expected number of successes.
Ans. Let S = event that at least an odd number is the outcome of n tosses. = success
Let F = event that no odd number turns out = failure
n =1, then in one toss, probability of an odd number turning out = 3/6 = 1/2
P(S) = 1/2 So P(F) = 1 – 1/2 = 1/2
P(S) = 1/2 So P(F) = 1 – 1/2 = 1/2
The tosses of die are independent. The probabilities can be multiplied.
n = 2, two tosses, P(S) = 1 – ( P(Failure) in first toss * P(Failure) in 2nd toss)
= 1 – 1/2 * 1/2 = 1 – 1/4 = 3/4
= 1 – 1/2 * 1/2 = 1 – 1/4 = 3/4
probability distribution is
P(n,X) : = 0 for n = 0
= 1/2 for n = 1
= 3/4 for n = 2
P(n,X) : = 0 for n = 0
= 1/2 for n = 1
= 3/4 for n = 2
Q 5 c) Find the probability that at most 5 defective fuses will be found in a box of 200, if experience shows that 20% of such fuses are defective.
Ans. 20%of 200=40
so there are 40 fuses which are defective
the probability of finding 5 defectives= 5/40=1/8
so there are 40 fuses which are defective
the probability of finding 5 defectives= 5/40=1/8
Q6 Following data are given for marks in subject A and B in a certain examination :
SUBJECT A SUBJECT B
MEAN MARKS 36 85
STANDARD DEVIATION 11 8
Coefficient of correlation between A and B = ±0.66
- i) Determine the two equations of regression
- ii) Calculate the expected marks in A corresponding to 75 marks obtained in B
Ans. i)
The Regression Line
With one independent variable, we may write the regression equation as:
Where Y is an observed score on the dependent variable, a is the intercept, b is the slope, X is the observed score on the independent variable, and e is an error or residual.
We can extend this to any number of independent variables:
Note that we have k independent variables and a slope for each. We still have one error and one intercept. Again we want to choose the estimates of a and b so as to minimize the sum of squared errors of prediction. The prediction equation is:
Finding the values of b is tricky for k>2 independent variables, and will be developed after some matrix algebra. It’s simpler for k=2 IVs, which we will discuss here.
For the one variable case, the calculation of b and a was:
At this point, you should notice that all the terms from the one variable case appear in the two variable case. In the two variable case, the other X variable also appears in the equation. For example, X2 appears in the equation for b1. Note that terms corresponding to the variance of both X variables occur in the slopes. Also note that a term corresponding to the covariance of X1 and X2 (sum of deviation cross-products) also appears in the formula for the slope.
The equation for a with two independent variables is:
This equation is a straight-forward generalization of the case for one independent variable
Ans.6 ii)In a cross between two heterozygous individuals, the offspring would be expected to show a 3 : 1 ratio. For example, in Case 1, three-fourths of the individuals would have red (wild-type) eyes, and one-fourth would have sepia eyes.
If there are 44 offspring, how many are expected to have red eyes?
We expect three-fourths to have red eyes.
If there are 44 offspring, how many are expected to have sepia eyes?
Q7A sample of 900 members has a mean 3.4 cm and standard deviation 2.61 cm. Test whether the sample is from a large population of mean 3.25 cm and standard deviation 2.61 cm. If the population is normal and its mean is unknown, find the 95% confidence interval for population mean.
Ans.
N_s = sample size = 900. μ_s = mean of the sample = 3.4 cm
σ_s = Standard deviation of the sample = 2.61 cm
σ_s = Standard deviation of the sample = 2.61 cm
Population mean μ₀ =3.25
Population standard deviation = σ₀ = 2.61 cm
Population standard deviation = σ₀ = 2.61 cm
student’s t = (μ_s – μ₀) / [σ_s / √N_s ]
t = (3.4 – 3.25) * √900 / 2.61 = 1.7241
t = (3.4 – 3.25) * √900 / 2.61 = 1.7241
Find the probability that -1.7241 <= t <= 1.7241 from the students t distribution table or from a website that calculates these. Here the degrees of freedom (d.f) are 899. Sample size is 900.
The probability is 0.915. Hence, the sample belongs to the population with a probability 0.915 or a confidence level of 91.5%.
The population is normal. we don’t know its mean. We know sample size, mean and standard deviation.
Z = (μ_s – μ₀) / σ
Here σ_s = 2.61 cm
Here σ_s = 2.61 cm
The value z of normal distribution variable Z for which the probability
P( -z < Z < z) = 95% = 0.95 is z = 1.96
P( -z < Z < z) = 95% = 0.95 is z = 1.96
-1.96 * σ_s <= (μ_s – μ₀) <= 1.96 * σ_s
– 5.1156 <= (μ_s – μ₀) <= 5.1156 cm
– 5.1156 <= (μ_s – μ₀) <= 5.1156 cm
If we take the value of μ_s = 3.4 cm then,
-1.7156 <= μ₀ <= 8.5156 cm
-1.7156 <= μ₀ <= 8.5156 cm
Q8 The mean yield for one acre plot is 662 kg with a s.d. 32 kg. Assuming normal distribution, how many one acre plot in a batch of 1000 plots would your expect to have yield between 600 and 750 kg.
Ans.The mean yield for one acre plot is 662 kg with a s.d 32kg .assuming normal distribution how many one acre plot in a batch of 1000 plots would your expect to have yield between 600 and 750 kg.