5. Find the coefficient of correlation between the variables X and Y using Karl Pearson's method:
X
1
3
4
6
8
9
11
14
Y
1
2
4
4
5
7
8
9
[Ans: r = 0.977]
Answers
Explanation:
Karl Pearson Correlation Coefficient Formula
The coefficient of correlation rxy between two variables x and y, for the bivariate dataset (xi,yi) where i = 1,2,3…..N; is given by –
r
(
x
,
y
)
=
cov
(
x
,
y
)
σ
x
σ
y
where,
⇒ cov(x,y): the covariance between x and y
–
Σ
N
i
=
1
(
x
i
–
¯
x
)
(
y
i
–
¯
y
)
N
=
Σ
x
i
y
i
N
–
¯
x
¯
y
Here,
b
a
r
x
and
¯
y
are simply the respective means of the distributions of x and y.
⇒ σx and σy are the standard deviations of the distributions x and y.
–
σ
x
=
√
Σ
(
x
i
–
¯
x
)
2
N
=
√
Σ
x
2
i
N
–
¯
x
2
–
σ
y
=
√
Σ
(
y
i
–
¯
y
)
2
N
=
√
Σ
y
2
i
N
–
¯
y
2
Alternate Formula
If some data is given in the form of a class-distributed frequency distribution, you may use the following formulae –
⇒ cov(x,y): the covariance between x and y
–
Σ
i
,
j
x
i
y
i
f
i
j
N
–
¯
x
¯
y
Here,
¯
x
and
¯
y
are simply the respective means of the distributions of x and y.
⇒ σx and σy are the standard deviations of the distributions x and y.
–
σ
x
=
√
Σ
i
f
i
o
x
2
i
N
–
¯
x
2
–
σ
y
=
√
Σ
j
f
i
o
y
2
i
N
–
¯
y
2
where,
xi: The central value of the i’th class of x
yj: The central value of the j’th class of y
fio,fij: Marginal Frequencies of x and y
fij: Frequency of the (i,j)th cell
In any case, the following equality must always hold:
Total frequency = N =
Σ
i
,
j
f
i
j
=
Σ
i
f
i
o
=
Σ
j
f
j
o
A Single Formula for Discrete Datasets –
r
x
y
=
N
Σ
x
i
y
i
–
Σ
x
i
Σ
y
i
√
N
Σ
x
2
i
–
(
Σ
x
i
)
2
√
N
Σ
y
2
i
–
(
Σ
y
i
)
2
Let us understand more about Scatter Diagram here