In the vast landscape of data analysis , understanding how variables relate to one another is very crusial. It is a fundamental statistical tool that helps to find the linear relationship between two continuous variables.
Table of Contents
- What is Correlation?
- Pearson Correlation Coefficient
- Mathematics Behind Pearson Correlation Coefficient
- Mathematical Example
- Code for Pearson Correlation Coefficient
- Conclusion
What is Correlation?
Correlation is besicaly a statistical measure which can measures the strength and direction of the relationship between two variables. This can be an invaluable tool when trying to determine whether an increase in one variable corresponds to an increase or decrease in another.
Pearson Correlation Coefficient
Pearson Correlation Coefficient (denoted as ) the strength and direction of the linear relationship between two continuous variables where the value of ranges from to .
: Perfect positive linear relationship
: No linear relationship
: Perfect negative linear relationship
Graph Representation
Mathematics Behind Pearson's Correlation Coefficient
Pearson's assesses how much two variables change together relative to how much they change independently. The formula for finding the value of is:
Where:
- and are the two variables.
- and are their respective mean values.
- and are individual data points.
This formula normalizes the covariance between and by the product of their standard deviations, ensuring that value always remains between to .
Interpreting the Coefficient
Understanding the value of involves both it's sign and magnitude
-
Sign(+/-)
It indicates the direction of the relationship- Positive: When increases , also increases.
- Negative: When increases , tends to decrease.
-
Magnitude
The magnitude represents the strength of the relationship.- 0.00 to 0.19: Very weak
- 0.20 to 0.39: Weak
- 0.40 to 0.59: Moderate
- 0.60 to 0.79: Strong
- 0.80 to 1.00: Very Strong
Graph
Mathematical Example
Let us take an example of Price vs the Demand -
Demand | Price |
---|---|
65 | 67 |
66 | 68 |
67 | 65 |
67 | 68 |
68 | 72 |
69 | 72 |
70 | 69 |
72 | 71 |
- Solution:
Considaring Demand as and Price as :
= 24
and
= 39.799
so the correlation coefficent will be :
= 0.60303022
- which indicates strong positive correlation.
Code for Pearson Correlation Coefficient
#include <stdio.h>
#include <math.h>
#include <stdlib.h>
int main()
{
int n;
double a[n];
double b[n];
double m_a;
double m_b;
double A[n];
double B[n];
double C;
double D;
double E;
double d;
printf("Enter the number of element ->");
scanf("%d", &n);
printf("Enter the value of y followed by x\n");
for (int i = 0; i < n; i++)
{
scanf("%lf", &a[i]);
scanf("%lf", &b[i]);
}
for (int i = 0; i < n; i++)
{
m_a = a[i] + m_a;
m_b = b[i] + m_b;
}
double M_A = m_a / n; // Mean value for x
double M_B = m_b / n; // Mean value for y
for (int i = 0; i < n; i++)
{
A[i] = (a[i] - M_A);
B[i] = (b[i] - M_B);
C = (A[i] * A[i]) + C;
D = (B[i] * B[i]) + D;
E = (A[i] * B[i]) + E;
}
double F = C * D;
double cor = E / sqrt(F);
printf(" The Correlation Coefficient is %lf\n ", cor);
printf("\n");
if (cor > 0 && cor <= 1)
{
printf("Positivly Correlated");
}
else if (cor == 0)
{
printf("No relation found");
}
else if (cor < 0 && cor > -1)
{
printf("Negativly Correlated");
}
else
{
printf("Invalid Output !!");
}
}
- Output
Conclusion
The Pearson Correlation Coefficient offers us a straightforward method to quantify the linear relationship between two continuous variables. By mastering Pearson's and recognizing when to use alternative measures, one can enhance data analysis proficiency and derive more accurate insights from any data.