Hey, folks! In this article, we will be focusing on **Correlation Regression analysis to find the correlation between variables** in Python.

So, let us begin!

## What is Correlation Regression Analysis?

**Correlation Regression Analysis** is an important step in the process of data pre-processing for modeling of datasets. For any dataset, it is very important to depict the relationship between the variables and understand the effect of variables on the overall prediction of the data as well as the target/response variable.

This is when, Correlation Regression Analysis comes into picture.

Correlation Analysis helps us analyze the below aspects of dataβ

**Relationship between the independent variables i.e. information depicted by them and their correlation.****Effect of the independent variables on the dependent variable.**

It is crucial for any developer to understand the correlation between the independent variables.

Correlation ranges from **0 to 1**. A high correlation between the two variables depicts that both the variables represent the same information.

Thus, it gives rise to multicollinearity and we can drop either of those variables.

Having understood the concept of Correlation, let us now try to implement it practically in the upcoming section.

## Finding Correlation between variables

Let us first start with importing the dataset. You can find the dataset **here**. We have loaded the dataset into the environment using the read_csv() function.

Further, we have segregated all the numeric variables of the dataset and stored them. Because, correlation works only on numeric data. We have applied the` corr()`

function to depict the correlation between the variables through the correlation matrix.

1 2 3 4 5 6 |
import pandas data = pandas.read_csv("Bank_loan.csv") #Using Correlation analysis to depict the relationship between the numeric/continuous data variables numeric_col = ['age',employ','address','income','debtinc','creddebt','othdebt'] corr = data.loc[:,numeric_col].corr() print(corr) |

**Output:**

We can use seaborn.heatmap() function to visualize the correlation data in the range of 0 to 1 as shown belowβ

1 |
sn.heatmap(corr, annot=True) |

**Output:**

## Conclusion

By this, we have come to the end of this topic. Feel free to comment below, in case you come across any question.

For more such posts related to Python, Stay tuned @ Python with JournalDev and till then, Happy Learning!! π