-
Summarisation
Start by determining basic insights about the data-
Mean
- Avoided when large outliers are present in the data
-
Median
-
Mode
-
IQR
-
Standard Deviation
-
Class Imbalance
- If one class dominates or a specific class has a low count
-
-
Visualisation
Select a subset of data and choose an appropriate graph to visualise it-
Histograms
- For Numerical Data
-
Bar Charts
- For Categorical Data
-
Scatter Plots
- To determine correlation and outliers
-
Line Plots
- Connected relation between data
-
Box Plots
- To determine the distribution, outliers and quartiles of a numerical dimensions
-
Contour Plots
- Used for temporal Data
-
Correlation Matrix
-