What is Descriptive Statistics? #
Descriptive statistics is used to summarize and describe the main features of a dataset.
It answers:
- What is the average?
- How spread out is the data?
- Are there unusual values?
Sample Dataset #
Let’s use this data:
10, 20, 30, 40, 50
Mean (Average) #
Definition #
The mean is the average of all values.
Formula #
Mean=Number of valuesSum of all values
Example #
(10+20+30+40+50)/5=30
Mean = 30
Use #
- General average of data
Median #
Definition #
The middle value when data is sorted
Example #
10, 20, 30, 40, 50
Middle value = 30
Even Dataset Example #
10, 20, 30, 40
Median = (20 + 30) / 2 = 25
Use #
- Better when data has outliers
Mode #
Definition #
The value that appears most frequently
Example #
10, 20, 20, 30, 40
Mode = 20
Use #
- Useful for categorical data
Range #
Definition #
Difference between maximum and minimum values
Formula #
Range=Max−Min
Example #
50−10=40
Range = 40
Use #
- Shows spread of data
Variance #
Definition #
Measures how far values are from the mean
Formula (Conceptual) #
Variance=n∑(x−mean)2
Meaning #
- High variance → data spread out
- Low variance → data close to mean
Standard Deviation #
Definition #
Square root of variance
Formula #
Std Dev=Variance
Meaning #
- Low → data is consistent
- High → data is spread out
Example Insight #
- Std Dev = 2 → tightly grouped
- Std Dev = 20 → widely spread
Quartiles #
Definition #
Divide data into 4 equal parts
Types #
| Quartile | Meaning |
|---|---|
| Q1 | 25% data |
| Q2 | Median (50%) |
| Q3 | 75% data |
Example #
10, 20, 30, 40, 50
- Q1 = 20
- Q2 = 30
- Q3 = 40
Interquartile Range (IQR) #
Formula #
IQR=Q3−Q1
Example #
40−20=20
IQR = 20
Outliers #
Definition #
Values that are very far from the rest of the data
Detection (Using IQR) #
Lower limit:Q1−1.5×IQR
Upper limit:Q3+1.5×IQR
Example #
If:
- Q1 = 20
- Q3 = 40
- IQR = 20
Lower = 20 – 30 = -10
Upper = 40 + 30 = 70
Any value outside this range = outlier
Example Data #
10, 20, 30, 40, 100
100 is an outlier
Summary Table #
| Measure | Purpose |
|---|---|
| Mean | Average |
| Median | Middle value |
| Mode | Most frequent |
| Range | Spread (max-min) |
| Variance | Data dispersion |
| Std Dev | Spread (interpretable) |
| Quartiles | Data distribution |
| Outliers | Extreme values |
When to Use What #
- Use Mean → normal data
- Use Median → skewed data
- Use Std Dev → variability
- Use IQR → outlier detection

