View Categories

Descriptive Statistics

What is Descriptive Statistics? #

Descriptive statistics is used to summarize and describe the main features of a dataset.

It answers:

  • What is the average?
  • How spread out is the data?
  • Are there unusual values?

Sample Dataset #

Let’s use this data:

10, 20, 30, 40, 50

Mean (Average) #

Definition #

The mean is the average of all values.

Formula #

Mean=Sum of all valuesNumber of values\text{Mean} = \frac{\text{Sum of all values}}{\text{Number of values}}Mean=Number of valuesSum of all values​

Example #

(10+20+30+40+50)/5=30(10 + 20 + 30 + 40 + 50) / 5 = 30(10+20+30+40+50)/5=30

Mean = 30

Use #

  • General average of data

Median #

Definition #

The middle value when data is sorted

Example #

10, 20, 30, 40, 50

Middle value = 30

Even Dataset Example #

10, 20, 30, 40

Median = (20 + 30) / 2 = 25

Use #

  • Better when data has outliers

Mode #

Definition #

The value that appears most frequently

Example #

10, 20, 20, 30, 40

Mode = 20

Use #

  • Useful for categorical data

Range #

Definition #

Difference between maximum and minimum values

Formula #

Range=MaxMin\text{Range} = \text{Max} – \text{Min}Range=Max−Min

Example #

5010=4050 – 10 = 4050−10=40

Range = 40

Use #

  • Shows spread of data

Variance #

Definition #

Measures how far values are from the mean

Formula (Conceptual) #

Variance=(xmean)2n\text{Variance} = \frac{\sum (x – \text{mean})^2}{n}Variance=n∑(x−mean)2​

Meaning #

  • High variance → data spread out
  • Low variance → data close to mean

Standard Deviation #

Definition #

Square root of variance

Formula #

Std Dev=Variance\text{Std Dev} = \sqrt{\text{Variance}}Std Dev=Variance​

Meaning #

  • Low → data is consistent
  • High → data is spread out

Example Insight #

  • Std Dev = 2 → tightly grouped
  • Std Dev = 20 → widely spread

Quartiles #

Definition #

Divide data into 4 equal parts

Types #

QuartileMeaning
Q125% data
Q2Median (50%)
Q375% data

Example #

10, 20, 30, 40, 50
  • Q1 = 20
  • Q2 = 30
  • Q3 = 40

Interquartile Range (IQR) #

Formula #

IQR=Q3Q1IQR = Q3 – Q1IQR=Q3−Q1

Example #

4020=2040 – 20 = 2040−20=20

IQR = 20

Outliers #

Definition #

Values that are very far from the rest of the data

Detection (Using IQR) #

Lower limit:Q11.5×IQRQ1 – 1.5 \times IQRQ1−1.5×IQR

Upper limit:Q3+1.5×IQRQ3 + 1.5 \times IQRQ3+1.5×IQR

Example #

If:

  • Q1 = 20
  • Q3 = 40
  • IQR = 20

Lower = 20 – 30 = -10
Upper = 40 + 30 = 70

Any value outside this range = outlier

Example Data #

10, 20, 30, 40, 100

100 is an outlier

Summary Table #

MeasurePurpose
MeanAverage
MedianMiddle value
ModeMost frequent
RangeSpread (max-min)
VarianceData dispersion
Std DevSpread (interpretable)
QuartilesData distribution
OutliersExtreme values

When to Use What #

  • Use Mean → normal data
  • Use Median → skewed data
  • Use Std Dev → variability
  • Use IQR → outlier detection
Descriptive Statistics
💬
AIRA (AI Research Assistant) Neural Learning Interface • Drag & Resize Enabled
×