View Categories

DAta cleaning in excel

What is Data Cleaning? #

Data Cleaning means fixing messy, incorrect, or inconsistent data so it becomes usable for analysis.

Raw data is often:

  • Duplicated
  • Misformatted
  • Combined in one column
  • Inconsistent

Removing Duplicates #

What are Duplicates? #

Duplicate data = same records repeated multiple times

Example:

NameEmail
Alexa@gmail.com
Alexa@gmail.com

Why Remove Duplicates? #

  • Avoid wrong analysis
  • Prevent double counting
  • Improve data quality

Steps to Remove Duplicates #

  1. Select your dataset
  2. Go to Data tab
  3. Click Remove Duplicates
  4. Select columns to check
  5. Click OK

Example #

Before:

Alex
John
Alex

After:

Alex
John

Important Tips #

  • Always keep a backup before removing
  • Choose correct columns (e.g., Email for uniqueness)

Text to Columns #

What is Text to Columns? #

Used to split one column into multiple columns

Example:

"Alex, 25, USA"

→ Split into:

  • Name
  • Age
  • Country

Types #

TypeDescription
DelimitedSplit by comma, space, etc.
Fixed WidthSplit by position

Steps (Delimited) #

  1. Select column
  2. Go to Data → Text to Columns
  3. Choose Delimited
  4. Select delimiter (comma, space, etc.)
  5. Click Finish

Example #

Before:

Alex,25,USA

After:

NameAgeCountry
Alex25USA

Use Cases #

  • Split full names
  • Separate addresses
  • Clean imported data

Flash Fill #

What is Flash Fill? #

Flash Fill automatically detects patterns and fills data.

No formula needed

Example #

Dataset: #

Full Name
Alex John

Extract First Name: #

  1. Type Alex manually in next column
  2. Press Ctrl + E
  3. Excel fills automatically

More Examples #

Extract Last Name: #

John

Combine Names:

Alex_John

Steps #

  1. Type example manually
  2. Press Ctrl + E
    Done

Use Cases #

  • Split names
  • Format phone numbers
  • Create custom patterns

Combined Real Example #

Raw Data: #

" Alex John , 25 , USA "

Cleaning Steps: #

  1. TRIM → remove spaces
  2. Text to Columns → split data
  3. Flash Fill → format names
  4. Remove Duplicates → clean repeats

Table #

FeaturePurposeShortcut
Remove DuplicatesDelete repeated dataData → Remove Duplicates
Text to ColumnsSplit dataData → Text to Columns
Flash FillAuto pattern fillCtrl + E
  • Data Cleaning is essential before analysis
  • Remove Duplicates → clean repeated data
  • Text to Columns → split combined data
  • Flash Fill → automate formatting

These skills are used in almost every real dataset

DAta cleaning in excel
💬
AIRA (AI Research Assistant) Neural Learning Interface • Drag & Resize Enabled
×