How to find and delete near-duplicate excel rows using levenshtein similarity
- Step 1Upload your file — Drop your Excel or CSV file onto the Fuzzy Deduplicator tool.
- Step 2Select key column — Enter the name of the column containing the values to deduplicate on (e.g. company_name).
- Step 3Set threshold — Choose a similarity threshold between 50–100%. 85% is a good starting point for company names.
- Step 4Review and download — Review the list of removed rows and their match scores, then download the clean file.
Frequently asked questions
What similarity algorithm is used?+
Levenshtein edit distance, normalized by the length of the longer string, gives a 0–100% score.
Does it keep the first or last occurrence?+
The first occurrence of each cluster is kept. All subsequent near-duplicates are removed.
Can I preview matches before deleting?+
Yes — the results panel shows each removed row paired with the representative row it matched.
Privacy first
Every JAD Excel tool runs entirely in your browser using SheetJS and ExcelJS. Your spreadsheets, formulas, and data never leave your device — verified by zero outbound network requests during processing.