How to compare two rows to dedupe?

Johg_Ananda · June 16, 2020, 3:49pm

I am trying to dedupe a contact database. There are many fields, name, phone, email, address. Many of the records are duplicates … which to keep? How can I compare the rows to see which are exact matches, and which have differences?

Federico_Stefanato · June 16, 2020, 4:16pm

Hi @Johg_Ananda,
this is a recurring issue that - sooner or later - tends to happen in data models that grow without specific blueprints.

Let’s see if this helps:

Determine what it means duplicated i.e. what are the set of rules to assess if B is copy or a variation of A. Usually there are a set of fields that act as a unique key e.g. first name, last name (and/or username, email…)
Implement a way to generate that unique key: a concatenation of those fields into a new column, for instance
Highlight duplicates.
Remove/merge them: if you already have some rules, this can even de done automatically. Otherwise, a manual check on duplicated rows is necessary.

I’d assume you can’t share your table (but if you can, it would be useful).
So I provided a very basic sample:

Let me know if its a good starting point.

Johg_Ananda · June 16, 2020, 4:35pm

@Federico_Stefanato yes this is a great starting point. The key model seems to solve it and allows me to choose the key fields to compare.

Is the regexreplace() to remove spaces // trim() would work?

Federico_Stefanato · June 16, 2020, 4:40pm

Actually, Trim() just stripes off spaces at the beginning and at the end of a string. Usually (in these cases) it’s better to “compress” all the spaces so that the char sequence is easier to match

Topic		Replies	Views
How to detect duplicates in a row?	2	415	October 8, 2023
Removing Duplicates by Email using a button	3	958	March 8, 2021
Creating a button to identify & remove duplicate rows	5	3370	April 25, 2024
How to deduplicate form lists	10	553	June 14, 2023
Find duplicates across several columns and show only first type	11	1337	September 9, 2021

How to compare two rows to dedupe?

Related topics