2 min read

Duplicate Detection – First step towards maintaining Data Quality

Today, maintaining data quality is one of the most critical problems faced in the business. Since data is entered in the system by different people, in different standards and at different levels, so this results in several data quality issues. One of the top most issues is having multiple representation of same logical real world entity or in other words having duplicate records. Such duplicate records may lead to significant problems to business for example:-

· Increased cost to company

· Increased customer dissatisfaction

· Erroneous data mining & analytics etc

In Titan there is a new very powerful feature of “Duplicate Detection” which will help organizations take first step towards maintaining quality of data in their system. So here is the brief introduction to this feature:-

· Duplicate detection feature in Titan, will allow organizations to define duplicate detection policy (Duplicate detection rules) for different record types (supported for all record types including custom entity). Using intuitive and well known (advanced find like) user interface, an administrator can define duplicate detection policies based on which system should report a record as a duplicate of another existing record. These duplicate detection policies can be defined across different record types for example organization may define that a lead will be duplicate if a contact is already existing with same name and phone number. Necessary knobs are exposed using which organization can turn on / off duplicate detection for different record types and different data entry points.

· Based on the duplicate detection policy (rules) defined by the administrator, system will alert user about the potential duplicates when user tries to create new or update existing record. Same policy will be applicable for other data entry points as well for example data import, Offline to online data sync, Promote contacts in outlook, SDK etc. So in short all the entry points for data are guarded with duplicate detection to just allow high quality data into the system.

· Even if the entry points are guarded with duplicate detection, there could be some legitimate business reasons due to which duplicate data enters the system. So to help organization maintain data quality on the on-going basis we have introduced a notion of “System-Wide duplicate detection”. With this feature you can schedule a duplicate detection background job which will check for duplicates for all the records matching a criterion (Advanced find query) and report back the result (list of duplicates). User can cleanse the data by deleting, deactivating or merging the duplicates reported by background job.

This is just a overview; expect more details on each and every concept of this super cool feature in much more details soon.

This posting is provided “AS IS” with no warranties, and confers no rights.

Rohit Bhatia