TABLE OF CONTENTS

1. Login

DEDUPE

2. Add Objects

3. Matching Models

4. Master Record/Merge Rules

CLEANSE

5. Transform Rules

6. Data Quality Modeling

VERIFY

7. Verifications

IMPORT

8. Import by CSV, Object, Salesforce, SFTP or AWS Cloud

AUTOMATE

9. Scheduled Jobs

10. How do I create a Dataset?



1. Login


Connect to DataGroomr through external login option - app.datagroomr.com and choose either Production or Sandbox Login. 


Optionally install DataGroomr Managed Package from the AppExchange to enable DataGroomr as a tab in Salesforce and utilize Lightning Components. 



Please see: 

Duplicates Lightning Component Install, Configure and Permission Sets

Manage Users

Manage Roles


2. Add Objects


Upon your initial login we profile standard DataGroomr objects; leads, contacts, accounts and leads in contacts - and provide the option to add any additional objects (custom/standard) of your choosing. 


Manually removing/adding objects takes place within Supervisr module, under Objects.



3. Matching Models

While DataGroomr provides out-of-the box classic and language learning models - you have the ability to create and customize your own modeling to your unique business requirements. 


What are we trying to deduplicate?

What field(s) present key indicators of a match? 

How strict should the matches be?

What are my score thresholds and match confidence?

Should I include Synonyms, Ignore Words or Groupings? 



Classic Matching Model

Rule-based approach: You manually define which fields to compare (e.g., First Name, Email, Company) and assign weights to each.

  • Deterministic logic: Matches are found based on exact or fuzzy logic using predefined thresholds.
  • Transparent scoring: You can clearly see which fields influenced a match and by how much.
  • Best for: Simple or predictable data scenarios where you want full control over match logic.



Machine Learning Matching Model

AI-assisted approach: Uses a machine learning algorithm trained on labeled examples of duplicates and non-duplicates. Manually train these models, or allow AI to profile and select duplicates within your dataset.

  • Pattern recognition: Learns complex relationships between fields that may not be obvious or linear.
  • Adaptive: Gets better with more training data (you label examples, the model adjusts).
  • Confidence scores: Matches are scored based on statistical confidence, not rigid rules.
  • Best for: Large or messy datasets where manual rules may miss patterns or create false positives.


Please see articles below for Basic/Advanced model configuration on classic/machine modeling: 

Training Machine Learning Model

Manage Matching Models


4. Master Record/Merge Rules

Master Record Rules: Automatically choose which record survives during a merge. 


When nominating a Master Record Rule, it is important to set based on your business logic requirements.


What record represents the most complete, reliable, or authoritative version?

Is this record/data being used for billing, support, reporting, marketing, or integration or maybe all?


Use that answer to drive logic like:


Oldest/most established record?

Most recently updated?

Most complete?

CRM system of origin?

Annual revenue?


DataGroomr provides commonly used scenarios to apply immediately, while also providing the ability to customize the rule with Functions, Values and Fields to your unique requirements. 


Try our AI Assistant (Beta), and prompt creation of Master, Merge, Data Quality and Transform rules with Generative AI.


Field Value Rules: Specify how data from child (non-surviving) records should be retained in the master record.


When nominating a Field Merge Rule, it is important to set based on your business logic requirements, with the overall goal being to preserve the best version of data for each field. 


Should we keep the master’s value, or pull a better value from another record?

Should we keep the masters value and pull a further value from another record? 

What makes a field value the most trustworthy?

Are there specific records we need to prioritize, or have standalone field merge logic applied? 

What would we like to do with empty fields? 


Much of this will vary by field type and business logic requirements.


DataGroomr provides commonly used scenarios to apply immediately, while also providing the ability to customize the rule with Functions, Logic, Values, Fields and Records to your unique requirements.


Try our AI Assistant (Beta), and prompt creation of Master, Merge, Data Quality and Transform rules with Generative AI.


For more information on Merge Rules see below:

AI Assistant in Rule Designer

Merge Rules Master Record Selection

Supervisr Field Merge Rule


Assign Rules to a Dataset, or access dataset configuration window.


5. Transform Rules

Transform rules allow you to normalize or standardize your data based on a specific criteria, in mass.


Business requirements to consider


What is the purpose of the data? (e.g., lead scoring, segmentation, routing)

Who consumes this data—Sales, Marketing, Support, etc.?

Are there dependencies on exact formatting (e.g., standardized titles, phone numbers)?

Will the transformed values be clear and intuitive to users in Salesforce?

Are there regulatory requirements (e.g., GDPR, HIPAA) around how data must be stored or displayed?

Do we have specific owners/values that need to be replaced?


DataGroomr provides commonly used scenarios to apply immediately, while also providing the ability to customize the rule with Functions, Logic, Values, Fields, Text Math and Date - complimenting your unique requirements.



Try our AI Assistant (Beta), and prompt creation of Master, Merge, Data Quality and Transform rules with Generative AI.



Transform Configurations can directly be applied within Brushr dataset, as well Import and Classic Matching Models.


For more information on Data Quality Models see below:

Transform Rules

Transform Records in CSV 

Apply Transform in Matching Model


6. Data Quality Modeling


Data Quality Models help enforce consistency, completeness, and accuracy across your records. Before creating them, it's important to align with key business needs to avoid disrupting downstream processes or misclassifying valuable data.


What does “good” or “usable” data mean for your teams?

For Sales: Does it mean valid phone numbers and job titles?

For Marketing: Does it mean industry, company size, and email format are standardized?

For Support: Do you need account ID linkage or complete address data?

Which fields must be accurate or complete for business operations?

Can you build DQ rules to flag bad entry habits?



DataGroomr provides commonly used scenarios to apply immediately for Data Quality Modeling, while also providing the ability to customize the rule with Functions, Scoring, Patterns, Logic, Values, Fields, Text Math and Date - complimenting your unique requirements.


Try our AI Assistant (Beta), and prompt creation of Master, Merge, Data Quality and Transform rules with Generative AI.


Data Quality Models can be applied within within Brushr dataset.


For more information on Data Quality Models see below:

Manage Data Quality Models

Data Quality Model Editor


7. Verifications


There are several ways to improve or confirm data validity score through Verification -


Email Address Verification - DataGroomr will verify the validity of email addresses

Phone Number Verification - DataGroomr will verify the validity of phone numbers

Mailing Address Verification - DataGroomr will verify the validity of the entered mailing address.

Website Address Verification - DataGroomr will verify the validity of the entered website address.




Verify can be applied within within Brushr dataset.

 


Click on individual verification action to configure synchronization of verification results to Salesforce, including Verification Status, Verification Score and Verification Date and enable automatic replacement of the invalid values to the suggested values for emails and addresses. Clicking on "+ Create new field" will synchronize the verification status of each field back into Salesforce 


Validations can optionally be displayed through Lightning Verify Component and added to user profiles of your choice. 

 

For more information on Verify, see below:

How does email verification work?

How does phone verification work?

How does address verification work?

How does website verification work?

Lightning Verify Component

Mass Verify Option


8. Import by CSV, Object, Salesforce, SFTP or AWS Cloud


Navigate to Importr Module


Add a Data Source, and Select CSV or Object (Transfer data between Salesforce objects). 


For CSV, you have options to import data from Your local device, Salesforce, SFTP or AWS S3.


Map CSV fields to Object Fields (API name provided for reference). 



Review and confirm Matching Model, Merge Rule, and Match Confidence. 



Upon Import DataGroomr will create four bucketed categories for your data.


Unmatched Records - records that are not duplicates

Matched Records - records that already have a duplicate in Salesforce

Imported Records – any records imported into Salesforce will move to this bucket

Mass Processed Records - records updated into Salesforce via an automation or a manual trigger of a mass update of Salesforce appear in this section


Unmatched records can be Mass Imported into Salesforce, and Matched records may be individually updated, or utilize a field merge rule to Update Values in Mass. 


Apply Transform Rules to your Unmatched/Matched data to assure standardization before syncing to Salesforce. 


For more information on Data Import, see below:

Import Data from CSV Files

Working with Imported CSV Records

Transfer Data Between Salesforce Objects

Export Records in Importr


9. Scheduled Jobs


Schedulr supports multiple types of jobs that can be fully automated, that includes:


Analyze - analyze data for duplicates and data quality

Mass Merge - mass merge duplicates

Mass Convert - mass convert leads to contacts or accounts

Mass Delete - mass delete records

Mass Transform - mass standardize, normalize or update data

Mass Import - mass import records from Importr dataset

Mass Verify - mass verify records in Brushr dataset

Sync To Salesforce - synchronize duplicates with Salesforce



Choose which datasets to analyze and who will receive Job report emailings, while setting Match Confidence for Job to perform Mass Action. 


Choose specific date and time for your Job, as well as the choice to Recur, Hourly, Daily, Weekly or Monthly. 



Since DataGroomr connects through API, set scheduled jobs at peak hours knowing no degradation will be incurred. 


For more information on Schedulr, see below:

Accessing Schedulr

Jobs: Task Scheduling

Triggers: Real Time Verification and Transformation


10. How do I create a Dataset?


To create a dataset, navigate to our Trimmr Module and add a single or cross-object dataset. 



Datasets consist of 5 tabs allowing for finite control of configuration - 


  • General -This tab allows you to name your dataset and select the Salesforce object that will be analyzed for duplicates.
  • Field - Use this tab to choose which fields will be visible during the deduplication process.
  • Filter - This section lets you apply custom filters to focus deduplication efforts on specific subsets of data
  • Match - This tab specifies the models used for duplicate detection. (Please see matching models below)
  • Merge - Master Record Selection: Automatically choose which record survives during a merge. Field Value Rules: Specify how data from child (non-surviving) records should be retained in the master record.

Once complete, save your dataset and click into the desired dataset to view duplicate matches for comparison to set business requirements and merge individually, or Mass Merge


DataGroomr provides standard rules, for both Master Record selection, and Field Merges - as well as pre-configured classic matching models complimented by the ability to create your own language learning models. 


For more information on Dataset creation see below:

How to Configure Datasets

Compare Datasets for Duplicates