Product Documentation
Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Back to homepage

Configure Consolidation

Overview

Canopy provides a rich set of functionalities that help you consolidate, i.e. merge, entities to form a Master Entity. This document outlines the steps needed to successfully use Consolidation Rules to achieve entity detection results best suited to your data.

A Master Entity is a collection of merged entities that identify a single, unique person. There are two types of merged entities: entities merged automatically via consolidation rules and manually merged entities.

Import Consolidation Rules

Canopy’s Global Rule Set can be replaced by importing custom consolidation rules. To import rules, click on Consolidation Rules from the kebab menu on the Entity List page: img_50.png

Click on the Delete icon to either import rules or create your own rules: img_52.png

Create rules interactively by clicking on the Create Rule button or via a JSON configuration file by clicking on the Import JSON button.

Landing Screen for Custom Consolidation

Canopy recommends that you start by using Global Rules, a universal set of rules suitable for all regions. Alternatively, you can select a trimmed down set of rules focused on a specific region:

  • Australia
  • Canada
  • United Kingdom
  • United States
⬇️ Download Canopy JSON consolidation rules

Rules Last Updated May 24, 2023

Edit Consolidation Rules

Users have the ability to edit Canopy’s Global Rule Set, or whichever rule set is currently loaded.

Click on the Pencil icon to edit a rule set.

img_53.png

Editing examples:
Users may want to consolidate entities when the First Name AND Last Name AND Social Security Number are the same.

In each condition subgroup below, the selection on top says Any of these, indicating that any of these conditions is enough to consolidate the entities.

A basic rule on First Name, Last Name and SSN

In the rule below, entities would be consolidated if they share the First Name AND Last Name AND any of Passport, Military ID and SSN. This is indicated by the All of these toggle in the top condition set and Any of these toggle in the bottom condition set.

A single rule combining AND and OR conditions

In the example below, multiple condition groups are joined with an OR condition, meaning that at least one of the rule sets needs to match.

Multiple rule sets to consolidate based on different fields

Exact and Phonetic Match Options

There are two types of field matching options, Exact and Phonetic. Phonetic matching is only applicable for name fields. Selecting Phonetic Match would consider “Jon” and “John” as same names.

Canopy uses a combination of phonetic match algorithms to detect names that sound the same but have different spellings. These algorithms are called Soundex and Metaphone.

Soundex converts word sounds into codes and then compares these codes to report similarities. Originally created and used for the US census in 1880, 1900, and 1910, Soundex doesn’t fare well with non-English names.

Metaphone was created as an alternative to Soundex. Consonant sounds were added to its analysis, helping it perform better on non-English names.

Canopy considers two words to be phonetically matched when both Soundex and Metaphone report exact matches. Use of both algorithms provides the most accurate results.

Canopy recommends using phonetic match detection on “First Name” or “Last Name” only. If used on both, a name like “Emmanuel Moreno” will match with “Immanuel Morano,” who may be two different people.

Ignore Options

There are three types of ignore options, Blanks, Case Sensitivity, and Special Characters. Ignore options cannot be set to Numbers for numeric fields.

Blanks

If you choose to ignore blank fields, Entity Consolidation will not group two entities where the fields are blank.

For example, you have two entities: | Entity | First Name | Last Name | | —— | —— | —— | | 1 | blank | Smith | | 2 | blank | Smith |

  • Rule configuration 1: First Name AND Last Name Exact Match

    Entity Consolidation will cluster all people with the last name of Smith and a blank entry for the first name.

    Results: Entity 1 and 2 will be clustered together.

  • Rule configuration 2: First Name (Ignore Blanks) AND Last Name Exact Match

    Entity Consolidation will ignore the blank fields and not create clusters based on them. In this example, these two entities with last name of Smith and a blank first name would not be clustered together.

    Results: Entity 1 and 2 will not be clustered together.

Case Sensitivity

If selected, Entity Consolidation will remove case sensitivity when comparing fields.

Special Characters

If selected, Entity Consolidation will ignore these characters when comparing fields:

  • Hyphen -
  • Parentheses ()
  • Plus +
  • Underscores _
  • Numbers 1, 2, 3, …
  • Spaces

Configure Merge Settings

Once conditions have been set, merge settings can be defined. Merge Settings can be accessed via the blue button on the bottom right of the screen:

img_49.png

In Merge Settings, the following options can be selected:

img_20.png

Use Canopy’s nickname database to resolve first name conflicts
Toggle this option ON to use the internal Canopy nickname database to resolve conflicts, eg. Jon Favreau and Jonathan Favreau will not be treated as a conflict.
Combine address fields into a single line before comparing
Toggle this option ON to collate all Address fields before comparing. Address will be combined only if the address field is configured to Do Not Merge in Field Conflict Settings. For example, the following two addresses will be treated the same:
  • Street: 751 ML King Avenue, Unit: Apt 51, City: Neverland, State: Narnia, Country: US
  • Street: 751 ML King Avenue, Apt 51, Unit: , City: Neverland, State: Narnia, Country: US

Configure Field Conflict Settings

The Field Conflict Settings determine how to resolve conflicts between a Master Entity and a Raw Entity.

Field Conflict Settings

Create your own field conflict settings interactively, or import JSON field settings by clicking on Import Merge Settings.

These settings can be set on a field-by-field basis. There are three columns to consider: Action, Merging Method, and Blank Field. For example, let there be two entities, as follows:

Entity First Name Middle Name Last Name Entity Type
1 John William Federer Master
2 John blank Federer Raw

Action

The Action column determines what happens when the Middle Name is a conflict between two entities to be consolidated according to the rules. If this is set to Do not merge, the two entities above will remain separate, but clustered. If this is set to Merge, the merging behavior will be determined by the next two columns.

Merging Method

The Merging Method column will only activate when the Action column is set to Merge. Select Append to show entities separated by a comma as a separate value in the field. Select Secondary Field to add the value of the secondary entity to the Additional Info field.

Only some fields, like “Names,” have the secondary field capability.
Blank Field

This column determines what happens if one of the values is blank (like Entity 2 above). This column will only activate if Action is set to Do not Merge. If Prevent Merge is selected, the entities will stay clustered, but will not be automatically merged. If Merge is selected, the two entities will be merged.

Update and Run Consolidation

The Update and Run option will be enabled when one or more of the following conditions are met:

  • New rules are imported
  • Current rules are edited
  • Reset consolidation and/or Delete manual decisions is checked, as explained below.

img_55.png

Reset Consolidation

Consolidation can be reset via a checkbox on the lower left of the Update Consolidation Rules page. Checking this box resets consolidation by deleting all automated clustering, grouping, and merging actions. This checkbox defaults to un-checked each time Update and Run is clicked.

img_46.png

If the Reset consolidation box is checked, the current consolidated entity list will be deleted, including automated clustering, grouping, and merging actions. Manual decisions will be remembered. Running consolidation will create clusters and automatically merge entities using your new rules and merge settings.

Delete Manual Decisions

All manual decisions will be deleted during consolidation when the Delete manual decisions box is checked.

img_45.png

If the Delete manual decisions box is checked, the manual decisions made to the consolidated entity list will be deleted when consolidation is run.

Checking both of these boxes and clicking Update and Run will completely delete the consolidated entity list, including manual and automated clustering, grouping, and merging actions.

img_47.png

If both Reset consolidation and Delete manual decisions boxes are checked, the current consolidated entity list will be deleted, including manual and automated clustering, grouping, and merging actions.

Terminology

Condition Group
Condition groups contain one or more subgroups of conditions separated by AND or OR statements. Condition groups can also be separated by AND or OR statements.
Clustering
Consolidation will create clusters based upon the values in Raw Entities, according to your consolidation rules. Based upon your merge settings, clustered entities will fall into two groups, Merged Entities or Related Entities.
Master Entity
A Master Entity is a collection of entities merged to identify a single, unique person. Merging entities can be done manually or via a consolidation rule. There is one Master Entity per cluster.
Changes to the elements in Master or Clustered entities will not be considered during consolidation. For changes to be considered during consolidation, edits must be made at the Raw entity level.
Merged Entities
Merged Entities form a Master Entity. There are two types of Merged Entities, automatically merged and manually merged. You can compare entities and remove a merged entity from a cluster using the Unmerge and Ungroup function. Click on the View Details icon in the Action column to access these functions.

img_8.png

img_5.png

img_2.png

Order of Evaluation
Rules will process sequentially.
Related Entities
Related entities are entities that are clustered together during consolidation, but could not be merged. You can manually merge the entity, or remove the related entity from the cluster using the Ungroup function. Click on this box in the Action column to access these functions:

img_4.png

You can also compare, merge, ungroup or delete entities from the detail view:

img_7.png

Subgroup
A group of conditions that you can configure to return a value of true if any of the conditions are met or if all of the conditions are met.