Skip to main content

Overview

Identity resolution is the process of mapping/linking a record from an external system to a record in SyncHive.

Terminology

TermDefiniton
Hive IDThe primary key of a record in SyncHive. The Hive ID is generated by SyncHive. The current format of a Hive ID is a UUIDv4. Although it is possible to generate a UUIDv4 and publish it as a Hive ID, it is not recommended to do so, as the format of the Hive ID may change.
Core IDAn alternate name for Hive ID.
Alternate KeyAn alternate key for a record in SyncHive.
External ID/External IdentityRefer to: External ID
Identifying propertyAn ID property which can identify the record. E.g. an external ID or a Hive ID
Root RecordThe top level record of the document.
MessageRefer to: Message

Inbound Pattern

For resolving identities inbound to SyncHive, the provided indentifying properties are used to attempt to link to a pre-existing record. At least one set of identifying properties need to be provided, and when multiple are provided a priority is applied.

The priority of matching with a record in SyncHive is:

  1. Hive ID
  2. External IDs
  3. Alternate Keys

Note: This is a different from the sequencing order outlined below.

If a pre-existing record is not found within SyncHive, a new record will be created.

Although it is possible to integrate soley with the Hive ID, it is recommended to provide all the identifying properties possible. In other words provide the external ID, and if available the Hive ID.

Not providing the external ID (when available), comes with downsides:

  1. The external ID may become required later on; which introduces a data backfilling exercise
  2. Hard to reconcile records which have already been integrated into an external systems

Integrating with just the Hive ID

Integrating soley with Hive IDs require the Hive ID to have been put there by the Outbound Hive ID Identity Resolution Pattern. And using a mix of the Hive ID and external ID can cause sequencing issues, and is not recommended.

Inbound Sequencing Order

SyncHive sequences records with the same identity, but when identifying properties are published, there is a priority on which identifying property is used.

  1. External ID
  2. Hive ID

Example Message

An example Product Message is provided below. The following document has provided a single identifying property, its external identity for the integration MyExampleIntegration. The value of the external identity is 101. If this document were to be published to SyncHive, the external identity would be first used to find a pre-existing document with that external identity, if found the pre-existing document would be updated with the new data, otherwise a new record would be created, and a Hive ID issued.

{
"schemaName": "limber",
"schemaVersion": "2.1.0",
"shapeName": "Product",
"dataId": {
"@type": "DataReference",
"schemaName": "limber",
"shapeName": "Product",
"externalIdentities": [
{
"integrationKey": "MyExampleIntegration",
"externalId": "101"
}
]
},
"dataAction": "PUBLISH",
"mode": "LIVE",
"integrationKey": "MyExampleIntegration",
"data": {
"@type": "Product",
"name": "My Example Product",
"externalIdentity": [
{
"@type": "ExternalID",
"internalType": "Product",
"externalSystemCode": "MyExampleIntegration",
"externalId": "101"
}
]
},
"storeKey": "limber"
}

Outbound Patterns

For resolving identities outbound, there are two recommended patterns:

  1. Hive ID Identity Resolution
  2. External ID Identity Resolution

Both identity resolution mechanisms have their own pros and cons, but wherever possible, the Hive ID Identity Resolution pattern is recommended. The prime reason for this is robustness. In cases where Hive ID Identity Resolution is not possible, External IDs are a fine backup.

Hive ID Identity Resolution

Hive ID Identity Resolution, is a pattern where the Hive ID is written into the external system. The Hive ID is then used to link to a record in an external system. All records published by SyncHive to a connector have a Hive ID.

Pros

  • Can handle asynchronous publications
  • More robust as there is no need to write the External ID back into SyncHive

Cons

  • Generally need to read the external system efficiently to do an update
  • External system needs to have a field to hold the Hive ID
  • Hive ID in the external system may be user editable
  • Ideally the external system can atomically create a record and write the Hive ID

The steps to follow for Hive ID Identity Resolutions are:

  1. Check if a record exists in the external system using the Hive ID from SyncHive
    1. If so then update the record in the external system
    2. Else create a new record in the external system, ensuring the Hive ID is written as well

Asynchronous Writes

Hive ID Identity Resolution is necessary for asynchronous integrations with an external system. Generally in this scenario, an external ID is not availble to link to immediately, and so a Hive ID is required to be able to link the record in an external system with the record in SyncHive.

Linking External IDs

Although not strictly necessary, it is still recommened to link an external ID for the record.

External ID Identity Resolution

External ID Identity Resolution, is a pattern where after writing into an external system, the external system's record identifier is written back into SyncHive against the record. The external ID in SyncHive is then used to link to a record in an external system.

Pros

  • External System doesn't need a field to hold the Hive ID

Cons

  • Need to store the External ID back into SyncHive (which may transiently fail)
  • Cannot handle asynchronous writes

The steps to follow for External ID Identity Resolutions are:

  1. Check if the document from SyncHive has an external ID
    1. If so use the external ID to update the record in the external system
    2. Else write the record into the external system and return the external ID to SyncHive

Outbound Sequencing Order

Outbound sequencing is soley done by Hive IDs.