Overview
Identity resolution is the process of mapping/linking a record from an external system to a record in SyncHive.
Terminology
Term | Definiton |
---|---|
Hive ID | The primary key of a record in SyncHive. The Hive ID is generated by SyncHive. The current format of a Hive ID is a UUIDv4. Although it is possible to generate a UUIDv4 and publish it as a Hive ID, it is not recommended to do so, as the format of the Hive ID may change. |
Core ID | An alternate name for Hive ID. |
Alternate Key | An alternate key for a record in SyncHive. |
External ID/External Identity | Refer to: External ID |
Identifying property | An ID property which can identify the record. E.g. an external ID or a Hive ID |
Root Record | The top level record of the document. |
Message | Refer to: Message |
Inbound Pattern
For resolving identities inbound to SyncHive, the provided indentifying properties are used to attempt to link to a pre-existing record. At least one set of identifying properties need to be provided, and when multiple are provided a priority is applied.
The priority of matching with a record in SyncHive is:
- Hive ID
- External IDs
- Alternate Keys
Note: This is a different from the sequencing order outlined below.
If a pre-existing record is not found within SyncHive, a new record will be created.
Recommended Pattern
Although it is possible to integrate soley with the Hive ID, it is recommended to provide all the identifying properties possible. In other words provide the external ID, and if available the Hive ID.
Not providing the external ID (when available), comes with downsides:
- The external ID may become required later on; which introduces a data backfilling exercise
- Hard to reconcile records which have already been integrated into an external systems
Integrating with just the Hive ID
Integrating soley with Hive IDs require the Hive ID to have been put there by the Outbound Hive ID Identity Resolution Pattern. And using a mix of the Hive ID and external ID can cause sequencing issues, and is not recommended.
Inbound Sequencing Order
SyncHive sequences records with the same identity, but when identifying properties are published, there is a priority on which identifying property is used.
- External ID
- Hive ID
Example Message
An example Product Message is provided below. The following document has provided a single identifying property, its external identity for the integration MyExampleIntegration
. The value of the external identity is 101
. If this document were to be published to SyncHive, the external identity would be first used to find a pre-existing document with that external identity, if found the pre-existing document would be updated with the new data, otherwise a new record would be created, and a Hive ID issued.
{
"schemaName": "limber",
"schemaVersion": "2.1.0",
"shapeName": "Product",
"dataId": {
"@type": "DataReference",
"schemaName": "limber",
"shapeName": "Product",
"externalIdentities": [
{
"integrationKey": "MyExampleIntegration",
"externalId": "101"
}
]
},
"dataAction": "PUBLISH",
"mode": "LIVE",
"integrationKey": "MyExampleIntegration",
"data": {
"@type": "Product",
"name": "My Example Product",
"externalIdentity": [
{
"@type": "ExternalID",
"internalType": "Product",
"externalSystemCode": "MyExampleIntegration",
"externalId": "101"
}
]
},
"storeKey": "limber"
}
Outbound Patterns
For resolving identities outbound, there are two recommended patterns:
- Hive ID Identity Resolution
- External ID Identity Resolution
Both identity resolution mechanisms have their own pros and cons, but wherever possible, the Hive ID Identity Resolution pattern is recommended. The prime reason for this is robustness. In cases where Hive ID Identity Resolution is not possible, External IDs are a fine backup.
Hive ID Identity Resolution
Hive ID Identity Resolution, is a pattern where the Hive ID is written into the external system. The Hive ID is then used to link to a record in an external system. All records published by SyncHive to a connector have a Hive ID.
Pros
- Can handle asynchronous publications
- More robust as there is no need to write the External ID back into SyncHive
Cons
- Generally need to read the external system efficiently to do an update
- External system needs to have a field to hold the Hive ID
- Hive ID in the external system may be user editable
- Ideally the external system can atomically create a record and write the Hive ID
The steps to follow for Hive ID Identity Resolutions are:
- Check if a record exists in the external system using the Hive ID from SyncHive
- If so then update the record in the external system
- Else create a new record in the external system, ensuring the Hive ID is written as well
Asynchronous Writes
Hive ID Identity Resolution is necessary for asynchronous integrations with an external system. Generally in this scenario, an external ID is not availble to link to immediately, and so a Hive ID is required to be able to link the record in an external system with the record in SyncHive.
Linking External IDs
Although not strictly necessary, it is still recommened to link an external ID for the record.
External ID Identity Resolution
External ID Identity Resolution, is a pattern where after writing into an external system, the external system's record identifier is written back into SyncHive against the record. The external ID in SyncHive is then used to link to a record in an external system.
Pros
- External System doesn't need a field to hold the Hive ID
Cons
- Need to store the External ID back into SyncHive (which may transiently fail)
- Cannot handle asynchronous writes
The steps to follow for External ID Identity Resolutions are:
- Check if the document from SyncHive has an external ID
- If so use the external ID to update the record in the external system
- Else write the record into the external system and return the external ID to SyncHive
Outbound Sequencing Order
Outbound sequencing is soley done by Hive IDs.