The Send to CDP Sync Module, Part 3 - Cache and Date Stamps

So far, we've covered the Send to CDP module's introduction, getting data from Send and preparing a list of guests for upload. Since there's a hard limit of 50mb per upload I wanted to use a local cache so only changes are sent on each iteration. We'll go over that in this post.

We're now in the home stretch! The 3rd of four articles for this module. 

  1. Module Overview, the Extension and Module Configuration
  2. Download Users from Send Lists <- We're here
  3. Cache and Date Stamps
  4. Upload to CDP

How Caching is Accomplished

The rest of the module operates in the CdpSupport.cs file, and it starts with loading a cache of the last iteration. 

Caching is accomplished by saving the complete batch file as a local JSON file (original.json) before removing unchanged records into a datetime-formatted folder in App_Data. The module also saves a file called import.json (per CDP documentation) in the same directory. Later this import.json is going to get compressed and uploaded to CDP. 

So now we have a representation of all records in Send in JSON format from the last run. How do we get it this time around? Glad you asked. 

var batchFiles = BatchFileInformation();

The above line calls a Tuple<string, string>, which returns the latest directory (remember it's datetime named) if there is one, and the 2nd string is the newly created folder for this run. 

This method also purges old directories and loads all guest records from the last run (using original.json) into CachedGuestRecords.

var existingRecords = JsonConvert.DeserializeObject<List<GuestRecord>>
  (oldFileContent, IgnoreMissingSerializerSettings);
CachedGuestRecords = existingRecords?.ToDictionary(gr => gr.Value.Email, gr => gr);


Loading Timestamps From the Latest Cache

Now that we have the complete list of records from the last (CachedGuestRecords) and current runs (NewCopyOfGuestRecords), we need to perform one more step before removing identical records, and that is updating the timestamps.

The extension in CDP will show the last time a subscription was true or false, but these are not available to us, so the cache will retrain them. 



The wordy method NewCopyOfGuestRecords will iterate over each new record and try to get a record from the cache based on the identifier (email).

foreach (var newlyDownloadedGuestRecord in NewCopyOfGuestRecords)
{
    if (CachedGuestRecords.TryGetValue(newlyDownloadedGuestRecord.Key, out var cachedGuestRecord))
    {

If found, it loads the same Ref (not mandatory) and the extensions for each one.

var cachedExention = cachedGuestRecord.Value.Extensions.FirstOrDefault();
var newExtension = newlyDownloadedGuestRecord.Value.Value.Extensions.FirstOrDefault();

Yes, there's more than one extension possible in CDP, but this module only knows about the one we're uploading.

It then iterates over each property that starts with subscription and doesn't end with LastUpdated. 

foreach (var key in newExtension.Keys.Where(x => x.StartsWith("subscription") 
  && !x.EndsWith("LastUpdated")))

If the values are not null and are the same, if keeps the last timestamp, which kept the last and so on. If they're different the timestamp is changed to now.

if (cachedExention[key].ToLower() == newExtension[key].ToLower())
{
    var timestampToKeep = cachedExention.ContainsKey($"{key}LastUpdated") 
        ? cachedExention[$"{key}LastUpdated"] 
        : "";
    if (string.IsNullOrWhiteSpace(timestampToKeep))
    {
        tempKeys[$"{key}LastUpdated"] = DateTime.UtcNow.ToString("o");
    }
    else
    {
        tempKeys[$"{key}LastUpdated"] = timestampToKeep;
    }
}
else
{
    tempKeys[$"{key}LastUpdated"] = DateTime.UtcNow.ToString("o");
}
foreach (var key in tempKeys.Keys)
{
    if (newExtension.ContainsKey(key))
    {
        newExtension[key] = tempKeys[key];
    }
    else
    {
        newExtension.Add(key, tempKeys[key]);
    }
}


Removing Unchanged Records

Ok so now we have have the NewCopyOfGuestRecords loaded with date stamps from the cached copy if the Boolean properties haven't changed. If there indeed changes since the last run, the next step is a simple removal of unchanged records by calling RemoveUnchangedRecords. 

The key line in the method below SequenceEqual, where the extensions in each record are compared. The records to be removed because they're the same are added to a temporary list, and then removed after the iteration:

foreach (var newlyDownloadedGuestRecord in NewCopyOfGuestRecords)
{
    if (CachedGuestRecords.TryGetValue(newlyDownloadedGuestRecord.Key, out var cachedGuestRecord))
    {
        var cachedExention = cachedGuestRecord.Value.Extensions.FirstOrDefault();
        var newExtension = newlyDownloadedGuestRecord.Value.Value.Extensions.FirstOrDefault();
        if (cachedExention == null)
        {
            continue;
        }
        else if (newExtension == null)
        {
            recordsToRemove.Add(newlyDownloadedGuestRecord.Key);
            continue;
        }
        else if (cachedExention.OrderBy(kvp => kvp.Key)
                .SequenceEqual(newExtension.OrderBy(kvp => kvp.Key)))
        {
            recordsToRemove.Add(newlyDownloadedGuestRecord.Key);
            skippedRecordUnchangedCount++;
        }
    }
}
if (recordsToRemove.Any())
{
    foreach (var key in recordsToRemove)
    {
        NewCopyOfGuestRecords.Remove(key);
    }
}


Should I Stay or Should I Go?

The last check before deciding if an upload is necessary looks to see if the new guest records have anything left in them. 

if (NewCopyOfGuestRecords.Count == 0)
{
    message = $"No guest records to update.";
    TaskSummary.AppendLine(message);
    Log.Info($"{logPrefix} {message}", this);
    ArchiveBatchFiles(batchFiles.Item1, batchFiles.Item2);
    return true;
}

I haven't mentioned the TaskSummary StringBuilder or the logging in earlier examples. I'm removing them from some of the blog code samples but they're everywhere in the module and Git. The task summary will update the main settings item for an easy view of the last run.


What's Next?

The next and last post for this module will cover uploading to CDP. See you there