The Send to CDP Sync Module, Part 4 - Deploying to CDP

Ok, that's it! We're in the final piece of this four-part series for the Send to CDP Sync module. In this episode, we'll cover the API calls used for deploying the batch file, monitoring its progress and securing the local cache which was covered in the last post.


Deploying to CDP

We have the JSON file with the changes that are being uploaded and the method UploadBatchFileAsync in CdpGateway.cs will handle this. You can go over the entire method but I'm going to cover the key points here.

First, and most importantly, the JSON format needed by CDP is not standard. There must be one record per line, with no commas separating them and they're not part of an array. You can see the formatting here:

importJson = importJson.TrimStart('[').TrimEnd(']').Replace(",{\"ref\":", Environment.NewLine + "{\"ref\":");
importJson = Regex.Replace(importJson, ":\"false\"", ":false", RegexOptions.IgnoreCase);
importJson = Regex.Replace(importJson, ":\"true\"", ":true", RegexOptions.IgnoreCase);

False and True values were previously used as strings and are converted to Boolean here as well.

The importJson string is then saved as a file, compressed and checked if it exceeds CDP's maximum file size limit of 50mb.

File.WriteAllText(importFilePath, importJson);
gzipPath = $"{importFilePath}.gz";
using (var originalFileStream = File.OpenRead(importFilePath))
using (var compressedFileStream = File.Create(gzipPath))
using (var compressionStream = new GZipStream(compressedFileStream, CompressionLevel.Optimal))
{
    originalFileStream.CopyTo(compressionStream);
}
long gzipFileSize = new FileInfo(gzipPath).Length;
double percentOf50MB = (double)gzipFileSize / (50 * 1024 * 1024) * 100;
string percentString = $"{percentOf50MB:F2}% of 50MB";
if (percentOf50MB > 100)
{
    Log.Error($"{logPrefix} The gzip file exceeds 50MB. Current size: {gzipFileSize} bytes ({percentString}).", this);
    throw new InvalidOperationException("The gzip file exceeds the maximum allowed size of 50MB.");
}

Next, we get into preparing some values for the CDP API calls. 

string md5Checksum;
using (var md5 = MD5.Create())
using (var stream = File.OpenRead(gzipPath))
{
    var hash = md5.ComputeHash(stream);
    md5Checksum = BitConverter.ToString(hash).Replace("-", "").ToLowerInvariant();
}
byte[] md5Bytes = Enumerable.Range(0, md5Checksum.Length / 2).Select(x => byte.Parse(md5Checksum.Substring(x * 2, 2), System.Globalization.NumberStyles.HexNumber)).ToArray();
base64 = System.Convert.ToBase64String(md5Bytes);
guid = Guid.NewGuid().ToString();
var batchUploadRequestBody = new BatchUploadRequestBody { Checksum = md5Checksum, Size = gzipFileSize };
batchUploadRequestBodyJson = JsonConvert.SerializeObject(batchUploadRequestBody, new JsonSerializerSettings
{
    ContractResolver = new LowercaseContractResolver(),
    Formatting = Formatting.None
});

At this point we have four key variables that the CDP API will need:

  • gzipFileSize = 1263
  • md5Checksum = 39461463ae2e42e0b7aa5237cf9d863b
  • base64 = OUYUY64uQuC3qlI3z52GOw==
  • guid = 6160324e-11be-4aec-85ad-325c0e85cd08


The first call will be the batch API plus the Guid, which will be https://api-engage-us.sitecorecloud.io/v2/batches/6160324e-11be-4aec-85ad-325c0e85cd08. The body, as seen below, will be the batchUploadRequestBodyJson from above.

var clientKey = GlobalConfigItem.Fields[Templates.SendToCdpSettings.CdpGateway.Fields.ApiClientKey]?.Value;
var apiToken = GlobalConfigItem.Fields[Templates.SendToCdpSettings.CdpGateway.Fields.ApiToken]?.Value;
var authToken = System.Convert.ToBase64String(Encoding.ASCII.GetBytes($"{clientKey}:{apiToken}"));
_httpClient.DefaultRequestHeaders.Authorization = new System.Net.Http.Headers.AuthenticationHeaderValue("Basic", authToken);
using (var request = new HttpRequestMessage(HttpMethod.Put, endpoint)
{
    Content = new StringContent(batchUploadRequestBodyJson, Encoding.UTF8, "application/json")
})

Later in this method the uploadLocation valieable is populated:

uploadLocation = batchUploadRequestResponse.Location.Href;

This is an example of the URL looking something like this:

https://boxever-batch-service-production-us-east-1.s3.us-east-1.amazonaws.com/3106ccd4123.......

The final piece to the process is to use the uploadLocation variable in a Put, and include the gzip file:

using (var fileStream = File.OpenRead(gzipPath))
{
    using (var request = new HttpRequestMessage(HttpMethod.Put, uploadLocation)
    {
        Content = new StreamContent(fileStream)
    })
    {
        request.Content.Headers.Add("x-amz-server-side-encryption", "AES256");
        request.Content.Headers.Add("Content-Md5", base64);
        request.Content.Headers.ContentType = new System.Net.Http.Headers.MediaTypeHeaderValue("application/octet-stream");
        using (var uploadResponse = await _httpClient.SendAsync(request))
        {
            if (uploadResponse.IsSuccessStatusCode)
            {
                MnpLogger.Info($"{logPrefix} Upload complete.");
                return true;
            }
            else
            {
                MnpLogger.Error($"{logPrefix} Error uploading batch file to CDP.");
                return false;
            }
        }
    }
}




Cleaning up the Cache Files

Once deployed using the gateway above, the method ArchiveBatchFiles is called from the CdpSupport.cs file. In it, the previously extracted JSON files from the last run are deleted.

var cachedOriginalFile = $"{originalFolderPath}\\{Constants.OriginalBatchFileName}";
if (File.Exists(cachedOriginalFile))
    File.Delete(cachedOriginalFile);
var cachedBatchFile = $"{originalFolderPath}\\{Constants.BatchFileName}";
if (File.Exists(cachedBatchFile))
    File.Delete(cachedBatchFile);

Then, the new files that were created for this run are added to a password protected zip file, so in the event access is gained to App_Data, the files are protected.

var archivePath = $"{newFolderPath}\\{Constants.ArchiveFileName}";
var newOriginalFile = $"{newFolderPath}\\{Constants.OriginalBatchFileName}";
var newBatchFile = $"{newFolderPath}\\{Constants.BatchFileName}";
using (var zip = new Ionic.Zip.ZipFile())
{
    zip.Encryption = Ionic.Zip.EncryptionAlgorithm.WinZipAes256;
    zip.Password = CompressionPassword;
    if (File.Exists(newOriginalFile))
        zip.AddFile(newOriginalFile, "");
    if (File.Exists(newBatchFile))
        zip.AddFile(newBatchFile, "");
    foreach (var entry in zip.Entries)
    {
        entry.Password = CompressionPassword;
        entry.Encryption = Ionic.Zip.EncryptionAlgorithm.WinZipAes256;
    }
    zip.Save(archivePath);
}
if (File.Exists(newOriginalFile))
    File.Delete(newOriginalFile);
if (File.Exists(newBatchFile))
    File.Delete(newBatchFile);


Saving Results for Review

The main settings item has a multiline text field which gets updated with a summary of the run. This is handy for Sitecore Users to see:

Added email list SaleSubscribers to the processing queue.
Added email list NewsletterSubscribers to the processing queue.
2 email lists are ready for processing.
Processing mailing list: subscriptionSaleSubscribers
Processing list complete. 1 total records so far.
Processing list complete. 1 total records so far.
Processing list complete. 1 total records so far.
Processing list complete. 1 total records so far.
Processing mailing list: subscriptionNewsletterSubscribers
Processing list complete. 5 total records so far.
Processing list complete. 5 total records so far.
Processing list complete. 5 total records so far.
Processing list complete. 5 total records so far.
Getting previous values from cached batch files.
Loaded 5 records from cache.
Updating extension time stamps based on cache.
Creating unedited cache file representing all records in Send.
Removing unchanged records from this batch.
Records processed: Removed from batch as unchanged: 3, leaving 2.
Batch file uploaded successfully.
Removing temporary cached files.
Archiving new batch files and compressing with password set in settings.
Successfully uploaded changes to CDP.
Task completed in 0.04 minutes.

Checking the Upload Status

In your CDP portal, go to Developer center -> Batch uploads, where you'll see the progress of the job. You can see here the current one and a completed one. 



We're Done

This is the end of the line for this series. Thanks for reading and drop me a line if you want to know more!