parsing – Peter Eliason :: Analytics Leader

When sending emails from ExactTarget (aka: SalesForce Marketing Cloud) — you know, we’ll just call it “ET” for short — we occasionally receive error logs which contain ET subscriber keys where a small subset of targeted customers match profiles of other customers who have unsubscribed or been flagged as undeliverable. Our goal is to process these log files and mark these customers as “undeliverable” in our parent system. Since the log files are not formatted for data import, we need to use a parsing tool to extract the subscriber keys. I chose Python.

In the code below, I enter the path of the log file which was emailed to me from ET, and then I use re (regular expressions) to find all instances that match the subscriber key format, which is “[varchars-varchars-varchars-varchars-varchars]”.

Your version of subscriber keys will undoubtedly look different than mine, but you should be able to modify the regular expression in the re.compile() function to search for the right format. More info about Python’s regular expression class.

Let me know what you think!

I have included sample files with fake data for reference:
Input File (9989FFA6-BD29-8DBA-B712-C6E8ED32F0X9 Results 988623122.txt)
Output File (cleanFile.txt)

[su_heading]Python Code[/su_heading]


# -*- coding: utf-8 -*-
"""
Created on Wed Feb  4 13:59:50 2015

@author: peliason

Purpose: Parse out subscriber key from ET List detective.

"""

from __future__ import division, print_function



##PARAMETERS
inputFile = "C://Users//peliason//Documents//reference//python//ET parse subscriber//for post//9989FFA6-BD29-8DBA-B712-C6E8ED32F0X9 Results 988623122.txt"

outputFile = "C://Users//peliason//Documents//reference//python//ET parse subscriber//for post/cleanFile.txt"
###END PARAMETERS





import re
import codecs


#initialize output list
found = []

#this is my search function
subscriberKeySearch = re.compile(r'[([A-Za-z0-9_] - [A-Za-z0-9_] - [A-Za-z0-9_] - [A-Za-z0-9_] - [A-Za-z0-9_] )]')

## Open the file with read only permit
f = codecs.open(inputFile, 'r','utf-16')

##IMPORTANT: if your input file is utf-8, then you should be able to run this 
#   file open code instead (without the codes class):
#   f = open(inputFile, 'r')


## Read the first line 
line = f.readline()

## till the file is empty
while line:
    
    found.extend(subscriberKeySearch.findall(line))
    
    line = f.readline()
f.close()

#write the list to the outputFile
myOutput = open(outputFile, 'w')
for ele in found:
    myOutput.write(ele 'n')
    
myOutput.close()

Tag: parsing

Parsing Exact Target (aka SalesForce Marketing Cloud) List Detective Import Error Files