Data Miners Blog: marketing

Showing posts with label marketing. Show all posts

Sunday, March 30, 2014

Doing the Right Thing: Are your measures correct?

"A lot of good analysis is wasted doing the wrong thing."

Anyone who has worked with data on business problems is probably aware of this adage. And this past week, I was reminded once again of this fact while analyzing a marketing program. This example is so striking, because difference between doing the "right" thing and the "almost-right" thing ended up being more than a factor of 10 -- a really big variance on a financial calculation.

Some background. One of my clients does a lot of prospecting on the web. They have various campaigns to increase leads to their web site. These campaigns cost money. Is it worth it to invest in a particular program?

This seems easy enough to answer, assuming the incoming leads are coded with their source (and they seem to be). Just look at the leads coming in. Compare them to the customers who sign up. And the rest, as they say, is just arithmetic.

Let's say that a customer who signups up on the web has an estimated value of $300. And, we can all agree on this number because it is the Finance Number. No need to argue with that.

The first estimate for the number of leads brought in was around 160, produced by the Business Intelligence Group. With an estimated value of $300, the pilot program was generating long term revenue of $48,000 -- much more than the cost of the program. No brainer here. The program worked! Expand the program! Promote the manager!

The second estimate for the number of leads brought in was 12. With an estimated value of $300, the pilot was generating $3,600 in long term revenue -- way less than the cost of the program. Well, we might as well burn the cash and roast marshmellows over the flame. No promotion here. Know any good recruiters?

Both these estimates used the same data sources. The difference was in the understanding of how the "visitor experience" is represented in the data.

For instance, a visitor has come to the site 300 times in the past. The 301st visit was through the new marketing program. Then two weeks later on the 320th visit, magic happens and the visitor becomes a customer. Is the lead responsible for the acquisition? This problem is called channel attribution. If the customer had signed up when s/he clicked as a lead then yes, you could attribute all or most value to that marketing program. But two weeks and 20 visits later? Not likely. The lead was already interested.

A more serious problem occurs through the complexities of web visits. If a visitor is not logged in, there is no perfect way to track him or her (or "it" if it were a dog). Of course, this company uses cookies and browser caches and tries really, really hard to keep track of visitors over time. But the visitor cannot be identified as a customer until s/he has logged in. So, I may be a real customer, but happen to be trying out a new browser on my machine. Or, I visit from an airport lounge and don't log in. Or some other anonymous visit. This seems like a bona fide lead when arriving through the marketing program.

And then . . . the visitor keeps using the new browser (or whatever). And then later, s/he decides to login. At that point, the visitor is identified as a customer. And, more importantly, the VisitorId associated with the visitor is now a customer. But that doesn't mean that the lead created the customer. The logging in merely identified an existing customer.

Guess what? This happened more times than you might imagine. In many, many cases, the 160 "customers" generated by the leads had been customers for months and years prior to this marketing campaign. It doesn't make sense to attribute their value to the campaign.

The moral of this story: it is important to understand the data and more importantly, to understand what the data is telling you about the real world. Sometimes in our eagerness to get answers we might miss very important details.

As a final note, we found the problem through a very simple request. Instead of just believing the number 160 in the report generated by the Business Intelligence Group, we insisted on the list of leads and account numbers created by the program. With the list in-hand, the problems were fairly obvious.

Monday, December 28, 2009

Differential Response or Uplift Modeling

Some time before the holidays, we received the following inquiry from a reader:

Dear Data Miners,

I’ve read interesting arguments for uplift modeling (also called incremental response modeling) [1], but I’m not sure how to implement it. I have responses from a direct mailing with a treatment group and a control group. Now what? Without data mining, I can calculate the uplift between the two groups but not for individual responses. With the data mining techniques I know, I can identify the ‘do not disturbs,’ but there’s more than avoiding mailing that group. How is uplift modeling implemented in general, and how could it be done in R or Weka?

[1] http://www.stochasticsolutions.com/pdf/CrossSell.pdf

I first heard the term "uplift modeling" from Nick Radcliffe, then of Quadstone. I think he may have invented it. In our book, Data Mining Techniques, we use the term "differential response analysis." It turns out that "differential response" has a very specific meaning in the child welfare world, so perhaps we'll switch to "incremental response" or "uplift" in the next edition. But whatever it is called, you can approach this problem in a cell-based fashion without any special tools. Cell-based approaches divide customers into cells or segments in such a way that all members of a cell are similar to one another along some set of dimensions considered to be important for the particular application. You can then measure whatever you wish to optimize (order size, response rate, . . .) by cell and, going forward, treat the cells where treatment has the greatest effect.

Here, the quantity to measure is the difference in response rate or average order size between treated and untreated groups of otherwise similar customers. Within each cell, we need a randomly selected treatment group and a randomly selected control group; the incremental response or uplift is the difference in average order size (or whatever) between the two. Of course some cells will have higher or lower overall average order size, but that is not the focus of incremental response modeling. The question is not "What is the average order size of women between 40 and 50 who have made more than 2 previous purchases and live in a neighborhood where average household income is two standard deviations above the regional average?" It is "What is the change in order size for this group?"

Ideally, of course, you should design the segmentation and assignment of customers to treatment and control groups before the test, but the reader who submitted the question has already done the direct mailing and tallied the responses. Is it now too late to analyze incremental response? That depends: If the control group is a true random control group and if it is large enough that it can be partitioned into segments that are still large enough to provide statistically significant differences in order size, it is not too late. You could, for instance, compare the incremental response of male and female responders.

A cell-based approach is only useful if the segment definitions are such that incremental response really does vary across cells. Dividing customers into male and female segments won't help if men and women are equally responsive to the treatment. This is the advantage of the special-purpose uplift modeling software developed by Quadstone (now Portrait Software). This tool builds a decision tree where the splitting criteria is maximizing the difference in incremental response. This automatically leads to segments (the leaves of the tree) characterized by either high or low uplift. That is a really cool idea, but the lack of such a tool is not a reason to avoid incremental response analysis.

Thursday, May 1, 2008

Statistical Test for Measuring ROI on Direct Mail Test

If I want to test the effect of return of investment on a mail/ no mail sample, however, I cannot use a parametric test since the distribution of dollar amounts do not follow a normal distribution. What non-parametric test could I use that would give me something similar to a hypothesis test of two samples?

Recently, we received an email with the question above. Since it was addressed to [email protected], it seems quite reasonable to answer it here.

First, I need to note that Michael and I are not statisticians. We don't even play one on TV (hmm, that's an interesting idea). However, we have gleaned some knowledge of statistics over the years, much from friends and colleagues who are respected statisticians.

Second, the question I am going to answer is the following: Assume that we do a test, with a test group and a control group. What we want to measure is whether the average dollars per customer is significantly different for the test group as compared to the control group. The challenge is that the dollar amounts themselve do not follow a known distribution, or the distribution is known not to be a normal distribution. For instance, we might only have two products, one that costs $10 and one that costs $100.

The reason that I'm restating the problem is because a term such as ROI (return on investment) gets thrown around a lot. In some cases, it could mean the current value of discounted future cash flows. Here, though, I think it simply means the dollar amount that customers spend (or invest, or donate, or whatever depending on the particular business).

The overall approach is that we want to measure the average and standard error for each of the groups. Then, we'll apply a simple "standard error" of the difference to see if the difference is consistently positive or negative. This is a very typical use of a z-score. And, it is a topic that I discuss in more detail in Chapter 3 of my book "". In fact, the example here is slightly modified from the example in the book.

A good place to start is the Central Limit Theorem. This is a fundamental theorem for statistics. Assume that I have a population of things -- such as customers who are going to spend money in response to a marketing campaign. Assume that I take a sample of these customers and measure an average over the sample. Well, as I take more an more samples, the distribution of the averages follows a normal distribution regardless of the original distribution of values. (This is a slight oversimplification of the Central Limit Theorem, but it captures the important ideas.)

In addition, I can measure the relationship between the characteristics of the overall population and the characteristics of the sample:

(1) The average of the sample is as good an approximation as any of the average of the overall population.

(2) The standard error on the average of the sample is the standard deviation of the overall population divided by the square root of the size of the sample. Alternatively, we can phrase this in terms of variance: the variance of the sample average is the variance of the population average divided by the size of the sample.

Well, we are close. We know the average of each sample, because we can measure the average. If we knew the standard deviation of the overall population, then we could get the standard error for each group. Then, we'd know the standard error and we would be done. Well, it turns out that:

(3) The standard deviation of the sample is as good an approximation as any for the standard deviation of the population. This is convenient!

Let's assume that we have the following scenario.

Our test group has 17,839 customers, and the overall average purchase is $85.48. The control group has 53,537 customers, and the average purchase is $70.14. Is this statistically different?

We need some additional information, namely the standard deviation for each group. For the test group, the standard deviation is $197.23. For the control group, it is $196.67.

The standard error for the two groups is then $197.23/sqrt(17,839) and $196.67/sqrt(53,537), which comes to $1.48 and $0.85, respectively.

So, now the question is: is the difference of the means ($85.48 - $70.14 = $15.34) significantly different from zero. We need another formula from statistics to calculate the standard error of the difference. This formula says that the standard error is the square root of the sums of the squares of standard errors. So the value is $1.71 = sqrt(0.85^2 + 1.48^2).

And we have arrived at a place where we can use the z-score. The difference of $15.34 is about 9 standard deviations from 0 (that is, 9*1.71 is about 15.34). It is highly, highly, highly unlikely that the difference includes 0, so we can say that the test group is significantly better than the control group.

In short, we can apply the concepts of normal distributions, even to calculations on dollar amounts. We do need to be careful and pay attention to what we are doing, but the Central Limit Theorem makes this possible. If you are interested in this subject, I do strongly recommend Data Analysis Using SQL and Excel, particularly Chapter 3.

--gordon

Data Miners Blog

Sunday, March 30, 2014

Doing the Right Thing: Are your measures correct?

Monday, December 28, 2009

Differential Response or Uplift Modeling

Thursday, May 1, 2008

Statistical Test for Measuring ROI on Direct Mail Test

Search This Blog

Topic Index

Blog Archive

Contributors

Sites we love

Data Miners Blog

Sunday, March 30, 2014

Doing the Right Thing: Are your measures correct?

Monday, December 28, 2009

Differential Response or Uplift Modeling

Thursday, May 1, 2008

Statistical Test for Measuring ROI on Direct Mail Test

Search This Blog

Topic Index

Blog Archive

Contributors

Subscribe To

Sites we love