direct marketing analytics journalthedma.org/wp-content/uploads/2012-analytics-journal.pdfwinter...

42
Letter From The Chair ONE PROBLEM: TWO SOLUTIONS – DIFFERENCE SCORES VS. REGRESSION SCORES SHOULD I BUILD A SEGMENTED MODEL? A PRACTITIONERS PERSPECTIVE Identifying Key Sales Influencers in Facebook using ORBIT ANALYTICS A Single Centroid-based Clustering Approach to Create Automatic Social Circles Dynamically Tie the Right Offer to the Right Customer in Telecommunications Industry 4 6 13 21 31 An Annual Publication from the Direct Marketing Association Analytics Council

Upload: others

Post on 21-May-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: DIRECT MARKETING ANALYTICS JOURNALthedma.org/wp-content/uploads/2012-Analytics-Journal.pdfWinter 2012 DMA Analytics Journal 4 Letter From The Chair March 13, 2012 Dear Analytics-CRM

2012

Letter From The Chair

ONE PROBLEM: TWO SOLUTIONS – DIFFERENCE SCORES VS. REGRESSION SCORES Sam Koslowsky, Harte-Hanks

SHOULD I BUILD A SEGMENTED MODEL?A PRACTITIONER’S PERSPECTIVE Krishna Mehta, JigyasaVarun Aggarwal, EXL Service

Identifying Key Sales Influencers in Facebook using ORBIT ANALYTICS A Single Centroid-based Clustering Approach toCreate Automatic Social Circles

Harikrishna S. Aravapalli, IBM’s Business Analytics and Optimization

Dynamically Tie the Right Offer to the Right Customer in Telecommunications Industry Kunal Sawarkar, IBM India Software Lab - SWG, Business Analytics, IndiaSanket Jain, GBS Business Analytics and Optimization Center of Competence, CMS Analytics India

4

6

13

21

31

DIRECT MARKETING ANALYTICS JOURNALAn Annual Publication from the Direct Marketing Association Analytics Council

Page 2: DIRECT MARKETING ANALYTICS JOURNALthedma.org/wp-content/uploads/2012-Analytics-Journal.pdfWinter 2012 DMA Analytics Journal 4 Letter From The Chair March 13, 2012 Dear Analytics-CRM

Untitled-1 1/19/12 12:35 PM Page 1

Composite

C M Y CM MY CY CMY K

Page 3: DIRECT MARKETING ANALYTICS JOURNALthedma.org/wp-content/uploads/2012-Analytics-Journal.pdfWinter 2012 DMA Analytics Journal 4 Letter From The Chair March 13, 2012 Dear Analytics-CRM

DISCOVER WHAT YOUR DATA CAN DOThe Global Event forReal-Time Marketers

DMA2012 October 13 – 18, 2012Mandalay Bay | Las Vegas

Learn how to integrate channels, content, and tactics including social, mobile, mail and more

Discover how to leverage data to improve ROI and measurability

Connect with thousands of marketing peers and solution providers all in one place

dma12.org

Register by June 29 and SAVE up to $300

Use Keycode AJ3 when registering

Page 4: DIRECT MARKETING ANALYTICS JOURNALthedma.org/wp-content/uploads/2012-Analytics-Journal.pdfWinter 2012 DMA Analytics Journal 4 Letter From The Chair March 13, 2012 Dear Analytics-CRM

Winter 2012 DMA Analytics Journal

4

Letter From The Chair March 13, 2012 Dear Analytics-CRM Council Member, I hope that this new issue of the DMA Analytics-CRM Council Journal finds you well. Although the DMA Analytics Journal has been published for over a decade, this is technically the first release from the DMA Analytics-CRM Council, the new organization created through the merging of two of the DMA’s premier member Councils, the Analytics Council, and the Customer Relationship Management (CRM) Council. This forward-thinking fusion of the Analytics and CRM Councils will be better able to provide industry leadership to DMA member and non-member companies. The merger will combine the data-driven insights enabled by analytics with customer-oriented business processes such as social media CRM, where companies track customer behavior to segment their target markets, customize their messaging, set marketing strategy, and design and implement campaigns. The Analytics-CRM Council Board members include prominent thought-leaders and industry experts from the industry and academia. Please visit us www.the-dma.org/segment/analytics for a list of Council Board Members and details on Council offerings. I want to take a moment to commend Leo Kluger, Vice Chair and Journal Editor, and Gary Cao, Associate Journal Editor, for their considerable time and efforts in compiling a quality professional journal. I do believe that you will enjoy reading four interesting articles covering a wide range of topics in the 2012 issue of the Direct Marketing Analytics-CRM Journal. Social Media and Next Generation Analytics are outlined as two of the top 10 technology trends in Gartner’s forecasts. In line with the projected growth in social media analytics, the current issue includes an article on Identifying Key Sales Influencers in Facebook. Harikrishna S. Aravapalli discusses the role of Orbit Analytics: a single centroid-based clustering approach to create automatic social circles. On the other side of the curve, where analytics are primarily conducted at the customer level, we know that each year, marketers lose millions of dollars providing offers to customers that may have purchased with a lower offer or even without an offer. In the genre of offer optimization, Kunal Sawarkar and Sanket Jain discuss how to Dynamically Tie the Right Offer to the Right Customer. Even as pioneers are reaching out to explore new and previously uncharted territories in the analytics space, many marketers struggle to acquire budgets to build even a single model. Building differentiated models for distinct segments is often an issue that

Page 5: DIRECT MARKETING ANALYTICS JOURNALthedma.org/wp-content/uploads/2012-Analytics-Journal.pdfWinter 2012 DMA Analytics Journal 4 Letter From The Chair March 13, 2012 Dear Analytics-CRM

Winter 2012 DMA Analytics Journal

5

necessitates rigorous business case justifications. Krishna Mehta and Varun Aggarwal tackle the timely question of Should I Build a Segmented Model? Last but not the least, we have an article that covers the evergreen problem of detecting difference between groups. Sam Koslowsky compares two methods: Difference Scores vs. Regression Scores. Armed with the appropriate methods for experimental design and evaluation, marketers would be better able to evaluate their programs correctly. I hope that you’ll be as pleased by the return of the hard copy version of the Journal as I am. We are inviting articles for the next release as well as comments and suggestions on ways to grow and expand the council offerings and membership. Please send me an email at [email protected] or call me at (203) 964-9733 Ext. 102 with your comments, feedback and contributions. I would love to hear from you. Make it a great day! Devyani Sadh, Ph.D. Chair, DMA Analytics-CRM Council CEO, Data Square

Page 6: DIRECT MARKETING ANALYTICS JOURNALthedma.org/wp-content/uploads/2012-Analytics-Journal.pdfWinter 2012 DMA Analytics Journal 4 Letter From The Chair March 13, 2012 Dear Analytics-CRM

Winter 2012 DMA Analytics Journal

6

One Problem: Two Solutions –

Difference Scores vs. Regression Scores

By Sam Koslowsky, Harte-Hanks Key to the analysis of experiments is the ability to detect differences between groups. In medicine, introduction of new treatments are evaluated against existing treatments. In education, new teaching approaches are considered, and compared to existing methods. In marketing, a manager may gauge the effect of a discount offer vs. the current coupon offering. Researchers refer to these assessment paradigms as the gold standard. While several approaches typically are employed in evaluating statistical differences between segments, the interpretation of data that emerge from such efforts must be done cautiously for many potential problems confront both the novice and more experienced analyst. Let’s introduce one such problem using a marketing example. We have two market segments, which we’ll call older and younger purchasers. Each group is presented with a generous incentive designed to increase spending. Behavior is noted before the test begins, and then subsequently recorded after the test is complete. Change in spending is calculated. Our objective is to determine whether or not the change in one segment is significantly (from a statistical perspective) different than the change in the other. While there are a variety of methods that market researchers employ to evaluate tests and results, two of the more standard techniques are (1) difference scores, and (2) regression analysis.

Application of Difference Scores Technique

If the change for the older group was $25 in increased spending and the corresponding change for the younger segment is $19 in increased spending, then we may say that the older group had a $6 incremental change (Case 1 below). If both segments increased by $6, then no difference in change is observed (Case 2 below). This can be viewed in the following table:

Page 7: DIRECT MARKETING ANALYTICS JOURNALthedma.org/wp-content/uploads/2012-Analytics-Journal.pdfWinter 2012 DMA Analytics Journal 4 Letter From The Chair March 13, 2012 Dear Analytics-CRM

Winter 2012 DMA Analytics Journal

7

Two Age Segments in Two Cases: Impact of Identical Purchase Incentive

Before & After Promotion

CASE 1 TIME 1

TIME 2 DIFF

OLDER $79 $104 $25 YOUNGER $63 $82 $19

DIFF $6

CASE 2 TIME 1

TIME 2

OLDER $79 $85 $6 YOUNGER $63 $69 $6

DIFF $0

Application of Regression Analysis Technique Here we use, for example, our two age groups (younger or older) and initial spending to predict spending at a subsequent point in time, say Time 2. As a by-product of the analysis, we observe whether or not age had any effect on spending. Let’s look at the following simple regression illustration to highlight the sensitivity dimension of regression, based on another set of simulated data (not shown): Assume we are looking to analyze spending as a function of someone's age. We do not know the precise age. What we do know is whether an individual falls into one of two categories:

younger group (less than age 45) older group (greater than or equal to age 45)

Investigators, in their statistical modeling, need to use only numeric data. How do we incorporate an older/younger, non-numeric-type categorical piece of data into our analysis? Researchers employ a “trick" converting this categorical data into a usable format. Dummy or indicator variables are created. Here we will assign a value of '0' to all those that are in the younger group, and a value of '1' to those that are in the older class. With this, our statistician developed the following relationship.

Spending = 3.12 - 13.41* (Older Age Group=1).

The 13.41 figure, or the coefficient, can be interpreted as the average difference in spending between people in the younger group and people in the older group during a given time segment. The 3.12 figure is the constant. Notice the negative sign in front of the coefficient. This leads to the conclusion that we would anticipate the older population to have spending that is $13.41 lower per person, on average.

Page 8: DIRECT MARKETING ANALYTICS JOURNALthedma.org/wp-content/uploads/2012-Analytics-Journal.pdfWinter 2012 DMA Analytics Journal 4 Letter From The Chair March 13, 2012 Dear Analytics-CRM

Winter 2012 DMA Analytics Journal

8

One of the most perplexing analytic issues faced by researchers is that these two methods frequently used to analyze change – difference scores and regression analysis – at times generate inconsistent and contradictory results! Why? And which technique should be used and when to help gain better insight? Let’s see how this happens using a deeper dive into a third marketing example in detail – and then we can discuss the right analysis technique to be used and why.

Difference Scores Approach in Detail Again, this analysis mode investigates average scores (spending, for example) at one point in time, and then re-examines the spending levels at a subsequent period. During the intervening time frame, the marketer has presented both younger and older test groups with a 25%-discount offer. While the incentive should appeal to all customers, the question to be answered is whether the older population changed it’s spending at a greater or lesser pace than the younger population. Older here refers to those over age 55, and the younger is defined as those with ages between 21 and 39. Let’s look at some simulated data for 40 study participants, representing 20 in each age segment.

Cust # Spending

at Spending

at AGE Difference

time T1 time T2 SEGMENT T2-T1

         

1 $185.37 $197.30 OLDER $11.93

2 $203.83 $178.38 OLDER -$25.45

3 $198.79 $229.08 OLDER $30.29

4 $207.00 $191.37 OLDER -$15.63

5 $165.37 $196.69 OLDER $31.32

6 $167.25 $225.84 OLDER $58.59

7 $197.91 $192.47 OLDER -$5.44

8 $239.18 $190.26 OLDER -$48.91

9 $200.67 $211.71 OLDER $11.03

10 $197.92 $176.26 OLDER -$21.66

11 $197.42 $196.93 OLDER -$0.49

12 $180.08 $219.37 OLDER $39.29

13 $234.81 $167.71 OLDER -$67.10

14 $186.18 $191.07 OLDER $4.89

15 $195.16 $180.51 OLDER -$14.65

16 $166.86 $190.35 OLDER $23.48

17 $172.38 $172.91 OLDER $0.52

18 $226.05 $221.27 OLDER -$4.78

19 $220.17 $225.65 OLDER $5.49

20 $216.97 $204.15 OLDER -$12.82

AVERAGE $197.96 $197.96 OLDER $0.00

Page 9: DIRECT MARKETING ANALYTICS JOURNALthedma.org/wp-content/uploads/2012-Analytics-Journal.pdfWinter 2012 DMA Analytics Journal 4 Letter From The Chair March 13, 2012 Dear Analytics-CRM

Winter 2012 DMA Analytics Journal

9

21 $182.32 $204.24 YOUNGER $21.92

22 $191.04 $202.46 YOUNGER $11.42

23 $183.53 $161.51 YOUNGER -$22.02

24 $191.18 $198.02 YOUNGER $6.84

25 $185.56 $192.85 YOUNGER $7.29

26 $176.61 $170.35 YOUNGER -$6.25

27 $199.62 $181.24 YOUNGER -$18.38

28 $184.33 $205.77 YOUNGER $21.44

29 $184.20 $203.78 YOUNGER $19.59

30 $189.19 $141.64 YOUNGER -$47.55

31 $159.01 $217.01 YOUNGER $58.00

32 $134.89 $174.70 YOUNGER $39.81

33 $198.97 $192.50 YOUNGER -$6.48

34 $204.62 $181.82 YOUNGER -$22.80

35 $191.23 $155.81 YOUNGER -$35.42

36 $163.49 $200.51 YOUNGER $37.02

37 $173.94 $174.07 YOUNGER $0.13

38 $230.19 $165.65 YOUNGER -$64.54

39 $169.46 $144.33 YOUNGER -$25.13

40 $173.24 $198.41 YOUNGER $25.16

         

AVERAGE $183.33 $183.33  YOUNGER $0.00 The last column computes the difference or change in spending between these two periods. First, let’s look at some basic facts:

The average spending for OLDER was $197.96 at TIME 1 and TIME 2 The average spending for YOUNGER was $183.33 at TIME 1 and TIME 2 The average spending increase for OLDER is ‘0’ The average spending increase for YOUNGER is ‘0’

Confirmed by the table below, the difference in spending (column labeled ‘Mean’) is effectively $0 for both age groups. We observe that there are no differences in the spending distributions between Time 1 and Time 2 for younger or older adults.

Difference Scores: Descriptive Statistics for Differences in Spending

AGE Minimum Maximum

Mean

(Difference in

Spending)

Standard

Deviation

OLDER -67.10 58.59 -.00 29.41

YOUNGER -64.54 58.00 .00 30.99

Page 10: DIRECT MARKETING ANALYTICS JOURNALthedma.org/wp-content/uploads/2012-Analytics-Journal.pdfWinter 2012 DMA Analytics Journal 4 Letter From The Chair March 13, 2012 Dear Analytics-CRM

Winter 2012 DMA Analytics Journal

10

It thus appears that the incentive offer presented by the marketer had no effect, and there is no evidence of differential impact of the age groups, as no group exhibits any change. The table below reports a T value of ‘0.000’ confirming the above results, and suggesting no difference between the groups. (A value closer to ‘3’ or more for the T test suggests that a significant statistical difference exists.)

T-Test – Difference in Spending between Younger and Older Groups

T Mean

Difference

DIFFERENCE 0.000 $0.000

Regression Analysis Approach in Detail

Let’s now view this same example of 40 individuals through the prism of basic regression analysis. As suggested earlier, an offshoot of regression analysis is the sensitivity of a predictor with what we are trying to analyze. Here, in our analysis, we will be using two predictors, namely age segment and spending at TIME 1 to predict spending at TIME 2. This diagram exhibits the process:

Note that age segment can be one of two values, older or younger. To employ this critical data element in our analysis, as explained earlier, we must convert each segment to a language understood by the regression procedure. We will recode OLDER to a value of ‘1’ and assign the value of ‘0’ to the YOUNGER segment. We’re all set now, as all input is in a numeric format, the necessary ingredients for processing. Here, a ‘T’ value designed to determine the significance of the coefficient, is generated, except this time through regression analysis. As exhibited in the following table, the ‘T’

TIME 1 TIME 2

SPENDING, AGE SEGMENT SPENDING

Page 11: DIRECT MARKETING ANALYTICS JOURNALthedma.org/wp-content/uploads/2012-Analytics-Journal.pdfWinter 2012 DMA Analytics Journal 4 Letter From The Chair March 13, 2012 Dear Analytics-CRM

Winter 2012 DMA Analytics Journal

11

value for the row titled ‘AGE’ (=1 for older, =0 for younger) is greater than 2 (underlined below), thus suggesting that there may be a relationship, albeit a weak one, between the AGE segment and subsequent spending. Again, this approach uses the same base sample data and results as the difference scores analysis – but indicates a different, and opposing, insight.

View 1: Regression Analysis between our Age Groups

Unstandardized Coefficients

Standardized

Coefficients

T Sig. B

Standardized

Error Beta

(Constant) 199.486 29.789 6.697 .000

AGE 15.920 6.880 .378 2.314 .026

Spending at TIME 1 -.088 .161 -.090 -.549 .586

To further support the regression conclusion, we can view results slightly differently by examining transaction levels (shown in the following table). Note the last column. On average, for a given TIME 1 spending level, older adults spend more at TIME 2 than do younger adults. Thus, these regression analysis results tell us the marketer’s incentive leads to more spending for the older population. Results are identical to what we saw in the more formal regression procedure.

View 2: Regression Analysis by Spending Level for our Two Age Groups

OLDER at

TIME 2 YOUNGER at

TIME 2 $

$ at TIME 1 % to all OLDER

% to all YOUNGER Avg $/older Avg $/younger Difference

< $170 15.00 20.00 204.29 184.14 20.15

$170-$185 10.00 35.00 196.14 188.30 7.84

$185-$200 35.00 35.00 194.80 180.64 14.16

$200-$215 15.00 5.00 193.82 181.82 12.00

> $215 25.00 5.00 201.81 165.65 36.16 AVERAGE 197.96 183.33 14.63

But hold on. Once again, by utilizing our first analysis approach, difference scores, we saw no difference. By proceeding with regression, we do detect differences. What is going on? Am I being tricked? Is this a hoax? Which method should I employ, and rely upon, in my analyses to inform the marketing plan?

Page 12: DIRECT MARKETING ANALYTICS JOURNALthedma.org/wp-content/uploads/2012-Analytics-Journal.pdfWinter 2012 DMA Analytics Journal 4 Letter From The Chair March 13, 2012 Dear Analytics-CRM

Winter 2012 DMA Analytics Journal

12

A History Lesson: Lord’s Paradox – And We Reveal Our Answer Some 40 years ago, statistician Frederik Lord first presented these conflicting results. His example involved the weight gain of students before and after their freshman year in college. Different diets were hypothesized to be a driver of the weight change. He essentially performed the same analysis as we described above and arrived at the same ambiguous result. The issue has been referred to as “Lord’s Paradox.” Lord, however, did not resolve his dilemma, and left many researchers searching for a satisfactory response. In our example, note the difference in pre-spending levels for each segment. These variations may indeed cloud subsequent analysis. While typical comparisons can be attempted, pre-existing conditions in the analysis groups may lead to misleading conclusions. Scientists have to be precise about the research questions that they pose. An examination of difference scores helps respond to question whether adults modified their spending habits from TIME 1 to the subsequent time frame. Regression analysis, however, does a better job dealing with the question of whether adults who have similar initial spending will have different subsequent spending – which is not the case in our example. These two analytic problems are distinct, and, as a result, it is unreasonable to assume they would present the manager with identical results. Subtle distinctions between questions can generate different results. The analyst must separate the contributory impact of the marketing offer from the original spending differences recorded at TIME 1. Because we know from TIME 1 data that the behavior of the two age segments is dissimilar at the onset, the analyst must choose difference scores as the correct analytics tool. We now know that the coupon used to promote TIME 2 sales did not result in any statistical difference between the two age groups. One of the attractive features that dominate direct marketing is the ability to test, assess and modify, as needed. Frequently, test groups are employed throughout the process. While managers may be using this gold standard, it’s possible they may not be evaluating their programs correctly. Fortunately, through clear articulation of objectives and careful analysis selection and evaluation, they can improve their results. Just ask Frederick Lord. About the author Sam Koslowsky is vice president of modeling solutions for Harte-Hanks. Harte-Hanks, Inc. (NYSE:HHS), San Antonio, TX, is a worldwide, direct and targeted marketing company that provides direct marketing services and shopper advertising opportunities to a wide range of local, regional, national and international consumer and business-to-business marketers. Contact Koslowsky at (212) 520-3259 or via e-mail at [email protected]. Visit the Harte-Hanks Web site at http://www.harte-hanks.com . Editorial Contact: Drew Hansen, [email protected], or Chet Dalzell, [email protected], consultant to Harte-Hanks.

Page 13: DIRECT MARKETING ANALYTICS JOURNALthedma.org/wp-content/uploads/2012-Analytics-Journal.pdfWinter 2012 DMA Analytics Journal 4 Letter From The Chair March 13, 2012 Dear Analytics-CRM

Winter 2012 DMA Analytics Journal

13

Identifying Key Sales Influencers in Facebook using

ORBIT ANALYTICS A Single Centroid-based Clustering Approach to Create

Automatic Social Circles

By Harikrishna S. Aravapalli IBM’s Business Analytics and Optimization

INTRODUCTION Where are the new yuppie customers of today? How do we target them for a possible sales or advertising campaign? How do we leverage them to influence future customers and increase our sales? If these are some of the questions that come to our mind, then it is high time we realized the potential of the Social Media networks as a platform for getting answers to these questions. Just imagine, even without any direct monetary benefit being accrued to the users, there are close to 500 million users hooked to various social network websites. And an even bigger imagination is that there are no cross-country or cross-cultural borders in the social media, which literally means that all 500 million users can (technically speaking), interact with each other. And now for the final imagination-stretching exercise: If there is some kind of mechanism to create monetary benefit to the users of the social media by recruiting them for our sales, marketing and advertising needs, then it would not only benefit the companies but also the users of the social media as they would now make informed, recommended and guided decisions instead of depending on the commission-based sales personnel for all their consumer transactions. In order to achieve this, we need to leverage the underlying data and metadata of Facebook or any other social media websites, and apply some basic concepts of analytics to the data, in order to pull out the key influencers/users on the social media across domains like industries, cultures, religions, sports, education and practically any field on the Earth. One concept of analytics that can help in identifying Key Influencers is “ORBIT Analytics” the focus of this article. ORBIT Analytics helps in creating automatic and dynamic Social Circles for each user on the Social Media. Facebook similar to Transaction Systems Looking at the various data and metadata on Facebook, you would realize that it is a constantly changing data system within specific bounds of users of the social media. At a gross level, there is a lot of transaction data for each user in the social media. This is the “First Wave” of the social media platform, which is to enable a critical mass of transaction data to be generated continuously. This first wave ensures that the social media platform survives in the first place and continues to thrive in the long run.

Page 14: DIRECT MARKETING ANALYTICS JOURNALthedma.org/wp-content/uploads/2012-Analytics-Journal.pdfWinter 2012 DMA Analytics Journal 4 Letter From The Chair March 13, 2012 Dear Analytics-CRM

Winter 2012 DMA Analytics Journal

14

Facebook Analytics similar to Business Intelligence and Analytics Once the “First Wave” of the Social Media is achieved, then one can leverage the thought processes of Business Intelligence and Analytics to derive some analysis and meaningful ideas out of this “First Wave” transaction data generated in Facebook. This is the “Second Wave” of the Social Media platform, i.e., to generate insights out of the underlying transaction data. These insights are what the sales, marketing and advertising companies can try to dig out and use for their benefit and also for the benefit of the social media users. Introducing the concept of Orbit Analytics One possible way of digging out insights from the social media platforms like Facebook is the concept of “Orbit Analytics”. Orbit analytics is a concept where, based on the interactions with his or her friends, a user’s friends are automatically classified as part of three basic circles. The innermost circle is called a “Coterie” and groups all those friends with whom there is most communications. The next outer circle is called the “Inner Circle”, where all those friends of the user with whom the communication is at an average level. The third circle is called the “Outer Circle”, and here only those friends with whom the user has minimal communication are grouped. The basis for defining the level of communication can vary and depend on the input parameters to the algorithm. There is no limit to the number of circles that each user has. This automatic classification of each user’s friends into Orbits / Circles is the core of the concept of “Orbit Analytics”. How Different is Orbit Analytics from Google Plus Circles “Orbit Analytics” is based on automatic classification of each user’s friends into circles. This is based on the social media transaction data and hence is dynamic in nature. Google Plus Circles requires the user to actually do the classification of the users into various circles. This is a self-evading concept as we all know that whom we interact with the most or the least varies from time to time, from age to age, from generation to generation, and also from circumstance to circumstance. Hence just because one user is your brother does not automatically qualify him as the one to be within the close circle. It is very likely that our classmate may currently have the most contact with us than our brother. These dynamic relationship statuses and the associated communication patterns are not captured by Google Plus Circles. “Orbit Analytics”, is based on the underlying social media transaction data of the user, hence the classification would be more real time and realistic than based on our pre-conceived notions. Hence I strongly recommend to classify your friends on the social media based on “Orbit Analytics”. This would not only help the user see his or her relationship patterns more realistically, but also help businesses in their target marketing initiatives on the social media. The next step is to discuss how to create these social circles automatically using “Orbit Analytics”.

Page 15: DIRECT MARKETING ANALYTICS JOURNALthedma.org/wp-content/uploads/2012-Analytics-Journal.pdfWinter 2012 DMA Analytics Journal 4 Letter From The Chair March 13, 2012 Dear Analytics-CRM

Winter 2012 DMA Analytics Journal

15

Realizing Orbit Analytics using a Single-Centroid-based K-means Clustering Approach A simple yet effective way to create these social circles is using a Single-Centroid-based K-mean clustering technique.

Fig. 1 : A Schematic depiction of the Social Circles created using Orbit Analytics The step-by-step approach is explained below:

1) Identify the various data and metadata fields within a social media platform like Facebook, which can be the members of a feature vector. This completes the first step of creating a feature vector.

2) Define a Single Centroid, which in this case is the current user, for whom the dynamic clustering is being done.

3) Calculate the distance of each of the friends of the current user, based on the feature vector and also the K-mean clustering technique (i.e.,based on the Eucledian distances )

4) Define the distance ranges for “Coterie” ( “d”) , “Inner Circle” (“d+d1”) and “Outer Circle” (“ d+d1+d2”).

5) Classify the current user’s friends into these three categories based on the distance range that the friend of the current user falls into.

6) This completes the automatic classification of the current user’s friends into various social circles in the social media, based on “Orbit Analytics”

Page 16: DIRECT MARKETING ANALYTICS JOURNALthedma.org/wp-content/uploads/2012-Analytics-Journal.pdfWinter 2012 DMA Analytics Journal 4 Letter From The Chair March 13, 2012 Dear Analytics-CRM

Winter 2012 DMA Analytics Journal

16

Applying Orbit Analytics to Facebook User’s data This concept of the “Orbit Analytics” can be easily applied to popular social media platforms like Facebook with a large user base. This would give rise to various other higher level concepts like:

a) Inter-Coterie Dynamics – how one coterie interacts with other coteries. b) Inter-Coterie Maps – for visual analytics to generate visual insights. c) Coterie-Inner Circle Ratio – to determine the amount of Inner Circle users who

can be influenced as compared to the “Coterie” users. d) Multiple Coterie Memberships – identify those users who belong to more number

of coteries. This can help in identifying the key influencers. e) Coterie Eclipses – identifying those users who stand in-between the connection of

the coterie users and the other users in other circles. This would help in identifying alternate connection paths or targeting this key link to get connected to other circles of users, for sales / marketing purposes.

Fig. 2 : A schematic depiction of Inter-Coterie Relationships in Social Media

Page 17: DIRECT MARKETING ANALYTICS JOURNALthedma.org/wp-content/uploads/2012-Analytics-Journal.pdfWinter 2012 DMA Analytics Journal 4 Letter From The Chair March 13, 2012 Dear Analytics-CRM

Winter 2012 DMA Analytics Journal

17

Fig. 3 : A schematic depiction of Multiple Coterie Memberships in Social Media It is important to note that once the “Orbits” or “Social Circles” for the users of the social media are created, the various types and dimensions of analysis are only limited by one’s imagination. Analytic Applications of Orbit Analytics on Facebook Data Once the various coteries are identified, we can create a lot of analytic applications to identify the key influencers who can sell, promote, influence or introduce a particular product or service for the companies. Some of the examples of the analytics applications based on “Orbital Clusters” are:

1) Targeting the users with the densest coteries. These users can be recruited by the business to do the sales, marketing and advertising functions in the social media.

2) Identifying the coteries of users, which are in turn connected to the most other coteries. This would ensure an exponential increase in the number of key influencers who can be contacted to do the sales, marketing and advertising functions.

Potential of Orbit Analytics to be leveraged by Target Marketers and Advertisers With the classification of friends of each user on social media platform like Facebook into “Orbits” or “Circles”, it becomes easy for the business to identify the key influencers for their products and services. This is because a particular set of users can be recruited

Page 18: DIRECT MARKETING ANALYTICS JOURNALthedma.org/wp-content/uploads/2012-Analytics-Journal.pdfWinter 2012 DMA Analytics Journal 4 Letter From The Chair March 13, 2012 Dear Analytics-CRM

Winter 2012 DMA Analytics Journal

18

based on the strength of their relationships with their friends and on the size of their social circles. Business can not only identify users who have the densest coteries, but they can also devise ways of further propagating their sales and marketing campaigns to the “Inner Circles” and “Outer Circles” of friends. It is also interesting to note that by applying Orbit Analytics, we can even create many higher level clusters of Coteries as shown below:

Fig. 4 : Higher level Clusters formed based on the Coteries formed using the Orbit Analytics There can be analytics done on these social circles built using Orbit Analytics, to identify all the key influencers across a chosen set of users and form a “Key Influencer Club” for each of the domains like business, health, education, sports, music, gadgets, electronics and many more. Introducing the concept of Interactive Sales, Marketing and Advertising using “Orbit Analytics” This automatic classification into “Orbits” or “Circles” using Orbit Analytics would also create a match-making platform for business and the key-influencers. While business can broadcast their need to hire key influential coteries, the Key influential coteries can also

Page 19: DIRECT MARKETING ANALYTICS JOURNALthedma.org/wp-content/uploads/2012-Analytics-Journal.pdfWinter 2012 DMA Analytics Journal 4 Letter From The Chair March 13, 2012 Dear Analytics-CRM

Winter 2012 DMA Analytics Journal

19

broadcast their current potential to push or promote or sell or market a particular product(s) or service(s) for a mutually agreed fees. An example could be a key influential user who has a dense coterie and also currently has a newborn baby. This user may have a lot of other users in his or her coterie who also have newborn babies. This could be because of the current life stage or circumstance that the user is in. Thus this particular key influential user can be recruited for promotion of baby related products and services to his or her friends in the coterie and even to his or her friends in the “inner” and “outer” circles.

Fig. 5 : A schematic depiction of the Interactive Marketing in Social Media There can be similar examples among users and their coteries in other life situations like health, money, investments, travel, education, wedding, politics, sports, etc. Conclusion While there is a lot of buzz about the concept of the social media platforms like Facebook, it is mostly restricted to the fun and social interactions amongst the huge user base of close to 500 million. This is the “First Wave” of the Social Media. The “Second Wave” of the Social Media is based on the treating the social media data as transactional data and applying analytics on it to generate insights about the user relationships and communication patterns. Orbit Analytics is one such analytics concept of that is useful to business from the sales, marketing and advertising perspective. Orbit Analytics would open up the concept of interactive sales, marketing and advertising. Not only will the Business try to advertise, market, and sell a product or service, but the key influencers on the social media will also approach the business to perform advertising, marketing, and sales function (on behalf of the business) with the coteries and other social circles of the key influencers. “Orbit Analytics” will ensure that the business identifies and appoints the right key influencers to promote their products and services.

Page 20: DIRECT MARKETING ANALYTICS JOURNALthedma.org/wp-content/uploads/2012-Analytics-Journal.pdfWinter 2012 DMA Analytics Journal 4 Letter From The Chair March 13, 2012 Dear Analytics-CRM

Winter 2012 DMA Analytics Journal

20

This “Second Wave” of the Social Media would be a strong imperative for the survival and further progress of the concept of Social Media, as it would be a revenue based model and hence of interest to the 500 million users on the Social Media like Facebook. About the author Harikrishna S. Aravapalli is an Application Architect, focusing on Entity Analytics, Social Media Analytics and Text Analytics, at IBM’s Business Analytics and Optimization - Center of Competence (BAO CoC) Bangalore, INDIA. He focuses on newer ways of applying available technology to promote thought leadership in applied areas of Business Intelligence and Analytics. He has around 15 years of experience in the Business Intelligence, Data warehousing and Analytics areas. He can be reached at [email protected] .

Page 21: DIRECT MARKETING ANALYTICS JOURNALthedma.org/wp-content/uploads/2012-Analytics-Journal.pdfWinter 2012 DMA Analytics Journal 4 Letter From The Chair March 13, 2012 Dear Analytics-CRM

Winter 2012 DMA Analytics Journal

21

Dynamically Tie the Right Offer to the Right Customer in Telecommunications Industry

By Kunal Sawarkar, IBM India Software Lab - SWG, Business Analytics, India

Sanket Jain, GBS Business Analytics and Optimization Center of Competence,

CMS Analytics India ABSTRACT For a successful business, engaging in an effective campaign is a key task for marketers. Most previous studies used various mathematical models to segment customers without considering the correlation between customer segmentation and a campaign. This work presents a conceptual model by studying the significant campaign-dependent variables of customer targeting in customer segmentation context. In this way, the processes of customer segmentation and targeting thus can be linked and solved together. The outcomes of customer segmentation of this study could be more meaningful and relevant for marketers. This investigation applies a customer life time value (LTV) model to assess the fitness between targeted customer groups and marketing strategies. To integrate customer segmentation and customer targeting, this work uses the genetic algorithm (GA) to determine the optimized marketing strategy. Later, we suggest using C&RT (Classification and Regression Tree) in SPSS PASW Modeler as the replacement to Genetic Algorithm technique to accomplish these results. We also suggest using LOSSYCOUNTING and Counting Bloom Filter to dynamically design the right and up-to-date offer to the right customer. Search keywords: Genetic Algorithm; C&RT; SPSS; LOSSYCOUNTING; Counting Bloom Filter. 1. INTRODUCTION For a successful business, engaging in an effective campaign is a key task for marketers. Traditionally, marketers must first identify market segmentation using a mathematical model and then implement an efficient campaign plan to target profitable customers (Fraley & Thearting, 1999). This process confronts considerable problems. First, most previous studies used various mathematical models to segment customers without considering the correlation between customer segmentation and a campaign. Previously, the link between customer segmentation and campaign activities was most manual or missing (Fraley & Thearting, 1999). For marketing researchers, segmentation should not the end in itself, but rather a means to an end (Jonker, Piersma, & Poel, 2004). Following the notion proposed by Jonker, this work presents a conceptual model by counting the significant campaign dependent variables of customer targeting in customer segmentation. In this way, the processes of customer segmentation and targeting thus can be linked and solved together. The outcomes of customer segmentation of this study could be more meaningful and relevant for marketers. To solve the core problem of marketers facing, this investigation applies a customer life time value (LTV) model to assess the fitness between targeted customer groups and marketing strategies. To

Page 22: DIRECT MARKETING ANALYTICS JOURNALthedma.org/wp-content/uploads/2012-Analytics-Journal.pdfWinter 2012 DMA Analytics Journal 4 Letter From The Chair March 13, 2012 Dear Analytics-CRM

Winter 2012 DMA Analytics Journal

22

integrate customer segmentation and customer targeting, this work uses the genetic algorithm (GA) to determine the optimized marketing strategy (Jonker et al., 2004; Kim & Street, 2004; Kim et al., 2005; Tsai & Chiu, 2004). We can explore using C&RT (Classification and Regression Tree) in SPSS PASW Modeler as a substitute for genetic algorithm to accomplish this objective. 2. BACKGROUND AND RESEARCH MOTIVATION According to Chan (2008), most previous studies classified RFM and LTV models as two different methods of segmenting customers. Generally, RFM models represent customer dynamic behavior. On the other hand, LTV models normally evaluate customer value or contribution. Identifying the behavior of high-value customers is a key task in customer segmentation research. This study proposes an intelligent model that uses GA to select customer RFM behavior using a LTV evaluation model. Then it explores the possibility of C&RT model in SPSS as a potential replacement. Customer life time value is taken as the fitness value of GA. If the proposed methodology is applied, high-value customers can be identified for campaign programs. Another advantage of the proposed methodology is that it considers the correlation between customer values and campaigns. Valuable customers thus can be identified for a campaign program. However, this approach of using genetic algorithm has some limitations. First, this proposed method requires numerous customer data. Second, if you need more breakpoints for your variables, then it can quickly become a very complex exercise. In the future, more breakpoints can be investigated to determine the optimal numbers of breakpoints for customer segmentation. Finally, future studies should also explore the possibility of targeting and finding new customers. This work will try to focus on devising an approach for dealing with customer segmentation problems during a promotion campaign for an existing customer base. Currently, in telecom, we typically design only a handful of offers for the entire customer base. These offers can be for retention, revenue maximization, or upgrade. But it would be worth exploring if we can tailor-make offers for each customer, or at least increase the number of offers so that more customers can be fitted to their offer. However, the current scenario does not allow you to map the offer to your individual customer’s usage. To do this, we need to adopt the following methodology. Take as input the customer demographic data along with profile data that includes certain keywords taken from his e-billing statement. To draw a parallel with the credit card and banking industry, let us see how it can be accomplished, and then we can design a similar approach for telecommunications industry. Scenario: User logs onto ICICI bank portal to check his mini-statement. There he finds that a salary of $100,000 has been credited to his account this month. He might also find that $30,000 home loan EMI had been deducted, and that $5,000 car loan got deducted too. So, in real time, if the application can calculate his net savings (say, $10,000), and

Page 23: DIRECT MARKETING ANALYTICS JOURNALthedma.org/wp-content/uploads/2012-Analytics-Journal.pdfWinter 2012 DMA Analytics Journal 4 Letter From The Chair March 13, 2012 Dear Analytics-CRM

Winter 2012 DMA Analytics Journal

23

then dynamically offer him a tailor-made credit card that would encourage him to spend in the region of $10,000. NOTE: The keywords are underscored. Along with this, all the banking corporations keep a standard keyword dictionary of terms like ATM withdrawal codes, EMI, car loan, etc. So, they can easily map the keyword searched to its corresponding term in their standard keyword dictionary of terms. Note: The specific illustration of offering a credit card based on his real-time income vs. expenses assessment. Further, this should also leverage the entire 360 degree view of the customer to include his average credit card expenses (if already has a card with same bank), payment history and ratings (bureau data). 3. APPROACH According to Chan (2008), first the companies must plan and develop a marketing strategy for promotion campaigns this year for the existing customer base. Second, marketers must gather customer data to establish customer profile and devise associated campaign information. This study uses a RFM model to represent customer behavior. Customer data are encoded using a RFM model and transformed into a binary string as the input format of genetic algorithm (GA). Conversely, this study collects and calculates customer lifetime values as the fitness values of GA. The proposed LTV model considers the correlation between campaign strategy and customer value. The fourth step is segmenting customers into several homogenous groups using GA. Te fifth step involves targeting and matching segmented customers with the developed campaign strategies and programs. The final step involves classifying customers and turning a campaign plan into action. Apart from this, it could be interesting to explore whether there exist clearly distinct categories of customers according to their status of offer. For instance, there could be some customers (A) who have never qualified for any offer in the last six months. Then some (B) have qualified for some offer(s) in the last six months but have either rejected or not accepted it. Remaining (C) have accepted their offer(s). So, using this Genetic Algorithm or C&RT, you might further need to customize your offers according to these segments. Also, it would be better if the system can be trained with the previous campaign feedback. For example, a customer who has never qualified for any offer in the past six months should be treated differently from the one who has accepted all the offers that have been sent to him in the same timeframe. Now, after using the RFM model to represent customer behavior, Chan (2008) encodes data into a binary string by dividing the values of Recency (R), Frequency (F) and Monetary value (M) into five sections, 0-20%, 20-40%, 40-60%, 60-80% and 80-100%. If the value lies between 20% and 40% the binary code is set to 2. Similarly, if the value

Page 24: DIRECT MARKETING ANALYTICS JOURNALthedma.org/wp-content/uploads/2012-Analytics-Journal.pdfWinter 2012 DMA Analytics Journal 4 Letter From The Chair March 13, 2012 Dear Analytics-CRM

Winter 2012 DMA Analytics Journal

24

is between 60% and 80% the binary code is 4. By doing this, a mapping can be established between the input data and binary codes. This proposed encoding scheme transforms the points of the parameter space into a binary string representation. For instance, a point (1, 2, 5) in a three-dimensional parameter space can be represented as a concatenated binary string (Jang, Sun, & Mizutani, 1997). 0001 0010 0101 can be translated using genetic algorithm to 1 2 5. Note: The bits have been underscored only for illustrative purposes. At first, the input parameters must be set up and customer data must be encoded as a binary string. Second, GA initializes the chromosome randomly. Third, each chromosome is evaluated. Fourth, higher fitness value members are selected as parents for the next generation. Fifth, crossover is used to generate new chromosomes with provable crossover rate that we hope to preserve good genes from parents. Sixth, mutation is used to flip a bit with the probability of fixed mutation rate. This step can generate new chromosomes to prevent the entire population from converging on trapped local optima. Seventh, a new generation is produced. Meanwhile, the eighth step is evaluating a new generation to measure the stop criteria. If the stop criteria remain unsatisfied, the processes will repeat. If the criteria are satisfied, the evolution will stop. Finally, the best chromosomes are chosen and decoded as the final solutions. The method of segmentation in this study is defined by variable breakpoints (Jonker et al., 2004). The number of segments increases rapidly whenever the number of breakpoints increases. Later in this study, we need to explore how we can use C&RT algorithm of SPSS to achieve similar results. To mirror the process of genetic algorithm using C&RT approach, all the relevant input predictor variables would need to be encoded akin to the Genetic Algorithm’s 0-1 format. For doing this, we will have to transform each (continuous) predictor variable first into an ordinal variable, and then into multiple binary variables. At the same time, if there is a predictor variable having only two possible values (such as the variable Gender could have value Male or Female), then those values (in this case of Male and Female) can be directly transformed to 1 and 0, respectively. OVERVIEW OF C&RT

C&RT modeling is an exploratory data analysis method used to study the relationships between a dependent measure and a large series of possible predictor variables with potential interactions among themselves. The dependent measure may be a qualitative (nominal or ordinal) one or a quantitative indicator. We can use a Gini measure of association, and prune the branches.

Page 25: DIRECT MARKETING ANALYTICS JOURNALthedma.org/wp-content/uploads/2012-Analytics-Journal.pdfWinter 2012 DMA Analytics Journal 4 Letter From The Chair March 13, 2012 Dear Analytics-CRM

Winter 2012 DMA Analytics Journal

25

READING THE OUTPUT DIAGRAM OF C&RT

In C&RT diagram, a series of predictor variables are assessed to see if splitting the sample based on these predictors leads to better discrimination in the dependent measure. For instance, if our dependent measure is whether the patient has gotten medical case management services, we would first assess whether there are different levels of receiving this service for two groups formed on the basis of one of the predictor variables. The most significant of these predictions would define the first split of the sample, or the first branching of the tree. Then, for each of the new groups formed, we would ask if the subgroup could be further significantly split by another of the predictor variables. This process will continue. After each split, we ask if the new subgroup can be further split on another variable so that there are significant differences in the dependent variable. The result at the end of the tree building process is that we have a series of groups that are maximally different from one another on the dependent variable. At each step, the optimal binary split is made. Different orientations of the same tree are sometimes useful to highlight different portions of the results. Here, all splits are binary. However, the same variable may be split repeatedly.

C&RT method has certain advantages as a way of looking for patterns in complicated datasets. First, the level of measurement for the dependent variable and predictor variables can be nominal or ordinal (categorical) or interval (a "scale"). Also, the missing values in predictor variables can be estimated from other predictor ("surrogate") variables, so that partial data can be used whenever possible within the tree. But, C&RT modeling is essentially a "stepwise" statistical method, and there is always a potential for too much to be seen in the data even when very conservative statistical criteria are used.

For us, the dependent variable could be whether the offer is relevant for a customer or not.

Note: There is no guarantee that C&RT model will yield good accuracy. 4. DATA AND VARIABLES Now, taking this analogy to telecom, consider the case of a customer who opts for a Vodafone recharge of his prepaid account. So, the application will offer him an upgrade by offering him a postpaid connection. So, in a way, the application system will automatically find business rules. For e.g., if there is a customer who always spends $10 only for taking a broadband connection, then a business rule should automatically get created for him. An example of offer could be: “unlimited access to Facebook by spending $10 upfront”. To do all this, we will have to keep an account of the billing statement of the customer’s mobile phone. This would be akin to his bank mini-statement. Some keywords in a POST-PAID mobile phone connection’s statement could be:

1. Voice calls - outgoing local to Airtel mobile, to other mobiles, to fixed landline,

Page 26: DIRECT MARKETING ANALYTICS JOURNALthedma.org/wp-content/uploads/2012-Analytics-Journal.pdfWinter 2012 DMA Analytics Journal 4 Letter From The Chair March 13, 2012 Dear Analytics-CRM

Winter 2012 DMA Analytics Journal

26

2. Voice calls - outgoing STD to Airtel mobile, to other mobiles, to fixed landline, 3. Voice calls – outgoing ISD, 4. Last bill period late fee, 5. Value Added Services: SMS – local to other mobiles, 6. Value Added Services: SMS – national to Airtel mobiles, 7. CUG (this indicates the social network of your customer), and 8. XXX nat sms free (this is “National SMS free” discount package), where XXX

could be 100, 150, 200, etc. Note: The important keywords have been underscored. Similar analysis can be conducted for pre-paid connection. Also, the next question would arise: How real-time can we make this. To my understanding, I think we can do batch processing because we are not being time-specific or location-bound here, such as offering dinner coupon or late evening movie ticket at Bangalore if you are currently at Bangalore. Actually, you might not need any bills or statements at all. All the data that is used to generate the statements is anyways present in the DW (data warehouse). So, the analysis can be done directly with the data from the DW (TDW in this case for IBM). Key questions worth exploring would be as follows:

How does data analytics drive my product/offer/solution innovation? How does an analytics-based rule link with the right product/service/campaign

offer mix? How can analytics get closer to the product or marketing mix development

activity? Here is another example: on your Gmail account, when you are looking for GMAT website (such as www.gmat.com), it shows you the links to Princeton Review and GRE and other relevant information. Can we do the same for Telecom? Here is a list of offers that can be given to customers (after having mapped his profile with your strategy or campaign objective):

1. Offer free talk time of say, 50 minutes (if no free time exists as of now). 2. Or, paid talk time of say, 100 minutes by paying $10 upfront (it he is a habitual

$10 payer), if there exist enough relevant keywords in his billing statement. 3. VAS offer. 4. Discount package – based on 150 nat sms free (i.e., 150 national SMS free). 5. Upgrade offer (if no late fee in the last 6 months).

The challenge here is that unlike in banking industry, there is no common database of standardized terms that can be used to map these keywords. CDR (Call Detail Record) is our best bet.

Page 27: DIRECT MARKETING ANALYTICS JOURNALthedma.org/wp-content/uploads/2012-Analytics-Journal.pdfWinter 2012 DMA Analytics Journal 4 Letter From The Chair March 13, 2012 Dear Analytics-CRM

Winter 2012 DMA Analytics Journal

27

But, deploying or even understanding a solution based on genetic algorithm can become quite complex. So, we can use LOSSYCOUNTING to find most frequent business rules. We start with knowing the campaign objective (e.g. retention) and then its parameters (e.g., churn probability, days since last upgrade, days of zero usage). Based on these parameters, the Response Log feature of Telecom Packaging in Decision Management will come up with certain number of business rules. We can then find the most effective (or, most ‘popular’) business rules by looking for the most frequently occurring business rule using LOSSYCOUNTING. Next, if a rule is “age < 27 and margin amount > $30”, then it should be promoted and other inferior rule like ‘age < 27 and margin amount < $10” should be retired. This can be accomplished by LOSSYCOUNTING. In real life, there could be many business rules, hence selecting the most effective ones is an objective that LOSSYCOUNTING can help us attain. Example offers could be:

If the most effective business rule is “age < 27 years and margin amount > $30 and heavy night SMS user”, then offer him 100 Free night SMS.

If the most effective business rule is “age < 24 years and margin amount < $10 and outgoing call minutes < 25 and heavy Internet user”, then offer her one month access to Facebook.com for $10 upfront fee. (Notice how his low usage was ‘mapped’ to give an offer for a $10 upfront fee).

Let us see how the LOSSYCOUNTING algorithm works: Imagine that you see a large number of individual transactions (such as Amazon book sales), and you want to calculate what are the top sellers today. Or imagine that you are monitoring network traffic and you want to know which hosts/subnets are responsible for most of the traffic. This is a problem of finding heavy hitters given a stream of elements. An easy method to solve this problem is to store each element identifier with a corresponding counter monitoring the number of occurrences of that element. Then, you sort the elements accordingly to their counters and you can easily get the most frequent elements. However, in many real scenarios, this simple solution is not efficient of computationally feasible. For instance, consider the case of tracking the pairs of IP address that generate the most traffic over some time period. You need 16,384 PBytes of memory and a lot of time to sort and scan that memory array, which often makes the problem non-computationally feasible. Hence, during recent years, techniques to computing heavy hitters using limited memory resources have been investigated. They cannot find exact heavy hitters, but instead approximate the heavy hitters of a data stream. The approximation typically lies in that the computed heavy hitters may include false negatives. LOSSYCOUNTING is an algorithm for finding heavy hitters using limited memory. Here, an important parameter for each distinct element identifier in the table is its error bound. Such error bound reflects the potential error on the estimated frequency of an element. Elements with small error bounds are more likely to be removed from LOSSYCOUNTING process than equal-frequency elements having a larger error bound. The LOSSYCOUNTING algorithm was proposed by Manku and Motwani in 2002, in addition to a randomized sampling-based algorithm and techniques for extending from

Page 28: DIRECT MARKETING ANALYTICS JOURNALthedma.org/wp-content/uploads/2012-Analytics-Journal.pdfWinter 2012 DMA Analytics Journal 4 Letter From The Chair March 13, 2012 Dear Analytics-CRM

Winter 2012 DMA Analytics Journal

28

frequent items to frequent item-sets. The algorithm stores tuples which comprise an item, a lower bound on its count, and a ‘delta’ value which records the difference between the upper bound and the lower bound. When processing the ith item, if it is currently stored then its lower bound is increased by one; else, a new tuple is created with the lower bound set to one, and _delta set to [i/k]. Periodically, all tuples whose upper bound is less than [i/k] are deleted. These are correct upper and lower bounds on the count of each item, so at the end of the stream, all items whose count exceeds n/k must be stored. As with FREQUENT algorithm, setting k = 1/epsilon ensures that the error in any approximate count is at most en, where e = epsilon. A careful argument demonstrates that the worst case space used by this algorithm is O(1/elogen), and for certain input distributions it is O(1/e). Now that we know how LOSSYCOUNTING works, we should ensure that we tie that with a mechanism that inserts and deletes rules. Hash function can be used to lookup for most commonly occurring. However, it may map two or more keys to the same hash value. We wish to minimize the occurrence of such collisions; hence the hash function must map the keys to the hash values as evenly as possible. For any choice of hash unction, there exist bad set of hash keys that all hash to the same slot. The idea is to use hash function at random, independent from the keys. (Remember that the probability of any two keys from colliding each other is less than 1/2). When testing a hash function, the uniformity of the distribution of hash values can be evaluated by chi-square test. Bloom filter is a probabilistic data structure that uses the hash concept to test whether an element is a member of a set. An empty Bloom filter is a bit array of m bits, all set to 0. There must also be k different hash functions defined, each of which maps or hashes some set element to one of the m array positions with a uniform random distribution. To add an element, feed it to each of the k hash functions to get k array positions. Set the bits at all these positions to 1. To query for an element (test whether it is in the set), feed it to each of the k hash functions to get k array positions. If any of the bits at these positions are 0, the element is not in the set – if it were, then all the bits would have been set to 1 when it was inserted. If all are 1, then either the element is in the set, or the bits have been set to 1 during the insertion of other elements. Removing an element is impossible. So, we will use Counting Bloom Filter (CBF) to perform insertion & deletion of business rules. This will automatically and dynamically promote the best rules and retire the others. CBF provides a way to implement a delete operation on a Bloom filter without recreating the filter afresh. In a counting filter the array positions (buckets) are extended from being a single bit, to an n-bit counter. The insert operation is extended to increment the value of the buckets and the lookup operation checks that each of the required buckets is non-zero. The delete operation, obviously, then consists of decrementing the value of each of the respective buckets.

Page 29: DIRECT MARKETING ANALYTICS JOURNALthedma.org/wp-content/uploads/2012-Analytics-Journal.pdfWinter 2012 DMA Analytics Journal 4 Letter From The Chair March 13, 2012 Dear Analytics-CRM

Winter 2012 DMA Analytics Journal

29

The advantage of using a Bloom Filter is that while risking false positives, it has a strong space advantage over other data structures for representing sets, because it does not require storing data. Mining frequent item-sets inherently builds on finding frequent items as a basic building block. In our case, we intend to find the most frequent business rules and/or the most frequent keywords within those rules. Finding the entropy of a stream requires learning the most frequent items in order to directly compute their contribution to the entropy, and remove their contribution before approximating the entropy of the residual stream. So, we need to promote the most frequent (and hence, the most effective) business rules and retire the ones that are not frequent. Our technique will use hashing (CBF) to derive multiple sub-streams, the frequent elements of which will be extracted to estimate the frequency moments of the stream using LOSSYCOUNTING. REFERENCES

Chan, Chu Chai Henry (2008): Intelligent value-based customer segmentation method for campaign management: A case study of automobile retailer E-Business Research Laboratory, Department of Industrial Engineering and Management, Chaoyang University of Technology, No. 168, Jifong East Road, Wufong Township, Taichung County, Taiwan, ROC. Chan, Chu-Chai Henry (2005): Online auction customer segmentation using a neural network model. International Journal of Applied Science and Engineering, 3(2), 101–109. Chung, Kyoo Yup, Oh, Seok Youn, Kim, Seong Seop, & Han, Seung Youb (2004). Three representative market segmentation methodologies for hotel guest room customers. Tourism Management, 25, 429–441. Cormode, Graham, Hadjieleftheriou Marios (2008): Finding Frequent Items in Data Streams. AT&T Labs–Research, Florham Park, NJ. Doyle, Shaun (2005): Software review business requirements for campaign management – A sample framework. Database Marketing & Customer Strategy Management, 12(2), 177–192. Fraley, Andrew, & Thearting, Kurt (1999): Increasing customer value by integrating data mining and campaign management software. Data Management, 49–53. Holland, J. H. (1975): Adaptation in natural and artificial systems. Ann Arbor: University of Michigan Press. Hsieh, Nan-Chen (2004): An integrated data mining and behavioral scoring model for analyzing bank customers. Expert Systems with Applications, 27, 623–633. Hu, Tung-Lai, & Sheub, Jiuh-Biing (2003): A fuzzy-based customer classification method for demand-responsive logistical distribution operations. Fuzzy Sets and Systems, 139, 431–450. Hwang, Hyunseok, Jung, Taesoo, & Suh, Euiho (2004): An LTV model and customer segmentation based on customer value: A case study on the wireless telecommunication industry. Expert Systems with Applications, 26, 181–188. Jang, J.-S. R., Sun, C. T., & Mizutani, E. (1997): Neuro-fuzzy and soft computing. USA: Prentice Hall Inc. Jiao, Jianxin, & Zhang, Yiyang (2005): Product portfolio identification based on association rule mining. Computer-Aided Design, 37, 149–172. Jones, Joni L., Easley, Robert F., & Koehler, Gary J. (2006):. Market segmentation within consolidated E-markets: A generalized combinatorial auction approach. Journal of Management Information Systems, 23(1), 161–182. Jonker, Jedid-Jah, Piersma, Nanda, & Poel, Dirk Van den (2004): Joint optimization of customer segmentation and marketing policy to maximize long-term profitability. Expert Systems with Applications, 27, 159–168.

Page 30: DIRECT MARKETING ANALYTICS JOURNALthedma.org/wp-content/uploads/2012-Analytics-Journal.pdfWinter 2012 DMA Analytics Journal 4 Letter From The Chair March 13, 2012 Dear Analytics-CRM

Winter 2012 DMA Analytics Journal

30

Kim, Su-Yeon, Jung, Tae-Soo, Suh, Eui-Ho, & Hwang, Hyun-Seok (2006): Customer segmentation and strategy development based on customer life time value: A case study. Expert Systems with Applications, 31, 101–107. Kim, Yong Seog, & Street, W. Nick (2004): An intelligent system for customer targeting: A data mining approach. Decision Support Systems, 37, 215–228. Kim, Yong Seog, Street, W. Nick, Russell, Gary J., & Menczer, Filippo (2005): Customer Targeting: A Neural Network Approach Guided by Genetic Algorithms. Management Science, 51(2), 264–276. Kuo, R. J., An, Y. L., Wang, H. S., & Chung, W. J. (2006): Integration of self-organizing feature maps neural net work and genetic K-means algorithm for market segmentation. Expert Systems with Applications, 30, 313–324. Kwang, Yun-Chiang (2006): Integrate genetic algorithm and clustering technique to formulate the appropriate campaign strategies – A case study of automobile dealership. Master thesis, Chaoyang University of Technology. Liu, Emmy (2001). CRM in e-business era. In CRM conference. Shin, H. W., & Sohn, S. Y. (2004): Segmentation of stock trading customers according to potential value. Expert Systems with Applications, 27, 27–33. Tsai, C.-Y., & Chiu, C.-C. (2004): A purchase-based market segmentation methodology. Expert Systems with Applications, 27, 265–276. Vellido, A., Lisboa, P. J. G., & Meehan, K. (1999): Segmentation of the on-line shopping market using neural net works. Expert Systems with Applications, 17, 303–314. Wedel, M., & Kamakura, W. A. (2000). Market segmentation: Conceptual and methodological foundations (2nd ed.). Dordrecht: Kluwer. Woo, Ji Young, Bae, Sung Min, & Park, Sang Chan (2005): Visualization method for customer targeting using customer map. Expert Systems with Applications, 28, 763–772. Yao, Jingtao, Li, Yili, & Chew, Lim Tan (2000): Option price forecasting using neural networks. The International Journal of Management Science, 28, 455–466. A. Arasu and G. S. Manku (2004): Approximate counts and quantiles over sliding windows. In ACM PODS. A. Blum, P. Gibbons, D. Song, and S. Venkataraman (2004):. New streaming algorithms for fast detection of superspreaders. Technical Report IRP-TR-04-23, Intel Research. A. Chakrabarti, G. Cormode, and A. McGregor (2007): A near-optimal algorithm for computing the entropy of a stream. In ACM-SIAM Symposium on Discrete Algorithms. M. Datar, A. Gionis, P. Indyk, and R. Motwani (2002): Maintaining stream statistics over sliding windows. In ACM-SIAM Symposium on Discrete Algorithms. G. Kollios, J. Byers, J. Considine, M. Hadjieleftheriou, and F. Li (2005): Robust aggregation in sensor networks. IEEE Data Engineering Bulletin, 28(1). L. Lee and H. Ting (2006): A simpler and more efficient deterministic scheme for finding frequent items over sliding windows. In ACM PODS. G. Manku and R. Motwani (2002): Approximate frequency counts over data streams. In International Conference on Very Large Data Bases, pages 346–357.

Page 31: DIRECT MARKETING ANALYTICS JOURNALthedma.org/wp-content/uploads/2012-Analytics-Journal.pdfWinter 2012 DMA Analytics Journal 4 Letter From The Chair March 13, 2012 Dear Analytics-CRM

Winter 2012 DMA Analytics Journal

31

Should I Build a Segmented Model? A Practitioner’s Perspective

By Krishna Mehta, Jigyasa Analytics

Varun Aggarwal, EXL Service

1. Abstract Modelers are often faced with the million-dollar question: when should one replace an aggregate level model with a more complex segmented model? A reasonable response to this question would be that a segment level strategy would be more appropriate when the business needs or data dynamics call for it. While this high level guideline is very useful, there are a lot of blanks that need to be filled in to make this concept operational. In this paper, we have developed a few useful guidelines about when segmentation is appropriate. Our approach looks for evidence of need for segmentation throughout the model development process, and provides the reader with a process flow to make this important decision. In making this decision, we consider (a) business needs, (b) data coverage, (c) stability issues due to over dependence on some variables, (d) consistency of relationship among predictors and predicted across sub-samples, and (e) residual analysis of the aggregate model to identify segments of sub-par performance. In addition, we present the reader with some practical examples where we have implemented our approach and share the associated improvement in model performance. 2. Introduction Segmentation strategy is a key element in the process of identifying true patterns in different portions of data. This is particularly useful while dealing with a huge customer base of any business entity. Looking at segments in detail, a modeler is likely to generate more meaningful insights. While adding value for marketing departments, it also wins the votes of statistical experts. Model performance often goes up. We have seen that an appropriate segmentation in the pre-modeling stage is an incremental benefit to the regression models. In this paper, we provide a set of strategies that can be adopted to examine whether there is a need for segmentation modeling. Each strategy corresponds to a particular scenario. These strategies are listed below:

(a) When there are some clear business reasons or external knowledge that calls for segmentation;

Page 32: DIRECT MARKETING ANALYTICS JOURNALthedma.org/wp-content/uploads/2012-Analytics-Journal.pdfWinter 2012 DMA Analytics Journal 4 Letter From The Chair March 13, 2012 Dear Analytics-CRM

Winter 2012 DMA Analytics Journal

32

(b) When data availability/coverage vary across sub-segments; (c) When the model is over dependent on certain predictors; (d) When the relationship of certain key predictors with target variable are not stable

across sub-pockets of the population; (e) When it is possible to identify some patterns in the error terms of the base model.

3. Business Sense No doubt computers have algorithms to read datasets. But they don’t have ears; and datasets do speak. Work experience in the relevant industry or a reasonable amount of time spent in market research can save a lot of time and effort for the modeler. Segmentation based on business intuitions often gets validated. In other situations, non-data driven reasons related to business strategy might also call for appropriate segmentation in the pre-modeling stage. Illustration Assume we want to predict the prices of Central Government securities of India. One approach may be a straightaway attempt at building a model, in case of minimal familiarity with the concepts of security pricing. On the contrary, modelers with industry experience or access to prior research work may have a fair idea of some valid criterion for segmentation. In our example, ‘residual maturity’ can be used as a key segmentation driver. By classifying securities based on short-term, medium-term and long-term residual maturity, we have been able to identify distinct patterns specific to each segment. Short-Term Residual Maturity Model: Bond’s issue date and maturity date have no significant impact on the trading price. Only coupon rate matters. Medium-Term Residual Maturity Model: The longer the time to maturity, the higher the price! Bond’s issue date has no significant impact. Long Term Residual Maturity Model: The longer the time since issuance, the higher the price! Bond’s maturity date has no significant impact. Segmentation on these lines has not only helped us generating some good insights, but has also provided us with better performing models. We have computed root mean square error (RMSE) for each segment using base model equation as well as respective segment model equation.

Page 33: DIRECT MARKETING ANALYTICS JOURNALthedma.org/wp-content/uploads/2012-Analytics-Journal.pdfWinter 2012 DMA Analytics Journal 4 Letter From The Chair March 13, 2012 Dear Analytics-CRM

Winter 2012 DMA Analytics Journal

33

We can see that the segmented model outperforms the unsegmented base model in every segment, and also on the entire population. 4. Data coverage In some situations, data coverage could vary across pockets of the population. If the data fields in question are important in predicting the event being modeled, this might be a reason for the segmentation. Illustration A large Internet service provider had introduced a new product and wanted to build customer churn models for all its entire product suite. However, customer usage data was not available for the newer product. We decided to develop segmented models to account for the data availability issues, as usage data was a very powerful predictor of churn. There are other situations where the data fields might be available for the entire population, but fill-in rates might be too low for sub-segments and might call for segmentation of the model. 5. Over-Dependence on Certain Predictors An unsegmented model (the base model built on entire population under consideration) can be of great help in assessing the need for segmentation.

Page 34: DIRECT MARKETING ANALYTICS JOURNALthedma.org/wp-content/uploads/2012-Analytics-Journal.pdfWinter 2012 DMA Analytics Journal 4 Letter From The Chair March 13, 2012 Dear Analytics-CRM

Winter 2012 DMA Analytics Journal

34

Illustration

Predominance of a binary variable V1 with a whooping 43% contribution in our base model urged us to look at the two segments:

(a) When V1 is true (b) When V1 is false

We developed separate models for these two segments, whose outcomes were then combined by ‘Capture Rate Method’ (refer annexure for details) to get the predicted scores. Lift comparison highlights the value in segmentation here.

It is important to note that segmented model lift chart is not only doing better but is also smoother than that of the unsegmented base model.

Page 35: DIRECT MARKETING ANALYTICS JOURNALthedma.org/wp-content/uploads/2012-Analytics-Journal.pdfWinter 2012 DMA Analytics Journal 4 Letter From The Chair March 13, 2012 Dear Analytics-CRM

Winter 2012 DMA Analytics Journal

35

6. Stability of Data Dynamics Segmentation might be needed if the correlation coefficients of predictors with the target variable tend to vary significantly or flip across pockets of data. Illustration For one of our clients, an unsegmented base model was built using 13 variables. We then reviewed the stability of the model by exploring the value of the correlations of the predictors across pockets of the population. We explored these across months as well as across other segments derived by splitting the variables. In the example below, we illustrated the case where one of the variables was used to split entire dataset into two segments. Correlations of remaining 12 predictors with the dependent variable were then computed for the two segments. It was observed that correlation signs for 5 variables flipped. This finding provided a reason for developing separate models for the two segments. The following Venn diagram gives a snapshot of distribution of variables across models. The alphabet denotes the area; and the number in parenthesis corresponds to the number of variables in that area.

The table below shows that the variables with flipping correlation signs across segments are very less likely to turn up in overlapping areas. This highlights the importance of building separate models for segments with distinct characteristics.

Page 36: DIRECT MARKETING ANALYTICS JOURNALthedma.org/wp-content/uploads/2012-Analytics-Journal.pdfWinter 2012 DMA Analytics Journal 4 Letter From The Chair March 13, 2012 Dear Analytics-CRM

Winter 2012 DMA Analytics Journal

36

As far as model performance is concerned, the segmented model (combining the models for segments) has shown a reasonable improvement in lift over the unsegmented base model.

7. Patterns in Error Terms

There might be a case for segmentation if the unsegmented base model shows systematically different error rates for across some sub-segments of the population. To explore this, one could develop a Classification/Regression Tree model where the target variable is the base model error and the predictors are the variables that are included in the model. Illustration

A leading transportation company wanted to improve the accuracy of its existing customer-choice model. Accordingly, the objective has been to develop a segmentation strategy driven by revealed passenger-choice data, and to demonstrate the effectiveness/fit of the same. We built a classification tree based on the base model residuals. Then we used the tree to identify potential segments, and then built segment level models.

Page 37: DIRECT MARKETING ANALYTICS JOURNALthedma.org/wp-content/uploads/2012-Analytics-Journal.pdfWinter 2012 DMA Analytics Journal 4 Letter From The Chair March 13, 2012 Dear Analytics-CRM

Winter 2012 DMA Analytics Journal

37

Population distribution across segments based on Residuals of Aggregate Base Model

Parent Node

Terminal Node

Notes: i. Each terminal node denotes a segment. ii. These nodes are subject to refinement.

The following table shows that this segmentation strategy has led to over 14% improvement in RMSE as compared to the base model.

RMSE Comparison

Page 38: DIRECT MARKETING ANALYTICS JOURNALthedma.org/wp-content/uploads/2012-Analytics-Journal.pdfWinter 2012 DMA Analytics Journal 4 Letter From The Chair March 13, 2012 Dear Analytics-CRM

Winter 2012 DMA Analytics Journal

38

8. Flow Chart

In this section, we present a flow chart for a model-driven segmentation strategy. START: There is a need to develop model. Step 1: Any business knowledge or data availability reasons to work on pre-

defined segments? − If YES, go to Step 2; − If NO, go to Step 3.

Step 2: Develop segment level models [refer scenario (b) in Sec. 2] and go to Step 1. Step 3: Build an aggregate model and go to Step 4. Step 4: Any binary predictor, whose contribution is very high?

− If YES, go to Step 5; − If NO, go to Step 6.

Step 5: Develop segment level models [refer scenario (a) in Sec. 2] and go to Step 4. Step 6: Does splitting the data by the values of any predictor cause the sign of the

correlation of the other predictors with the target variable to flip or change a lot? − If YES, go to Step 7; − If NO, go to Step 8.

Step 7: Develop segment level models [refer scenario (c) in Sect. 2] and go to Step 4.

Step 8: Any patterns (based on classification tree) in residuals of aggregate model?

− If YES, go to Step 9; − If NO, go to STOP.

Step 9: Develop segment level models [refer scenario (d) in Sec. 2] and go to Step 4. STOP: No segmentation needed.

9. Conclusion

Qualitative research for gaining required business sense helps in building better models. This can be a great starting point for defining segments. Additional efforts, such as going into a few profiling exercises before finalizing an aggregate model, may yield better pay-offs in terms of answers to a few questions listed below: – Why is one variable contributing so much in this model? – Why do target variable correlation magnitudes vary across data subsets for key

predictors? – Does the unsegmented base model have systematically higher errors for some pockets

of the population?

Page 39: DIRECT MARKETING ANALYTICS JOURNALthedma.org/wp-content/uploads/2012-Analytics-Journal.pdfWinter 2012 DMA Analytics Journal 4 Letter From The Chair March 13, 2012 Dear Analytics-CRM

Winter 2012 DMA Analytics Journal

39

10. Acknowledgements

[1] SAS is a Registered Trademark of the SAS Institute, Inc. of Cary, North Carolina. [2] This paper was written while the authors were colleagues at EXL Services.

11. Annexure

Capture Rate Method

Consider two segments with ‘m’ and ‘n’ observations respectively.

Segment 1 Segment 2

Note: Sorted by Model’s Predicted Probability in descending order

For a reasonably good model (irrespective of whatever segment), top percentiles of predicted score should correspond to more events than non-events (for instance, in above illustrative snapshots, first 6 records capture 5 events). Note that there is a striking difference between maximum predicted scores across two segments. Now if you append the two segments and sort them by scores to evaluate model performance, top observations of Segment II (capturing events) would lose their importance. To resolve this, capture rates have been used in one form or the other to compute new scores. Note that capture rate at percentile level is the lift value for each percentile of predicted score.

About the authors Krishna is co-founder and Principal at Jigyasa Analytics. A Yale alumnus, Krishna has experience consulting several Fortune 500 companies in Banking, Credit Card, Insurance, Telecommunications, Publishing, Transportation and Retail. His areas of expertise include Customer Acquisition, Cross Sell and Upsell, Fraud Analytics, Retention Analytics, Segmentation, Experiment Design, Marketing Mix Modeling and Forecasting. email: [email protected] Varun is a modeler at EXL Services. A graduate of Delhi School of Economics, Varun has extensive experience in developing customized analytics solutions that help businesses make better marketing decisions. He has extensive experience in the verticals of Telecommunications and Financial Services. email: [email protected]

Page 40: DIRECT MARKETING ANALYTICS JOURNALthedma.org/wp-content/uploads/2012-Analytics-Journal.pdfWinter 2012 DMA Analytics Journal 4 Letter From The Chair March 13, 2012 Dear Analytics-CRM

Analytics-CRM Council Advisory Committee

Chair Devyani Sadh, Ph.D. CEO, Data Square Vice Chair & Journal Editor Leo Kluger WW Program Director, IBM Newsletter Lead John Young SVP Analytics, Epsilon

Academic Liaison Pete Fader, Ph.D. Professor, Wharton - UPenn CRM Strategy Lead Mark Picone Experian

CRM Strategy Lead Denise Gatto VP, Astoria Federal

Associate Journal Editor Gary Cao SVP Decision Analytics, IXI

Analytic Challenge Lead Krishna Mehta Principal, Jigyasa Analytics Seminar and Webinar Lead Jacque Paige Recruiter, Smith Hanley Assoc. Mentoring Program Lead Peter Zajonc Sr. Director Analytics, Epsilon

Tough-Nuts Program Lead Dave Miller SVP Analytics, Claritas

Membership Committee Onder Oguzhan Partner, Managed Analytics

Advisor Joe Devanny Readers Digest Membership Lead Open Position Social Networking Lead Open Position

Page 41: DIRECT MARKETING ANALYTICS JOURNALthedma.org/wp-content/uploads/2012-Analytics-Journal.pdfWinter 2012 DMA Analytics Journal 4 Letter From The Chair March 13, 2012 Dear Analytics-CRM

!!!!

Big Data and Real-time Data Management Statistical and Artificial Intelligence Methodologies

Traditional, Digital, Email, and Social Media Analytics Real-time Analytics and Dynamic Model Scoring Engine

Analytic Integration with Marketing Automation Technology Integrated Business Insights driven by Analytics and Technology

Data Square 1-877-DATASET | 1-203-964-9733 Ext. 200

www.datasquare.com | [email protected]

Thought-leadership in Analytics and CRM

Page 42: DIRECT MARKETING ANALYTICS JOURNALthedma.org/wp-content/uploads/2012-Analytics-Journal.pdfWinter 2012 DMA Analytics Journal 4 Letter From The Chair March 13, 2012 Dear Analytics-CRM

bridgingtalent & opportunity

Recruitment Specialists since 1980

Jacqueline Paige, Managing [email protected]

203-319-4300 x287www.smithhanley.com

Marketing Analytics

Marketing Mix Modeling

Survey Methodology

Forecasting

Predictive Modeling

Credit Policy Analysis

Risk Management

7.5 x 10