Next-Purchase Prediction Using Projections of Discounted Purchasing Sequences

Shapoval, Katerina; Setzer, Thomas

doi:10.1007/s12599-017-0485-1

Next-Purchase Prediction Using Projections of Discounted Purchasing Sequences

Research Paper
Published: 27 June 2017

Volume 60, pages 151–166, (2018)
Cite this article

Business & Information Systems Engineering Aims and scope Submit manuscript

Katerina Shapoval¹ &
Thomas Setzer¹

829 Accesses
7 Citations
Explore all metrics

Abstract

A primary task of customer relationship management (CRM) is the transformation of customer data into business value related to customer binding and development, for instance, by offering additional products that meet customers’ needs. A customer’s purchasing history (or sequence) is a promising feature to better anticipate customer needs, such as the next purchase intention. To operationalize this feature, sequences need to be aggregated before applying supervised prediction. That is because numerous sequences might exist with little support (number of observations) per unique sequence, discouraging inferences from past observations at the individual sequence level. In this paper the authors propose mechanisms to aggregate sequences to generalized purchasing types. The mechanisms group sequences according to their similarity but allow for giving higher weights to more recent purchases. The observed conversion rate per purchasing type can then be used to predict a customer’s probability of a next purchase and target the customers most prone to purchasing a particular product. The bias–variance trade-off when applying the models to target customers with respect to the lift criterion are discussed. The mechanisms are tested on empirical data in the realm of cross-selling campaigns. Results show that the expected bias–variance behavior well predicts the lift achieved with the mechanisms. Results also show a superior performance of the proposed methods compared to commonly used segmentation-based approaches, different similarity measures, and popular class predictors. While the authors tested the approaches for CRM campaigns, their parameterization can be adjusted to operationalize sequential features of high cardinality also in other domains or business functions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Notes

The company operates on the US and European markets of telecommunication services. The company achieved annual revenues in the double-digit billion Euro range. The product portfolio ranges from basic starter-products like Internet domains, to various hosting solutions up to professional server solutions for large-scale businesses, mobile telephony, as well as access products such as digital subscriber lines.
Such approaches are often used within a broader category of methods such as the Sequence Alignment Method (SAM; Kruskal 1983).
Geometrically descending weights are widely-used techniques to model a discounted importance of observations, such as in time series forecasting (Brown 2004).
This is very different from tasks such as class prediction, where a classifier is typically assessed by the total accuracy or its (potentially weighted) confusion matrix computed over all test data instances. The discrepancy of the business-oriented objective of lift and the traditional accuracy measures as well as its implications are extensively discussed in Baumann et al. (2015).
Evaluations with higher and lower parameter values delivered clearly worse results and are not further considered in this article.
This results in 48 dimensional binary vector encoding 9 + 9 potential products for the first two purchases, and 10 + 10 + 10 potential products when including $P_0$ for the three prior purchases.
We apply $\lambda _{Box-Cox}=0.26$, as in our dataset we observe approximately white noise error structures with this value.
We used Wilcoxon test as a more conservative approach but a t-test has also been conducted on Box–Cox transformed values, also confirming the significance in lift difference.
Note, that loss in lift and normalized out-of-sample lift sum up to 1.

References

Back B, Holmbom A, Eklund T (2011) Customer portfolio analysis using the SOM. Int J Bus Inf Syst 8(4):396–412
Google Scholar
Baumann A, Lessmann S, Coussement K, De Bock KW (2015) Maximize what matters: predicting customer churn with decision-centric ensemble selection. In: ECIS 2015 completed research papers. http://aisel.aisnet.org/ecis2015_cr/15/. Accessed 25 June 2017
Bicego M, Murino V, Figueiredo MA (2003) Similarity-based clustering of sequences using hidden Markov models. Machine learning and data mining in pattern recognition. Springer, Heidelberg, pp 86–95
Chapter Google Scholar
Bose I, Chen X (2009) Quantitative models for direct marketing: a review from systems perspective. Eur J Oper Res 195(1):1–16
Article Google Scholar
Brown RG (2004) Smoothing, forecasting and prediction of discrete time series. Courier Dover Publications, Mineola, NY
Chan CCH (2008) Intelligent value-based customer segmentation method for campaign management: a case study of automobile retailer. Expert Syst Appl 34(4):2754–2762
Article Google Scholar
Cho YB, Cho YH, Kim SH (2005) Mining changes in customer buying behavior for collaborative recommendations. Expert Syst Appl 28(2):359–369
Article Google Scholar
Daoud RA, Amine A, Bouikhalene B, Lbibb R (2015) Combining RFM model and clustering techniques for customer value analysis of a company selling online. In: Computer systems and applications (AICCSA), 2015 IEEE/ACS 12th international conference, IEEE, pp 1–6
Domingos P (2000) A unified bias-variance decomposition. In: Proceedings of 17th international conference on machine learning. Morgan Kaufmann, Stanford, CA, pp 231–238
Dunlavy DM, Kolda TG, Acar E (2011) Temporal link prediction using matrix and tensor factorizations. ACM Trans Knowl Discov Data TKDD 5(2):10
Google Scholar
Han SH, Lu SX, Leung SC (2012) Segmentation of telecom customers based on customer value by decision tree model. Expert Syst Appl 39(4):3964–3973
Article Google Scholar
Hsu MW, Lessmann S, Sung MC, Ma T, Johnson JE (2016) Bridging the divide in financial market forecasting: machine learners vs. financial economists. Expert Syst Appl 61:215–234
Article Google Scholar
James G, Witten D, Hastie T, Tibshirani R (2013) An introduction to statistical learning, vol 6. Springer, Heidelberg
Book Google Scholar
Joh CH, Timmermans HJ, Popkowski-Leszczyc PT (2003) Identifying purchase-history sensitive shopper segments using scanner panel data and sequence alignment methods. J Retail Consum Serv 10(3):135–144
Article Google Scholar
Kaski S, Nikkilä J, Kohonen T (1998) Methods for interpreting a self-organized map in data analysis. In: In Proc. 6th European Symposium on Artificial Neural Networks (ESANN98). D-Facto, Brugfes, Citeseer
Khajvand M, Tarokh MJ (2011) Estimating customer future value of different customer segments based on adapted RFM model in retail banking context. Proced Comput Sci 3:1327–1332
Article Google Scholar
Kohonen T (2001) Self-organizing maps. Springer, Heidelberg
Book Google Scholar
Kruskal JB (1983) An overview of sequence comparison: time warps, string edits, and macromolecules. SIAM Rev 25(2):201–237
Article Google Scholar
Levenshtein VI (1966) Binary codes capable of correcting deletions, insertions and reversals. Cybern Control Theory 10:845–848
Google Scholar
Li S, Sun B, Wilcox RT (2005) Cross-selling sequentially ordered products: an application to consumer banking services. J Mark Res 42(2):233–239
Article Google Scholar
MacQueen J et al (1967) Some methods for classification and analysis of multivariate observations, vol 1. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, California, pp 281–297
Miguéis V, Van den Poel D, Camanho A, Cunha J (2012) Predicting partial customer churn using Markov for discrimination for modeling first purchase sequences. Adv Data Anal Classif 6(4):337–353
Article Google Scholar
Moeyersoms J, Martens D (2015) Including high-cardinality attributes in predictive models: A case study in churn prediction in the energy sector. Decis Support Syst 72:72–81
Article Google Scholar
Moon S, Russell GJ (2008) Predicting product purchase from inferred customer similarity: an autologistic model approach. Manag Sci 54(1):71–82
Article Google Scholar
Mooney CH, Roddick JF (2013) Sequential pattern mining—approaches and algorithms. ACM Comput Surv 45(2):19:1–19:39
Article Google Scholar
Netzer O, Lattin JM, Srinivasan V (2008) A hidden Markov model of customer relationship dynamics. Mark Sci 27(2):185–204
Article Google Scholar
Ngai E, Xiu L, Chau D (2009) Application of data mining techniques in customer relationship management: a literature review and classification. Expert Syst Appl 36(2):2592–2602
Article Google Scholar
Park DH, Kim HK, Choi IY, Kim JK (2012) A literature review and classification of recommender systems research. Expert Syst Appl 39(11):10,059–10,072
Article Google Scholar
Piatetsky-Shapiro G, Masand B (1999) Estimating campaign benefits and modeling lift. In: Proceedings of the fifth ACM SIGKDD international conference on knowledge discovery and data mining, ACM, New York, KDD ’99, pp 185–193. doi:10.1145/312129.312225
Prinzie A, Van den Poel D (2007) Predicting home-appliance acquisition sequences: Markov/Markov for discrimination and survival analysis for modeling sequential information in NPTB models. Decis Support Syst 44(1):28–45
Article Google Scholar
Sahoo N, Singh PV, Mukhopadhyay T (2012) A hidden Markov model for collaborative filtering. MIS Q 36(4):1329–1356
Google Scholar
Schweidel DA, Bradlow ET, Fader PS (2011) Portfolio dynamics for customers of a multiservice provider. Manag Sci 57(3):471–486
Article Google Scholar
Shirley KE, Small DS, Lynch KG, Maisto SA, Oslin DW (2010) Hidden Markov models for alcoholism treatment trial data. Ann Appl Stat 4:366–395
Article Google Scholar
Steinmann S, Silberer G (2010) Clustering customer contact sequences—results of a customer survey in retailing. European Retail Research. Gabler, Wiesbaden, pp 97–120
Chapter Google Scholar
Van den Poel D, Buckinx W (2005) Predicting online-purchasing behaviour. Eur J Oper Res 166(2):557–575
Article Google Scholar
Wong KW, Zhou S, Yang Q, Yeung JMS (2005) Mining customer value: from association rules to direct marketing. Data Min Knowl Discov 11(1):57–79
Article Google Scholar

Download references

Author information

Authors and Affiliations

Karlsruhe Institute of Technology (KIT), Institute of Information Systems and Marketing (IISM), Information and Market Engineering (IM), Fritz-Erler-Straße 23, 76133, Karlsruhe, Germany
Katerina Shapoval & Thomas Setzer

Authors

Katerina Shapoval
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Setzer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Katerina Shapoval.

Additional information

Accepted after two revisions by Prof. Dr. Suhl.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Shapoval, K., Setzer, T. Next-Purchase Prediction Using Projections of Discounted Purchasing Sequences. Bus Inf Syst Eng 60, 151–166 (2018). https://doi.org/10.1007/s12599-017-0485-1

Download citation

Received: 27 May 2016
Accepted: 23 February 2017
Published: 27 June 2017
Issue Date: April 2018
DOI: https://doi.org/10.1007/s12599-017-0485-1

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Next-Purchase Prediction Using Projections of Discounted Purchasing Sequences

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Business-to-Business Example

Improving Marketing Interactions by Mining Sequences

Forecasting purchase rates of new products introduced in existing categories

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Next-Purchase Prediction Using Projections of Discounted Purchasing Sequences

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Business-to-Business Example

Improving Marketing Interactions by Mining Sequences

Forecasting purchase rates of new products introduced in existing categories

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation