research-article

A Reinforcement Learning Approach to Optimize Discount and Reputation Tradeoffs in E-commerce Systems

Authors:

John C. S. LuiAuthors Info & Claims

ACM Transactions on Internet Technology (TOIT), Volume 20, Issue 4

Article No.: 37, Pages 1 - 26

https://doi.org/10.1145/3400024

Published: 27 October 2020 Publication History

Abstract

Feedback-based reputation systems are widely deployed in E-commerce systems. Evidence shows that earning a reputable label (for sellers of such systems) may take a substantial amount of time, and this implies a reduction of profit. We propose to enhance sellers’ reputation via price discounts. However, the challenges are as follows: (1) The demands from buyers depend on both the discount and reputation, and (2) the demands are unknown to the seller. To address these challenges, we first formulate a profit maximization problem via a semi-Markov decision process to explore the optimal tradeoffs in selecting price discounts. We prove the monotonicity of the optimal profit and optimal discount. Based on the monotonicity, we design a Q-learning with forward projection (QLFP) algorithm, which infers the optimal discount from historical transaction data. We prove that the QLFP algorithm convergences to the optimal policy. We conduct trace-driven simulations using a dataset from eBay to evaluate the QLFP algorithm. Evaluation results show that QLFP improves the profit by as high as 50% over both Q-learning and Speedy Q-learning. The QLFP algorithm also improves both the reputation and profit by as high as two times over the scheme of not providing any price discount.

References

[1]

Mohammad Gheshlaghi Azar, Remi Munos, Mohammad Ghavamzadeh, and Hilbert Kappen. 2011. Speedy Q-learning. In Advances in Neural Information Processing Systems.

[2]

Sulin Ba and Paul A. Pavlou. 2002. Evidence of the effect of trust building technology in electronic markets: Price premiums and buyer behavior. MIS Quart. 26, 3 (2002), 243--268.

Digital Library

[3]

Dimitri P. Bertsekas and John N. Tsitsiklis. 1996. Neuro-Dynamic Programming (1st ed.). Athena Scientific.

Digital Library

[4]

Stephen Boyd and Lieven Vandenberghe. 2004. Convex Optimization. Cambridge University Press.

Digital Library

[5]

Steven J. Bradtke and Michael O. Duff. 1994. Reinforcement learning methods for continuous-time Markov decision problems. In Proceedings of the Conference on Neural Information Processing Systems (NIPS’94).

[6]

Alpha C. Chiang. 1984. Fundamental Methods of Mathematical Economics. McGraw-Hill/Irwin, Boston, Mass.

[7]

Chrysanthos Dellarocas. 2001. Analyzing the economic efficiency of eBay-like online reputation reporting mechanisms. In Proceedings of the ACM Conference on Economics and Computation (EC’01).

Digital Library

[8]

Adithya M. Devraj and Sean Meyn. 2017. Zap Q-learning. In Advances in Neural Information Processing Systems. 2235--2244.

[9]

Prashant Dewan and Partha Dasgupta. 2010. P2P reputation management using distributed identities and decentralized recommendation chains. IEEE Trans. Knowl. Data Eng. 22, 7 (2010), 1000--1013.

Digital Library

[10]

eBay. 1995. eBay Classifies Sellers into Twelve Stars. Retrieved from http://pages.ebay.com/help/feedback/scores-reputation.html.

[11]

Fortune500. 2015. Retrieved from http://fortune.com/fortune500/.

[12]

Ramanthan Guha, Ravi Kumar, Prabhakar Raghavan, and Andrew Tomkins. 2004. Propagation of trust and distrust. In Proceedings of the Annual Conference on the World Wide Web (WWW’04). 403--412.

Digital Library

[13]

Kevin Hoffman, David Zage, and Cristina Nita-Rotaru. 2009. A survey of attack and defense techniques for reputation systems. ACM Comput. Surv. 42, 1, Article 1 (December 2009), 31 pages.

[14]

Daniel Houser and John Wooders. 2006. Reputation in auctions: Theory, and evidence from eBay. J. Econ. Manage. Strategy 15, 2 (2006).

[15]

Daniel R. Jiang and Warren B. Powell. 2015. An approximate dynamic programming algorithm for monotone value functions. Operat. Res. 63, 6 (2015), 1489--1511.

Digital Library

[16]

Ginger Zhe Jin and Andrew Kato. 2006. Price, quality, and reputation: Evidence from an online field experiment. AND J. Econ. 37, 4 (2006), 983--1005.

[17]

Sepandar D. Kamvar, Mario T. Schlosser, and Hector Garcia-Molina. 2003. The eigentrust algorithm for reputation management in P2P networks. In Proceedings of the Annual Conference on the World Wide Web (WWW’03).

Digital Library

[18]

Tapan Khopkar, Xin Li, and Paul Resnick. 2005. Self-selection, slipping, salvaging, slacking, and stoning: The impacts of negative feedback at eBay. In Proceedings of the ACM Conference on Economics and Computation (EC’05).

Digital Library

[19]

Stuart Landon and Constance E. Smith. 1998. Quality expectations, reputation, and price. South. Econ. J. 64, 3 (1998), 628--647.

[20]

Nolan Miller, Paul Resnick, and Richard Zeckhauser. 2005. Eliciting informative feedback: The peer-prediction method. Manage. Sci. 51, 9 (September 2005), 1359--1373.

[21]

Lev Muchnik, Sinan Aral, and Sean J. Taylor. 2013. Social influence bias: A randomized experiment. Science 341, 6146 (2013), 647--651.

[22]

Martin L. Puterman. 2014. Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley 8 Sons.

[23]

Paul Resnick, Ko Kuwabara, Richard Zeckhauser, and Eric Friedman. 2000. Reputation systems. Commun. ACM 43, 12 (December 2000), 45--48.

Digital Library

[24]

Paul Resnick and Rahul Sami. 2009. Sybilproof transitive trust protocols. In Proceedings of the ACM Conference on Economics and Computation (EC’09).

Digital Library

[25]

Herbert Robbins and Sutton Monro. 1951. A stochastic approximation method. Ann. Math. Stat. 22, 3 (1951), 400--407.

[26]

Aameek Singh and Ling Liu. 2003. TrustMe: Anonymous management of trust relationships in decentralized P2P systems. In Proceedings of the Annual Peer-to-Peer Conference (P2P’03).

[27]

Richard S. Sutton and Andrew G. Barto. 1998. Reinforcement Learning: An Introduction. Vol. 1. MIT press Cambridge.

Digital Library

[28]

Hong Xie and John C. S. Lui. 2015. A data driven approach to uncover deficiencies in online reputation systems. In Proceedings of the IEEE International Conference on Data Mining (ICDM’15).

[29]

Hong Xie and John C. S. Lui. 2015. Modeling eBay-like reputation systems: Analysis, characterization and insurance mechanism design. Perf. Eval. 91 (2015), 132--149.

Digital Library

[30]

Hong Xie and John C. S. Lui. 2017. Mining deficiencies of online reputation systems: Methodologies, experiments and implications. IEEE Trans. Serv. Comput. 13, 5 (2017), 887--900.

[31]

Hong Xie, Richard T. B. Ma, and John C. S. Lui. 2018. Enhancing reputation via price discounts in E-commerce systems: A data-driven approach. ACM Trans. Knowl. Discov. Data 20, 3, Article 26 (Jan. 2018), 29 pages.

[32]

Li Xiong and Ling Liu. 2004. Peertrust: Supporting reputation-based trust for peer-to-peer electronic communities. IEEE Trans. Knowl. Data Eng. 16, 7 (2004), 843--857.

Digital Library

[33]

Haitao Xu, Daiping Liu, Haining Wang, and Angelos Stavrou. 2015. E-commerce reputation manipulation: The emergence of reputation-escalation-as-a-service. In Proceedings of the Annual Conference on the World Wide Web (WWW’15).

Digital Library

[34]

Haifeng Yu, Michael Kaminsky, Phillip B. Gibbons, and Abraham Flaxman. 2006. SybilGuard: Defending against sybil attacks via social networks. In Proceedings of the ACM Special Interest Group on Data Communication Conference (SIGCOMM’06).

Digital Library

[35]

Xiuzhen Zhang, Lishan Cui, and Yan Wang. 2014. Commtrust: Computing multi-dimensional trust by mining e-commerce feedback comments. IEEE Trans. Knowl. Data Eng. 26, 7 (2014), 1631--1643.

Cited By

Wang L(2024)Consumer evaluation mechanisms on e-commerce platforms: reputation management and analysis of influencing factorsApplied Mathematics and Nonlinear Sciences10.2478/amns-2024-18729:1Online publication date: 5-Aug-2024
https://doi.org/10.2478/amns-2024-1872
Sharma DMaurya SPunhan ROjha MOjha P(2023)E-Commerce: Reach Customers and Drive Sales with Data Science and Big Data Analytics2023 2nd International Conference for Innovation in Technology (INOCON)10.1109/INOCON57975.2023.10101132(1-6)Online publication date: 3-Mar-2023
https://doi.org/10.1109/INOCON57975.2023.10101132
Lan RWang JHuang WDeng ZSun XChen ZLuo X(2021)Chinese Emotional Dialogue Response Generation via Reinforcement LearningACM Transactions on Internet Technology10.1145/344639021:4(1-17)Online publication date: 22-Jul-2021
https://dl.acm.org/doi/10.1145/3446390

Index Terms

A Reinforcement Learning Approach to Optimize Discount and Reputation Tradeoffs in E-commerce Systems
1. Computing methodologies
  1. Machine learning
    1. Machine learning algorithms
2. Information systems
  1. World Wide Web
    1. Web applications
      1. Electronic commerce

Recommendations

Enhancing Reputation via Price Discounts in E-Commerce Systems: A Data-Driven Approach

Reputation systems have become an indispensable component of modern E-commerce systems, as they help buyers make informed decisions in choosing trustworthy sellers. To attract buyers and increase the transaction volume, sellers need to earn reasonably ...
An electronic marketplace based on reputation and learning

In this paper, we propose a market model which is based on reputation and reinforcement learning algorithms for buying and selling agents. Three important factors: quality, price and delivery-time are considered in the model. We take into account the ...
Discount Based Prediction for Business Systems
MNCAPPS '12: Proceedings of the 2012 International Conference on Advances in Mobile Network, Communication and Its Applications

E commerce refers to buying and selling of products or services over electronic systems such as interne. As E commerce is growing fast companies are willing to spend more on improving online experiences. Currently many systems are in existence which ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Internet Technology

ACM Transactions on Internet Technology Volume 20, Issue 4

November 2020

391 pages

ISSN:1533-5399

EISSN:1557-6051

DOI:10.1145/3427795

Editor:
Ling Liu
Georgia Institute of Technology, USA

Issue’s Table of Contents

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 October 2020

Accepted: 01 May 2020

Revised: 01 April 2020

Received: 01 November 2019

Published in TOIT Volume 20, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

GRF
Chongqing High-Technology Innovation and Application Development Funds
National Nature Science Foundation of China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
191
Total Downloads

Downloads (Last 12 months)24
Downloads (Last 6 weeks)2

Reflects downloads up to 04 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Wang L(2024)Consumer evaluation mechanisms on e-commerce platforms: reputation management and analysis of influencing factorsApplied Mathematics and Nonlinear Sciences10.2478/amns-2024-18729:1Online publication date: 5-Aug-2024
https://doi.org/10.2478/amns-2024-1872
Sharma DMaurya SPunhan ROjha MOjha P(2023)E-Commerce: Reach Customers and Drive Sales with Data Science and Big Data Analytics2023 2nd International Conference for Innovation in Technology (INOCON)10.1109/INOCON57975.2023.10101132(1-6)Online publication date: 3-Mar-2023
https://doi.org/10.1109/INOCON57975.2023.10101132
Lan RWang JHuang WDeng ZSun XChen ZLuo X(2021)Chinese Emotional Dialogue Response Generation via Reinforcement LearningACM Transactions on Internet Technology10.1145/344639021:4(1-17)Online publication date: 22-Jul-2021
https://dl.acm.org/doi/10.1145/3446390

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Issue’s Table of Contents