Åpne denne publikasjonen i ny fane eller vindu >>2020 (engelsk)Inngår i: ACM Transactions on Interactive Intelligent Systems, ISSN 2160-6455, E-ISSN 2160-6463, Vol. 10, nr 1, s. 4:1-4:32, artikkel-id 4Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]
This work presents an extension of Thompson Sampling bandit policy for orchestrating the collection of base recommendation algorithms for e-commerce. We focus on the problem of item-to-item recommendations, for which multiple behavioral and attribute-based predictors are provided to an ensemble learner. In addition, we detail the construction of a personalized predictor based on k-Nearest Neighbors (kNN), with temporal decay capabilities and event weighting. We show how to adapt Thompson Sampling to realistic situations when neither action availability nor reward stationarity is guaranteed. Furthermore, we investigate the effects of priming the sampler with pre-set parameters of reward probability distributions by utilizing the product catalog and/or event history, when such information is available. We report our experimental results based on the analysis of three real-world e-commerce datasets.
sted, utgiver, år, opplag, sider
ACM Digital Library, 2020
Emneord
E-commerce recommender systems, Thompson Sampling, multi-arm bandit ensembles, session-based recommendations, streaming recommendations
HSV kategori
Identifikatorer
urn:nbn:se:mau:diva-2479 (URN)10.1145/3237187 (DOI)000564083500004 ()2-s2.0-85075692959 (Scopus ID)30500 (Lokal ID)30500 (Arkivnummer)30500 (OAI)
2020-02-272020-02-272024-09-19bibliografisk kontrollert