Мэтчинг товаров на маркетплейсах: мультимодальная модель на основе архитектуры трансформера

Artem Yu. Varnukhov; Dmitry M.  Nazarov

Artem Yu. Varnukhov Ural State University of Economics, Yekaterinburg, Russia
Dmitry M. Nazarov Ural State University of Economics, Yekaterinburg, Russia https://orcid.org/0000-0002-5847-9718

Keywords: digital marketplace, contextual-semantic identification, competitive offers search, product matching, machine learning, deep learning, transformer architecture, data mining

Abstract

In this paper we analyze the problem of intelligent product matching in digital marketplaces for which one requires evaluation of similarity of various records that describe products but may differ in format, content or volume of multimodal data. The subject area of this scientific research represents an intersection of entity resolution (ER) problem solving methods: record matching and multimodal data analysis. It is of extreme relevance in a fast-growing platform economy with the e-commerce market expanding exponentially. The main purpose of this research is to develop and test an intelligent multimodal model based on transformer architecture to improve the accuracy and robustness of product matching in digital marketplaces. The authors developed a model integrating textual, visual and tabular attributes which enables us to identify similar products, find competitive offers, detect duplicates and perform product clustering and segmentation in a more effective manner. The proposed approach is based on the self-attention mechanism which enables contextual-semantic relations modeling of various-nature data. In order to extract the vector representation of text descriptions, language models are applied, in particular the Sentence-BERT architecture; for the graphical component Vision Transformer is used; and tabular data are processed using specialized learning mechanisms based on TabTransformer structured data. The experiment we carried out demonstrated that the developed multimodal model efficiently solves the task of product matching in digital marketplaces in an environment of significant variability of product items and data heterogeneity. Additionally, the results suggest that the model can be adapted successfully for application in other product categories. The results obtained confirm the efficiency and expediency to apply the multimodal approach for digital marketplace product matching implementation. This allows the e-commerce market participants to significantly improve the quality of inventory management, increase pricing efficiency and strengthen their competitive advantages.

Downloads

Download data is not yet available.

References

Fletcher A., Ormosi P. L., Savani R. (2023) Recommender systems and supplier competition on platforms. Journal of Competition Law & Economics, vol. 19, no. 3, pp. 397–426. https://doi.org/10.1093/joclec/nhad009

Hussien F.T.A., Rahma A.M.S., Abdulwahab H.B. (2021) An e-commerce recommendation system based on dynamic analysis of customer behavior. Sustainability, vol. 13, no. 19, article 10786. https://doi.org/10.3390/su131910786

Chen F., Liu X., Proserpio D. et al. (2020) Studying product competition using representation learning. Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ‘20), pp. 1261–1268. https://doi.org/10.1145/3397271.3401041

Hu S., Wei M. M., Cui S. (2023) The role of product and market information in an online marketplace. Production and Operations Management, vol. 32, no. 10, pp. 3100–3118. https://doi.org/10.1111/poms.14025

Cheung M., She J., Sun W., Zhou J. (2019) Detecting online counterfeit-goods seller using connection discovery. ACM Transactions on Multimedia Computing, Communications, and Applications, vol. 15, no. 2, article 35. https://doi.org/10.1145/3311785

Sun J., Zhang X., Zhu Q. (2020) Counterfeiters in online marketplaces: Stealing your sales or sharing your costs. Journal of Retailing, vol. 96, no. 2, pp. 189–202. https://doi.org/10.1016/j.jretai.2019.07.002

Köpcke H., Thor A., Rahm E. (2010) Evaluation of entity resolution approaches on real-world match problems. Proceedings of the VLDB Endowment, vol. 3, nos. 1–2, pp. 484–493. https://doi.org/10.14778/1920841.1920904

Cohen W.W., Ravikumar P., Fienberg S.E. (2003) A Comparison of string distance metrics for name-matching tasks. Proceedings of Workshop on Information Integration (IJCAI-03), pp. 73–78.

Singh R., Meduri V.V., Elmagarmid A., et. al. (2017) Synthesizing entity matching rules by examples. Proceedings of the VLDB Endowment, vol. 11, no. 2, pp. 189–202. https://doi.org/10.14778/3149193.3149199

Wang J., Li G., Yu J.X, Feng J. (2011) Entity matching: How similar is similar. Proceedings of the VLDB Endowment, vol. 4, no. 10, pp. 622–633. https://doi.org/10.14778/2021017.2021020

Angermann H. (2022) TaxoMulti: Rule-based expert system to customize product taxonomies for multi-channel e-commerce. SN Computer Science, vol. 3, article 177. https://doi.org/10.1007/s42979-022-01070-8

Mao M., Chen S., Zhang F. et. al. (2021) Hybrid ecommerce recommendation model incorporating product taxonomy and folksonomy. Knowledge-Based Systems, vol. 214, article 106720. https://doi.org/10.1016/j.knosys.2020.106720

Aanen S. S., Vandic D., Frasincar F. (2015) Automated product taxonomy mapping in an e-commerce environment. Expert Systems with Applications, vol. 42, no. 3, pp. 1298–1313. https://doi.org/10.1016/j.eswa.2014.09.032

Ristoski P., Petrovski P., Mika P., Paulheim H. (2018) A machine learning approach for product matching and categorization: Use case: Enriching product ads with semantic structured data. Semantic Web, vol. 9, no. 5, pp. 707–728. https://doi.org/10.3233/SW-180300

Shah K., Kopru S., Ruvini J. D. (2018) Neural network based extreme classification and similarity models for product matching. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, New Orleans – Louisiana, vol. 3, pp. 8–15. Association for Computational Linguistics. https://doi.org/10.18653/v1/N18-3002

Vaswani A., Shazeer N., Parmar N. et. al. (2017) Attention is all you need. Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17), Long Beach, CA, USA, pp. 6000–6010. https://dl.acm.org/doi/pdf/10.5555/3295222.3295349

Zhang H., Shafiq M.O. (2024) Survey of transformers and towards ensemble learning using transformers for natural language processing. Journal of Big Data, vol. 11, article 25. https://doi.org/10.1186/s40537-023-00842-0

Mikolov T., Chen K., Corrado G., Dean J. (2013) Efficient estimation of word representations in vector space. arXiv:1301.3781. https://doi.org/10.48550/arXiv.1301.3781

Pennington J., Socher R., Manning C. D. (2014) GloVe: Global vectors for word representation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, pp. 1532–1543.

He K., Zhang X., Ren S., Sun J. (2016) Deep residual learning for image recognition. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, pp. 770–778. https://doi.org/10.1109/CVPR.2016.90

Ba J.L., Kiros J.R., Hinton G.E. (2016) Layer normalization. arXiv:1607.06450. https://doi.org/10.48550/arXiv.1607.06450

Devlin J., Chang M. W., Lee K., Toutanova K. (2019) Bert: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, Minnesota, vol. 1, pp. 4171–4186. Association for Computational Linguistics. https://doi.org/10.18653/v1/N19-1423

Reimers N., Gurevych I. (2019) Sentence-BERT: Sentence embeddings using Siamese BERT-networks. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, pp. 3982–3992. Association for Computational Linguistics. https://doi.org/10.18653/v1/D19-1410

Wu Z., Shen C., van den Hengel A. (2019) Wider or deeper: Revisiting the ResNet model for visual recognition. Pattern Recognition, vol. 90, pp. 119–133. https://doi.org/10.1016/j.patcog.2019.01.006

Tan M., Le Q. (2019) EfficientNet: Rethinking model scaling for convolutional neural networks. Proceedings of the 36th International Conference on Machine Learning, vol. 97, pp. 6105–6114.

Dosovitskiy A., Beyer L., Kolesnikov A. et al. (2021) An image is worth 16x16 words: transformers for image recognition at scale. arXiv:2010.11929. https://doi.org/10.48550/arXiv.2010.11929

Radford A., Kim J. W., Hallacy C. et. al. (2021) Learning transferable visual models from natural language supervision. Proceedings of the 38th International Conference on Machine Learning, vol. 139, pp. 8748–8763.

Caron M., Touvron H., Misra I. et. al. (2021) Emerging properties in self-supervised vision transformers. arXiv:2104.14294. https://doi.org/10.48550/arXiv.2104.14294

Huang X., Khetan A., Cvitkovic M. et. al. (2020) TabTransformer: Tabular data modeling using contextual embeddings. arXiv:2012.06678. https://doi.org/10.48550/arXiv.2012.06678

Gorishniy Y., Rubachev I., Khrulkov V., et. al. (2021) Revisiting deep learning models for tabular data. Proceedings of the 35th International Conference on Neural Information Processing Systems (NIPS’21), article 1447, pp. 18932–18943.

Product matching in digital marketplaces: Multimodal model based on the transformer architecture

Abstract

Downloads

References