Gender information is very important for the recommendation system in the online shopping website. However, gender data often face label missing and incorrect labelling problems caused by consumers’ unwillingness to… Click to show full abstract
Gender information is very important for the recommendation system in the online shopping website. However, gender data often face label missing and incorrect labelling problems caused by consumers’ unwillingness to actively disclose personal information, which leads to gender estimation results that cannot meet the needs of the product recommendation system. To discover the customers’ gender information, we explore the customers’ online shopping behavior, especially the items viewed in the shopping session, from the dataset provided by Vietnam FPT Group. The dataset is very imbalanced while the number of female samples is $3\times $ of the male samples. To address the imbalance issue, we cluster the female samples into three subsets and then train a two-layer classifier model to estimate the customers’ gender. Experimental results demonstrate that our proposed method could achieve a combined accuracy 78% on average, and takes less than 6 seconds on average. As a data mining model for gender prediction, our approach has a lightweight network structure and less time consumption.
               
Click one of the above tabs to view related content.