Commodity information must be matched to HSCode so as to be quickly through customs for export. So it is particularly important to identify entity name in the commodity title of… Click to show full abstract
Commodity information must be matched to HSCode so as to be quickly through customs for export. So it is particularly important to identify entity name in the commodity title of e-commerce platform quickly and accurately. Aim at the problem, an approach based on TWs-LSTM is proposed to identify the entity name of commodity. In this paper, we apply TFIDF algorithm to manipulate text corpus of the commodity for getting the weight matrix of the commodity words. Meanwhile, we use the Word2Vec model to represent the semantic meanings of the words extracted from the bag of words. Then, the weight vector of commodity titles and every word vector of the title are combined into a new one-dimensional vector. We use these one-dimensional vectors to represent the commodity titles, named TWs model. Finally, we put the TWs vector into the LSTM for commodity entity name recognition. In the experimental stage, we compare the TWs-LSTM model with other text processing models for experimental calculation by dividing the commodity entity name data into a training set and a testing set. After applying the TWs-LSTM model, the F1-Score reached 64.58% with the commodity title corpus of the Tmall platform, where the TWs-LSTM achieves a state-of-the-art in comparison with the baseline models and previous studies.
               
Click one of the above tabs to view related content.