Lexical semantic change detection has been a rapidly developing field of science in recent years. Existed algorithms of lexical semantic change detection face difficulties when they are used to work… Click to show full abstract
Lexical semantic change detection has been a rapidly developing field of science in recent years. Existed algorithms of lexical semantic change detection face difficulties when they are used to work with words denoting named entities. This paper proposes a method that allows one to reveal a word in a large corpus that started being used as a named entity, as well as to date the first usage of this word as a proper name. To solve this problem, firstly, we offer an algorithm that allows for detecting words in a large corpus denoting named entities. The recognizer is based on an analysis of co-occurrences with the most frequent words and was trained on data from the English subcorpus of the Google Books Ngram corpus. The achieved recognition accuracy of named entities is 98.44% on the test sample. Secondly, we test the possibility of applying the trained recognizer to diachronic data. The analysed cases show that the recognizer initially trained using the total bigram frequencies for a long time interval, at least for any frequent word, provides stable results for the annual frequency values. This can make the recognizer a good tool for language evolution studies, especially for detecting new meanings of words. The analysed cases show that the proposed method allows revealing new word meanings associated with named entities, as well as detecting genericized meaning of words that were earlier used as proper names.
               
Click one of the above tabs to view related content.