We extracted 3,291,101 Tweets using hashtags associated with African American-related discourse (#BlackTwitter, #BlackLivesMatter, #StayWoke) and 1,382,441 Tweets from a control set (general or no hashtags) from September 1, 2019 to… Click to show full abstract
We extracted 3,291,101 Tweets using hashtags associated with African American-related discourse (#BlackTwitter, #BlackLivesMatter, #StayWoke) and 1,382,441 Tweets from a control set (general or no hashtags) from September 1, 2019 to December 31, 2019 using the Twitter API. We also extracted a literary historical corpus of 14,692 poems and prose writings by African American authors and 66,083 items authored by others as a control, including poems, plays, short stories, novels and essays, using a cloud-based machine learning platform (Amazon SageMaker) via ProQuest TDM Studio. Lastly, we combined statistics from log likelihood and Fisher's exact tests as well as feature analysis of a batch-trained Naive Bayes classifier to select lexicons of terms most strongly associated with the target or control texts. The resulting Tweet-derived African American lexicon contains 1,734 unigrams, while the control contains 2,266 unigrams. This initial version of a lexicon-based African American Tweet detection algorithm developed using Tweet texts will be useful to inform culturally sensitive Twitter-based social support interventions for African American dementia caregivers.
               
Click one of the above tabs to view related content.