9:55 - 10:15 amSunday, September 18
LK 101
Extracting adverse drug events from Twitter messages in real time using Naive Bayes classifier
LK 101
Extracting adverse drug events from Twitter messages in real time using Naive Bayes classifier
Aristotle University of Thessaloniki
Background Traditional reporting systems for adverse events (AE) have been slow in adapting to online AE reporting from patients. In the meantime, the growth and popularity of social media turned many... Read more

Description

Background 
Traditional reporting systems for adverse events (AE) have been slow in adapting to online AE reporting from patients. In the meantime, the growth and popularity of social media turned many patients and drug users to share their experiences with drugs online. Twitter is a social media service with increasing adoption and, although is limited to 140 character messages, can be a valuable source of AE related information. In this study, we describe the development and use of a real time system, that gathers AE information from users tweets. We compare the collected AE frequencies with those referred in the clinical trials of the drug and examine the results: are they a reliable source of information?

Method
The system uses Twitter Search API to retrieve Tweets that contains a specific drug name, e.g. Xanax. We carefully choose up to 25 tweets to create a training set for use on the Naive Bayes (NB) classifier. NB classifier is fast and requires only a small amount of training data to estimate the parameters necessary for classification. Thus, is suitable for a real-time application. The software can run on a mid-size server making it fast and cost-effective.

The system periodically collects new messages. In each message, we apply the NB classifier and get a value between 0 and 1. This number represents the probability this message to refer to an AE. Finally, messages having values over a threshold create a set of AE incidents items.

Results
Using the above described method, we searched for a specific drug word (Xanax) and extracted around 320 AE incidents, for a threshold of 0.6. Within these tweets, we counted the frequency of specific words (e.g. drowsiness, insomnia) coming from AE reported in placebo-controlled trials of the drug. Comparing the two frequencies gives us a glimpse on how reliable are the collected information: although there is a recognisable pattern between the two frequency tables, there are differences in the percentages of the observed AE.

Conclusion
This study proposes an innovative way to use well-known techniques, such as NB classification, to build a fast and cost-effective real time system for collecting AE incidents from social media platforms, such as Twitter. The system can automatically collect and report messages with information related to AE. A comparison between the frequencies of the AE in the messages and those reported in clinical trials, shows that although there are differences on the percentages (and, thus, the data cannot be considered as reliable), there is a pattern that indicates a possible correlation. Future work includes research for more drugs and AE as well as improvements on the NB classifier.

Contact Us

We're not around right now. But you can send us an email and we'll get back to you, asap.

Not readable? Change text. captcha txt

Start typing and press Enter to search