Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content
Shereen Oraby
  • I'm a 3rd year PhD student at the Natural Language and Dialog Systems Lab at the University of California, Santa Cruz... moreedit
Research Interests:
Informal first-person narratives are a unique resource for computational models of everyday events and people's affec-tive reactions to them. People blogging about their day tend not to explicitly say I am happy. Instead they describe... more
Informal first-person narratives are a unique resource for computational models of everyday events and people's affec-tive reactions to them. People blogging about their day tend not to explicitly say I am happy. Instead they describe situations from which other humans can readily infer their affective reactions. However current sentiment dictionaries are missing much of the information needed to make similar inferences. We build on recent work that models affect in terms of lexical predicate functions and affect on the pred-icate's arguments. We present a method to learn proxies for these functions from first-person narratives. We construct a novel fine-grained test set, and show that the patterns we learn improve our ability to predict first-person affective reactions to everyday events, from a Stanford sentiment baseline of .67F to .75F.
Research Interests:
Research Interests:
Effective models of social dialog must understand a broad range of rhetorical and figurative devices. Rhetorical questions (RQs) are a type of figurative language whose aim is to achieve a pragmatic goal, such as structuring an argument,... more
Effective models of social dialog must understand a broad range of rhetorical and figurative devices. Rhetorical questions (RQs) are a type of figurative language whose aim is to achieve a pragmatic goal, such as structuring an argument, being persuasive, emphasizing a point, or being ironic. While there are computational models for other forms of figurative language, rhetorical questions have received little attention to date. We expand a small dataset from previous work, presenting a corpus of 10,270 RQs from debate forums and Twitter that represent different discourse functions. We show that we can clearly distinguish between RQs and sincere questions (0.76 F1). We then show that RQs can be used both sarcastically and non-sarcastically, observing that non-sarcastic (other) uses of RQs are frequently argumentative in forums, and persuasive in tweets. We present experiments to distinguish between these uses of RQs using SVM and LSTM models that represent linguistic features and post-level context , achieving results as high as 0.76 F1 for SARCASTIC and 0.77 F1 for OTHER in forums, and 0.83 F1 for both SARCASTIC and OTHER in tweets. We supplement our quantitative experiments with an in-depth characterization of the linguistic variation in RQs.
Research Interests:
Given the increasing popularity of customer service dialogue on Twitter, analysis of conversation data is essential to understand trends in customer and agent behavior for the purpose of automating customer service interactions. In this... more
Given the increasing popularity of customer service dialogue on Twitter, analysis of conversation data is essential to understand trends in customer and agent behavior for the purpose of automating customer service interactions. In this work, we develop a novel taxonomy of fine-grained "dialogue acts" frequently observed in customer service, showcasing acts that are more suited to the domain than the more generic existing taxonomies. Using a sequential SVM-HMM model, we model conversation flow, predicting the dialogue act of a given turn in real-time. We characterize differences between customer and agent behavior in Twitter customer service conversations, and investigate the effect of testing our system on different customer service industries. Finally, we use a data-driven approach to predict important conversation outcomes: customer satisfaction, customer frustration, and overall problem resolution. We show that the type and location of certain dialogue acts in a conversation have a significant effect on the probability of desirable and undesirable outcomes, and present actionable rules based on our findings. The patterns and rules we derive can be used as guidelines for outcome-driven automated customer service platforms.
Research Interests:
The use of irony and sarcasm in social media allows us to study them at scale for the first time. However, their diversity has made it difficult to construct a high-quality corpus of sarcasm in dialogue. Here, we describe the process of... more
The use of irony and sarcasm in social
media allows us to study them at scale for
the first time. However, their diversity has
made it difficult to construct a high-quality
corpus of sarcasm in dialogue. Here, we
describe the process of creating a largescale,
highly-diverse corpus of online
debate forums dialogue, and our novel
methods for operationalizing classes of
sarcasm in the form of rhetorical questions
and hyperbole. We show that we can use
lexico-syntactic cues to reliably retrieve
sarcastic utterances with high accuracy.
To demonstrate the properties and quality
of our corpus, we conduct supervised
learning experiments with simple features,
and show that we achieve both higher
precision and F than previous work on
sarcasm in debate forums dialogue. We
apply a weakly-supervised linguistic
pattern learner and qualitatively analyze
the linguistic differences in each class
We investigate the characteristics of factual and emotional argumentation styles observed in online debates. Using an annotated set of FACTUAL and FEELING debate forum posts, we extract patterns that are highly correlated with factual and... more
We investigate the characteristics of factual and emotional argumentation styles observed in online debates. Using an annotated set of FACTUAL and FEELING debate forum posts, we extract patterns that are highly correlated with factual and emotional arguments, and then apply a bootstrapping methodology to find new patterns in a larger pool of unannotated forum posts. This process automatically produces a large set of patterns representing linguistic expressions that are highly correlated with factual and emotional language. Finally, we analyze the most discriminating pat- terns to better understand the defining characteristics of factual and emotional arguments.
With increasing interest in sentiment analysis research and opinionated web content always on the rise, focus on analysis of text in various domains and different languages is a relevant and important task. This paper explores the... more
With increasing interest in sentiment analysis research and opinionated web content always on the rise, focus on analysis of text in various domains and different languages is a relevant and important task. This paper explores the problems of sentiment analysis and opinion strength measurement using a rule-based approach tailored to the Arabic language. The approach takes into account language-specific traits that are valuable to syntactically segment a text, and allow for closer analysis of opinion-bearing language queues. By using an adapted sentiment lexicon along with sets of opinion indicators, a rule-based methodology for opinion-phrase extraction is introduced, followed by a method to rate the parsed opinions and offer a measure of opinion strength for the text under analysis.  The proposed method, even with a small set of rules, shows potential for a simple and scalable opinion-rating system, which is of particular interest for morphologically-rich languages such as Arabic.
The inherent morphological complexity of languages such as Arabic entails the exploration of language traits that could be valuable to the task of detecting and classifying sentiment within text. This paper investigates the relevance of... more
The inherent morphological complexity of languages such as Arabic entails the exploration of language traits that could be valuable to the task of detecting and classifying sentiment within text. This paper investigates the relevance of using the roots of words as input features into a sentiment analysis system under two distinct domains, in order to tailor the task more suitably to morphologically-rich languages such as Arabic. Different word-rooting solutions are employed in conjunction with a basic sentiment classifier, in order to demonstrate the potential of mapping Arabic words to basic roots for a language-specific development to the sentiment analysis task, showing a noteworthy improvement to baseline performance.