- Speech and language scientistedit
Research Interests:
Research Interests:
Research Interests:
The concept of vocal charisma has changed in the past decades from something that people have to something that people do, thereby stimulating research on how vocal charisma can be created and improved. Broadening the perspective on vocal... more
The concept of vocal charisma has changed in the past decades from something that people have to something that people do, thereby stimulating research on how vocal charisma can be created and improved. Broadening the perspective on vocal charisma beyond the speaker’s performance itself to the context of the speech, we conducted acoustic-prosodic analyses of public speeches of two prominent Israelian politicians – Benjamin Netanyahu and Benny Gantz. The speech material consisted of 311–516 prosodic phrases per politician from the election campaigns 2019–2020 and, crucially, was balanced so as to include an equal number of pre- and post-election speeches. Results show a superiority of Netanyahu over Gantz in almost all facets of vocal charisma, although Gantz caught up over time. Moreover, unlike Gantz, Netanyahu showed a strong adaptation of his vocal charisma patterns to before- and after-election contexts. Scrutinizing this versatility difference, an additional perception experiment with 42 listeners and excerpts from the two politicians’ speeches was carried out. Results show that Netanyahu’s speech excerpts were, unlike those of Gantz, mainly rated as more charismatic in those contexts in which they were performed. Gantz’ post-election speech excerpts, by contrast, were primarily rated as not fitting into that context, i.e., as unfolding their charisma better in a pre-election context. Moreover, listeners also rated Netanyahu as overall more charismatic than Gantz. The combined production and perception evidence suggests the relevance of context in the evaluation and interpretation of vocal charisma signals.
Research Interests:
Research Interests:
This study investigates the realization of the two most common word-level stress patterns in Hebrew, final and penultimate, at utterance-final position. Twenty-six disyllabic words that form minimal pairs, which differ only in their... more
This study investigates the realization of the two most common word-level stress patterns in Hebrew, final and penultimate, at utterance-final position. Twenty-six disyllabic words that form minimal pairs, which differ only in their stress pattern, were embedded in 52 sentences. The mean values of three acoustic parameters—duration, F0, and intensity—were measured for vowels of the target words. Findings show that duration is significantly longer at stressed vowels, similar to previous findings on words at utterance-mid position. Lower intensity is assigned to the utterance-final vowels regardless of the stress pattern, but the degree of lowering does depend on the stress pattern. Finally, lower F0 values are found in the utterance-final vowels, but the degree of lowering is similar to both stress patterns. We conclude that duration is the main cue at the prosodic word level, while F0 is used by Hebrew speakers to cue higher prosodic units.
Research Interests:
Research Interests:
Research Interests:
Research Interests:
A recent production study found that duration is the main cue in realization of lexical stress in Hebrew. In order to examine this perceptually, two minimal pairs of bisyllabic words differing only in their stress patterns were uttered... more
A recent production study found that duration is the main cue in realization of lexical stress in Hebrew. In order to examine this perceptually, two minimal pairs of bisyllabic words differing only in their stress patterns were uttered within a carrier sentence. Target words were then extracted and their vowel durations were manipulated in eight steps, either downward or upward, to form smooth transitions from original penultimate duration patterns to ultimate, and vice versa. These stimuli were then presented to 15 listeners. Results show that for most listeners, the changes in duration were sufficient to cause a categorical change in perception of the stress pattern.
Research Interests:
Understanding speech depends on its segmentation into units, and prosody – the tone, intonation and rhythm of speech – is a crucial tool used by speakers for this purpose. This book provides a comprehensive description of the prosodic... more
Understanding speech depends on its segmentation into units, and prosody – the tone, intonation and rhythm of speech – is a crucial tool used by speakers for this purpose. This book provides a comprehensive description of the prosodic boundary patterns of spoken Israeli Hebrew (IH), by examining how Hebrew-speakers express sequences of utterances using a defined set of prosodic boundary tones. The study specifically focuses on the relationship between the syntagmatic and prosodic layers of spoken IH, thereby clarifying our understanding of the relationship between prosodic form and its linguistic function. This interface is modeled as the "speeCHain perspective", which demonstrates the chaining of speech units and the subsequent systematic linkage of syntactic units. The research was carried out on authentic IH everyday conversations, thus providing a unique contribution to present-day research. The analysis sheds light on the overall study of prosodic patterning in speech...
Research Interests:
Prepositions in Israeli Hebrew, as in other languages, are clitics. They are often regarded as proclitics, cliticizing to their object, probably because they both introduce this object (conceptually) and select for it... more
Prepositions in Israeli Hebrew, as in other languages, are clitics. They are often regarded as proclitics, cliticizing to their object, probably because they both introduce this object (conceptually) and select for it (morpho-syntactically). In this paper, we make the opposite claim, namely that these prosodically-dependent items are in fact entclitic: they cliticize to the preceding word, mainly the sentence predicate. This claim is first supported by the analysis of evidence from natural speech prosodic segmentation. Then, it is shown that within the morpho-syntactic theory of Distributed Morphology, this segmentation is in fact predicted. The morpho-syntactic analysis is confirmed by its capacity to account for the distribution of non-radical [h] in the verbal system of Israeli Hebrew, hitherto considered an arbitrary trait. Finally, the analysis is shown to correctly predict the encliticization of definite articles preceding prepositions. Finally, the analysis is shown to correc...
Research Interests:
Currently, via the mediation of audio mining technology and conversational user interfaces, and after years of constant improvements of Automatic Speech Recognition technology, conversation intelligence is an emerging concept, significant... more
Currently, via the mediation of audio mining technology and conversational user interfaces, and after years of constant improvements of Automatic Speech Recognition technology, conversation intelligence is an emerging concept, significant to the understanding of human-human communication in its most natural and primitive channel – our voice. This paper introduces the concept of Conversation Intelligence (CI), which is becoming crucial to the study of humanhuman speech interaction and communication management and is part of the field of speech analytics. CI is demonstrated on two established discourse terms – power relations and convergence. Finally, this paper highlights the importance of visualization for large-scale speech analytics.
Research Interests:
Research Interests:
The current rapid technological changes confront researchers of learning technologies with the challenge of evaluating them, predicting trends, and improving their adoption and diffusion. This study utilizes a data-driven discourse... more
The current rapid technological changes confront researchers of learning technologies with the challenge of evaluating them, predicting trends, and improving their adoption and diffusion. This study utilizes a data-driven discourse analysis approach, namely culturomics, to investigate changes over time in the research of learning technologies. The patterns and changes were examined on a corpus of articles published over the past decade (2006-2014) in the proceedings of Chais Conference for the Study of Innovation and Learning Technologies – the leading research conference on learning technologies in Israel. The interesting findings of the exhaustive process of analyzing all the words in the corpus were that the most commonly used terms (e.g., pupil, teacher, student) and the most commonly used phrases (e.g., face-to-face) in the field of learning technologies reflect a pedagogical rather than a technological aspect of learning technologies. The study also demonstrates two cases of c...
Research Interests:
Research Interests:
Research Interests:
Research Interests:
Research Interests:
This study examines the diversity of silences in unbalanced dialogues, i.e. dialogues between speakers with different participation levels: responder and reporter. We examined two genres: therapeutic sessions and private dialogues that... more
This study examines the diversity of silences in unbalanced dialogues, i.e. dialogues between speakers with different participation levels: responder and reporter. We examined two genres: therapeutic sessions and private dialogues that are based on this responder-reporter structure. When looking at silences versus speech ratios, we found no differences between the genres nor between the roles. However, when grouping the silences by their types: Pauses (intra-speaker silences), gaps (interspeakers’ silences) and silences that occur in the vicinity of speech overlaps, we found that the silence duration of pauses are role dependent in both genres, while the silence duration of gaps were found genre dependent, but not role dependent. Moreover, speech rate was not found genre dependent. It seems that although silences in unbalanced dialogues vary considerably, genre and speaker’s role are influential.
This paper proposes a new framework for prosodic pattern analysis, based on the study of excessive prolongation in spontaneous Israeli Hebrew. In order to reveal whether it is a random phenomenon or a predictable prosodic pattern, a... more
This paper proposes a new framework for prosodic pattern analysis, based on the study of excessive prolongation in spontaneous Israeli Hebrew. In order to reveal whether it is a random phenomenon or a predictable prosodic pattern, a multi-layer linguistic analysis was conducted. First, the phenomenon was taken out of its canonical research framework as a type of speech disfluency. Second, its acoustic characteristics were defined, and the phonological environments of these prolongations were accounted for. Finally, prolongations and their interface with the syntagmatic layer were analyzed. The proposed framework can serve as a format for other prosodic patterns as well. Resumo: Este trabalho propõe um novo modelo para a análise do padrão prosódico, a partir do estudo do prolongamento excessivo na fala espontânea do hebraico israelense. Com o objetivo de verificar se se trata de um fenômeno aleatório ou um padrão prosódico previsível, procedeu-se a uma análise linguística multi-nível...
An effective approach to the study of prosody in spoken language seeks to identify prosodic patterns and their communicative values, and to subsequently find a correlation between these prosodic patterns and other layers of linguistic... more
An effective approach to the study of prosody in spoken language seeks to identify prosodic patterns and their communicative values, and to subsequently find a correlation between these prosodic patterns and other layers of linguistic structure. The present research strives to define a single prosodic boundary pattern: the boundary tone of hesitation disfluencies in spontaneous Israeli Hebrew. This entails uncovering the phonological environments in which they occur. Results show two distinct domains for such disfluencies with regard to word-level phonology: word-final syllables and appended e vowels that are inserted after a word, but within the same intonation unit. Statistically significant relations were found between these domains and the phonological structures of the disfluent syllables. Index Terms: prosody, spontaneous Hebrew, disfluencies 1.
Abstract The human tendency to adapt to interlocutors to become more similar, known as entrainment, has been studied for many languages. To our knowledge, however, there have only been two studies relating to the phenomenon in any Semitic... more
Abstract The human tendency to adapt to interlocutors to become more similar, known as entrainment, has been studied for many languages. To our knowledge, however, there have only been two studies relating to the phenomenon in any Semitic language, specifically Hebrew, which had limited scope. We greatly expand on this by conducting an analysis of acoustic-prosodic entrainment in a corpus of task-oriented Hebrew dialogues. We use previously established methodology to facilitate comparison with prior results for other languages. We find that acoustic-prosodic entrainment at turn exchanges is present in Hebrew interactions to a similar degree as for Indo-European languages. The most notable difference with those languages is a greater tendency for divergent behavior in Hebrew, particularly among mixed gender speaker pairs. Compared to American English, we also note a lack of global similarity between speakers’ mean feature values. We do not attribute these distinctions to specific linguistic differences but discuss possible sources of variation based on language and other factors. Our data reveals no clear pattern of differences between gender pairs or between speakers responding to male or female interlocutors, respectively, at turn exchanges. There is also no difference at all between responding speakers based on their gender. However, we do find that speakers who depend on information tend to match their interlocutors more closely at turn exchanges than those who possess it.
Research Interests:
This article focuses on one decade, 1874–1883, in the relatively long lifespan of the Hebrew weekly Ha-Tzefirah, which was founded in Warsaw in 1862. Applying computational tools to the study of the early Hebrew press requires a unique... more
This article focuses on one decade, 1874–1883, in the relatively long lifespan of the Hebrew weekly Ha-Tzefirah, which was founded in Warsaw in 1862. Applying computational tools to the study of the early Hebrew press requires a unique effort. The Hebrew language in general is distinct in its characters, morphological structure, and word order. The contribution of this proof-of-concept study is two-fold: First, computational analysis provides a long-term indication of trends in the discourse that cannot be attained through qualitative study. The second contribution is on the micro level: Computational analysis can potentially shed light, in a diachronic perspective, on the use of a specific term or the discussion of a specific geographical location.
Research Interests:
This study examines the potential of big data discourse analysis (i.e., culturomics) to produce valuable knowledge, and suggests a mixed methods model for improving the effectiveness of culturomics. We argue that the importance and... more
This study examines the potential of big data discourse analysis (i.e., culturomics) to produce valuable knowledge, and suggests a mixed methods model for improving the effectiveness of culturomics. We argue that the importance and magnitude of using qualitative methods as complementing quantitative ones, depends on the scope of the analyzed data (i.e., the volume of data and the period it spans over). We demonstrate the merit of a mixed methods approach for culturomics analyses in the context of identifying research trends, by analyzing changes over a period of 15 years (2000-2014) in the terms used in the research literature related to learning technologies. The dataset was based on Google Scholar search query results. Three perspectives of analysis are presented: (1) Curves describing five main types of relative frequency trends (i.e., rising; stable; fall; rise and fall; rise and stable); (2) The top key-terms identified for each year; (3) A comparison of data from three dataset...
Research Interests:
This study investigates the acoustic realization of primary and secondary stress in polysyllabic words in Modern Hebrew (MH). The study focuses on the production of target words embedded in a meaningful carrier sentence, with three... more
This study investigates the acoustic realization of primary and secondary stress in polysyllabic words in Modern Hebrew (MH). The study focuses on the production of target words embedded in a meaningful carrier sentence, with three primary stress types: ultimate, penultimate and antepenultimate stress. We measured the duration, intensity and F0 of each vowel. Results show that duration is the sole reliable cue for stress in MH, and that there is no phonetic realization of secondary stress in MH, and therefore no true surface alternating pattern. These findings may have phonological implications regarding the prosodic organization of language, and provide a solid basis for future studies on the perception of primary and secondary stress by speakers.
Research Interests:
Research Interests: Computer Science and Isca
Understanding speech depends on its segmentation into units, and prosody – the tone, intonation and rhythm of speech – is a crucial tool used by speakers for this purpose. This book provides a comprehensive description of the prosodic... more
Understanding speech depends on its segmentation into units, and prosody – the tone, intonation and rhythm of speech – is a crucial tool used by speakers for this purpose. This book provides a comprehensive description of the prosodic boundary patterns of spoken Israeli Hebrew (IH), by examining how Hebrew-speakers express sequences of utterances using a defined set of prosodic boundary tones. The study specifically focuses on the relationship between the syntagmatic and prosodic layers of spoken IH, thereby clarifying our understanding of the relationship between prosodic form and its linguistic function. This interface is modeled as the "speeCHain perspective", which demonstrates the chaining of speech units and the subsequent systematic linkage of syntactic units. The research was carried out on authentic IH everyday conversations, thus providing a unique contribution to present-day research. The analysis sheds light on the overall study of prosodic patterning in speech...
Research Interests:
It has been well-documented for several languages that human interlocutors tend to adapt their linguistic productions to become more similar to each other. This behavior, known as entrainment, affects lexical choice as well, both with... more
It has been well-documented for several languages that human interlocutors tend to adapt their linguistic productions to become more similar to each other. This behavior, known as entrainment, affects lexical choice as well, both with regard to specific words, such as referring expressions, and overall style. We offer what we believe to be the first investigation of such lexical entrainment in Hebrew. Using two existing measures, we analyze Hebrew speakers interacting in a Map Task, a popular experimental setup, and find rich evidence of lexical entrainment. Analyzing speaker pairs by the combination of their genders as well as speakers by their individual gender, we find no clear pattern of differences. We do, however, find that speakers in a position of less power entrain more than those with greater power, which matches theoretical accounts. Overall, our results mostly accord with those for American English, with a lack of entrainment on hedge words being the main difference.
Research Interests:
The aim of the present study is to assess the importance of prosody in the perceptual delimitation of “units” for the spoken language, by resorting to experiments involving non-speakers of the language, filtered speech, and automatic... more
The aim of the present study is to assess the importance of prosody in the perceptual delimitation of “units” for the spoken language, by resorting to experiments involving non-speakers of the language, filtered speech, and automatic segmentation by the software ANALOR. The three experiments test in various ways the lack of access to semantic and syntactic information, as opposed to the expert’s segmentation. Results show that quantitatively stronger prosodic cues are needed for informants without access to the syntax-semantics of the sample, especially when they are non-speakers. The analysis also suggests the existence of both universal and language-specific prosodic cues. One of the fundamental questions underlying the analysis of spoken languages is their decomposition into units that can be considered basic in terms of informational processing and communication. Even once the importance of prosody has been recognized, the basis of the decomposition remains problematic, because ...
Research Interests:
In this paper we try to improve the humanmachine interaction of a voice-activated system by adding prosodic characteristics to the system. We focus on verbal hesitation, which is manifested by speech disfluencies. In humanhuman... more
In this paper we try to improve the humanmachine interaction of a voice-activated system by adding prosodic characteristics to the system. We focus on verbal hesitation, which is manifested by speech disfluencies. In humanhuman communication recent research shows that moderate disfluencies make speakers more credible. In addition, people tend to react more leniently to an erroneous answer, if the answer was given by the conversant in a hesitating manner, implying that the responding person is unsure of the correct answer. In this study we investigate the hypothesis that users will react in a similar way to voice activated systems. Specifically, we hypothesized that adding prosodic features to the system’s speech responses, will increase the user’s perception of the system credibility, his/her overall satisfaction and reduce frustration while using the system.
In this research we diagnose two commercial automatic speech recognizers (ASRs) on a corpus of academic lectures in Hebrew. Our goal is not only to measure the engines' performance but to find out if current Hebrew ASRs'... more
In this research we diagnose two commercial automatic speech recognizers (ASRs) on a corpus of academic lectures in Hebrew. Our goal is not only to measure the engines' performance but to find out if current Hebrew ASRs' transcription can be a reasonable replacement to human transcription, or at least a significant bootstrapping for a manual post-processing of the automatic output. We performed a word error rate (WER) diagnosis and a linguistic error classification on two automatic transcriptions – Nuance's and Google's, and compared it to a real-time (RT) stenographer's records, as well as to an exact transcription that reflects excatly the speaker's speech. Results show that the ASRs‘ WER is caused by massive substitutions, while the RT transcription's errors were caused mainly due to deletions. This research provides an opportunity to explore cost/benefit aspects of automatic vs. manual audio transcriptions.
Research Interests:
The main objective of the current study is to learn how one's voice changes according to non-linguistic conditions. This report on work in progress, aims to study the common and different qualities in non-linguistic variables by... more
The main objective of the current study is to learn how one's voice changes according to non-linguistic conditions. This report on work in progress, aims to study the common and different qualities in non-linguistic variables by state-of-the-art Voice Conversion (VC) methods. As far as we know, this is the first report of VC tests on Hebrew speech. We first report on some baseline results of gender conversions, which were evaluated perceptually by thirty-one Hebrew native speakers. None of the original audio excerpts was evaluated correctly by 100% of the participants. In the second listening test, subjects were asked to indicate how close the voice conversion of a mimic voice (Source voice) is to an original famous figure's voice (Target voice). Results suggest that the VC process managed to shift the subjects' votes' weight from "absolutely Source voice" to relatively higher uncertain votes and higher percentage of "absolutely Target voice", com...
Research Interests:
In this paper we present a proof-of-concept study which aims to model a conceptual framework to analyze structures of dialogues. We demonstrate our approach on a specific research question – how speaker’s role is realized along the... more
In this paper we present a proof-of-concept study which aims to model a conceptual framework to analyze structures of dialogues. We demonstrate our approach on a specific research question – how speaker’s role is realized along the dialogue? To this end, we use a unified set of Map Task dialogues that are unique in the sense that each speaker participated twice – once as a follower and once as a leader, with the same interlocutor playing the other role. This pairwise setting enables to compare prosodic differences in three facets: Role, Speaker, and Session. For this POC, we analyze a basic set of prosodic features: Talk proportions, pitch, and intensity. To create comparable methodological framework for dialogues, we created three plots of the three prosodic features, in ten equal sized intervals along the session. We used a simple distance measure between the resulting ten-dimensional vectors of each facet for each feature. The prosodic plots of these dialogues reveal the interact...
Research Interests:
In this paper we examine a certain aspect of prosodysyntax interface, that of hesitation disfluencies (HD) that occur intra-phrases or intra-morphemes. Such cases were found in two spontaneous corpora of two syntactically distinct... more
In this paper we examine a certain aspect of prosodysyntax interface, that of hesitation disfluencies (HD) that occur intra-phrases or intra-morphemes. Such cases were found in two spontaneous corpora of two syntactically distinct languages – Israeli Hebrew (IH) and Japanese. It was found that intra-phrasal hesitations in the two languages calls for different explanations, since in Japanese the noun (e.g., in NP) precedes the case marking particle while in IH the preposition (e.g., in PP) precedes the noun. In this paper we will present qualitative findings and suggest a unified view of the phenomenon of intra-phrasal HDs.
Research Interests:
Despite the growing importance of Automatic Speech Recognition (ASR), its application is still challenging, limited, language-dependent, and requires considerable resources. The resources required for ASR are not only technical, they also... more
Despite the growing importance of Automatic Speech Recognition (ASR), its application is still challenging, limited, language-dependent, and requires considerable resources. The resources required for ASR are not only technical, they also need to reflect technological trends and cultural diversity. The purpose of this research is to explore ASR performance gaps by a comparative study of American English, German, and Hebrew. Apart from different languages, we also investigate different speaking styles – utterances from spontaneous dialogues and utterances from frontal lectures (TED-like genre). The analysis includes a comparison of the performance of four ASR engines (Google Cloud, Google Search, IBM Watson, and WIT.ai) using four commonly used metrics: Word Error Rate (WER); Character Error Rate (CER); Word Information Lost (WIL); and Match Error Rate (MER). As expected, findings suggest that English ASR systems provide the best results. Contrary to our hypothesis regarding ASR’s lo...
Research Interests:
The notion of intonation units is very basic to the study of discourse. Nevertheless, a clear-cut definition of what comprises an intonation unit has not been forthcoming. In reality, it seems that the boundaries delineating intonation... more
The notion of intonation units is very basic to the study of discourse. Nevertheless, a clear-cut definition of what comprises an intonation unit has not been forthcoming. In reality, it seems that the boundaries delineating intonation units are somewhat easier to define, though this is by no means a closed subject. In this preliminary study of spoken Israeli Hebrew, we took four common criteria for intonation unit boundaries (fast initial speech, slow terminating speech, pitch reset, pauses) and analyzed their occurrences in a segment taken from a spontaneous speech corpus, containing approximately 54 such units. This segment was parsed perceptually by four researchers, and the resultant boundaries were analyzed acoustically to determine which were present at each boundary. A number of interesting conclusions result: only a quarter of the boundaries conformed to all cues, while two boundaries that were agreed upon by all the listeners conformed to none. Final lengthening was most p...
Research Interests:
An effective approach to the study of prosody in spoken language seeks to identify prosodic patterns and their communicative values, and to subsequently find a correlation between these prosodic patterns and other layers of linguistic... more
An effective approach to the study of prosody in spoken language seeks to identify prosodic patterns and their communicative values, and to subsequently find a correlation between these prosodic patterns and other layers of linguistic structure. The present research strives to define a single prosodic boundary pattern: the boundary tone of hesitation disfluencies in spontaneous Israeli Hebrew. This entails uncovering the phonological environments in which they occur. Results show two distinct domains for such disfluencies with regard to word-level phonology: word-final syllables and appended e vowels that are inserted after a word, but within the same intonation unit. Statistically significant relations were found between these domains and the phonological structures of the disfluent syllables.
The purpose of the present study was to examine the influence of lexical stress on formant values (mainly F1 and F2) in spontaneous Hebrew speech. Speech samples taken from a Hebrew version of the wellknown Map Task dialogues were... more
The purpose of the present study was to examine the influence of lexical stress on formant values (mainly F1 and F2) in spontaneous Hebrew speech. Speech samples taken from a Hebrew version of the wellknown Map Task dialogues were analysed, comparing stressed/unstressed vowels in wordfinal/non-final positions. The results showed that lexical stress has a different effect on the different vowels. Of the five vowels in the Hebrew vowel system, the vowels /a/ and /e/ were most clearly affected in a consistent manner across men and women. Similar behaviour was observed for both vowels: in word-final position the vowels were centralized similarly, regardless of stress. In non-final position, the unstressed vowels were significantly more centralized than their stressed counterparts.