Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–15 of 15 results for author: Shang, G

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.18957  [pdf, other

    cs.DC cs.GT

    A Treatment of EIP-1559: Enhancing Transaction Fee Mechanism through Nth-Price Auction

    Authors: Kun Li, Guangpeng Qi, Guangyong Shang, Wanli Deng, Minghui Xu, Xiuzhen Cheng

    Abstract: With the widespread adoption of blockchain technology, the transaction fee mechanism (TFM) in blockchain systems has become a prominent research topic. An ideal TFM should satisfy user incentive compatibility (UIC), miner incentive compatibility (MIC), and miner-user side contract proofness ($c$-SCP). However, state-of-the-art works either fail to meet these three properties simultaneously or only… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  2. arXiv:2405.19099  [pdf, other

    cs.CR

    DataSafe: Copyright Protection with PUF Watermarking and Blockchain Tracking

    Authors: Xiaolong Xue, Guangyong Shang, Zhen Ma, Minghui Xu, Hechuan Guo, Kun Li, Xiuzhen Cheng

    Abstract: Digital watermarking methods are commonly used to safeguard digital media copyrights by confirming ownership and deterring unauthorized use. However, without reliable third-party oversight, these methods risk security vulnerabilities during watermark extraction. Furthermore, digital media lacks tangible ownership attributes, posing challenges for secure copyright transfer and tracing. This study i… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  3. arXiv:2405.11055  [pdf, other

    cs.CL cs.AI

    Leveraging Discourse Structure for Extractive Meeting Summarization

    Authors: Virgile Rennard, Guokan Shang, Michalis Vazirgiannis, Julie Hunter

    Abstract: We introduce an extractive summarization system for meetings that leverages discourse structure to better identify salient information from complex multi-party discussions. Using discourse graphs to represent semantic relations between the contents of utterances in a meeting, we train a GNN-based node classification model to select the most important utterances, which are then combined to create a… ▽ More

    Submitted 21 May, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

  4. arXiv:2312.04843  [pdf, other

    cs.CL cs.AI

    FREDSum: A Dialogue Summarization Corpus for French Political Debates

    Authors: Virgile Rennard, Guokan Shang, Damien Grari, Julie Hunter, Michalis Vazirgiannis

    Abstract: Recent advances in deep learning, and especially the invention of encoder-decoder architectures, has significantly improved the performance of abstractive summarization systems. The majority of research has focused on written documents, however, neglecting the problem of multi-party dialogue summarization. In this paper, we present a dataset of French political debates for the purpose of enhancing… ▽ More

    Submitted 8 December, 2023; originally announced December 2023.

    Comments: Accepted at EMNLP2023 Findings

  5. arXiv:2311.16840  [pdf, ps, other

    cs.CL cs.AI

    The Claire French Dialogue Dataset

    Authors: Julie Hunter, Jérôme Louradour, Virgile Rennard, Ismaïl Harrando, Guokan Shang, Jean-Pierre Lorré

    Abstract: We present the Claire French Dialogue Dataset (CFDD), a resource created by members of LINAGORA Labs in the context of the OpenLLM France initiative. CFDD is a corpus containing roughly 160 million words from transcripts and stage plays in French that we have assembled and publicly released in an effort to further the development of multilingual, open source language models. This paper describes t… ▽ More

    Submitted 28 November, 2023; originally announced November 2023.

  6. arXiv:2311.11967  [pdf, other

    cs.CL

    Automatic Analysis of Substantiation in Scientific Peer Reviews

    Authors: Yanzhu Guo, Guokan Shang, Virgile Rennard, Michalis Vazirgiannis, Chloé Clavel

    Abstract: With the increasing amount of problematic peer reviews in top AI conferences, the community is urgently in need of automatic quality control measures. In this paper, we restrict our attention to substantiation -- one popular quality aspect indicating whether the claims in a review are sufficiently supported by evidence -- and provide a solution automatizing this evaluation process. To achieve this… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

    Comments: Accepted to EMNLP 2023 Findings

  7. arXiv:2311.09807  [pdf, other

    cs.CL

    The Curious Decline of Linguistic Diversity: Training Language Models on Synthetic Text

    Authors: Yanzhu Guo, Guokan Shang, Michalis Vazirgiannis, Chloé Clavel

    Abstract: This study investigates the consequences of training language models on synthetic data generated by their predecessors, an increasingly prevalent practice given the prominence of powerful generative models. Diverging from the usual emphasis on performance metrics, we focus on the impact of this training methodology on linguistic diversity, especially when conducted recursively over time. To assess… ▽ More

    Submitted 16 April, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

    Comments: Accepted to NAACL 2024 Findings

  8. arXiv:2210.06576  [pdf, other

    cs.CL

    DATScore: Evaluating Translation with Data Augmented Translations

    Authors: Moussa Kamal Eddine, Guokan Shang, Michalis Vazirgiannis

    Abstract: The rapid development of large pretrained language models has revolutionized not only the field of Natural Language Generation (NLG) but also its evaluation. Inspired by the recent work of BARTScore: a metric leveraging the BART language model to evaluate the quality of generated text from various aspects, we introduce DATScore. DATScore uses data augmentation techniques to improve the evaluation… ▽ More

    Submitted 12 October, 2022; originally announced October 2022.

  9. arXiv:2208.04163  [pdf, other

    cs.CL

    Abstractive Meeting Summarization: A Survey

    Authors: Virgile Rennard, Guokan Shang, Julie Hunter, Michalis Vazirgiannis

    Abstract: A system that could reliably identify and sum up the most important points of a conversation would be valuable in a wide variety of real-world contexts, from business meetings to medical consultations to customer service calls. Recent advances in deep learning, and especially the invention of encoder-decoder architectures, has significantly improved language generation systems, opening the door to… ▽ More

    Submitted 25 April, 2023; v1 submitted 8 August, 2022; originally announced August 2022.

    Comments: pre-MIT Press publication version for TACL journal

  10. arXiv:2205.09978  [pdf, other

    cs.HC cs.LG

    HeadText: Exploring Hands-free Text Entry using Head Gestures by Motion Sensing on a Smart Earpiece

    Authors: Songlin Xu, Guanjie Wang, Ziyuan Fang, Guangwei Zhang, Guangzhu Shang, Rongde Lu, Liqun He

    Abstract: We present HeadText, a hands-free technique on a smart earpiece for text entry by motion sensing. Users input text utilizing only 7 head gestures for key selection, word selection, word commitment and word cancelling tasks. Head gesture recognition is supported by motion sensing on a smart earpiece to capture head moving signals and machine learning algorithms (K-Nearest-Neighbor (KNN) with a Dyna… ▽ More

    Submitted 22 May, 2022; v1 submitted 20 May, 2022; originally announced May 2022.

    Comments: 23 pages

  11. arXiv:2112.00566  [pdf, ps, other

    cs.CL

    NLP Research and Resources at DaSciM, Ecole Polytechnique

    Authors: Hadi Abdine, Yanzhu Guo, Moussa Kamal Eddine, Giannis Nikolentzos, Stamatis Outsios, Guokan Shang, Christos Xypolopoulos, Michalis Vazirgiannis

    Abstract: DaSciM (Data Science and Mining) part of LIX at Ecole Polytechnique, established in 2013 and since then producing research results in the area of large scale data analysis via methods of machine and deep learning. The group has been specifically active in the area of NLP and text mining with interesting results at methodological and resources level. Here follow our different contributions of inter… ▽ More

    Submitted 1 December, 2021; originally announced December 2021.

  12. arXiv:2110.08559  [pdf, other

    cs.CL

    FrugalScore: Learning Cheaper, Lighter and Faster Evaluation Metricsfor Automatic Text Generation

    Authors: Moussa Kamal Eddine, Guokan Shang, Antoine J. -P. Tixier, Michalis Vazirgiannis

    Abstract: Fast and reliable evaluation metrics are key to R&D progress. While traditional natural language generation metrics are fast, they are not very reliable. Conversely, new metrics based on large pretrained language models are much more reliable, but require significant computational resources. In this paper, we propose FrugalScore, an approach to learn a fixed, low cost version of any expensive NLG… ▽ More

    Submitted 16 October, 2021; originally announced October 2021.

  13. arXiv:2004.02913  [pdf, other

    cs.CL cs.LG

    Speaker-change Aware CRF for Dialogue Act Classification

    Authors: Guokan Shang, Antoine Jean-Pierre Tixier, Michalis Vazirgiannis, Jean-Pierre Lorré

    Abstract: Recent work in Dialogue Act (DA) classification approaches the task as a sequence labeling problem, using neural network models coupled with a Conditional Random Field (CRF) as the last layer. CRF models the conditional probability of the target DA label sequence given the input utterance sequence. However, the task involves another important input sequence, that of speakers, which is ignored by p… ▽ More

    Submitted 24 June, 2023; v1 submitted 6 April, 2020; originally announced April 2020.

    Comments: typo fix: argmin -> argmax

  14. arXiv:1904.09491  [pdf, other

    cs.CL cs.LG

    Energy-based Self-attentive Learning of Abstractive Communities for Spoken Language Understanding

    Authors: Guokan Shang, Antoine Jean-Pierre Tixier, Michalis Vazirgiannis, Jean-Pierre Lorré

    Abstract: Abstractive community detection is an important spoken language understanding task, whose goal is to group utterances in a conversation according to whether they can be jointly summarized by a common abstractive sentence. This paper provides a novel approach to this task. We first introduce a neural contextual utterance encoder featuring three types of self-attention mechanisms. We then train it u… ▽ More

    Submitted 7 November, 2019; v1 submitted 20 April, 2019; originally announced April 2019.

    Comments: Update baselines

  15. arXiv:1805.05271  [pdf, other

    cs.CL

    Unsupervised Abstractive Meeting Summarization with Multi-Sentence Compression and Budgeted Submodular Maximization

    Authors: Guokan Shang, Wensi Ding, Zekun Zhang, Antoine Jean-Pierre Tixier, Polykarpos Meladianos, Michalis Vazirgiannis, Jean-Pierre Lorré

    Abstract: We introduce a novel graph-based framework for abstractive meeting speech summarization that is fully unsupervised and does not rely on any annotations. Our work combines the strengths of multiple recent approaches while addressing their weaknesses. Moreover, we leverage recent advances in word embeddings and graph degeneracy applied to NLP to take exterior semantic knowledge into account, and to… ▽ More

    Submitted 14 November, 2018; v1 submitted 14 May, 2018; originally announced May 2018.

    Comments: Published as a long paper at ACL 2018. v2: updated Figure 3