Skip to main content

Showing 1–50 of 64 results for author: de Melo, G

  1. arXiv:2406.17639  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Mitigate the Gap: Investigating Approaches for Improving Cross-Modal Alignment in CLIP

    Authors: Sedigheh Eslami, Gerard de Melo

    Abstract: Contrastive Language--Image Pre-training (CLIP) has manifested remarkable improvements in zero-shot classification and cross-modal vision-language tasks. Yet, from a geometrical point of view, the CLIP embedding space has been found to have a pronounced modality gap. This gap renders the embedding space overly sparse and disconnected, with different modalities being densely distributed in distinct… ▽ More

    Submitted 26 June, 2024; v1 submitted 25 June, 2024; originally announced June 2024.

  2. arXiv:2406.01551  [pdf, other

    cs.CV

    ELSA: Evaluating Localization of Social Activities in Urban Streets

    Authors: Maryam Hosseini, Marco Cipriano, Sedigheh Eslami, Daniel Hodczak, Liu Liu, Andres Sevtsuk, Gerard de Melo

    Abstract: Why do some streets attract more social activities than others? Is it due to street design, or do land use patterns in neighborhoods create opportunities for businesses where people gather? These questions have intrigued urban sociologists, designers, and planners for decades. Yet, most research in this area has remained limited in scale, lacking a comprehensive perspective on the various factors… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  3. arXiv:2405.01660  [pdf, other

    cs.CL cs.AI

    Investigating Wit, Creativity, and Detectability of Large Language Models in Domain-Specific Writing Style Adaptation of Reddit's Showerthoughts

    Authors: Tolga Buz, Benjamin Frost, Nikola Genchev, Moritz Schneider, Lucie-Aimée Kaffee, Gerard de Melo

    Abstract: Recent Large Language Models (LLMs) have shown the ability to generate content that is difficult or impossible to distinguish from human writing. We investigate the ability of differently-sized LLMs to replicate human writing style in short, creative texts in the domain of Showerthoughts, thoughts that may occur during mundane activities. We compare GPT-2 and GPT-Neo fine-tuned on Reddit data as w… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: Accepted to *SEM 2024 (StarSEM) conference

  4. arXiv:2403.05188  [pdf, other

    cs.CL cs.SE

    CommitBench: A Benchmark for Commit Message Generation

    Authors: Maximilian Schall, Tamara Czinczoll, Gerard de Melo

    Abstract: Writing commit messages is a tedious daily task for many software developers, and often remains neglected. Automating this task has the potential to save time while ensuring that messages are informative. A high-quality dataset and an objective benchmark are vital preconditions for solid research and evaluation towards this goal. We show that existing datasets exhibit various problems, such as the… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

    Comments: Submitted and accepted at SANER 2024

  5. arXiv:2403.00025  [pdf, ps, other

    cs.LG cs.AI

    On the Challenges and Opportunities in Generative AI

    Authors: Laura Manduchi, Kushagra Pandey, Robert Bamler, Ryan Cotterell, Sina Däubener, Sophie Fellenz, Asja Fischer, Thomas Gärtner, Matthias Kirchler, Marius Kloft, Yingzhen Li, Christoph Lippert, Gerard de Melo, Eric Nalisnick, Björn Ommer, Rajesh Ranganath, Maja Rudolph, Karen Ullrich, Guy Van den Broeck, Julia E Vogt, Yixin Wang, Florian Wenzel, Frank Wood, Stephan Mandt, Vincent Fortuin

    Abstract: The field of deep generative modeling has grown rapidly and consistently over the years. With the availability of massive amounts of training data coupled with advances in scalable unsupervised learning paradigms, recent large-scale generative models show tremendous promise in synthesizing high-resolution images and text, as well as structured data such as videos and molecules. However, we argue t… ▽ More

    Submitted 28 February, 2024; originally announced March 2024.

  6. arXiv:2402.17682  [pdf, other

    cs.CL

    NextLevelBERT: Masked Language Modeling with Higher-Level Representations for Long Documents

    Authors: Tamara Czinczoll, Christoph Hönes, Maximilian Schall, Gerard de Melo

    Abstract: While (large) language models have significantly improved over the last years, they still struggle to sensibly process long sequences found, e.g., in books, due to the quadratic scaling of the underlying attention mechanism. To address this, we propose NextLevelBERT, a Masked Language Model operating not on tokens, but on higher-level semantic representations in the form of text embeddings. We pre… ▽ More

    Submitted 13 June, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

    Comments: accepted at ACL 2024; camera-ready version; 9 pages

  7. arXiv:2312.12806  [pdf, other

    cs.CL cs.AI

    MedBench: A Large-Scale Chinese Benchmark for Evaluating Medical Large Language Models

    Authors: Yan Cai, Linlin Wang, Ye Wang, Gerard de Melo, Ya Zhang, Yanfeng Wang, Liang He

    Abstract: The emergence of various medical large language models (LLMs) in the medical domain has highlighted the need for unified evaluation standards, as manual evaluation of LLMs proves to be time-consuming and labor-intensive. To address this issue, we introduce MedBench, a comprehensive benchmark for the Chinese medical domain, comprising 40,041 questions sourced from authentic examination exercises an… ▽ More

    Submitted 20 December, 2023; originally announced December 2023.

    Comments: accepted by AAAI-24

  8. arXiv:2312.06530  [pdf, other

    cs.HC

    Study of Non-Verbal Behavior in Conversational Agents

    Authors: Camila Vicari Maccari, Gustavo Galle de Melo, Paulo Ricardo Knob, Soraia Raupp Musse

    Abstract: This paper studies the non-verbal behavior of a conversational agent named Arthur. We propose the development of body movements for this agent, which interacts solely through voice commands, chat, and videos with facial animations. This research aims to analyze users' perceptions regarding the gestures performed by Arthur. This study was conducted with participants who agreed to interact directly… ▽ More

    Submitted 11 December, 2023; originally announced December 2023.

    Comments: Paper presented as Final Project in Computer Science - Pontifical Catholic university of Rio Grande do Sul (Brazil)

  9. arXiv:2311.05610  [pdf, other

    cs.LG cs.DC

    Efficient Parallelization Layouts for Large-Scale Distributed Model Training

    Authors: Johannes Hagemann, Samuel Weinbach, Konstantin Dobler, Maximilian Schall, Gerard de Melo

    Abstract: Efficiently training large language models requires parallelizing across hundreds of hardware accelerators and invoking various compute and memory optimizations. When combined, many of these strategies have complex interactions regarding the final training efficiency. Prior work tackling this problem did not have access to the latest set of optimizations, such as FlashAttention or sequence paralle… ▽ More

    Submitted 10 December, 2023; v1 submitted 9 November, 2023; originally announced November 2023.

    Comments: Camera-ready version for the Workshop on Advancing Neural Network Training at 37th Conference on Neural Information Processing Systems (WANT@NeurIPS 2023)

  10. arXiv:2310.11818  [pdf, other

    cs.AI

    IntentDial: An Intent Graph based Multi-Turn Dialogue System with Reasoning Path Visualization

    Authors: Zengguang Hao, Jie Zhang, Binxia Xu, Yafang Wang, Gerard de Melo, Xiaolong Li

    Abstract: Intent detection and identification from multi-turn dialogue has become a widely explored technique in conversational agents, for example, voice assistants and intelligent customer services. The conventional approaches typically cast the intent mining process as a classification task. Although neural classifiers have proven adept at such classification tasks, the issue of neural network models oft… ▽ More

    Submitted 18 October, 2023; originally announced October 2023.

    Comments: 4pages, 5 figures

  11. arXiv:2309.14468  [pdf, other

    cs.CV cs.LG cs.SI

    FARSEC: A Reproducible Framework for Automatic Real-Time Vehicle Speed Estimation Using Traffic Cameras

    Authors: Lucas Liebe, Franz Sauerwald, Sylwester Sawicki, Matthias Schneider, Leo Schuhmann, Tolga Buz, Paul Boes, Ahmad Ahmadov, Gerard de Melo

    Abstract: Estimating the speed of vehicles using traffic cameras is a crucial task for traffic surveillance and management, enabling more optimal traffic flow, improved road safety, and lower environmental impact. Transportation-dependent systems, such as for navigation and logistics, have great potential to benefit from reliable speed estimation. While there is prior research in this area reporting competi… ▽ More

    Submitted 25 September, 2023; originally announced September 2023.

    ACM Class: I.4.9

  12. arXiv:2308.06374  [pdf, other

    cs.AI cs.CL

    Large Language Models and Knowledge Graphs: Opportunities and Challenges

    Authors: Jeff Z. Pan, Simon Razniewski, Jan-Christoph Kalo, Sneha Singhania, Jiaoyan Chen, Stefan Dietze, Hajira Jabeen, Janna Omeliyanenko, Wen Zhang, Matteo Lissandrini, Russa Biswas, Gerard de Melo, Angela Bonifati, Edlira Vakaj, Mauro Dragoni, Damien Graux

    Abstract: Large Language Models (LLMs) have taken Knowledge Representation -- and the world -- by storm. This inflection point marks a shift from explicit knowledge representation to a renewed focus on the hybrid representation of both explicit knowledge and parametric knowledge. In this position paper, we will discuss some of the common debate points within the community on LLMs (parametric knowledge) and… ▽ More

    Submitted 11 August, 2023; originally announced August 2023.

    Comments: 30 pages

  13. arXiv:2307.10018  [pdf, other

    cs.RO cs.AI

    RobôCIn Small Size League Extended Team Description Paper for RoboCup 2023

    Authors: Aline Lima de Oliveira, Cauê Addae da Silva Gomes, Cecília Virginia Santos da Silva, Charles Matheus de Sousa Alves, Danilo Andrade Martins de Souza, Driele Pires Ferreira Araújo Xavier, Edgleyson Pereira da Silva, Felipe Bezerra Martins, Lucas Henrique Cavalcanti Santos, Lucas Dias Maciel, Matheus Paixão Gumercindo dos Santos, Matheus Lafayette Vasconcelos, Matheus Vinícius Teotonio do Nascimento Andrade, João Guilherme Oliveira Carvalho de Melo, João Pedro Souza Pereira de Moura, José Ronald da Silva, José Victor Silva Cruz, Pedro Henrique Santana de Morais, Pedro Paulo Salman de Oliveira, Riei Joaquim Matos Rodrigues, Roberto Costa Fernandes, Ryan Vinicius Santos Morais, Tamara Mayara Ramos Teobaldo, Washington Igor dos Santos Silva, Edna Natividade Silva Barros

    Abstract: RobôCIn has participated in RoboCup Small Size League since 2019, won its first world title in 2022 (Division B), and is currently a three-times Latin-American champion. This paper presents our improvements to defend the Small Size League (SSL) division B title in RoboCup 2023 in Bordeaux, France. This paper aims to share some of the academic research that our team developed over the past year. Ou… ▽ More

    Submitted 19 July, 2023; originally announced July 2023.

  14. Connecting the Dots: What Graph-Based Text Representations Work Best for Text Classification Using Graph Neural Networks?

    Authors: Margarita Bugueño, Gerard de Melo

    Abstract: Given the success of Graph Neural Networks (GNNs) for structure-aware machine learning, many studies have explored their use for text classification, but mostly in specific domains with limited data characteristics. Moreover, some strategies prior to GNNs relied on graph mining and classical machine learning, making it difficult to assess their effectiveness in modern settings. This work extensive… ▽ More

    Submitted 22 January, 2024; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: Accepted to Findings of the Association for Computational Linguistics: EMNLP 2023 (Long Paper). 17 pages, 2 figures, 15 tables. The Appendix starts on page 12

    Journal ref: Findings of the Association for Computational Linguistics: EMNLP 2023, pages 8943-8960

  15. FOCUS: Effective Embedding Initialization for Monolingual Specialization of Multilingual Models

    Authors: Konstantin Dobler, Gerard de Melo

    Abstract: Using model weights pretrained on a high-resource language as a warm start can reduce the need for data and compute to obtain high-quality language models for other, especially low-resource, languages. However, if we want to use a new tokenizer specialized for the target language, we cannot transfer the source model's embedding matrix. In this paper, we propose FOCUS - Fast Overlapping Token Combi… ▽ More

    Submitted 6 November, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: Accepted to EMNLP 2023 Main Conference (Long Paper). Code: https://github.com/konstantinjdobler/focus

  16. arXiv:2303.12734  [pdf, other

    cs.CV cs.CL cs.LG

    MultiModal Bias: Introducing a Framework for Stereotypical Bias Assessment beyond Gender and Race in Vision Language Models

    Authors: Sepehr Janghorbani, Gerard de Melo

    Abstract: Recent breakthroughs in self supervised training have led to a new class of pretrained vision language models. While there have been investigations of bias in multimodal models, they have mostly focused on gender and racial bias, giving much less attention to other relevant groups, such as minorities with regard to religion, nationality, sexual orientation, or disabilities. This is mainly due to l… ▽ More

    Submitted 16 March, 2023; originally announced March 2023.

  17. arXiv:2301.00170  [pdf, other

    q-fin.ST cs.CY cs.SI

    Democratization of Retail Trading: Can Reddit's WallStreetBets Outperform Investment Bank Analysts?

    Authors: Tolga Buz, Gerard de Melo

    Abstract: The recent hype around Reddit's WallStreetBets (WSB) community has inspired research on its impact on our economy and society. Still, one important question remains: Can WSB's community of anonymous contributors actually provide valuable investment advice and possibly even outperform top financial institutions? We present a data-driven empirical study of investment recommendations of WSB in compar… ▽ More

    Submitted 31 December, 2022; originally announced January 2023.

  18. arXiv:2210.05556  [pdf, other

    cs.CV cs.CL

    ViLPAct: A Benchmark for Compositional Generalization on Multimodal Human Activities

    Authors: Terry Yue Zhuo, Yaqing Liao, Yuecheng Lei, Lizhen Qu, Gerard de Melo, Xiaojun Chang, Yazhou Ren, Zenglin Xu

    Abstract: We introduce ViLPAct, a novel vision-language benchmark for human activity planning. It is designed for a task where embodied AI agents can reason and forecast future actions of humans based on video clips about their initial activities and intents in text. The dataset consists of 2.9k videos from \charades extended with intents via crowdsourcing, a multi-choice question test set, and four strong… ▽ More

    Submitted 9 March, 2023; v1 submitted 11 October, 2022; originally announced October 2022.

    Comments: Accepted at EACL2023 (Findings)

  19. arXiv:2208.03550  [pdf, other

    cs.CV

    Frozen CLIP Models are Efficient Video Learners

    Authors: Ziyi Lin, Shijie Geng, Renrui Zhang, Peng Gao, Gerard de Melo, Xiaogang Wang, Jifeng Dai, Yu Qiao, Hongsheng Li

    Abstract: Video recognition has been dominated by the end-to-end learning paradigm -- first initializing a video recognition model with weights of a pretrained image model and then conducting end-to-end training on videos. This enables the video network to benefit from the pretrained image model. However, this requires substantial computation and memory resources for finetuning on videos and the alternative… ▽ More

    Submitted 6 August, 2022; originally announced August 2022.

    Comments: ECCV 2022

  20. arXiv:2206.04615  [pdf, other

    cs.CL cs.AI cs.CY cs.LG stat.ML

    Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

    Authors: Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza , et al. (426 additional authors not shown)

    Abstract: Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur… ▽ More

    Submitted 12 June, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench

    Journal ref: Transactions on Machine Learning Research, May/2022, https://openreview.net/forum?id=uyTL5Bvosj

  21. arXiv:2203.00281  [pdf, other

    cs.CL

    Fast-R2D2: A Pretrained Recursive Neural Network based on Pruned CKY for Grammar Induction and Text Representation

    Authors: Xiang Hu, Haitao Mi, Liang Li, Gerard de Melo

    Abstract: Recently CKY-based models show great potential in unsupervised grammar induction thanks to their human-like encoding paradigm, which runs recursively and hierarchically, but requires $O(n^3)$ time-complexity. Recursive Transformer based on Differentiable Trees (R2D2) makes it possible to scale to large language model pre-training even with complex tree encoder by introducing a heuristic pruning me… ▽ More

    Submitted 2 November, 2022; v1 submitted 1 March, 2022; originally announced March 2022.

    Comments: EMNLP 2022

  22. arXiv:2202.11777  [pdf, other

    cs.CV cs.AI

    Art Creation with Multi-Conditional StyleGANs

    Authors: Konstantin Dobler, Florian Hübscher, Jan Westphal, Alejandro Sierra-Múnera, Gerard de Melo, Ralf Krestel

    Abstract: Creating meaningful art is often viewed as a uniquely human endeavor. A human artist needs a combination of unique skills, understanding, and genuine intention to create artworks that evoke deep feelings and emotions. In this paper, we introduce a multi-conditional Generative Adversarial Network (GAN) approach trained on large amounts of human paintings to synthesize realistic-looking paintings th… ▽ More

    Submitted 23 February, 2022; originally announced February 2022.

  23. arXiv:2112.13906  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Does CLIP Benefit Visual Question Answering in the Medical Domain as Much as it Does in the General Domain?

    Authors: Sedigheh Eslami, Gerard de Melo, Christoph Meinel

    Abstract: Contrastive Language--Image Pre-training (CLIP) has shown remarkable success in learning with cross-modal supervision from extensive amounts of image--text pairs collected online. Thus far, the effectiveness of CLIP has been investigated primarily in general-domain multimodal problems. This work evaluates the effectiveness of CLIP for the task of Medical Visual Question Answering (MedVQA). To this… ▽ More

    Submitted 27 December, 2021; originally announced December 2021.

  24. arXiv:2112.11734  [pdf, other

    cs.LG cs.AI

    D-HYPR: Harnessing Neighborhood Modeling and Asymmetry Preservation for Digraph Representation Learning

    Authors: Honglu Zhou, Advith Chegu, Samuel S. Sohn, Zuohui Fu, Gerard de Melo, Mubbasir Kapadia

    Abstract: Digraph Representation Learning (DRL) aims to learn representations for directed homogeneous graphs (digraphs). Prior work in DRL is largely constrained (e.g., limited to directed acyclic graphs), or has poor generalizability across tasks (e.g., evaluated solely on one task). Most Graph Neural Networks (GNNs) exhibit poor performance on digraphs due to the neglect of modeling neighborhoods and pre… ▽ More

    Submitted 28 September, 2022; v1 submitted 22 December, 2021; originally announced December 2021.

    Comments: CIKM 2022

  25. arXiv:2112.02721  [pdf, other

    cs.CL cs.AI cs.LG

    NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation

    Authors: Kaustubh D. Dhole, Varun Gangal, Sebastian Gehrmann, Aadesh Gupta, Zhenhao Li, Saad Mahamood, Abinaya Mahendiran, Simon Mille, Ashish Shrivastava, Samson Tan, Tongshuang Wu, Jascha Sohl-Dickstein, Jinho D. Choi, Eduard Hovy, Ondrej Dusek, Sebastian Ruder, Sajant Anand, Nagender Aneja, Rabin Banjade, Lisa Barthe, Hanna Behnke, Ian Berlot-Attwell, Connor Boyle, Caroline Brun, Marco Antonio Sobrevilla Cabezudo , et al. (101 additional authors not shown)

    Abstract: Data augmentation is an important component in the robustness evaluation of models in natural language processing (NLP) and in enhancing the diversity of the data they are trained on. In this paper, we present NL-Augmenter, a new participatory Python-based natural language augmentation framework which supports the creation of both transformations (modifications to the data) and filters (data split… ▽ More

    Submitted 11 October, 2022; v1 submitted 5 December, 2021; originally announced December 2021.

    Comments: 39 pages, repository at https://github.com/GEM-benchmark/NL-Augmenter

  26. arXiv:2109.11778  [pdf, other

    cs.CV cs.CL

    Dense Contrastive Visual-Linguistic Pretraining

    Authors: Lei Shi, Kai Shuang, Shijie Geng, Peng Gao, Zuohui Fu, Gerard de Melo, Yunpeng Chen, Sen Su

    Abstract: Inspired by the success of BERT, several multimodal representation learning approaches have been proposed that jointly represent image and text. These approaches achieve superior performance by capturing high-level semantic information from large-scale multimodal pretraining. In particular, LXMERT and UNITER adopt visual region feature regression and label classification as pretext tasks. However,… ▽ More

    Submitted 24 September, 2021; originally announced September 2021.

    Comments: Accepted by ACM Multimedia 2021. arXiv admin note: text overlap with arXiv:2007.13135

  27. R2D2: Recursive Transformer based on Differentiable Tree for Interpretable Hierarchical Language Modeling

    Authors: Xiang Hu, Haitao Mi, Zujie Wen, Yafang Wang, Yi Su, Jing Zheng, Gerard de Melo

    Abstract: Human language understanding operates at multiple levels of granularity (e.g., words, phrases, and sentences) with increasing levels of abstraction that can be hierarchically combined. However, existing deep models with stacked layers do not explicitly model any sort of hierarchical process. This paper proposes a recursive Transformer model based on differentiable CKY style binary trees to emulate… ▽ More

    Submitted 3 March, 2022; v1 submitted 2 July, 2021; originally announced July 2021.

    Comments: ACL-IJCNLP 2021

  28. arXiv:2105.02728  [pdf

    q-fin.ST cs.CY cs.SI

    Should You Take Investment Advice From WallStreetBets? A Data-Driven Approach

    Authors: Tolga Buz, Gerard de Melo

    Abstract: Reddit's WallStreetBets (WSB) community has come to prominence in light of its notable role in affecting the stock prices of what are now referred to as meme stocks. Yet very little is known about the reliability of the highly speculative investment advice disseminated on WSB. This paper analyses WSB data spanning from January 2019 to April 2021 in order to assess how successful an investment stra… ▽ More

    Submitted 6 May, 2021; originally announced May 2021.

  29. arXiv:2104.08679  [pdf, other

    cs.CL

    Guilt by Association: Emotion Intensities in Lexical Representations

    Authors: Shahab Raji, Gerard de Melo

    Abstract: What do word vector representations reveal about the emotions associated with words? In this study, we consider the task of estimating word-level emotion intensity scores for specific emotions, exploring unsupervised, supervised, and finally a self-supervised method of extracting emotional associations from word vector representations. Overall, we find that word vectors carry substantial potential… ▽ More

    Submitted 17 April, 2021; originally announced April 2021.

  30. arXiv:2104.08451  [pdf, other

    cs.CL cs.AI

    Context-Aware Interaction Network for Question Matching

    Authors: Zhe Hu, Zuohui Fu, Yu Yin, Gerard de Melo

    Abstract: Impressive milestones have been achieved in text matching by adopting a cross-attention mechanism to capture pertinent semantic connections between two sentence representations. However, regular cross-attention focuses on word-level links between the two input sequences, neglecting the importance of contextual information. We propose a context-aware interaction network (COIN) to properly align two… ▽ More

    Submitted 18 September, 2021; v1 submitted 17 April, 2021; originally announced April 2021.

  31. arXiv:2104.07869  [pdf, other

    cs.IR

    Faithfully Explainable Recommendation via Neural Logic Reasoning

    Authors: Yaxin Zhu, Yikun Xian, Zuohui Fu, Gerard de Melo, Yongfeng Zhang

    Abstract: Knowledge graphs (KG) have become increasingly important to endow modern recommender systems with the ability to generate traceable reasoning paths to explain the recommendation process. However, prior research rarely considers the faithfulness of the derived explanations to justify the decision making process. To the best of our knowledge, this is the first work that models and evaluates faithful… ▽ More

    Submitted 15 April, 2021; originally announced April 2021.

    Comments: Accepted in NAACL 2021

  32. arXiv:2103.05028  [pdf, other

    cs.CL

    Fast and Effective Biomedical Entity Linking Using a Dual Encoder

    Authors: Rajarshi Bhowmik, Karl Stratos, Gerard de Melo

    Abstract: Biomedical entity linking is the task of identifying mentions of biomedical concepts in text documents and mapping them to canonical entities in a target thesaurus. Recent advancements in entity linking using BERT-based models follow a retrieve and rerank paradigm, where the candidate entities are first selected using a retriever model, and then the retrieved candidates are ranked by a reranker mo… ▽ More

    Submitted 8 March, 2021; originally announced March 2021.

  33. arXiv:2101.00430  [pdf

    cs.CL cs.IR cs.SI

    Assessing Emoji Use in Modern Text Processing Tools

    Authors: Abu Awal Md Shoeb, Gerard de Melo

    Abstract: Emojis have become ubiquitous in digital communication, due to their visual appeal as well as their ability to vividly convey human emotion, among other factors. The growing prominence of emojis in social media and other instant messaging also leads to an increased need for systems and tools to operate on text containing emojis. In this study, we assess this support by considering test sets of twe… ▽ More

    Submitted 2 January, 2021; originally announced January 2021.

  34. arXiv:2012.09411  [pdf, other

    cs.CL

    Interactive Question Clarification in Dialogue via Reinforcement Learning

    Authors: Xiang Hu, Zujie Wen, Yafang Wang, Xiaolong Li, Gerard de Melo

    Abstract: Coping with ambiguous questions has been a perennial problem in real-world dialogue systems. Although clarification by asking questions is a common form of human interaction, it is hard to define appropriate questions to elicit more specific intents from a user. In this work, we propose a reinforcement model to clarify ambiguous questions by suggesting refinements of the original query. We first f… ▽ More

    Submitted 17 December, 2020; originally announced December 2020.

    Comments: COLING industry track

  35. arXiv:2011.06844  [pdf, other

    cs.CL

    Cross-Domain Learning for Classifying Propaganda in Online Contents

    Authors: Liqiang Wang, Xiaoyu Shen, Gerard de Melo, Gerhard Weikum

    Abstract: As news and social media exhibit an increasing amount of manipulative polarized content, detecting such propaganda has received attention as a new task for content analysis. Prior work has focused on supervised learning with training data from the same domain. However, as propaganda can be subtle and keeps evolving, manual identification and proper labeling are very demanding. As a consequence, tr… ▽ More

    Submitted 22 November, 2020; v1 submitted 13 November, 2020; originally announced November 2020.

    Comments: TTO 2020

  36. CAFE: Coarse-to-Fine Neural Symbolic Reasoning for Explainable Recommendation

    Authors: Yikun Xian, Zuohui Fu, Handong Zhao, Yingqiang Ge, Xu Chen, Qiaoying Huang, Shijie Geng, Zhou Qin, Gerard de Melo, S. Muthukrishnan, Yongfeng Zhang

    Abstract: Recent research explores incorporating knowledge graphs (KG) into e-commerce recommender systems, not only to achieve better recommendation performance, but more importantly to generate explanations of why particular decisions are made. This can be achieved by explicit KG reasoning, where a model starts from a user node, sequentially determines the next step, and walks towards an item node of pote… ▽ More

    Submitted 29 October, 2020; originally announced October 2020.

    Comments: Accepted in CIKM 2020

  37. arXiv:2010.04366  [pdf, other

    cs.SI cs.AI cs.LG

    GitEvolve: Predicting the Evolution of GitHub Repositories

    Authors: Honglu Zhou, Hareesh Ravi, Carlos M. Muniz, Vahid Azizi, Linda Ness, Gerard de Melo, Mubbasir Kapadia

    Abstract: Software development is becoming increasingly open and collaborative with the advent of platforms such as GitHub. Given its crucial role, there is a need to better understand and model the dynamics of GitHub as a social platform. Previous work has mostly considered the dynamics of traditional social networking sites like Twitter and Facebook. We propose GitEvolve, a system to predict the evolution… ▽ More

    Submitted 9 October, 2020; originally announced October 2020.

  38. arXiv:2008.09237  [pdf, other

    cs.IR cs.AI cs.CL

    COOKIE: A Dataset for Conversational Recommendation over Knowledge Graphs in E-commerce

    Authors: Zuohui Fu, Yikun Xian, Yaxin Zhu, Yongfeng Zhang, Gerard de Melo

    Abstract: In this work, we present a new dataset for conversational recommendation over knowledge graphs in e-commerce platforms called COOKIE. The dataset is constructed from an Amazon review corpus by integrating both user-agent dialogue and custom knowledge graphs for recommendation. Specifically, we first construct a unified knowledge graph and extract key entities between user--product pairs, which ser… ▽ More

    Submitted 20 August, 2020; originally announced August 2020.

  39. arXiv:2007.15072  [pdf, other

    cs.CL

    Leveraging Adversarial Training in Self-Learning for Cross-Lingual Text Classification

    Authors: Xin Dong, Yaxin Zhu, Yupeng Zhang, Zuohui Fu, Dongkuan Xu, Sen Yang, Gerard de Melo

    Abstract: In cross-lingual text classification, one seeks to exploit labeled data from one language to train a text classification model that can then be applied to a completely different language. Recent multilingual representation models have made it much easier to achieve this. Still, there may still be subtle differences between languages that are neglected when doing so. To address this, we present a s… ▽ More

    Submitted 29 July, 2020; originally announced July 2020.

    Comments: SIGIR 2020 (Short Paper)

  40. arXiv:2007.13135  [pdf, other

    cs.CV eess.IV

    Contrastive Visual-Linguistic Pretraining

    Authors: Lei Shi, Kai Shuang, Shijie Geng, Peng Su, Zhengkai Jiang, Peng Gao, Zuohui Fu, Gerard de Melo, Sen Su

    Abstract: Several multi-modality representation learning approaches such as LXMERT and ViLBERT have been proposed recently. Such approaches can achieve superior performance due to the high-level semantic information captured during large-scale multimodal pretraining. However, as ViLBERT and LXMERT adopt visual region regression and classification loss, they often suffer from domain gap and noisy label probl… ▽ More

    Submitted 26 July, 2020; originally announced July 2020.

  41. arXiv:2006.04109  [pdf, other

    cs.AI cs.CL cs.MA

    Incorporating Pragmatic Reasoning Communication into Emergent Language

    Authors: Yipeng Kang, Tonghan Wang, Gerard de Melo

    Abstract: Emergentism and pragmatics are two research fields that study the dynamics of linguistic communication along substantially different timescales and intelligence levels. From the perspective of multi-agent reinforcement learning, they correspond to stochastic games with reinforcement training and stage games with opponent awareness. Given that their combination has been explored in linguistics, we… ▽ More

    Submitted 15 December, 2020; v1 submitted 7 June, 2020; originally announced June 2020.

    Comments: 9 pages. Accepted as a spotlight paper to NeurIPS 2020

  42. arXiv:2006.02046  [pdf, other

    cs.IR cs.AI cs.SI

    Fairness-Aware Explainable Recommendation over Knowledge Graphs

    Authors: Zuohui Fu, Yikun Xian, Ruoyuan Gao, Jieyu Zhao, Qiaoying Huang, Yingqiang Ge, Shuyuan Xu, Shijie Geng, Chirag Shah, Yongfeng Zhang, Gerard de Melo

    Abstract: There has been growing attention on fairness considerations recently, especially in the context of intelligent decision making systems. Explainable recommendation systems, in particular, may suffer from both explanation bias and performance disparity. In this paper, we analyze different groups of users according to their level of activity, and find that bias exists in recommendation performance be… ▽ More

    Submitted 27 June, 2020; v1 submitted 3 June, 2020; originally announced June 2020.

  43. arXiv:2005.13192  [pdf, other

    cs.CV

    TIME: Text and Image Mutual-Translation Adversarial Networks

    Authors: Bingchen Liu, Kunpeng Song, Yizhe Zhu, Gerard de Melo, Ahmed Elgammal

    Abstract: Focusing on text-to-image (T2I) generation, we propose Text and Image Mutual-Translation Adversarial Networks (TIME), a lightweight but effective model that jointly learns a T2I generator G and an image captioning discriminator D under the Generative Adversarial Network framework. While previous methods tackle the T2I problem as a uni-directional task and use pre-trained language models to enforce… ▽ More

    Submitted 22 December, 2020; v1 submitted 27 May, 2020; originally announced May 2020.

    Comments: AAAI-2021

  44. arXiv:2005.08646  [pdf, other

    cs.CV eess.IV

    Character Matters: Video Story Understanding with Character-Aware Relations

    Authors: Shijie Geng, Ji Zhang, Zuohui Fu, Peng Gao, Hang Zhang, Gerard de Melo

    Abstract: Different from short videos and GIFs, video stories contain clear plots and lists of principal characters. Without identifying the connection between appearing people and character names, a model is not able to obtain a genuine understanding of the plots. Video Story Question Answering (VSQA) offers an effective way to benchmark higher-level comprehension abilities of a model. However, current VSQ… ▽ More

    Submitted 9 May, 2020; originally announced May 2020.

  45. arXiv:2005.01158  [pdf, other

    cs.CL cs.IR cs.LG

    Correcting the Autocorrect: Context-Aware Typographical Error Correction via Training Data Augmentation

    Authors: Kshitij Shah, Gerard de Melo

    Abstract: In this paper, we explore the artificial generation of typographical errors based on real-world statistics. We first draw on a small set of annotated data to compute spelling error statistics. These are then invoked to introduce errors into substantially larger corpora. The generation methodology allows us to generate particularly challenging errors that require context-aware error detection. We u… ▽ More

    Submitted 3 May, 2020; originally announced May 2020.

    Comments: Accepted for publication at LREC 2020

  46. arXiv:2005.00693  [pdf, other

    cs.CL cs.AI cs.CY

    Are Emojis Emotional? A Study to Understand the Association between Emojis and Emotions

    Authors: Abu Shoeb, Gerard de Melo

    Abstract: Given the growing ubiquity of emojis in language, there is a need for methods and resources that shed light on their meaning and communicative role. One conspicuous aspect of emojis is their use to convey affect in ways that may otherwise be non-trivial to achieve. In this paper, we seek to explore the connection between emojis and emotions by means of a new dataset consisting of human-solicited a… ▽ More

    Submitted 2 May, 2020; originally announced May 2020.

  47. arXiv:2005.00637  [pdf, other

    cs.CL

    Explainable Link Prediction for Emerging Entities in Knowledge Graphs

    Authors: Rajarshi Bhowmik, Gerard de Melo

    Abstract: Despite their large-scale coverage, cross-domain knowledge graphs invariably suffer from inherent incompleteness and sparsity. Link prediction can alleviate this by inferring a target entity, given a source entity and a query relation. Recent embedding-based approaches operate in an uninterpretable latent semantic vector space of entities and relations, while path-based approaches operate in the s… ▽ More

    Submitted 25 September, 2020; v1 submitted 1 May, 2020; originally announced May 2020.

    Comments: To appear in the proceedings of International Semantic Web Conference, 2020 (ISWC 2020)

  48. arXiv:2003.04036  [pdf, ps, other

    cs.CL

    Sentence Analogies: Exploring Linguistic Relationships and Regularities in Sentence Embeddings

    Authors: Xunjie Zhu, Gerard de Melo

    Abstract: While important properties of word vector representations have been studied extensively, far less is known about the properties of sentence vector representations. Word vectors are often evaluated by assessing to what degree they exhibit regularities with regard to relationships of the sort considered in word analogies. In this paper, we investigate to what extent commonly used sentence vector rep… ▽ More

    Submitted 9 March, 2020; originally announced March 2020.

  49. arXiv:2003.02320  [pdf, other

    cs.AI cs.DB cs.LG

    Knowledge Graphs

    Authors: Aidan Hogan, Eva Blomqvist, Michael Cochez, Claudia d'Amato, Gerard de Melo, Claudio Gutierrez, José Emilio Labra Gayo, Sabrina Kirrane, Sebastian Neumaier, Axel Polleres, Roberto Navigli, Axel-Cyrille Ngonga Ngomo, Sabbir M. Rashid, Anisa Rula, Lukas Schmelzeisen, Juan Sequeda, Steffen Staab, Antoine Zimmermann

    Abstract: In this paper we provide a comprehensive introduction to knowledge graphs, which have recently garnered significant attention from both industry and academia in scenarios that require exploiting diverse, dynamic, large-scale collections of data. After some opening remarks, we motivate and contrast various graph-based data models and query languages that are used for knowledge graphs. We discuss th… ▽ More

    Submitted 11 September, 2021; v1 submitted 4 March, 2020; originally announced March 2020.

    Comments: Revision from v5: Correcting errata from previous version for entailment/models, and some other minor typos

    Journal ref: ACM Comput. Surv. 54(4): 71:1-71:37 (2021)

  50. arXiv:2003.00739  [pdf, other

    cs.CV cs.CL

    Long Short-Term Sample Distillation

    Authors: Liang Jiang, Zujie Wen, Zhongping Liang, Yafang Wang, Gerard de Melo, Zhe Li, Liangzhuang Ma, Jiaxing Zhang, Xiaolong Li, Yuan Qi

    Abstract: In the past decade, there has been substantial progress at training increasingly deep neural networks. Recent advances within the teacher--student training paradigm have established that information about past training updates show promise as a source of guidance during subsequent training steps. Based on this notion, in this paper, we propose Long Short-Term Sample Distillation, a novel training… ▽ More

    Submitted 2 March, 2020; originally announced March 2020.

    Comments: published as a conference paper at AAAI 2020