Skip to main content

Showing 1–23 of 23 results for author: Menezes, A

  1. arXiv:2402.12624  [pdf, other

    cs.CV cs.AI

    Efficient Parameter Mining and Freezing for Continual Object Detection

    Authors: Angelo G. Menezes, Augusto J. Peterlevitz, Mateus A. Chinelatto, André C. P. L. F. de Carvalho

    Abstract: Continual Object Detection is essential for enabling intelligent agents to interact proactively with humans in real-world settings. While parameter-isolation strategies have been extensively explored in the context of continual learning for classification, they have yet to be fully harnessed for incremental object detection scenarios. Drawing inspiration from prior research that focused on mining… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

    Comments: In Proceedings of the 19th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 2: VISAPP, ISBN 978-989-758-679-8, ISSN 2184-4321, pages 466-474

  2. arXiv:2310.15987  [pdf, other

    cs.CL cs.AI

    Dissecting In-Context Learning of Translations in GPTs

    Authors: Vikas Raunak, Hany Hassan Awadalla, Arul Menezes

    Abstract: Most of the recent work in leveraging Large Language Models (LLMs) such as GPT-3 for Machine Translation (MT) has focused on selecting the few-shot samples for prompting. In this work, we try to better understand the role of demonstration attributes for the in-context learning of translations through perturbations of high-quality, in-domain demonstrations. We find that asymmetric perturbation of t… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

    Comments: EMNLP Findings (+ Minor Updates over Camera-Ready)

  3. arXiv:2305.19835  [pdf, ps, other

    cs.CL cs.AI

    Deliberate then Generate: Enhanced Prompting Framework for Text Generation

    Authors: Bei Li, Rui Wang, Junliang Guo, Kaitao Song, Xu Tan, Hany Hassan, Arul Menezes, Tong Xiao, Jiang Bian, JingBo Zhu

    Abstract: Large language models (LLMs) have shown remarkable success across a wide range of natural language generation tasks, where proper prompt designs make great impacts. While existing prompting methods are normally restricted to providing correct information, in this paper, we encourage the model to deliberate by proposing a novel Deliberate then Generate (DTG) prompting framework, which consists of e… ▽ More

    Submitted 31 May, 2023; originally announced May 2023.

  4. arXiv:2305.16806  [pdf, other

    cs.CL cs.AI

    Do GPTs Produce Less Literal Translations?

    Authors: Vikas Raunak, Arul Menezes, Matt Post, Hany Hassan Awadalla

    Abstract: Large Language Models (LLMs) such as GPT-3 have emerged as general-purpose language models capable of addressing many natural language generation or understanding tasks. On the task of Machine Translation (MT), multiple works have investigated few-shot prompting mechanisms to elicit better translations from LLMs. However, there has been relatively little investigation on how such translations diff… ▽ More

    Submitted 5 June, 2023; v1 submitted 26 May, 2023; originally announced May 2023.

    Comments: ACL 2023

  5. arXiv:2305.14878  [pdf, other

    cs.CL cs.AI

    Leveraging GPT-4 for Automatic Translation Post-Editing

    Authors: Vikas Raunak, Amr Sharaf, Yiren Wang, Hany Hassan Awadallah, Arul Menezes

    Abstract: While Neural Machine Translation (NMT) represents the leading approach to Machine Translation (MT), the outputs of NMT models still require translation post-editing to rectify errors and enhance quality under critical settings. In this work, we formalize the task of direct translation post-editing with Large Language Models (LLMs) and explore the use of GPT-4 to automatically post-edit NMT outputs… ▽ More

    Submitted 23 October, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: EMNLP Findings 2023

  6. arXiv:2305.03361  [pdf, other

    cs.SE cs.PL

    CHAMELEON: OutSystems Live Bidirectional Transformations

    Authors: Hugo Lourenço, João Costa Seco, Carla Ferreira, Tiago Simões, Vasco Silva, Filipe Assunção, André Menezes

    Abstract: In model-driven engineering, the bidirectional transformation of models plays a crucial role in facilitating the use of editors that operate at different levels of abstraction. This is particularly important in the context of industrial-grade low-code platforms like OutSystems, which feature a comprehensive ecosystem of tools that complement the standard integrated development environment with dom… ▽ More

    Submitted 5 May, 2023; originally announced May 2023.

  7. arXiv:2304.14802  [pdf, other

    cs.CL cs.AI cs.LG cs.NE

    ResiDual: Transformer with Dual Residual Connections

    Authors: Shufang Xie, Huishuai Zhang, Junliang Guo, Xu Tan, Jiang Bian, Hany Hassan Awadalla, Arul Menezes, Tao Qin, Rui Yan

    Abstract: Transformer networks have become the preferred architecture for many tasks due to their state-of-the-art performance. However, the optimal way to implement residual connections in Transformer, which are essential for effective training, is still debated. Two widely used variants are the Post-Layer-Normalization (Post-LN) and Pre-Layer-Normalization (Pre-LN) Transformers, which apply layer normaliz… ▽ More

    Submitted 28 April, 2023; originally announced April 2023.

  8. arXiv:2303.16870  [pdf, ps, other

    physics.soc-ph cs.AI cs.HC

    Questions of science: chatting with ChatGPT about complex systems

    Authors: Nuno Crokidakis, Marcio Argollo de Menezes, Daniel O. Cajueiro

    Abstract: We present an overview of the complex systems field using ChatGPT as a representation of the community's understanding. ChatGPT has learned language patterns and styles from a large dataset of internet texts, allowing it to provide answers that reflect common opinions, ideas, and language patterns found in the community. Our exploration covers both teaching and learning, and research topics. We re… ▽ More

    Submitted 29 March, 2023; originally announced March 2023.

    Comments: This is a work in progress

  9. arXiv:2212.00006  [pdf, other

    cs.HC cs.CL cs.CV cs.CY

    Operationalizing Specifications, In Addition to Test Sets for Evaluating Constrained Generative Models

    Authors: Vikas Raunak, Matt Post, Arul Menezes

    Abstract: In this work, we present some recommendations on the evaluation of state-of-the-art generative models for constrained generation tasks. The progress on generative models has been rapid in recent years. These large-scale models have had three impacts: firstly, the fluency of generation in both language and vision modalities has rendered common average-case evaluation metrics much less useful in dia… ▽ More

    Submitted 19 November, 2022; originally announced December 2022.

    Comments: NeurIPS 2022 Workshop on Human Evaluation of Generative Models

  10. arXiv:2211.16934  [pdf, other

    cs.CL cs.AI cs.LG cs.MM eess.AS

    VideoDubber: Machine Translation with Speech-Aware Length Control for Video Dubbing

    Authors: Yihan Wu, Junliang Guo, Xu Tan, Chen Zhang, Bohan Li, Ruihua Song, Lei He, Sheng Zhao, Arul Menezes, Jiang Bian

    Abstract: Video dubbing aims to translate the original speech in a film or television program into the speech in a target language, which can be achieved with a cascaded system consisting of speech recognition, machine translation and speech synthesis. To ensure the translated speech to be well aligned with the corresponding video, the length/duration of the translated speech should be as close as possible… ▽ More

    Submitted 4 December, 2023; v1 submitted 30 November, 2022; originally announced November 2022.

    Comments: AAAI 2023 camera version

  11. arXiv:2211.13317  [pdf, other

    cs.CL cs.AI

    Rank-One Editing of Encoder-Decoder Models

    Authors: Vikas Raunak, Arul Menezes

    Abstract: Large sequence to sequence models for tasks such as Neural Machine Translation (NMT) are usually trained over hundreds of millions of samples. However, training is just the origin of a model's life-cycle. Real-world deployments of models require further behavioral adaptations as new requirements emerge or shortcomings become known. Typically, in the space of model behaviors, behavior deletion requ… ▽ More

    Submitted 23 November, 2022; originally announced November 2022.

    Comments: The Second Workshop On Interactive Learning For Natural Language Processing (InterNLP 2022), NeurIPS 2022

  12. arXiv:2210.12929  [pdf, other

    cs.CL cs.AI cs.LG

    Finding Memo: Extractive Memorization in Constrained Sequence Generation Tasks

    Authors: Vikas Raunak, Arul Menezes

    Abstract: Memorization presents a challenge for several constrained Natural Language Generation (NLG) tasks such as Neural Machine Translation (NMT), wherein the proclivity of neural models to memorize noisy and atypical samples reacts adversely with the noisy (web crawled) datasets. However, previous studies of memorization in constrained NLG tasks have only focused on counterfactual memorization, linking… ▽ More

    Submitted 23 October, 2022; originally announced October 2022.

    Comments: EMNLP Findings 2022

  13. arXiv:2206.04615  [pdf, other

    cs.CL cs.AI cs.CY cs.LG stat.ML

    Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

    Authors: Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza , et al. (426 additional authors not shown)

    Abstract: Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur… ▽ More

    Submitted 12 June, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench

    Journal ref: Transactions on Machine Learning Research, May/2022, https://openreview.net/forum?id=uyTL5Bvosj

  14. arXiv:2205.15445  [pdf, other

    cs.CV cs.LG cs.RO

    Continual Object Detection: A review of definitions, strategies, and challenges

    Authors: Angelo G. Menezes, Gustavo de Moura, Cézanne Alves, André C. P. L. F. de Carvalho

    Abstract: The field of Continual Learning investigates the ability to learn consecutive tasks without losing performance on those previously learned. Its focus has been mainly on incremental classification tasks. We believe that research in continual object detection deserves even more attention due to its vast range of applications in robotics and autonomous vehicles. This scenario is more complex than con… ▽ More

    Submitted 30 May, 2022; originally announced May 2022.

  15. arXiv:2205.09988  [pdf, other

    cs.CL cs.AI

    SALTED: A Framework for SAlient Long-Tail Translation Error Detection

    Authors: Vikas Raunak, Matt Post, Arul Menezes

    Abstract: Traditional machine translation (MT) metrics provide an average measure of translation quality that is insensitive to the long tail of behavioral problems in MT. Examples include translation of numbers, physical units, dropped content and hallucinations. These errors, which occur rarely and unpredictably in Neural Machine Translation (NMT), greatly undermine the reliability of state-of-the-art MT… ▽ More

    Submitted 20 May, 2022; originally announced May 2022.

  16. arXiv:2107.10821  [pdf, other

    cs.CL

    To Ship or Not to Ship: An Extensive Evaluation of Automatic Metrics for Machine Translation

    Authors: Tom Kocmi, Christian Federmann, Roman Grundkiewicz, Marcin Junczys-Dowmunt, Hitokazu Matsushita, Arul Menezes

    Abstract: Automatic metrics are commonly used as the exclusive tool for declaring the superiority of one machine translation system's quality over another. The community choice of automatic metric guides research directions and industrial developments by deciding which models are deemed better. Evaluating metrics correlations with sets of human judgements has been limited by the size of these sets. In this… ▽ More

    Submitted 13 September, 2021; v1 submitted 22 July, 2021; originally announced July 2021.

    Comments: Accepted to WMT 2021 research papers

  17. arXiv:2107.01017  [pdf, other

    q-fin.ST cs.AI cs.CE cs.LG

    MegazordNet: combining statistical and machine learning standpoints for time series forecasting

    Authors: Angelo Garangau Menezes, Saulo Martiello Mastelini

    Abstract: Forecasting financial time series is considered to be a difficult task due to the chaotic feature of the series. Statistical approaches have shown solid results in some specific problems such as predicting market direction and single-price of stocks; however, with the recent advances in deep learning and big data techniques, new promising options have arises to tackle financial time series forecas… ▽ More

    Submitted 23 June, 2021; originally announced July 2021.

  18. arXiv:2104.06683  [pdf, other

    cs.CL cs.AI cs.LG

    The Curious Case of Hallucinations in Neural Machine Translation

    Authors: Vikas Raunak, Arul Menezes, Marcin Junczys-Dowmunt

    Abstract: In this work, we study hallucinations in Neural Machine Translation (NMT), which lie at an extreme end on the spectrum of NMT pathologies. Firstly, we connect the phenomenon of hallucinations under source perturbation to the Long-Tail theory of Feldman (2020), and present an empirically validated hypothesis that explains hallucinations under source perturbation. Secondly, we consider hallucination… ▽ More

    Submitted 14 April, 2021; originally announced April 2021.

    Comments: Accepted to NAACL 2021

  19. arXiv:2101.10845  [pdf, other

    cs.CV cs.AI

    Analysis and evaluation of Deep Learning based Super-Resolution algorithms to improve performance in Low-Resolution Face Recognition

    Authors: Angelo G. Menezes

    Abstract: Surveillance scenarios are prone to several problems since they usually involve low-resolution footage, and there is no control of how far the subjects may be from the camera in the first place. This situation is suitable for the application of upsampling (super-resolution) algorithms since they may be able to recover the discriminant properties of the subjects involved. While general super-resolu… ▽ More

    Submitted 18 January, 2021; originally announced January 2021.

    Comments: MSc Thesis under supervision of Carlos A. E. Montesco presented at the Federal University of Sergipe, Brazil (2019)

    ACM Class: I.4.0; I.4.9

  20. arXiv:2012.15547  [pdf, other

    cs.CL

    XLM-T: Scaling up Multilingual Machine Translation with Pretrained Cross-lingual Transformer Encoders

    Authors: Shuming Ma, Jian Yang, Haoyang Huang, Zewen Chi, Li Dong, Dongdong Zhang, Hany Hassan Awadalla, Alexandre Muzio, Akiko Eriguchi, Saksham Singhal, Xia Song, Arul Menezes, Furu Wei

    Abstract: Multilingual machine translation enables a single model to translate between different languages. Most existing multilingual machine translation systems adopt a randomly initialized Transformer backbone. In this work, inspired by the recent success of language model pre-training, we present XLM-T, which initializes the model with an off-the-shelf pretrained cross-lingual Transformer encoder and fi… ▽ More

    Submitted 31 December, 2020; originally announced December 2020.

  21. A continuous integration and web framework in support of the ATLAS Publication Process

    Authors: Juan Pedro Araque Espinosa, Gabriel Baldi Levcovitz, Riccardo-Maria Bianchi, Ian Brock, Tancredi Carli, Nuno Filipe Castro, Alessandra Ciocio, Maurizio Colautti, Ana Carolina Da Silva Menezes, Gabriel De Oliveira da Fonseca, Leandro Domingues Macedo Alves, Andreas Hoecker, Bruno Lange Ramos, Gabriela Lemos Lúcidi Pinhão, Carmen Maidantchik, Fairouz Malek, Robert McPherson, Gianluca Picco, Marcelo Teixeira Dos Santos

    Abstract: The ATLAS collaboration defines methods, establishes procedures, and organises advisory groups to manage the publication processes of scientific papers, conference papers, and public notes. All stages are managed through web systems, computing programs, and tools that are designed and developed by the collaboration. A framework called FENCE is integrated into the CERN GitLab software repository, t… ▽ More

    Submitted 28 January, 2021; v1 submitted 14 May, 2020; originally announced May 2020.

    Comments: 22 pages in total,11 figures, submitted to JINST. All figures including auxiliary figures are available at https://atlas.web.cern.ch/Atlas/GROUPS/PHYSICS/PAPERS/GENR-2018-01/

    Report number: CERN-OPEN-2020-007

  22. arXiv:1803.05567  [pdf, other

    cs.CL

    Achieving Human Parity on Automatic Chinese to English News Translation

    Authors: Hany Hassan, Anthony Aue, Chang Chen, Vishal Chowdhary, Jonathan Clark, Christian Federmann, Xuedong Huang, Marcin Junczys-Dowmunt, William Lewis, Mu Li, Shujie Liu, Tie-Yan Liu, Renqian Luo, Arul Menezes, Tao Qin, Frank Seide, Xu Tan, Fei Tian, Lijun Wu, Shuangzhi Wu, Yingce Xia, Dongdong Zhang, Zhirui Zhang, Ming Zhou

    Abstract: Machine translation has made rapid advances in recent years. Millions of people are using it today in online translation systems and mobile applications in order to communicate across language barriers. The question naturally arises whether such systems can approach or achieve parity with human translations. In this paper, we first address the problem of how to define and accurately measure human… ▽ More

    Submitted 29 June, 2018; v1 submitted 14 March, 2018; originally announced March 2018.

  23. arXiv:1203.6673  [pdf, ps, other

    physics.soc-ph cond-mat.stat-mech cs.SI q-bio.PE

    Critical behavior of the SIS epidemic model with time-dependent infection rate

    Authors: Nuno Crokidakis, Marcio Argollo de Menezes

    Abstract: In this work we study a modified Susceptible-Infected-Susceptible (SIS) model in which the infection rate $λ$ decays exponentially with the number of reinfections $n$, saturating after $n=l$. We find a critical decaying rate $ε_{c}(l)$ above which a finite fraction of the population becomes permanently infected. From the mean-field solution and computer simulations on hypercubic lattices we find e… ▽ More

    Submitted 29 March, 2012; originally announced March 2012.

    Comments: 13 pages, 11 figures, Submitted for publication

    Journal ref: J. Stat. Mech. P05012 (2012)