Skip to main content

Showing 1–50 of 212 results for author: Lee, R

  1. arXiv:2407.12882  [pdf, other

    cs.CL cs.AI cs.LG

    InstructAV: Instruction Fine-tuning Large Language Models for Authorship Verification

    Authors: Yujia Hu, Zhiqiang Hu, Chun-Wei Seah, Roy Ka-Wei Lee

    Abstract: Large Language Models (LLMs) have demonstrated remarkable proficiency in a wide range of NLP tasks. However, when it comes to authorship verification (AV) tasks, which involve determining whether two given texts share the same authorship, even advanced models like ChatGPT exhibit notable limitations. This paper introduces a novel approach, termed InstructAV, for authorship verification. This appro… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  2. arXiv:2407.09105  [pdf, other

    cs.LG cs.AI

    Enhancing Training Efficiency Using Packing with Flash Attention

    Authors: Achintya Kundu, Rhui Dih Lee, Laura Wynter, Raghu Kiran Ganti, Mayank Mishra

    Abstract: Padding is often used in tuning LLM models by adding special tokens to shorter training examples to match the length of the longest sequence in each batch. While this ensures uniformity for batch processing, it introduces inefficiencies by including irrelevant padding tokens in the computation and wastes GPU resources. On the other hand, the Hugging Face SFT trainer offers the option to use packin… ▽ More

    Submitted 18 July, 2024; v1 submitted 12 July, 2024; originally announced July 2024.

  3. arXiv:2407.06362  [pdf, other

    cs.RO physics.app-ph

    Self-deployable contracting-cord metamaterials with tunable mechanical properties

    Authors: Wenzhong Yan, Talmage Jones, Christopher L. Jawetz, Ryan H. Lee, Jonathan B. Hopkins, Ankur Mehta

    Abstract: Recent advances in active materials and fabrication techniques have enabled the production of cyclically self-deployable metamaterials with an expanded functionality space. However, designing metamaterials that possess continuously tunable mechanical properties after self-deployment remains a challenge, notwithstanding its importance. Inspired by push puppets, we introduce an efficient design stra… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: 6 figures

    Journal ref: Materials Horizons (2024)

  4. arXiv:2406.17294  [pdf, other

    cs.CL

    Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models

    Authors: Wenhao Shi, Zhiqiang Hu, Yi Bin, Junhua Liu, Yang Yang, See-Kiong Ng, Lidong Bing, Roy Ka-Wei Lee

    Abstract: Large language models (LLMs) have demonstrated impressive reasoning capabilities, particularly in textual mathematical problem-solving. However, existing open-source image instruction fine-tuning datasets, containing limited question-answer pairs per image, do not fully exploit visual information to enhance the multimodal mathematical reasoning capabilities of Multimodal LLMs (MLLMs). To bridge th… ▽ More

    Submitted 26 June, 2024; v1 submitted 25 June, 2024; originally announced June 2024.

    Comments: 8 pages

  5. arXiv:2406.12223  [pdf, other

    cs.CL cs.CY

    ToxiCloakCN: Evaluating Robustness of Offensive Language Detection in Chinese with Cloaking Perturbations

    Authors: Yunze Xiao, Yujia Hu, Kenny Tsu Wei Choo, Roy Ka-wei Lee

    Abstract: Detecting hate speech and offensive language is essential for maintaining a safe and respectful digital environment. This study examines the limitations of state-of-the-art large language models (LLMs) in identifying offensive content within systematically perturbed data, with a focus on Chinese, a language particularly susceptible to such perturbations. We introduce \textsf{ToxiCloakCN}, an enhan… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 10 pages,5 Tables, 2 Figures

  6. arXiv:2406.06717  [pdf, ps, other

    cs.SI cs.HC

    Analyzing user archetypes in Singapore's Telegram groups on COVID-19 and climate change

    Authors: Val Alvern Cueco Ligo, Lan Tianxiang, Ying Zeng, Lam Yin Cheung, Pi Zonooz, Roy Ka-Wei Lee, Koustuv Saha, Edson C. Tandoc Jr., Navin Kumar

    Abstract: Social media platforms, particularly Telegram, play a pivotal role in shaping public perceptions and opinions on global and national issues. Unlike traditional news media, Telegram allows for the proliferation of user-generated content with minimal oversight, making it a significant venue for the spread of controversial and misinformative content. During the COVID-19 pandemic, Telegram's popularit… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  7. arXiv:2406.06474  [pdf, other

    cs.AI cs.CL

    Towards a Personal Health Large Language Model

    Authors: Justin Cosentino, Anastasiya Belyaeva, Xin Liu, Nicholas A. Furlotte, Zhun Yang, Chace Lee, Erik Schenck, Yojan Patel, Jian Cui, Logan Douglas Schneider, Robby Bryant, Ryan G. Gomes, Allen Jiang, Roy Lee, Yun Liu, Javier Perez, Jameson K. Rogers, Cathy Speed, Shyam Tailor, Megan Walker, Jeffrey Yu, Tim Althoff, Conor Heneghan, John Hernandez, Mark Malhotra , et al. (9 additional authors not shown)

    Abstract: In health, most large language model (LLM) research has focused on clinical tasks. However, mobile and wearable devices, which are rarely integrated into such tasks, provide rich, longitudinal data for personal health monitoring. Here we present Personal Health Large Language Model (PH-LLM), fine-tuned from Gemini for understanding and reasoning over numerical time-series personal health data. We… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 72 pages

  8. arXiv:2406.02352  [pdf, other

    cs.LG

    System-Aware Neural ODE Processes for Few-Shot Bayesian Optimization

    Authors: Jixiang Qing, Becky D Langdon, Robert M Lee, Behrang Shafei, Mark van der Wilk, Calvin Tsay, Ruth Misener

    Abstract: We consider the problem of optimizing initial conditions and timing in dynamical systems governed by unknown ordinary differential equations (ODEs), where evaluating different initial conditions is costly and there are constraints on observation times. To identify the optimal conditions within several trials, we introduce a few-shot Bayesian Optimization (BO) framework based on the system's prior… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  9. arXiv:2406.00549  [pdf, other

    stat.ME cs.AI

    Zero Inflation as a Missing Data Problem: a Proxy-based Approach

    Authors: Trung Phung, Jaron J. R. Lee, Opeyemi Oladapo-Shittu, Eili Y. Klein, Ayse Pinar Gurses, Susan M. Hannum, Kimberly Weems, Jill A. Marsteller, Sara E. Cosgrove, Sara C. Keller, Ilya Shpitser

    Abstract: A common type of zero-inflated data has certain true values incorrectly replaced by zeros due to data recording conventions (rare outcomes assumed to be absent) or details of data recording equipment (e.g. artificial zeros in gene expression data). Existing methods for zero-inflated data either fit the observed data likelihood via parametric mixture models that explicitly represent excess zeros,… ▽ More

    Submitted 2 July, 2024; v1 submitted 1 June, 2024; originally announced June 2024.

    Comments: 28 pages, 8 figues, accepted for the 40th Conference on Uncertainty in Artificial Intelligence (UAI 2024)

  10. arXiv:2405.14791  [pdf, other

    cs.LG cs.CV cs.DC

    Recurrent Early Exits for Federated Learning with Heterogeneous Clients

    Authors: Royson Lee, Javier Fernandez-Marques, Shell Xu Hu, Da Li, Stefanos Laskaridis, Łukasz Dudziak, Timothy Hospedales, Ferenc Huszár, Nicholas D. Lane

    Abstract: Federated learning (FL) has enabled distributed learning of a model across multiple clients in a privacy-preserving manner. One of the main challenges of FL is to accommodate clients with varying hardware capacities; clients have differing compute and memory requirements. To tackle this challenge, recent state-of-the-art approaches leverage the use of early exits. Nonetheless, these approaches fal… ▽ More

    Submitted 27 May, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

    Comments: Accepted at the 41st International Conference on Machine Learning (ICML 2024)

  11. arXiv:2405.10221  [pdf, other

    math.OC cs.LG stat.ML

    Scalarisation-based risk concepts for robust multi-objective optimisation

    Authors: Ben Tu, Nikolas Kantas, Robert M. Lee, Behrang Shafei

    Abstract: Robust optimisation is a well-established framework for optimising functions in the presence of uncertainty. The inherent goal of this problem is to identify a collection of inputs whose outputs are both desirable for the decision maker, whilst also being robust to the underlying uncertainties in the problem. In this work, we study the multi-objective case of this problem. We identify that the maj… ▽ More

    Submitted 15 July, 2024; v1 submitted 16 May, 2024; originally announced May 2024.

    Comments: The code is available at: https://github.com/benmltu/scalarize

  12. arXiv:2405.01842  [pdf, ps, other

    cs.CL

    SGHateCheck: Functional Tests for Detecting Hate Speech in Low-Resource Languages of Singapore

    Authors: Ri Chi Ng, Nirmalendu Prakash, Ming Shan Hee, Kenny Tsu Wei Choo, Roy Ka-Wei Lee

    Abstract: To address the limitations of current hate speech detection models, we introduce \textsf{SGHateCheck}, a novel framework designed for the linguistic and cultural context of Singapore and Southeast Asia. It extends the functional testing approach of HateCheck and MHC, employing large language models for translation and paraphrasing into Singapore's main languages, and refining these with native ann… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

  13. arXiv:2405.01404  [pdf, other

    stat.ML cs.LG math.OC stat.ME

    Random Pareto front surfaces

    Authors: Ben Tu, Nikolas Kantas, Robert M. Lee, Behrang Shafei

    Abstract: The goal of multi-objective optimisation is to identify the Pareto front surface which is the set obtained by connecting the best trade-off points. Typically this surface is computed by evaluating the objectives at different points and then interpolating between the subset of the best evaluated trade-off points. In this work, we propose to parameterise the Pareto front surface using polar coordina… ▽ More

    Submitted 21 June, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

    Comments: The code is available at: https://github.com/benmltu/scalarize

  14. arXiv:2404.17667  [pdf, other

    eess.SP cs.LG

    SiamQuality: A ConvNet-Based Foundation Model for Imperfect Physiological Signals

    Authors: Cheng Ding, Zhicheng Guo, Zhaoliang Chen, Randall J Lee, Cynthia Rudin, Xiao Hu

    Abstract: Foundation models, especially those using transformers as backbones, have gained significant popularity, particularly in language and language-vision tasks. However, large foundation models are typically trained on high-quality data, which poses a significant challenge, given the prevalence of poor-quality real-world data. This challenge is more pronounced for developing foundation models for phys… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

  15. arXiv:2404.15353  [pdf, other

    eess.SP cs.AI cs.LG

    SQUWA: Signal Quality Aware DNN Architecture for Enhanced Accuracy in Atrial Fibrillation Detection from Noisy PPG Signals

    Authors: Runze Yan, Cheng Ding, Ran Xiao, Aleksandr Fedorov, Randall J Lee, Fadi Nahab, Xiao Hu

    Abstract: Atrial fibrillation (AF), a common cardiac arrhythmia, significantly increases the risk of stroke, heart disease, and mortality. Photoplethysmography (PPG) offers a promising solution for continuous AF monitoring, due to its cost efficiency and integration into wearable devices. Nonetheless, PPG signals are susceptible to corruption from motion artifacts and other factors often encountered in ambu… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

    Comments: 15 pages; 9 figures; 2024 Conference on Health, Inference, and Learning (CHIL)

  16. arXiv:2404.14219  [pdf, other

    cs.CL cs.AI

    Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

    Authors: Marah Abdin, Sam Ade Jacobs, Ammar Ahmad Awan, Jyoti Aneja, Ahmed Awadallah, Hany Awadalla, Nguyen Bach, Amit Bahree, Arash Bakhtiari, Jianmin Bao, Harkirat Behl, Alon Benhaim, Misha Bilenko, Johan Bjorck, Sébastien Bubeck, Qin Cai, Martin Cai, Caio César Teodoro Mendes, Weizhu Chen, Vishrav Chaudhary, Dong Chen, Dongdong Chen, Yen-Chun Chen, Yi-Ling Chen, Parul Chopra , et al. (90 additional authors not shown)

    Abstract: We introduce phi-3-mini, a 3.8 billion parameter language model trained on 3.3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3.5 (e.g., phi-3-mini achieves 69% on MMLU and 8.38 on MT-bench), despite being small enough to be deployed on a phone. The innovation lies entirely in our dataset… ▽ More

    Submitted 23 May, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

    Comments: 19 pages

  17. arXiv:2404.03991  [pdf, other

    eess.IV cs.CV cs.LG

    Towards Efficient and Accurate CT Segmentation via Edge-Preserving Probabilistic Downsampling

    Authors: Shahzad Ali, Yu Rim Lee, Soo Young Park, Won Young Tak, Soon Ki Jung

    Abstract: Downsampling images and labels, often necessitated by limited resources or to expedite network training, leads to the loss of small objects and thin boundaries. This undermines the segmentation network's capacity to interpret images accurately and predict detailed labels, resulting in diminished performance compared to processing at original resolutions. This situation exemplifies the trade-off be… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

    Comments: 5 pages (4 figures, 1 table); This work has been submitted to the IEEE Signal Processing Letters. Copyright may be transferred without notice, after which this version may no longer be accessible

  18. arXiv:2404.01353  [pdf, other

    cs.LG cs.AI cs.CL

    Efficiently Distilling LLMs for Edge Applications

    Authors: Achintya Kundu, Fabian Lim, Aaron Chew, Laura Wynter, Penny Chong, Rhui Dih Lee

    Abstract: Supernet training of LLMs is of great interest in industrial applications as it confers the ability to produce a palette of smaller models at constant cost, regardless of the number of models (of different size / latency) produced. We propose a new method called Multistage Low-rank Fine-tuning of Super-transformers (MLFS) for parameter-efficient supernet training. We show that it is possible to ob… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: This paper has been accepted for publication in NAACL 2024 (Industry Track)

  19. arXiv:2404.01104  [pdf, other

    cs.CL

    SentiCSE: A Sentiment-aware Contrastive Sentence Embedding Framework with Sentiment-guided Textual Similarity

    Authors: Jaemin Kim, Yohan Na, Kangmin Kim, Sang Rak Lee, Dong-Kyu Chae

    Abstract: Recently, sentiment-aware pre-trained language models (PLMs) demonstrate impressive results in downstream sentiment analysis tasks. However, they neglect to evaluate the quality of their constructed sentiment representations; they just focus on improving the fine-tuning performance, which overshadows the representation quality. We argue that without guaranteeing the representation quality, their d… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: 14 pages, 8 figures

    MSC Class: 68T50 ACM Class: I.2.7

    Journal ref: LREC-COLING2024

  20. arXiv:2403.14652  [pdf, other

    cs.CY cs.AI cs.CL cs.MM

    MemeCraft: Contextual and Stance-Driven Multimodal Meme Generation

    Authors: Han Wang, Roy Ka-Wei Lee

    Abstract: Online memes have emerged as powerful digital cultural artifacts in the age of social media, offering not only humor but also platforms for political discourse, social critique, and information dissemination. Their extensive reach and influence in shaping online communities' sentiments make them invaluable tools for campaigning and promoting ideologies. Despite the development of several meme-gene… ▽ More

    Submitted 24 February, 2024; originally announced March 2024.

    Comments: 8 pages, 7 figures, ACM MM 2024

    ACM Class: I.2.7; I.2.10

  21. arXiv:2402.17971  [pdf, other

    cs.CV cs.AI cs.CL

    All in an Aggregated Image for In-Image Learning

    Authors: Lei Wang, Wanyu Xu, Zhiqiang Hu, Yihuai Lan, Shan Dong, Hao Wang, Roy Ka-Wei Lee, Ee-Peng Lim

    Abstract: This paper introduces a new in-context learning (ICL) mechanism called In-Image Learning (I$^2$L) that combines demonstration examples, visual cues, and chain-of-thought reasoning into an aggregated image to enhance the capabilities of Large Multimodal Models (e.g., GPT-4V) in multimodal reasoning tasks. Unlike previous approaches that rely on converting images to text or incorporating visual inpu… ▽ More

    Submitted 2 April, 2024; v1 submitted 27 February, 2024; originally announced February 2024.

    Comments: Preprint

  22. arXiv:2402.12647  [pdf, other

    cs.CV cs.RO

    DiffusionNOCS: Managing Symmetry and Uncertainty in Sim2Real Multi-Modal Category-level Pose Estimation

    Authors: Takuya Ikeda, Sergey Zakharov, Tianyi Ko, Muhammad Zubair Irshad, Robert Lee, Katherine Liu, Rares Ambrus, Koichi Nishiwaki

    Abstract: This paper addresses the challenging problem of category-level pose estimation. Current state-of-the-art methods for this task face challenges when dealing with symmetric objects and when attempting to generalize to new environments solely through synthetic data training. In this work, we address these challenges by proposing a probabilistic model that relies on diffusion to estimate dense canonic… ▽ More

    Submitted 5 March, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

    Comments: 8 pages. 9 figures. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  23. arXiv:2402.11845  [pdf, other

    cs.CL cs.CV

    Modularized Networks for Few-shot Hateful Meme Detection

    Authors: Rui Cao, Roy Ka-Wei Lee, Jing Jiang

    Abstract: In this paper, we address the challenge of detecting hateful memes in the low-resource setting where only a few labeled examples are available. Our approach leverages the compositionality of Low-rank adaptation (LoRA), a widely used parameter-efficient tuning technique. We commence by fine-tuning large language models (LLMs) with LoRA on selected tasks pertinent to hateful meme detection, thereby… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

    Comments: camera-ready for WWW, 2024, Web4Good

  24. arXiv:2402.08406  [pdf, other

    cs.LG

    Transition Constrained Bayesian Optimization via Markov Decision Processes

    Authors: Jose Pablo Folch, Calvin Tsay, Robert M Lee, Behrang Shafei, Weronika Ormaniec, Andreas Krause, Mark van der Wilk, Ruth Misener, Mojmír Mutný

    Abstract: Bayesian optimization is a methodology to optimize black-box functions. Traditionally, it focuses on the setting where you can arbitrarily query the search space. However, many real-life problems do not offer this flexibility; in particular, the search space of the next query may depend on previous ones. Example challenges arise in the physical sciences in the form of local movement constraints, r… ▽ More

    Submitted 29 May, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

    Comments: 10 pages main, 32 pages total, 16 figures, 2 tables, preprint

  25. arXiv:2402.01707  [pdf, other

    cs.CY cs.HC

    Revitalizing Sex Education for Chinese Children: A Formative Study

    Authors: Kyrie Zhixuan Zhou, Yilin Zhu, Jingwen Shan, Madelyn Rose Sanfilippo, Hee Rin Lee

    Abstract: Sex education helps children obtain knowledge and awareness of sexuality, and protects them against sexually transmitted diseases, pregnancy, and sexual abuse. Sex education is not well taught to children in China -- both school-based education and parental communication on this topic are limited. To interrogate the status quo of sex education in China and explore suitable interventions, we conduc… ▽ More

    Submitted 25 January, 2024; originally announced February 2024.

  26. arXiv:2401.16727  [pdf, other

    cs.CL

    Recent Advances in Hate Speech Moderation: Multimodality and the Role of Large Models

    Authors: Ming Shan Hee, Shivam Sharma, Rui Cao, Palash Nandi, Tanmoy Chakraborty, Roy Ka-Wei Lee

    Abstract: In the evolving landscape of online communication, moderating hate speech (HS) presents an intricate challenge, compounded by the multimodal nature of digital content. This comprehensive survey delves into the recent strides in HS moderation, spotlighting the burgeoning role of large language models (LLMs) and large multimodal models (LMMs). Our exploration begins with a thorough analysis of curre… ▽ More

    Submitted 1 February, 2024; v1 submitted 29 January, 2024; originally announced January 2024.

    Comments: Preprint; Under-Review

  27. arXiv:2401.07856  [pdf

    physics.optics cs.CV physics.app-ph

    Information hiding cameras: optical concealment of object information into ordinary images

    Authors: Bijie Bai, Ryan Lee, Yuhang Li, Tianyi Gan, Yuntian Wang, Mona Jarrahi, Aydogan Ozcan

    Abstract: Data protection methods like cryptography, despite being effective, inadvertently signal the presence of secret communication, thereby drawing undue attention. Here, we introduce an optical information hiding camera integrated with an electronic decoder, optimized jointly through deep learning. This information hiding-decoding system employs a diffractive optical processor as its front-end, which… ▽ More

    Submitted 15 January, 2024; originally announced January 2024.

    Comments: 26 Pages, 8 Figures

    Journal ref: Science Advances (2024)

  28. Temporal and Between-Group Variability in College Dropout Prediction

    Authors: Dominik Glandorf, Hye Rin Lee, Gabe Avakian Orona, Marina Pumptow, Renzhe Yu, Christian Fischer

    Abstract: Large-scale administrative data is a common input in early warning systems for college dropout in higher education. Still, the terminology and methodology vary significantly across existing studies, and the implications of different modeling decisions are not fully understood. This study provides a systematic evaluation of contributing factors and predictive performance of machine learning models… ▽ More

    Submitted 12 January, 2024; originally announced January 2024.

    Comments: Full paper accepted to Learning Analytics and Knowledge (LAK 2024)

  29. arXiv:2312.11804  [pdf, other

    cs.RO

    Gravity-aware Grasp Generation with Implicit Grasp Mode Selection for Underactuated Hands

    Authors: Tianyi Ko, Takuya Ikeda, Thomas Stewart, Robert Lee, Koichi Nishiwaki

    Abstract: Learning-based grasp detectors typically assume a precision grasp, where each finger only has one contact point, and estimate the grasp probability. In this work, we propose a data generation and learning pipeline that can leverage power grasping, which has more contact points with an enveloping configuration and is robust against both positioning error and force disturbance. To train a grasp dete… ▽ More

    Submitted 28 February, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  30. arXiv:2312.09693  [pdf, other

    cs.AI

    Prompting Large Language Models for Topic Modeling

    Authors: Han Wang, Nirmalendu Prakash, Nguyen Khoi Hoang, Ming Shan Hee, Usman Naseem, Roy Ka-Wei Lee

    Abstract: Topic modeling is a widely used technique for revealing underlying thematic structures within textual data. However, existing models have certain limitations, particularly when dealing with short text datasets that lack co-occurring words. Moreover, these models often neglect sentence-level semantics, focusing primarily on token-level semantics. In this paper, we propose PromptTopic, a novel topic… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

    Comments: 6 pages, 3 figures, IEEE International Conference on Big Data

    ACM Class: I.2.7

  31. arXiv:2312.08603  [pdf, other

    eess.AS cs.SD

    NeXt-TDNN: Modernizing Multi-Scale Temporal Convolution Backbone for Speaker Verification

    Authors: Hyun-Jun Heo, Ui-Hyeop Shin, Ran Lee, YoungJu Cheon, Hyung-Min Park

    Abstract: In speaker verification, ECAPA-TDNN has shown remarkable improvement by utilizing one-dimensional(1D) Res2Net block and squeeze-and-excitation(SE) module, along with multi-layer feature aggregation (MFA). Meanwhile, in vision tasks, ConvNet structures have been modernized by referring to Transformer, resulting in improved performance. In this paper, we present an improved block design for TDNN in… ▽ More

    Submitted 14 December, 2023; v1 submitted 13 December, 2023; originally announced December 2023.

    Comments: Accepted by ICASSP 2024

  32. arXiv:2312.07399  [pdf, other

    cs.CL cs.AI

    Large Language Models are Clinical Reasoners: Reasoning-Aware Diagnosis Framework with Prompt-Generated Rationales

    Authors: Taeyoon Kwon, Kai Tzu-iunn Ong, Dongjin Kang, Seungjun Moon, Jeong Ryong Lee, Dosik Hwang, Yongsik Sim, Beomseok Sohn, Dongha Lee, Jinyoung Yeo

    Abstract: Machine reasoning has made great progress in recent years owing to large language models (LLMs). In the clinical domain, however, most NLP-driven projects mainly focus on clinical classification or reading comprehension, and under-explore clinical reasoning for disease diagnosis due to the expensive rationale annotation with clinicians. In this work, we present a "reasoning-aware" diagnosis framew… ▽ More

    Submitted 10 May, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

    Comments: Accepted to AAAI 2024

  33. arXiv:2312.06094  [pdf, other

    cs.CL cs.CV cs.MM

    MATK: The Meme Analytical Tool Kit

    Authors: Ming Shan Hee, Aditi Kumaresan, Nguyen Khoi Hoang, Nirmalendu Prakash, Rui Cao, Roy Ka-Wei Lee

    Abstract: The rise of social media platforms has brought about a new digital culture called memes. Memes, which combine visuals and text, can strongly influence public opinions on social and cultural issues. As a result, people have become interested in categorizing memes, leading to the development of various datasets and multimodal models that show promising results in this field. However, there is curren… ▽ More

    Submitted 10 December, 2023; originally announced December 2023.

    Comments: Accepted at ACM Multimedia'23 Open-Source Software Competition Track

    ACM Class: I.1.4

  34. arXiv:2312.06093  [pdf, other

    cs.CL cs.CV cs.MM

    PromptMTopic: Unsupervised Multimodal Topic Modeling of Memes using Large Language Models

    Authors: Nirmalendu Prakash, Han Wang, Nguyen Khoi Hoang, Ming Shan Hee, Roy Ka-Wei Lee

    Abstract: The proliferation of social media has given rise to a new form of communication: memes. Memes are multimodal and often contain a combination of text and visual elements that convey meaning, humor, and cultural significance. While meme analysis has been an active area of research, little work has been done on unsupervised multimodal topic modeling of memes, which is important for content moderation… ▽ More

    Submitted 10 December, 2023; originally announced December 2023.

    Comments: Accepted at ACM Multimedia'23 Research Track

    ACM Class: I.1.4; I.1.7

  35. arXiv:2312.02658  [pdf

    cs.LG physics.ao-ph

    Do AI models produce better weather forecasts than physics-based models? A quantitative evaluation case study of Storm Ciarán

    Authors: Andrew J. Charlton-Perez, Helen F. Dacre, Simon Driscoll, Suzanne L. Gray, Ben Harvey, Natalie J. Harvey, Kieran M. R. Hunt, Robert W. Lee, Ranjini Swaminathan, Remy Vandaele, Ambrogio Volonté

    Abstract: There has been huge recent interest in the potential of making operational weather forecasts using machine learning techniques. As they become a part of the weather forecasting toolbox, there is a pressing need to understand how well current machine learning models can simulate high-impact weather events. We compare forecasts of Storm Ciarán, a European windstorm that caused sixteen deaths and ext… ▽ More

    Submitted 19 February, 2024; v1 submitted 5 December, 2023; originally announced December 2023.

  36. arXiv:2312.00622  [pdf, other

    cs.LG math.OC stat.ME

    Practical Path-based Bayesian Optimization

    Authors: Jose Pablo Folch, James Odgers, Shiqiang Zhang, Robert M Lee, Behrang Shafei, David Walz, Calvin Tsay, Mark van der Wilk, Ruth Misener

    Abstract: There has been a surge in interest in data-driven experimental design with applications to chemical engineering and drug manufacturing. Bayesian optimization (BO) has proven to be adaptable to such cases, since we can model the reactions of interest as expensive black-box functions. Sometimes, the cost of this black-box functions can be separated into two parts: (a) the cost of the experiment itse… ▽ More

    Submitted 1 December, 2023; originally announced December 2023.

    Comments: 6 main pages, 12 with references and appendix. 4 figures, 2 tables. To appear in NeurIPS 2023 Workshop on Adaptive Experimental Design and Active Learning in the Real World

    Journal ref: NeurIPS 2023 Workshop on Adaptive Experimental Design and Active Learning in the Real World

  37. arXiv:2311.18451  [pdf, other

    cs.LG

    How Much Is Hidden in the NAS Benchmarks? Few-Shot Adaptation of a NAS Predictor

    Authors: Hrushikesh Loya, Łukasz Dudziak, Abhinav Mehrotra, Royson Lee, Javier Fernandez-Marques, Nicholas D. Lane, Hongkai Wen

    Abstract: Neural architecture search has proven to be a powerful approach to designing and refining neural networks, often boosting their performance and efficiency over manually-designed variations, but comes with computational overhead. While there has been a considerable amount of research focused on lowering the cost of NAS for mainstream tasks, such as image classification, a lot of those improvements… ▽ More

    Submitted 30 November, 2023; originally announced November 2023.

  38. arXiv:2311.18260  [pdf, other

    eess.IV cs.CL cs.CV cs.LG

    Consensus, dissensus and synergy between clinicians and specialist foundation models in radiology report generation

    Authors: Ryutaro Tanno, David G. T. Barrett, Andrew Sellergren, Sumedh Ghaisas, Sumanth Dathathri, Abigail See, Johannes Welbl, Karan Singhal, Shekoofeh Azizi, Tao Tu, Mike Schaekermann, Rhys May, Roy Lee, SiWai Man, Zahra Ahmed, Sara Mahdavi, Yossi Matias, Joelle Barral, Ali Eslami, Danielle Belgrave, Vivek Natarajan, Shravya Shetty, Pushmeet Kohli, Po-Sen Huang, Alan Karthikesalingam , et al. (1 additional authors not shown)

    Abstract: Radiology reports are an instrumental part of modern medicine, informing key clinical decisions such as diagnosis and treatment. The worldwide shortage of radiologists, however, restricts access to expert care and imposes heavy workloads, contributing to avoidable errors and delays in report delivery. While recent progress in automated report generation with vision-language models offer clear pote… ▽ More

    Submitted 20 December, 2023; v1 submitted 30 November, 2023; originally announced November 2023.

  39. arXiv:2311.18145  [pdf, ps, other

    cs.DS math.FA

    Sparsifying generalized linear models

    Authors: Arun Jambulapati, James R. Lee, Yang P. Liu, Aaron Sidford

    Abstract: We consider the sparsification of sums $F : \mathbb{R}^n \to \mathbb{R}$ where $F(x) = f_1(\langle a_1,x\rangle) + \cdots + f_m(\langle a_m,x\rangle)$ for vectors $a_1,\ldots,a_m \in \mathbb{R}^n$ and functions $f_1,\ldots,f_m : \mathbb{R} \to \mathbb{R}_+$. We show that $(1+\varepsilon)$-approximate sparsifiers of $F$ with support size… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

  40. arXiv:2311.13777  [pdf, other

    cs.CV

    GS-Pose: Category-Level Object Pose Estimation via Geometric and Semantic Correspondence

    Authors: Pengyuan Wang, Takuya Ikeda, Robert Lee, Koichi Nishiwaki

    Abstract: Category-level pose estimation is a challenging task with many potential applications in computer vision and robotics. Recently, deep-learning-based approaches have made great progress, but are typically hindered by the need for large datasets of either pose-labelled real images or carefully tuned photorealistic simulators. This can be avoided by using only geometry inputs such as depth images to… ▽ More

    Submitted 22 November, 2023; originally announced November 2023.

  41. arXiv:2311.11071  [pdf, other

    cs.IR cs.AI cs.LG cs.SI

    SBTRec- A Transformer Framework for Personalized Tour Recommendation Problem with Sentiment Analysis

    Authors: Ngai Lam Ho, Roy Ka-Wei Lee, Kwan Hui Lim

    Abstract: When traveling to an unfamiliar city for holidays, tourists often rely on guidebooks, travel websites, or recommendation systems to plan their daily itineraries and explore popular points of interest (POIs). However, these approaches may lack optimization in terms of time feasibility, localities, and user preferences. In this paper, we propose the SBTRec algorithm: a BERT-based Trajectory Recommen… ▽ More

    Submitted 18 November, 2023; originally announced November 2023.

    Report number: 01

  42. arXiv:2311.01697  [pdf, other

    cs.RO

    CraterGrader: Autonomous Robotic Terrain Manipulation for Lunar Site Preparation and Earthmoving

    Authors: Ryan Lee, Benjamin Younes, Alexander Pletta, John Harrington, Russell Q. Wong, William "Red" Whittaker

    Abstract: Establishing lunar infrastructure is paramount to long-term habitation on the Moon. To meet the demand for future lunar infrastructure development, we present CraterGrader, a novel system for autonomous robotic earthmoving tasks within lunar constraints. In contrast to the current approaches to construction autonomy, CraterGrader uses online perception for dynamic mapping of deformable terrain, de… ▽ More

    Submitted 4 June, 2024; v1 submitted 3 November, 2023; originally announced November 2023.

    Comments: 13 pages, 10 figures

  43. arXiv:2310.20468  [pdf, other

    cs.RO

    An Introduction to Causal Inference Methods for Observational Human-Robot Interaction Research

    Authors: Jaron J. R. Lee, Gopika Ajaykumar, Ilya Shpitser, Chien-Ming Huang

    Abstract: Quantitative methods in Human-Robot Interaction (HRI) research have primarily relied upon randomized, controlled experiments in laboratory settings. However, such experiments are not always feasible when external validity, ethical constraints, and ease of data collection are of concern. Furthermore, as consumer robots become increasingly available, increasing amounts of real-world data will be ava… ▽ More

    Submitted 31 October, 2023; originally announced October 2023.

    Comments: 28 pages

  44. arXiv:2310.20463  [pdf, ps, other

    cs.AI

    Interpretable Neural PDE Solvers using Symbolic Frameworks

    Authors: Yolanne Yi Ran Lee

    Abstract: Partial differential equations (PDEs) are ubiquitous in the world around us, modelling phenomena from heat and sound to quantum systems. Recent advances in deep learning have resulted in the development of powerful neural solvers; however, while these methods have demonstrated state-of-the-art performance in both accuracy and computational efficiency, a significant challenge remains in their inter… ▽ More

    Submitted 10 November, 2023; v1 submitted 31 October, 2023; originally announced October 2023.

    Comments: Accepted to the NeurIPS 2023 AI for Science Workshop. arXiv admin note: text overlap with arXiv:2310.19763

  45. arXiv:2310.20159  [pdf, other

    cs.CV cs.AI

    Language Guided Visual Question Answering: Elevate Your Multimodal Language Model Using Knowledge-Enriched Prompts

    Authors: Deepanway Ghosal, Navonil Majumder, Roy Ka-Wei Lee, Rada Mihalcea, Soujanya Poria

    Abstract: Visual question answering (VQA) is the task of answering questions about an image. The task assumes an understanding of both the image and the question to provide a natural language answer. VQA has gained popularity in recent years due to its potential applications in a wide range of fields, including robotics, education, and healthcare. In this paper, we focus on knowledge-augmented VQA, where an… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

  46. arXiv:2310.19886  [pdf

    cs.LG cs.IR cs.SI

    BTRec: BERT-Based Trajectory Recommendation for Personalized Tours

    Authors: Ngai Lam Ho, Roy Ka-Wei Lee, Kwan Hui Lim

    Abstract: An essential task for tourists having a pleasant holiday is to have a well-planned itinerary with relevant recommendations, especially when visiting unfamiliar cities. Many tour recommendation tools only take into account a limited number of factors, such as popular Points of Interest (POIs) and routing constraints. Consequently, the solutions they provide may not always align with the individual… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

    Comments: RecSys 2023, Workshop on Recommenders in Tourism

  47. arXiv:2310.19763  [pdf, other

    cs.LG cs.AI math.NA

    Autoregressive Renaissance in Neural PDE Solvers

    Authors: Yolanne Yi Ran Lee

    Abstract: Recent developments in the field of neural partial differential equation (PDE) solvers have placed a strong emphasis on neural operators. However, the paper "Message Passing Neural PDE Solver" by Brandstetter et al. published in ICLR 2022 revisits autoregressive models and designs a message passing graph neural network that is comparable with or outperforms both the state-of-the-art Fourier Neural… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

    Comments: Presented as a workshop poster at ICLR 2023

  48. arXiv:2310.10928  [pdf, ps, other

    cs.HC cs.AI cs.LG

    Using Audio Data to Facilitate Depression Risk Assessment in Primary Health Care

    Authors: Adam Valen Levinson, Abhay Goyal, Roger Ho Chun Man, Roy Ka-Wei Lee, Koustuv Saha, Nimay Parekh, Frederick L. Altice, Lam Yin Cheung, Munmun De Choudhury, Navin Kumar

    Abstract: Telehealth is a valuable tool for primary health care (PHC), where depression is a common condition. PHC is the first point of contact for most people with depression, but about 25% of diagnoses made by PHC physicians are inaccurate. Many other barriers also hinder depression detection and treatment in PHC. Artificial intelligence (AI) may help reduce depression misdiagnosis in PHC and improve ove… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

  49. arXiv:2310.09203  [pdf, other

    cs.LG cs.AI

    SiamAF: Learning Shared Information from ECG and PPG Signals for Robust Atrial Fibrillation Detection

    Authors: Zhicheng Guo, Cheng Ding, Duc H. Do, Amit Shah, Randall J. Lee, Xiao Hu, Cynthia Rudin

    Abstract: Atrial fibrillation (AF) is the most common type of cardiac arrhythmia. It is associated with an increased risk of stroke, heart failure, and other cardiovascular complications, but can be clinically silent. Passive AF monitoring with wearables may help reduce adverse clinical outcomes related to AF. Detecting AF in noisy wearable data poses a significant challenge, leading to the emergence of var… ▽ More

    Submitted 8 March, 2024; v1 submitted 13 October, 2023; originally announced October 2023.

  50. arXiv:2310.08123  [pdf, other

    cs.CL

    Who Wrote it and Why? Prompting Large-Language Models for Authorship Verification

    Authors: Chia-Yu Hung, Zhiqiang Hu, Yujia Hu, Roy Ka-Wei Lee

    Abstract: Authorship verification (AV) is a fundamental task in natural language processing (NLP) and computational linguistics, with applications in forensic analysis, plagiarism detection, and identification of deceptive content. Existing AV techniques, including traditional stylometric and deep learning approaches, face limitations in terms of data requirements and lack of explainability. To address thes… ▽ More

    Submitted 12 October, 2023; originally announced October 2023.

    Comments: 7 pages,1 figure