Skip to main content

Showing 1–50 of 977 results for author: Vivek

  1. arXiv:2407.11229  [pdf, other

    cs.CL cs.AI cs.CV cs.HC cs.LG

    Unraveling the Truth: Do LLMs really Understand Charts? A Deep Dive into Consistency and Robustness

    Authors: Srija Mukhopadhyay, Adnan Qidwai, Aparna Garimella, Pritika Ramu, Vivek Gupta, Dan Roth

    Abstract: Chart question answering (CQA) is a crucial area of Visual Language Understanding. However, the robustness and consistency of current Visual Language Models (VLMs) in this field remain under-explored. This paper evaluates state-of-the-art VLMs on comprehensive datasets, developed specifically for this study, encompassing diverse question categories and chart formats. We investigate two key aspects… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: 22 pages, 7 Tables, 3 Figures, 25 examples

  2. arXiv:2407.10380  [pdf, other

    cs.CV cs.AI cs.CL cs.IR

    NTSEBENCH: Cognitive Reasoning Benchmark for Vision Language Models

    Authors: Pranshu Pandya, Agney S Talwarr, Vatsal Gupta, Tushar Kataria, Vivek Gupta, Dan Roth

    Abstract: Cognitive textual and visual reasoning tasks, such as puzzles, series, and analogies, demand the ability to quickly reason, decipher, and evaluate patterns both textually and spatially. While LLMs and VLMs, through extensive training on large amounts of human-curated data, have attained a high level of pseudo-human intelligence in some common sense reasoning tasks, they still struggle with more co… ▽ More

    Submitted 14 July, 2024; originally announced July 2024.

    Comments: 15 pages, 2 figures, 5 tables

  3. arXiv:2407.09481  [pdf

    cs.CY cs.HC

    ChatGPT and Vaccine Hesitancy: A Comparison of English, Spanish, and French Responses Using a Validated Scale

    Authors: Saubhagya Joshi, Eunbin Ha, Yonaira Rivera, Vivek K. Singh

    Abstract: ChatGPT is a popular information system (over 1 billion visits in August 2023) that can generate natural language responses to user queries. It is important to study the quality and equity of its responses on health-related topics, such as vaccination, as they may influence public health decision-making. We use the Vaccine Hesitancy Scale (VHS) proposed by Shapiro et al.1 to measure the hesitancy… ▽ More

    Submitted 6 May, 2024; originally announced July 2024.

    Comments: 11 pages. Appeared in the Proceedings of the AMIA Informatics Summit, 2024

  4. arXiv:2407.08349  [pdf

    cs.CV

    Spine Vision X-Ray Image based GUI Planning of Pedicle Screws Using Enhanced YOLOv5 for Vertebrae Segmentation

    Authors: Yashwanth Rao, Gaurisankar S, Durga R, Aparna Purayath, Vivek Maik, Manojkumar Lakshmanan, Mohanasankar Sivaprakasm

    Abstract: In this paper, we propose an innovative Graphical User Interface (GUI) aimed at improving preoperative planning and intra-operative guidance for precise spinal screw placement through vertebrae segmentation. The methodology encompasses both front-end and back-end computations. The front end comprises a GUI that allows surgeons to precisely adjust the placement of screws on X-Ray images, thereby im… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  5. arXiv:2407.08347  [pdf

    eess.IV cs.CV

    GUI-based Pedicle Screw Planning on Fluoroscopic Images Utilizing Vertebral Segmentation

    Authors: Vivek Maik, Aparna Purayath, Durga R, Manojkumar Lakshmanan, Mohanasankar Sivaprakasm

    Abstract: The proposed work establishes a novel Graphical User Interface (GUI) framework, primarily designed for intraoperative pedicle screw planning. Current planning workflow in Image Guided Surgeries primarily relies on pre-operative CT planning. Intraoperative CT planning can be time-consuming and expensive and thus is not a common practice. In situations where efficiency and cost-effectiveness are par… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  6. arXiv:2407.05952  [pdf, other

    cs.DB cs.AI cs.CL cs.LG

    H-STAR: LLM-driven Hybrid SQL-Text Adaptive Reasoning on Tables

    Authors: Nikhil Abhyankar, Vivek Gupta, Dan Roth, Chandan K. Reddy

    Abstract: Tabular reasoning involves interpreting unstructured queries against structured tables, requiring a synthesis of textual understanding and symbolic reasoning. Existing methods rely on either of the approaches and are constrained by their respective limitations. Textual reasoning excels in semantic interpretation unlike symbolic reasoning (SQL logic), but falls short in mathematical reasoning where… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

    Comments: 13 pages, 14 tables, 9 figures

  7. arXiv:2407.04965  [pdf, other

    cs.CL

    Beyond Perplexity: Multi-dimensional Safety Evaluation of LLM Compression

    Authors: Zhichao Xu, Ashim Gupta, Tao Li, Oliver Bentham, Vivek Srikumar

    Abstract: Large language models (LLMs) are increasingly deployed in real-world scenarios with the help of recent model compression techniques. Such momentum towards local deployment means the use of compressed LLMs will widely impact a large population. However, prior analysis works often prioritize on preserving perplexity which is a direct analogy to training loss. The impact of compression method on othe… ▽ More

    Submitted 10 July, 2024; v1 submitted 6 July, 2024; originally announced July 2024.

  8. arXiv:2407.01578  [pdf

    cs.RO eess.IV eess.SY

    A Hybrid-Layered System for Image-Guided Navigation and Robot Assisted Spine Surgeries

    Authors: Suhail Ansari T, Vivek Maik, Minhas Naheem, Keerthi Ram, Manojkumar Lakshmanan, Mohanasankar Sivaprakasam

    Abstract: In response to the growing demand for precise and affordable solutions for Image-Guided Spine Surgery (IGSS), this paper presents a comprehensive development of a Robot-Assisted and Navigation-Guided IGSS System. The endeavor involves integrating cutting-edge technologies to attain the required surgical precision and limit user radiation exposure, thereby addressing the limitations of manual surgi… ▽ More

    Submitted 7 June, 2024; originally announced July 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2406.04644

  9. arXiv:2407.00774  [pdf, other

    quant-ph cs.LG

    Advantages of quantum support vector machine in cross-domain classification of quantum states

    Authors: Diksha Sharma, Vivek Balasaheb Sabale, Parvinder Singh, Atul Kumar

    Abstract: In this study, we use cross-domain classification using quantum machine learning for quantum advantages to address the entanglement versus separability paradigm. We further demonstrate the efficient classification of Bell diagonal states into zero and non-zero discord classes. The inherited structure of quantum states and its relation with a particular class of quantum states are exploited to intu… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

  10. arXiv:2406.19237  [pdf, other

    cs.CL cs.CV cs.IR cs.LG

    FlowVQA: Mapping Multimodal Logic in Visual Question Answering with Flowcharts

    Authors: Shubhankar Singh, Purvi Chaurasia, Yerram Varun, Pranshu Pandya, Vatsal Gupta, Vivek Gupta, Dan Roth

    Abstract: Existing benchmarks for visual question answering lack in visual grounding and complexity, particularly in evaluating spatial reasoning skills. We introduce FlowVQA, a novel benchmark aimed at assessing the capabilities of visual question-answering multimodal language models in reasoning with flowcharts as visual contexts. FlowVQA comprises 2,272 carefully generated and human-verified flowchart im… ▽ More

    Submitted 28 June, 2024; v1 submitted 27 June, 2024; originally announced June 2024.

    Comments: Accepted in ACL 2024 (Findings), 21 pages, 7 figures, 9 Tables

  11. arXiv:2406.18679  [pdf, other

    eess.AS cs.AI cs.CL cs.LG

    Speakers Unembedded: Embedding-free Approach to Long-form Neural Diarization

    Authors: Xiang Li, Vivek Govindan, Rohit Paturi, Sundararajan Srinivasan

    Abstract: End-to-end neural diarization (EEND) models offer significant improvements over traditional embedding-based Speaker Diarization (SD) approaches but falls short on generalizing to long-form audio with large number of speakers. EEND-vector-clustering method mitigates this by combining local EEND with global clustering of speaker embeddings from local windows, but this requires an additional speaker… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: Accepted at INTERSPEECH 2024

  12. arXiv:2406.18675  [pdf, other

    cs.HC cs.AI cs.CL

    Human-AI Collaborative Taxonomy Construction: A Case Study in Profession-Specific Writing Assistants

    Authors: Minhwa Lee, Zae Myung Kim, Vivek Khetan, Dongyeop Kang

    Abstract: Large Language Models (LLMs) have assisted humans in several writing tasks, including text revision and story generation. However, their effectiveness in supporting domain-specific writing, particularly in business contexts, is relatively less explored. Our formative study with industry professionals revealed the limitations in current LLMs' understanding of the nuances in such domain-specific wri… ▽ More

    Submitted 15 July, 2024; v1 submitted 26 June, 2024; originally announced June 2024.

    Comments: Accepted to CHI 2024 In2Writing Workshop

  13. arXiv:2406.17098  [pdf, other

    cs.LG cs.AI

    Learning Temporal Distances: Contrastive Successor Features Can Provide a Metric Structure for Decision-Making

    Authors: Vivek Myers, Chongyi Zheng, Anca Dragan, Sergey Levine, Benjamin Eysenbach

    Abstract: Temporal distances lie at the heart of many algorithms for planning, control, and reinforcement learning that involve reaching goals, allowing one to estimate the transit time between two states. However, prior attempts to define such temporal distances in stochastic settings have been stymied by an important limitation: these prior approaches do not satisfy the triangle inequality. This is not me… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

    Comments: Proceedings of the 41st International Conference on Machine Learning (ICML 2024)

  14. arXiv:2406.15053  [pdf, other

    cs.CL

    PARIKSHA : A Large-Scale Investigation of Human-LLM Evaluator Agreement on Multilingual and Multi-Cultural Data

    Authors: Ishaan Watts, Varun Gumma, Aditya Yadavalli, Vivek Seshadri, Manohar Swaminathan, Sunayana Sitaram

    Abstract: Evaluation of multilingual Large Language Models (LLMs) is challenging due to a variety of factors -- the lack of benchmarks with sufficient linguistic diversity, contamination of popular benchmarks into LLM pre-training data and the lack of local, cultural nuances in translated benchmarks. In this work, we study human and LLM-based evaluation in a multilingual, multi-cultural setting. We evaluate… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: Work in progress

  15. arXiv:2406.13127  [pdf, other

    cs.AI

    Oralytics Reinforcement Learning Algorithm

    Authors: Anna L. Trella, Kelly W. Zhang, Stephanie M. Carpenter, David Elashoff, Zara M. Greer, Inbal Nahum-Shani, Dennis Ruenger, Vivek Shetty, Susan A. Murphy

    Abstract: Dental disease is still one of the most common chronic diseases in the United States. While dental disease is preventable through healthy oral self-care behaviors (OSCB), this basic behavior is not consistently practiced. We have developed Oralytics, an online, reinforcement learning (RL) algorithm that optimizes the delivery of personalized intervention prompts to improve OSCB. In this paper, we… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  16. arXiv:2406.11307  [pdf, other

    cs.CL

    An Empirical Investigation of Matrix Factorization Methods for Pre-trained Transformers

    Authors: Ashim Gupta, Sina Mahdipour Saravani, P. Sadayappan, Vivek Srikumar

    Abstract: The increasing size of transformer-based models in NLP makes the question of compressing them important. In this work, we present a comprehensive analysis of factorization based model compression techniques. Specifically, we focus on comparing straightforward low-rank factorization against the recently introduced Monarch factorization, which exhibits impressive performance preservation on the GLUE… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  17. arXiv:2406.10085  [pdf, other

    cs.CL

    Enhancing Question Answering on Charts Through Effective Pre-training Tasks

    Authors: Ashim Gupta, Vivek Gupta, Shuo Zhang, Yujie He, Ning Zhang, Shalin Shah

    Abstract: To completely understand a document, the use of textual information is not enough. Understanding visual cues, such as layouts and charts, is also required. While the current state-of-the-art approaches for document understanding (both OCR-based and OCR-free) work well, a thorough analysis of their capabilities and limitations has not yet been performed. Therefore, in this work, we addresses the li… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  18. arXiv:2406.07738  [pdf, other

    cs.CV

    On the Application of Egocentric Computer Vision to Industrial Scenarios

    Authors: Vivek Chavan, Oliver Heimann, Jörg Krüger

    Abstract: Egocentric vision aims to capture and analyse the world from the first-person perspective. We explore the possibilities for egocentric wearable devices to improve and enhance industrial use cases w.r.t. data collection, annotation, labelling and downstream applications. This would contribute to easier data collection and allow users to provide additional context. We envision that this approach cou… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: To be presented at the First Joint Egocentric Vision (EgoVis) Workshop, held in conjunction with CVPR 2024

  19. arXiv:2406.06714  [pdf, other

    cs.LG cs.AI cs.HC

    Coprocessor Actor Critic: A Model-Based Reinforcement Learning Approach For Adaptive Brain Stimulation

    Authors: Michelle Pan, Mariah Schrum, Vivek Myers, Erdem Bıyık, Anca Dragan

    Abstract: Adaptive brain stimulation can treat neurological conditions such as Parkinson's disease and post-stroke motor deficits by influencing abnormal neural activity. Because of patient heterogeneity, each patient requires a unique stimulation policy to achieve optimal neural responses. Model-free reinforcement learning (MFRL) holds promise in learning effective policies for a variety of similar control… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: Proceedings of the 41st International Conference on Machine Learning (ICML 2024)

  20. arXiv:2406.06371  [pdf, other

    cs.CL cs.SD eess.AS

    mHuBERT-147: A Compact Multilingual HuBERT Model

    Authors: Marcely Zanon Boito, Vivek Iyer, Nikolaos Lagos, Laurent Besacier, Ioan Calapodescu

    Abstract: We present mHuBERT-147, the first general-purpose massively multilingual HuBERT speech representation model trained on 90K hours of clean, open-license data. To scale up the multi-iteration HuBERT approach, we use faiss-based clustering, achieving 5.2x faster label assignment than the original method. We also apply a new multilingual batching up-sampling strategy, leveraging both language and data… ▽ More

    Submitted 27 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

    Comments: Extended version of the Interspeech 2024 paper of same name

  21. arXiv:2406.06316  [pdf, other

    cs.CL cs.AI cs.CE cs.LG

    Tx-LLM: A Large Language Model for Therapeutics

    Authors: Juan Manuel Zambrano Chaves, Eric Wang, Tao Tu, Eeshit Dhaval Vaishnav, Byron Lee, S. Sara Mahdavi, Christopher Semturs, David Fleet, Vivek Natarajan, Shekoofeh Azizi

    Abstract: Developing therapeutics is a lengthy and expensive process that requires the satisfaction of many different criteria, and AI models capable of expediting the process would be invaluable. However, the majority of current AI approaches address only a narrowly defined set of tasks, often circumscribed within a particular domain. To bridge this gap, we introduce Tx-LLM, a generalist large language mod… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  22. arXiv:2406.05184  [pdf, other

    cs.CV

    The Unmet Promise of Synthetic Training Images: Using Retrieved Real Images Performs Better

    Authors: Scott Geng, Cheng-Yu Hsieh, Vivek Ramanujan, Matthew Wallingford, Chun-Liang Li, Pang Wei Koh, Ranjay Krishna

    Abstract: Generative text-to-image models enable us to synthesize unlimited amounts of images in a controllable manner, spurring many recent efforts to train vision models with synthetic data. However, every synthetic image ultimately originates from the upstream data used to train the generator. What additional value does the intermediate generator provide over directly training on relevant parts of the up… ▽ More

    Submitted 3 July, 2024; v1 submitted 7 June, 2024; originally announced June 2024.

    Comments: Correspondence to sgeng at cs dot washington dot edu. RK and PWK equally advised the project

  23. A Hybrid-Layered System for Image-Guided Navigation and Robot Assisted Spine Surgery

    Authors: Suhail Ansari T, Vivek Maik, Minhas Naheem, Keerthi Ram, Manojkumar Lakshmanan, Mohanasankar Sivaprakasam

    Abstract: In response to the growing demand for precise and affordable solutions for Image-Guided Spine Surgery (IGSS), this paper presents a comprehensive development of a Robot-Assisted and Navigation-Guided IGSS System. The endeavor involves integrating cutting-edge technologies to attain the required surgical precision and limit user radiation exposure, thereby addressing the limitations of manual surgi… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: 6 Pages, 4 Figures, Published in IEEE SII Conference

    Journal ref: 2024 IEEE/SICE International Symposium on System Integration (SII)

  24. arXiv:2406.02749  [pdf, other

    cs.DS

    Efficient Leverage Score Sampling for Tensor Train Decomposition

    Authors: Vivek Bharadwaj, Beheshteh T. Rakhshan, Osman Asif Malik, Guillaume Rabusseau

    Abstract: Tensor Train~(TT) decomposition is widely used in the machine learning and quantum physics communities as a popular tool to efficiently compress high-dimensional tensor data. In this paper, we propose an efficient algorithm to accelerate computing the TT decomposition with the Alternating Least Squares (ALS) algorithm relying on exact leverage scores sampling. For this purpose, we propose a data s… ▽ More

    Submitted 5 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

  25. arXiv:2406.02057  [pdf, other

    cs.AI cs.LG

    Tabular and Deep Learning for the Whittle Index

    Authors: Francisco Robledo Relaño, Vivek Borkar, Urtzi Ayesta, Konstantin Avrachenkov

    Abstract: The Whittle index policy is a heuristic that has shown remarkably good performance (with guaranteed asymptotic optimality) when applied to the class of problems known as Restless Multi-Armed Bandit Problems (RMABPs). In this paper we present QWI and QWINN, two reinforcement learning algorithms, respectively tabular and deep, to learn the Whittle index for the total discounted criterion. The key fe… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: ACM Transactions on Modeling and Performance Evaluation of Computing Systems, 2024

  26. arXiv:2406.01939  [pdf, other

    cs.AI cs.DC cs.LG

    Speeding up Policy Simulation in Supply Chain RL

    Authors: Vivek Farias, Joren Gijsbrechts, Aryan Khojandi, Tianyi Peng, Andrew Zheng

    Abstract: Simulating a single trajectory of a dynamical system under some state-dependent policy is a core bottleneck in policy optimization algorithms. The many inherently serial policy evaluations that must be performed in a single simulation constitute the bulk of this bottleneck. To wit, in applying policy optimization to supply chain optimization (SCO) problems, simulating a single month of a supply ch… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  27. arXiv:2406.00859  [pdf, other

    eess.IV cs.CV

    Streaming quanta sensors for online, high-performance imaging and vision

    Authors: Tianyi Zhang, Matthew Dutson, Vivek Boominathan, Mohit Gupta, Ashok Veeraraghavan

    Abstract: Recently quanta image sensors (QIS) -- ultra-fast, zero-read-noise binary image sensors -- have demonstrated remarkable imaging capabilities in many challenging scenarios. Despite their potential, the adoption of these sensors is severely hampered by (a) high data rates and (b) the need for new computational pipelines to handle the unconventional raw data. We introduce a simple, low-bandwidth comp… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  28. arXiv:2406.00529  [pdf, other

    cs.LG cs.CV stat.ML

    On the Use of Anchoring for Training Vision Models

    Authors: Vivek Narayanaswamy, Kowshik Thopalli, Rushil Anirudh, Yamen Mubarka, Wesam Sakla, Jayaraman J. Thiagarajan

    Abstract: Anchoring is a recent, architecture-agnostic principle for training deep neural networks that has been shown to significantly improve uncertainty estimation, calibration, and extrapolation capabilities. In this paper, we systematically explore anchoring as a general protocol for training vision models, providing fundamental insights into its training and inference processes and their implications… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

  29. arXiv:2405.18369  [pdf, other

    cs.CL cs.AI cs.LG

    PromptWizard: Task-Aware Agent-driven Prompt Optimization Framework

    Authors: Eshaan Agarwal, Vivek Dani, Tanuja Ganu, Akshay Nambi

    Abstract: Large language models (LLMs) have revolutionized AI across diverse domains, showcasing remarkable capabilities. Central to their success is the concept of prompting, which guides model output generation. However, manual prompt engineering is labor-intensive and domain-specific, necessitating automated solutions. This paper introduces PromptWizard, a novel framework leveraging LLMs to iteratively s… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Report number: MSR-TR-VeLLM-02

  30. arXiv:2405.15585  [pdf, other

    cs.CL

    Synergizing In-context Learning with Hints for End-to-end Task-oriented Dialog Systems

    Authors: Vishal Vivek Saley, Rocktim Jyoti Das, Dinesh Raghu, Mausam

    Abstract: End-to-end Task-Oriented Dialog (TOD) systems typically require extensive training datasets to perform well. In contrast, large language model (LLM) based TOD systems can excel even with limited data due to their ability to learn tasks through in-context exemplars. However, these models lack alignment with the style of responses in training data and often generate comprehensive responses, making i… ▽ More

    Submitted 3 July, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

  31. arXiv:2405.07518  [pdf, other

    cs.AR cs.AI

    SambaNova SN40L: Scaling the AI Memory Wall with Dataflow and Composition of Experts

    Authors: Raghu Prabhakar, Ram Sivaramakrishnan, Darshan Gandhi, Yun Du, Mingran Wang, Xiangyu Song, Kejie Zhang, Tianren Gao, Angela Wang, Karen Li, Yongning Sheng, Joshua Brot, Denis Sokolov, Apurv Vivek, Calvin Leung, Arjun Sabnis, Jiayu Bai, Tuowen Zhao, Mark Gottscho, David Jackson, Mark Luttrell, Manish K. Shah, Edison Chen, Kaizhao Liang, Swayambhoo Jain , et al. (5 additional authors not shown)

    Abstract: Monolithic large language models (LLMs) like GPT-4 have paved the way for modern generative AI applications. Training, serving, and maintaining monolithic LLMs at scale, however, remains prohibitively expensive and challenging. The disproportionate increase in compute-to-memory ratio of modern AI accelerators have created a memory wall, necessitating new methods to deploy AI. Composition of Expert… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

  32. arXiv:2405.06346  [pdf, other

    cs.CL

    Akal Badi ya Bias: An Exploratory Study of Gender Bias in Hindi Language Technology

    Authors: Rishav Hada, Safiya Husain, Varun Gumma, Harshita Diddee, Aditya Yadavalli, Agrima Seth, Nidhi Kulkarni, Ujwal Gadiraju, Aditya Vashistha, Vivek Seshadri, Kalika Bali

    Abstract: Existing research in measuring and mitigating gender bias predominantly centers on English, overlooking the intricate challenges posed by non-English languages and the Global South. This paper presents the first comprehensive study delving into the nuanced landscape of gender bias in Hindi, the third most spoken language globally. Our study employs diverse mining techniques, computational models,… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

    Comments: Accepted to FAccT 2024

  33. arXiv:2405.02770  [pdf, other

    cs.LG

    PhilHumans: Benchmarking Machine Learning for Personal Health

    Authors: Vadim Liventsev, Vivek Kumar, Allmin Pradhap Singh Susaiyah, Zixiu Wu, Ivan Rodin, Asfand Yaar, Simone Balloccu, Marharyta Beraziuk, Sebastiano Battiato, Giovanni Maria Farinella, Aki Härmä, Rim Helaoui, Milan Petkovic, Diego Reforgiato Recupero, Ehud Reiter, Daniele Riboni, Raymond Sterling

    Abstract: The use of machine learning in Healthcare has the potential to improve patient outcomes as well as broaden the reach and affordability of Healthcare. The history of other application areas indicates that strong benchmarks are essential for the development of intelligent systems. We present Personal Health Interfaces Leveraging HUman-MAchine Natural interactions (PhilHumans), a holistic suite of be… ▽ More

    Submitted 16 May, 2024; v1 submitted 4 May, 2024; originally announced May 2024.

  34. arXiv:2405.02417  [pdf, other

    cs.RO

    Hierarchies define the scalability of robot swarms

    Authors: Vivek Shankar Varadharajan, Karthik Soma, Sepand Dyanatkar, Pierre-Yves Lajoie, Giovanni Beltrame

    Abstract: The emerging behaviors of swarms have fascinated scientists and gathered significant interest in the field of robotics. Traditionally, swarms are viewed as egalitarian, with robots sharing identical roles and capabilities. However, recent findings highlight the importance of hierarchy for deploying robot swarms more effectively in diverse scenarios. Despite nature's preference for hierarchies, the… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

    Comments: 31 Pages, 7 Figures. Supplementary material attached to the paper

  35. arXiv:2405.01409  [pdf, other

    cs.CV cs.AI

    Goal-conditioned reinforcement learning for ultrasound navigation guidance

    Authors: Abdoul Aziz Amadou, Vivek Singh, Florin C. Ghesu, Young-Ho Kim, Laura Stanciulescu, Harshitha P. Sai, Puneet Sharma, Alistair Young, Ronak Rajani, Kawal Rhode

    Abstract: Transesophageal echocardiography (TEE) plays a pivotal role in cardiology for diagnostic and interventional procedures. However, using it effectively requires extensive training due to the intricate nature of image acquisition and interpretation. To enhance the efficiency of novice sonographers and reduce variability in scan acquisitions, we propose a novel ultrasound (US) navigation assistance me… ▽ More

    Submitted 22 May, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

    Comments: 11 pages, 3 figures

    ACM Class: I.4.0; I.5.0

  36. arXiv:2404.18880  [pdf, ps, other

    cs.CL

    Spivavtor: An Instruction Tuned Ukrainian Text Editing Model

    Authors: Aman Saini, Artem Chernodub, Vipul Raheja, Vivek Kulkarni

    Abstract: We introduce Spivavtor, a dataset, and instruction-tuned models for text editing focused on the Ukrainian language. Spivavtor is the Ukrainian-focused adaptation of the English-only CoEdIT model. Similar to CoEdIT, Spivavtor performs text editing tasks by following instructions in Ukrainian. This paper describes the details of the Spivavtor-Instruct dataset and Spivavtor models. We evaluate Spivav… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: Accepted to UNLP Workshop 2024

  37. arXiv:2404.18416  [pdf, other

    cs.AI cs.CL cs.CV cs.LG

    Capabilities of Gemini Models in Medicine

    Authors: Khaled Saab, Tao Tu, Wei-Hung Weng, Ryutaro Tanno, David Stutz, Ellery Wulczyn, Fan Zhang, Tim Strother, Chunjong Park, Elahe Vedadi, Juanma Zambrano Chaves, Szu-Yeu Hu, Mike Schaekermann, Aishwarya Kamath, Yong Cheng, David G. T. Barrett, Cathy Cheung, Basil Mustafa, Anil Palepu, Daniel McDuff, Le Hou, Tomer Golany, Luyang Liu, Jean-baptiste Alayrac, Neil Houlsby , et al. (42 additional authors not shown)

    Abstract: Excellence in a wide variety of medical applications poses considerable challenges for AI, requiring advanced reasoning, access to up-to-date medical knowledge and understanding of complex multimodal data. Gemini models, with strong general capabilities in multimodal and long-context reasoning, offer exciting possibilities in medicine. Building on these core strengths of Gemini, we introduce Med-G… ▽ More

    Submitted 1 May, 2024; v1 submitted 29 April, 2024; originally announced April 2024.

  38. arXiv:2404.15894  [pdf, other

    cs.CL cs.AI

    Assessing The Potential Of Mid-Sized Language Models For Clinical QA

    Authors: Elliot Bolton, Betty Xiong, Vijaytha Muralidharan, Joel Schamroth, Vivek Muralidharan, Christopher D. Manning, Roxana Daneshjou

    Abstract: Large language models, such as GPT-4 and Med-PaLM, have shown impressive performance on clinical tasks; however, they require access to compute, are closed-source, and cannot be deployed on device. Mid-size models such as BioGPT-large, BioMedLM, LLaMA 2, and Mistral 7B avoid these drawbacks, but their capacity for clinical tasks has been understudied. To help assess their potential for clinical us… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: 25 pages, 8 figures

  39. arXiv:2404.15774  [pdf, other

    cs.CV cs.AI eess.IV

    Toward Physics-Aware Deep Learning Architectures for LiDAR Intensity Simulation

    Authors: Vivek Anand, Bharat Lohani, Gaurav Pandey, Rakesh Mishra

    Abstract: Autonomous vehicles (AVs) heavily rely on LiDAR perception for environment understanding and navigation. LiDAR intensity provides valuable information about the reflected laser signals and plays a crucial role in enhancing the perception capabilities of AVs. However, accurately simulating LiDAR intensity remains a challenge due to the unavailability of material properties of the objects in the env… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: 7 pages, 7 figures

  40. arXiv:2404.14457  [pdf

    cs.LG

    Graph Coloring Using Heat Diffusion

    Authors: Vivek Chaudhary

    Abstract: Graph coloring is a problem with varied applications in industry and science such as scheduling, resource allocation, and circuit design. The purpose of this paper is to establish if a new gradient based iterative solver framework known as heat diffusion can solve the graph coloring problem. We propose a solution to the graph coloring problem using the heat diffusion framework. We compare the solu… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

    Comments: 5 Pages, 3 Figures

    MSC Class: 05

  41. arXiv:2404.09493  [pdf, ps, other

    eess.SP cs.HC cs.NE

    Novel entropy difference-based EEG channel selection technique for automated detection of ADHD

    Authors: Shishir Maheshwari, Kandala N V P S Rajesh, Vivek Kanhangad, U Rajendra Acharya, T Sunil Kumar

    Abstract: Attention deficit hyperactivity disorder (ADHD) is one of the common neurodevelopmental disorders in children. This paper presents an automated approach for ADHD detection using the proposed entropy difference (EnD)- based encephalogram (EEG) channel selection approach. In the proposed approach, we selected the most significant EEG channels for the accurate identification of ADHD using an EnD-base… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  42. arXiv:2404.08854  [pdf, other

    cs.RO

    gnss_lib_py: Analyzing GNSS Data with Python

    Authors: Derek Knowles, Ashwin Vivek Kanhere, Daniel Neamati, Grace Gao

    Abstract: This paper presents gnss_lib_py, a Python library used to parse, analyze, and visualize data from a variety of GNSS (Global Navigation Satellite Systems) data sources. The gnss_lib_py library's ease of use, modular capabilities, testing coverage, and extensive documentation make it an attractive tool not only for scientific and industry users wanting a quick, out-of-the-box solution but also for a… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

    Comments: Submitted to the SoftwareX journal

  43. arXiv:2404.07926  [pdf, ps, other

    cs.HC cs.AI

    Leveraging Large Language Models (LLMs) to Support Collaborative Human-AI Online Risk Data Annotation

    Authors: Jinkyung Park, Pamela Wisniewski, Vivek Singh

    Abstract: In this position paper, we discuss the potential for leveraging LLMs as interactive research tools to facilitate collaboration between human coders and AI to effectively annotate online risk data at scale. Collaborative human-AI labeling is a promising approach to annotating large-scale and complex data for various tasks. Yet, tools and methods to support effective human-AI collaboration for data… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: This paper has been peer-reviewed and presented at the "CHI 2024 Workshop on LLMs as Research Tools: Applications and Evaluations in HCI Data Work, May 12, 2024, Honolulu, HI, USA."

  44. arXiv:2404.07795  [pdf, other

    cs.RO

    From the Lab to the Theater: An Unconventional Field Robotics Journey

    Authors: Ali Imran, Vivek Shankar Varadharajan, Rafael Gomes Braga, Yann Bouteiller, Abdalwhab Bakheet Mohamed Abdalwhab, Matthis Di-Giacomo, Alexandra Mercader, Giovanni Beltrame, David St-Onge

    Abstract: Artistic performances involving robotic systems present unique technical challenges akin to those encountered in other field deployments. In this paper, we delve into the orchestration of robotic artistic performances, focusing on the complexities inherent in communication protocols and localization methods. Through our case studies and experimental insights, we demonstrate the breadth of technica… ▽ More

    Submitted 20 April, 2024; v1 submitted 11 April, 2024; originally announced April 2024.

  45. arXiv:2404.05501  [pdf

    q-bio.NC cs.AI cs.LG

    Data Science In Olfaction

    Authors: Vivek Agarwal, Joshua Harvey, Dmitry Rinberg, Vasant Dhar

    Abstract: Advances in neural sensing technology are making it possible to observe the olfactory process in great detail. In this paper, we conceptualize smell from a Data Science and AI perspective, that relates the properties of odorants to how they are sensed and analyzed in the olfactory system from the nose to the brain. Drawing distinctions to color vision, we argue that smell presents unique measureme… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

    Comments: 20 pages, 10 Figures, 2 Appendix, 1 Table

  46. arXiv:2404.03023  [pdf, ps, other

    cs.HC cs.AI

    Toward Safe Evolution of Artificial Intelligence (AI) based Conversational Agents to Support Adolescent Mental and Sexual Health Knowledge Discovery

    Authors: Jinkyung Park, Vivek Singh, Pamela Wisniewski

    Abstract: Following the recent release of various Artificial Intelligence (AI) based Conversation Agents (CAs), adolescents are increasingly using CAs for interactive knowledge discovery on sensitive topics, including mental and sexual health topics. Exploring such sensitive topics through online search has been an essential part of adolescent development, and CAs can support their knowledge discovery on su… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: This paper has been peer-reviewed and presented at the "CHI 2024 Workshop on Child-centred AI Design, May 11, 2024, Honolulu, HI, USA."

  47. arXiv:2404.00458  [pdf, other

    cs.CL cs.IR

    Beyond One-Size-Fits-All: Multi-Domain, Multi-Task Framework for Embedding Model Selection

    Authors: Vivek Khetan

    Abstract: This position paper proposes a systematic approach towards developing a framework to help select the most effective embedding models for natural language processing (NLP) tasks, addressing the challenge posed by the proliferation of both proprietary and open-source encoder models.

    Submitted 30 March, 2024; originally announced April 2024.

  48. arXiv:2403.17615  [pdf, other

    eess.IV cs.CV q-bio.QM

    Grad-CAMO: Learning Interpretable Single-Cell Morphological Profiles from 3D Cell Painting Images

    Authors: Vivek Gopalakrishnan, Jingzhe Ma, Zhiyong Xie

    Abstract: Despite their black-box nature, deep learning models are extensively used in image-based drug discovery to extract feature vectors from single cells in microscopy images. To better understand how these networks perform representation learning, we employ visual explainability techniques (e.g., Grad-CAM). Our analyses reveal several mechanisms by which supervised models cheat, exploiting biologicall… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  49. arXiv:2403.17016  [pdf, other

    cs.CV cs.LG physics.ao-ph

    HEAL-ViT: Vision Transformers on a spherical mesh for medium-range weather forecasting

    Authors: Vivek Ramavajjala

    Abstract: In recent years, a variety of ML architectures and techniques have seen success in producing skillful medium range weather forecasts. In particular, Vision Transformer (ViT)-based models (e.g. Pangu-Weather, FuXi) have shown strong performance, working nearly "out-of-the-box" by treating weather data as a multi-channel image on a rectilinear grid. While a rectilinear grid is appropriate for 2D ima… ▽ More

    Submitted 14 February, 2024; originally announced March 2024.

    Comments: 18 pages, 14 figures, preprint

  50. A Hybrid Transformer-Sequencer approach for Age and Gender classification from in-wild facial images

    Authors: Aakash Singh, Vivek Kumar Singh

    Abstract: The advancements in computer vision and image processing techniques have led to emergence of new application in the domain of visual surveillance, targeted advertisement, content-based searching, and human-computer interaction etc. Out of the various techniques in computer vision, face analysis, in particular, has gained much attention. Several previous studies have tried to explore different appl… ▽ More

    Submitted 20 March, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

    Comments: 22 pages

    Journal ref: Neural Computing and Applications. 2024 Jan;36(3):1149-65