Skip to main content

Showing 1–50 of 843 results for author: Singh, S

  1. arXiv:2407.07775  [pdf, other

    cs.RO cs.AI

    Mobility VLA: Multimodal Instruction Navigation with Long-Context VLMs and Topological Graphs

    Authors: Hao-Tien Lewis Chiang, Zhuo Xu, Zipeng Fu, Mithun George Jacob, Tingnan Zhang, Tsang-Wei Edward Lee, Wenhao Yu, Connor Schenck, David Rendleman, Dhruv Shah, Fei Xia, Jasmine Hsu, Jonathan Hoech, Pete Florence, Sean Kirmani, Sumeet Singh, Vikas Sindhwani, Carolina Parada, Chelsea Finn, Peng Xu, Sergey Levine, Jie Tan

    Abstract: An elusive goal in navigation research is to build an intelligent agent that can understand multimodal instructions including natural language and image, and perform useful navigation. To achieve this, we study a widely useful category of navigation tasks we call Multimodal Instruction Navigation with demonstration Tours (MINT), in which the environment prior is provided through a previously recor… ▽ More

    Submitted 12 July, 2024; v1 submitted 10 July, 2024; originally announced July 2024.

  2. arXiv:2407.06727  [pdf, other

    eess.IV cs.CV

    Towards Physics-informed Cyclic Adversarial Multi-PSF Lensless Imaging

    Authors: Abeer Banerjee, Sanjay Singh

    Abstract: Lensless imaging has emerged as a promising field within inverse imaging, offering compact, cost-effective solutions with the potential to revolutionize the computational camera market. By circumventing traditional optical components like lenses and mirrors, novel approaches like mask-based lensless imaging eliminate the need for conventional hardware. However, advancements in lensless image recon… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  3. arXiv:2407.05887  [pdf, other

    cs.CL cs.AI cs.LG

    Generation and De-Identification of Indian Clinical Discharge Summaries using LLMs

    Authors: Sanjeet Singh, Shreya Gupta, Niralee Gupta, Naimish Sharma, Lokesh Srivastava, Vibhu Agarwal, Ashutosh Modi

    Abstract: The consequences of a healthcare data breach can be devastating for the patients, providers, and payers. The average financial impact of a data breach in recent months has been estimated to be close to USD 10 million. This is especially significant for healthcare organizations in India that are managing rapid digitization while still establishing data governance procedures that align with the lett… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: Accepted at BioNLP Workshop at ACL 2024; 21 pages (9 pages main content)

  4. arXiv:2407.03941  [pdf, other

    cs.SE cs.AI cs.CL

    Narrow Transformer: Starcoder-Based Java-LM For Desktop

    Authors: Kamalkumar Rathinasamy, Balaji A J, Ankush Kumar, Gagan Gayari, Harshini K, Rajab Ali Mondal, Sreenivasa Raghavan K S, Swayam Singh

    Abstract: This paper presents NT-Java-1.1B, an open-source specialized code language model built on StarCoderBase-1.1B, designed for coding tasks in Java programming. NT-Java-1.1B achieves state-of-the-art performance, surpassing its base model and majority of other models of similar size on MultiPL-E Java code benchmark. While there have been studies on extending large, generic pre-trained models to improv… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    ACM Class: I.2.7

  5. arXiv:2407.03843  [pdf, other

    cs.ET

    Resistive Memory for Computing and Security: Algorithms, Architectures, and Platforms

    Authors: Simranjeet Singh, Farhad Merchant, Sachin Patkar

    Abstract: Resistive random-access memory (RRAM) is gaining popularity due to its ability to offer computing within the memory and its non-volatile nature. The unique properties of RRAM, such as binary switching, multi-state switching, and device variations, can be leveraged to design novel techniques and algorithms. This thesis proposes a technique for utilizing RRAM devices in three major directions: i) di… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: Accepted as PhD Forum at VLSI-SoC 2024

  6. arXiv:2407.02921  [pdf, other

    cs.ET

    In-Memory Mirroring: Cloning Without Reading

    Authors: Simranjeet Singh, Ankit Bende, Chandan Kumar Jha, Vikas Rana, Rolf Drechsler, Sachin Patkar, Farhad Merchant

    Abstract: In-memory computing (IMC) has gained significant attention recently as it attempts to reduce the impact of memory bottlenecks. Numerous schemes for digital IMC are presented in the literature, focusing on logic operations. Often, an application's description has data dependencies that must be resolved. Contemporary IMC architectures perform read followed by write operations for this purpose, which… ▽ More

    Submitted 4 July, 2024; v1 submitted 3 July, 2024; originally announced July 2024.

    Comments: Accepted in IFIP/IEEE VLSI-SoC 2024

  7. arXiv:2406.19888  [pdf, other

    cs.AI

    Fine-tuning of Geospatial Foundation Models for Aboveground Biomass Estimation

    Authors: Michal Muszynski, Levente Klein, Ademir Ferreira da Silva, Anjani Prasad Atluri, Carlos Gomes, Daniela Szwarcman, Gurkanwar Singh, Kewen Gu, Maciel Zortea, Naomi Simumba, Paolo Fraccaro, Shraddha Singh, Steve Meliksetian, Campbell Watson, Daiki Kimura, Harini Srinivasan

    Abstract: Global vegetation structure mapping is critical for understanding the global carbon cycle and maximizing the efficacy of nature-based carbon sequestration initiatives. Moreover, vegetation structure mapping can help reduce the impacts of climate change by, for example, guiding actions to improve water security, increase biodiversity and reduce flood risk. Global satellite measurements provide an i… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  8. arXiv:2406.19800  [pdf, other

    cs.LG cs.RO

    Modeling the Real World with High-Density Visual Particle Dynamics

    Authors: William F. Whitney, Jacob Varley, Deepali Jain, Krzysztof Choromanski, Sumeet Singh, Vikas Sindhwani

    Abstract: We present High-Density Visual Particle Dynamics (HD-VPD), a learned world model that can emulate the physical dynamics of real scenes by processing massive latent point clouds containing 100K+ particles. To enable efficiency at this scale, we introduce a novel family of Point Cloud Transformers (PCTs) called Interlacers leveraging intertwined linear-attention Performer layers and graph-based neig… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  9. arXiv:2406.19237  [pdf, other

    cs.CL cs.CV cs.IR cs.LG

    FlowVQA: Mapping Multimodal Logic in Visual Question Answering with Flowcharts

    Authors: Shubhankar Singh, Purvi Chaurasia, Yerram Varun, Pranshu Pandya, Vatsal Gupta, Vivek Gupta, Dan Roth

    Abstract: Existing benchmarks for visual question answering lack in visual grounding and complexity, particularly in evaluating spatial reasoning skills. We introduce FlowVQA, a novel benchmark aimed at assessing the capabilities of visual question-answering multimodal language models in reasoning with flowcharts as visual contexts. FlowVQA comprises 2,272 carefully generated and human-verified flowchart im… ▽ More

    Submitted 28 June, 2024; v1 submitted 27 June, 2024; originally announced June 2024.

    Comments: Accepted in ACL 2024 (Findings), 21 pages, 7 figures, 9 Tables

  10. arXiv:2406.18899  [pdf, other

    cs.RO cs.AI

    Autonomous Control of a Novel Closed Chain Five Bar Active Suspension via Deep Reinforcement Learning

    Authors: Nishesh Singh, Sidharth Ramesh, Abhishek Shankar, Jyotishka Duttagupta, Leander Stephen D'Souza, Sanjay Singh

    Abstract: Planetary exploration requires traversal in environments with rugged terrains. In addition, Mars rovers and other planetary exploration robots often carry sensitive scientific experiments and components onboard, which must be protected from mechanical harm. This paper deals with an active suspension system focused on chassis stabilisation and an efficient traversal method while encountering unavoi… ▽ More

    Submitted 4 July, 2024; v1 submitted 27 June, 2024; originally announced June 2024.

    Comments: 15 pages, 11 figures

    ACM Class: I.2.9

  11. arXiv:2406.16300  [pdf, other

    cs.LG

    Landscaping Linear Mode Connectivity

    Authors: Sidak Pal Singh, Linara Adilova, Michael Kamp, Asja Fischer, Bernhard Schölkopf, Thomas Hofmann

    Abstract: The presence of linear paths in parameter space between two different network solutions in certain cases, i.e., linear mode connectivity (LMC), has garnered interest from both theoretical and practical fronts. There has been significant research that either practically designs algorithms catered for connecting networks by adjusting for the permutation symmetries as well as some others that more th… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

    Comments: ICML 2024 HiLD workshop paper

  12. arXiv:2406.12056  [pdf, other

    cs.LG q-bio.QM

    Learning Molecular Representation in a Cell

    Authors: Gang Liu, Srijit Seal, John Arevalo, Zhenwen Liang, Anne E. Carpenter, Meng Jiang, Shantanu Singh

    Abstract: Predicting drug efficacy and safety in vivo requires information on biological responses (e.g., cell morphology and gene expression) to small molecule perturbations. However, current molecular representation learning methods do not provide a comprehensive view of cell states under these perturbations and struggle to remove noise, hindering model generalization. We introduce the Information Alignme… ▽ More

    Submitted 22 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: 21 pages, 8 tables, 7 figures

  13. arXiv:2406.10886  [pdf, other

    cs.CL cs.LG

    Distilling Opinions at Scale: Incremental Opinion Summarization using XL-OPSUMM

    Authors: Sri Raghava Muddu, Rupasai Rangaraju, Tejpalsingh Siledar, Swaroop Nath, Pushpak Bhattacharyya, Swaprava Nath, Suman Banerjee, Amey Patil, Muthusamy Chelliah, Sudhanshu Shekhar Singh, Nikesh Garera

    Abstract: Opinion summarization in e-commerce encapsulates the collective views of numerous users about a product based on their reviews. Typically, a product on an e-commerce platform has thousands of reviews, each review comprising around 10-15 words. While Large Language Models (LLMs) have shown proficiency in summarization tasks, they struggle to handle such a large volume of reviews due to context limi… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  14. arXiv:2406.10209  [pdf, other

    cs.CL

    Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs

    Authors: Abhimanyu Hans, Yuxin Wen, Neel Jain, John Kirchenbauer, Hamid Kazemi, Prajwal Singhania, Siddharth Singh, Gowthami Somepalli, Jonas Geiping, Abhinav Bhatele, Tom Goldstein

    Abstract: Large language models can memorize and repeat their training data, causing privacy and copyright risks. To mitigate memorization, we introduce a subtle modification to the next-token training objective that we call the goldfish loss. During training, a randomly sampled subset of tokens are excluded from the loss computation. These dropped tokens are not memorized by the model, which prevents verba… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 9.5 pages, 8 figures, and 1 table in the main body. Code available at https://github.com/ahans30/goldfish-loss

  15. arXiv:2406.09000  [pdf, other

    cs.CR

    A Passwordless MFA Utlizing Biometrics, Proximity and Contactless Communication

    Authors: Sneha Shukla, Gaurav Varshney, Shreya Singh, Swati Goel

    Abstract: Despite being more secure and strongly promoted, two-factor (2FA) or multi-factor (MFA) schemes either fail to protect against recent phishing threats such as real-time MITM, controls/relay MITM, malicious browser extension-based phishing attacks, and/or need the users to purchase and carry other hardware for additional account protection. Leveraging the unprecedented popularity of NFC and BLE-ena… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  16. arXiv:2406.08649  [pdf, other

    cs.LG

    MOTI$\mathcal{VE}$: A Drug-Target Interaction Graph For Inductive Link Prediction

    Authors: John Arevalo, Ellen Su, Anne E Carpenter, Shantanu Singh

    Abstract: Drug-target interaction (DTI) prediction is crucial for identifying new therapeutics and detecting mechanisms of action. While structure-based methods accurately model physical interactions between a drug and its protein target, cell-based assays such as Cell Painting can better capture complex DTI interactions. This paper introduces MOTI$\mathcal{VE}$, a Morphological cOmpound Target Interaction… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  17. arXiv:2406.07887  [pdf, other

    cs.LG cs.CL

    An Empirical Study of Mamba-based Language Models

    Authors: Roger Waleffe, Wonmin Byeon, Duncan Riach, Brandon Norick, Vijay Korthikanti, Tri Dao, Albert Gu, Ali Hatamizadeh, Sudhakar Singh, Deepak Narayanan, Garvit Kulshreshtha, Vartika Singh, Jared Casper, Jan Kautz, Mohammad Shoeybi, Bryan Catanzaro

    Abstract: Selective state-space models (SSMs) like Mamba overcome some of the shortcomings of Transformers, such as quadratic computational complexity with sequence length and large inference-time memory requirements from the key-value cache. Moreover, recent studies have shown that SSMs can match or exceed the language modeling capabilities of Transformers, making them an attractive alternative. In a contr… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  18. arXiv:2406.07835  [pdf, other

    cs.CL cs.AI

    SciRIFF: A Resource to Enhance Language Model Instruction-Following over Scientific Literature

    Authors: David Wadden, Kejian Shi, Jacob Morrison, Aakanksha Naik, Shruti Singh, Nitzan Barzilay, Kyle Lo, Tom Hope, Luca Soldaini, Shannon Zejiang Shen, Doug Downey, Hannaneh Hajishirzi, Arman Cohan

    Abstract: We present SciRIFF (Scientific Resource for Instruction-Following and Finetuning), a dataset of 137K instruction-following demonstrations for 54 tasks covering five essential scientific literature understanding capabilities: information extraction, summarization, question answering, claim verification, and classification. SciRIFF demonstrations are notable for their long input contexts, detailed t… ▽ More

    Submitted 18 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

    Comments: Submitted to NeurIPS Datasets and Benchmarks 2024

  19. arXiv:2406.06799  [pdf, other

    cs.DC cs.CL

    LLM-dCache: Improving Tool-Augmented LLMs with GPT-Driven Localized Data Caching

    Authors: Simranjit Singh, Michael Fore, Andreas Karatzas, Chaehong Lee, Yanan Jian, Longfei Shangguan, Fuxun Yu, Iraklis Anagnostopoulos, Dimitrios Stamoulis

    Abstract: As Large Language Models (LLMs) broaden their capabilities to manage thousands of API calls, they are confronted with complex data operations across vast datasets with significant overhead to the underlying system. In this work, we introduce LLM-dCache to optimize data accesses by treating cache operations as callable API functions exposed to the tool-augmented agent. We grant LLMs the autonomy to… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  20. arXiv:2406.06774  [pdf, other

    eess.AS cs.SD

    ComFeAT: Combination of Neural and Spectral Features for Improved Depression Detection

    Authors: Orchid Chetia Phukan, Sarthak Jain, Shubham Singh, Muskaan Singh, Arun Balaji Buduru, Rajesh Sharma

    Abstract: In this work, we focus on the detection of depression through speech analysis. Previous research has widely explored features extracted from pre-trained models (PTMs) primarily trained for paralinguistic tasks. Although these features have led to sufficient advances in speech-based depression detection, their performance declines in real-world settings. To address this, in this paper, we introduce… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: Accepted to INTERSPEECH 2024 Show & Tell Demonstrations

  21. arXiv:2406.02542  [pdf, other

    cs.LG

    Loki: Low-Rank Keys for Efficient Sparse Attention

    Authors: Prajwal Singhania, Siddharth Singh, Shwai He, Soheil Feizi, Abhinav Bhatele

    Abstract: Inference on large language models can be expensive in terms of the compute and memory costs involved, especially when long sequence lengths are used. In particular, the self-attention mechanism used in such models contributes significantly to these costs, which has resulted in several recent works that propose sparse attention approximations for inference. In this work, we propose to approximate… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  22. arXiv:2406.01057  [pdf, other

    cs.DS cs.CC

    Knapsack with Vertex Cover, Set Cover, and Hitting Set

    Authors: Palash Dey, Ashlesha Hota, Sudeshna Kolay, Sipra Singh

    Abstract: Given an undirected graph $\GG=(\VV,\EE)$, with vertex weights $(w(u))_{u\in\VV}$, vertex values $(α(u))_{u\in\VV}$, a knapsack size $s$, and a target value $d$, the \vcknapsack problem is to determine if there exists a subset $\UU\subseteq\VV$ of vertices such that \UU forms a vertex cover, $w(\UU)=\sum_{u\in\UU} w(u) \le s$, and $α(\UU)=\sum_{u\in\UU} α(u) \ge d$. In this paper, we closely study… ▽ More

    Submitted 6 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

  23. arXiv:2405.19563  [pdf, other

    cs.CL

    Unlearning Climate Misinformation in Large Language Models

    Authors: Michael Fore, Simranjit Singh, Chaehong Lee, Amritanshu Pandey, Antonios Anastasopoulos, Dimitrios Stamoulis

    Abstract: Misinformation regarding climate change is a key roadblock in addressing one of the most serious threats to humanity. This paper investigates factual accuracy in large language models (LLMs) regarding climate information. Using true/false labeled Q&A data for fine-tuning and evaluating LLMs on climate-related claims, we compare open-source models, assessing their ability to generate truthful respo… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  24. arXiv:2405.18831  [pdf, other

    cs.CV cs.LG

    Evaluating Zero-Shot GPT-4V Performance on 3D Visual Question Answering Benchmarks

    Authors: Simranjit Singh, Georgios Pavlakos, Dimitrios Stamoulis

    Abstract: As interest in "reformulating" the 3D Visual Question Answering (VQA) problem in the context of foundation models grows, it is imperative to assess how these new paradigms influence existing closed-vocabulary datasets. In this case study, we evaluate the zero-shot performance of foundational models (GPT-4 Vision and GPT-4) on well-established 3D VQA benchmarks, namely 3D-VQA and ScanQA. We provide… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: Accepted at 1st Workshop on Multimodalities for 3D Scenes CVPR 2024

  25. arXiv:2405.17438  [pdf, other

    cs.PL cs.AI cs.LG

    An LLM-Tool Compiler for Fused Parallel Function Calling

    Authors: Simranjit Singh, Andreas Karatzas, Michael Fore, Iraklis Anagnostopoulos, Dimitrios Stamoulis

    Abstract: State-of-the-art sequential reasoning in Large Language Models (LLMs) has expanded the capabilities of Copilots beyond conversational tasks to complex function calling, managing thousands of API calls. However, the tendency of compositional prompting to segment tasks into multiple steps, each requiring a round-trip to the GPT APIs, leads to increased system latency and costs. Although recent advan… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  26. arXiv:2405.16330  [pdf, other

    cs.CV

    LEAST: "Local" text-conditioned image style transfer

    Authors: Silky Singh, Surgan Jandial, Simra Shahid, Abhinav Java

    Abstract: Text-conditioned style transfer enables users to communicate their desired artistic styles through text descriptions, offering a new and expressive means of achieving stylization. In this work, we evaluate the text-conditioned image editing and style transfer techniques on their fine-grained understanding of user prompts for precise "local" style transfer. We find that current methods fail to acco… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

    Comments: Accepted to AI for Content Creation (AI4CC) Workshop at CVPR 2024

  27. arXiv:2405.11295  [pdf

    eess.IV cs.CV cs.LG cs.MM

    Medical Image Analysis for Detection, Treatment and Planning of Disease using Artificial Intelligence Approaches

    Authors: Nand Lal Yadav, Satyendra Singh, Rajesh Kumar, Sudhakar Singh

    Abstract: X-ray is one of the prevalent image modalities for the detection and diagnosis of the human body. X-ray provides an actual anatomical structure of an organ present with disease or absence of disease. Segmentation of disease in chest X-ray images is essential for the diagnosis and treatment. In this paper, a framework for the segmentation of X-ray images using artificial intelligence techniques has… ▽ More

    Submitted 18 May, 2024; originally announced May 2024.

    Comments: 10 pages, 3 figures

    Journal ref: International Journal of Microsystems and IoT, Vol. 1, Issue 5, pp.278- 287, 2023

  28. arXiv:2405.10880  [pdf

    cs.CR

    The MESA Security Model 2.0: A Dynamic Framework for Mitigating Stealth Data Exfiltration

    Authors: Sanjeev Pratap Singh, Naveed Afzal

    Abstract: The rising complexity of cyber threats calls for a comprehensive reassessment of current security frameworks in business environments. This research focuses on Stealth Data Exfiltration, a significant cyber threat characterized by covert infiltration, extended undetectability, and unauthorized dissemination of confidential data. Our findings reveal that conventional defense-in-depth strategies oft… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

    Journal ref: International Journal of Network Security & Its Applications (IJNSA) 2024

  29. arXiv:2405.09247  [pdf, other

    cs.CV cs.LG

    Graph Neural Network based Handwritten Trajectories Recognition

    Authors: Anuj Sharma, Sukhdeep Singh, S Ratna

    Abstract: The graph neural networks has been proved to be an efficient machine learning technique in real life applications. The handwritten recognition is one of the useful area in real life use where both offline and online handwriting recognition are required. The chain code as feature extraction technique has shown significant results in literature and we have been able to use chain codes with graph neu… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

  30. arXiv:2405.07674  [pdf, other

    eess.IV cs.CV

    CoVScreen: Pitfalls and recommendations for screening COVID-19 using Chest X-rays

    Authors: Sonit Singh

    Abstract: The novel coronavirus (COVID-19), a highly infectious respiratory disease caused by the SARS-CoV-2 has emerged as an unprecedented healthcare crisis. The pandemic had a devastating impact on the health, well-being, and economy of the global population. Early screening and diagnosis of symptomatic patients plays crucial role in isolation of patient to help stop community transmission as well as pro… ▽ More

    Submitted 13 May, 2024; originally announced May 2024.

    Comments: 21 pages

  31. arXiv:2405.06989  [pdf, other

    cs.RO eess.SY

    Mobius Transformation-Based Circular Motion Control for Unicycle Robots in Nonconcentric Circular Geofences

    Authors: Shubham Singh, Anoop Jain

    Abstract: Nonuniform motion constraints are ubiquitous in robotic applications. Geofencing control is one such paradigm where the motion of a robot must be constrained within a predefined boundary. This paper addresses the problem of stabilizing a unicycle robot around a desired circular orbit while confining its motion within a nonconcentric external circular boundary. Our solution approach relies on the c… ▽ More

    Submitted 17 July, 2024; v1 submitted 11 May, 2024; originally announced May 2024.

  32. arXiv:2405.04324  [pdf, other

    cs.AI cs.CL cs.SE

    Granite Code Models: A Family of Open Foundation Models for Code Intelligence

    Authors: Mayank Mishra, Matt Stallone, Gaoyuan Zhang, Yikang Shen, Aditya Prasad, Adriana Meza Soria, Michele Merler, Parameswaran Selvam, Saptha Surendran, Shivdeep Singh, Manish Sethi, Xuan-Hong Dang, Pengyuan Li, Kun-Lung Wu, Syed Zawad, Andrew Coleman, Matthew White, Mark Lewis, Raju Pavuluri, Yan Koyfman, Boris Lublinsky, Maximilien de Bayser, Ibrahim Abdelaziz, Kinjal Basu, Mayank Agarwal , et al. (21 additional authors not shown)

    Abstract: Large Language Models (LLMs) trained on code are revolutionizing the software development process. Increasingly, code LLMs are being integrated into software development environments to improve the productivity of human programmers, and LLM-based agents are beginning to show promise for handling complex tasks autonomously. Realizing the full potential of code LLMs requires a wide range of capabili… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: Corresponding Authors: Rameswar Panda, Ruchir Puri; Equal Contributors: Mayank Mishra, Matt Stallone, Gaoyuan Zhang

  33. Histogram-Based Federated XGBoost using Minimal Variance Sampling for Federated Tabular Data

    Authors: William Lindskog, Christian Prehofer, Sarandeep Singh

    Abstract: Federated Learning (FL) has gained considerable traction, yet, for tabular data, FL has received less attention. Most FL research has focused on Neural Networks while Tree-Based Models (TBMs) such as XGBoost have historically performed better on tabular data. It has been shown that subsampling of training data when building trees can improve performance but it is an open problem whether such subsa… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

    Comments: 6 figures, 5 tables, 8 pages, FLTA 2023 (together with FMEC 2023)

  34. arXiv:2405.01858  [pdf, other

    cs.CL cs.CY

    SUKHSANDESH: An Avatar Therapeutic Question Answering Platform for Sexual Education in Rural India

    Authors: Salam Michael Singh, Shubhmoy Kumar Garg, Amitesh Misra, Aaditeshwar Seth, Tanmoy Chakraborty

    Abstract: Sexual education aims to foster a healthy lifestyle in terms of emotional, mental and social well-being. In countries like India, where adolescents form the largest demographic group, they face significant vulnerabilities concerning sexual health. Unfortunately, sexual education is often stigmatized, creating barriers to providing essential counseling and information to this at-risk population. Co… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

  35. arXiv:2405.01699  [pdf, other

    cs.CV cs.AI

    SOAR: Advancements in Small Body Object Detection for Aerial Imagery Using State Space Models and Programmable Gradients

    Authors: Tushar Verma, Jyotsna Singh, Yash Bhartari, Rishi Jarwal, Suraj Singh, Shubhkarman Singh

    Abstract: Small object detection in aerial imagery presents significant challenges in computer vision due to the minimal data inherent in small-sized objects and their propensity to be obscured by larger objects and background noise. Traditional methods using transformer-based models often face limitations stemming from the lack of specialized databases, which adversely affect their performance with objects… ▽ More

    Submitted 5 May, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

    Comments: 7 pages, 5 figures

  36. arXiv:2405.00942  [pdf, other

    cs.CV cs.CL

    LLaVA Finds Free Lunch: Teaching Human Behavior Improves Content Understanding Abilities Of LLMs

    Authors: Somesh Singh, Harini S I, Yaman K Singla, Veeky Baths, Rajiv Ratn Shah, Changyou Chen, Balaji Krishnamurthy

    Abstract: Communication is defined as "Who says what to whom with what effect." A message from a communicator generates downstream receiver effects, also known as behavior. Receiver behavior, being a downstream effect of the message, carries rich signals about it. Even after carrying signals about the message, the behavior data is often ignored while training large language models. We show that training LLM… ▽ More

    Submitted 16 May, 2024; v1 submitted 1 May, 2024; originally announced May 2024.

  37. arXiv:2405.00709  [pdf, other

    cs.CL cs.AI cs.LG

    Evaluating Tool-Augmented Agents in Remote Sensing Platforms

    Authors: Simranjit Singh, Michael Fore, Dimitrios Stamoulis

    Abstract: Tool-augmented Large Language Models (LLMs) have shown impressive capabilities in remote sensing (RS) applications. However, existing benchmarks assume question-answering input templates over predefined image-text data pairs. These standalone instructions neglect the intricacies of realistic user-grounded tasks. Consider a geospatial analyst: they zoom in a map area, they draw a region over which… ▽ More

    Submitted 23 April, 2024; originally announced May 2024.

    Comments: ICLR 2024 Machine Learning for Remote Sensing (ML4RS) Workshop

  38. arXiv:2405.00588  [pdf, other

    cs.CL cs.AI cs.CV cs.CY cs.LG

    Are Models Biased on Text without Gender-related Language?

    Authors: Catarina G Belém, Preethi Seshadri, Yasaman Razeghi, Sameer Singh

    Abstract: Gender bias research has been pivotal in revealing undesirable behaviors in large language models, exposing serious gender stereotypes associated with occupations, and emotions. A key observation in prior work is that models reinforce stereotypes as a consequence of the gendered correlations that are present in the training data. In this paper, we focus on bias where the effect from training data… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

    Comments: In International Conference on Learning Representations 2024

  39. arXiv:2404.19345  [pdf, other

    cond-mat.mes-hall cs.ET

    Connecting physics to systems with modular spin-circuits

    Authors: Kemal Selcuk, Saleh Bunaiyan, Nihal Sanjay Singh, Shehrin Sayed, Samiran Ganguly, Giovanni Finocchio, Supriyo Datta, Kerem Y. Camsari

    Abstract: An emerging paradigm in modern electronics is that of CMOS + $\sf X$ requiring the integration of standard CMOS technology with novel materials and technologies denoted by $\sf X$. In this context, a crucial challenge is to develop accurate circuit models for $\sf X$ that are compatible with standard models for CMOS-based circuits and systems. In this perspective we present physics-based, experime… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

  40. arXiv:2404.15804  [pdf, other

    cs.LG cs.AI

    GeckOpt: LLM System Efficiency via Intent-Based Tool Selection

    Authors: Michael Fore, Simranjit Singh, Dimitrios Stamoulis

    Abstract: In this preliminary study, we investigate a GPT-driven intent-based reasoning approach to streamline tool selection for large language models (LLMs) aimed at system efficiency. By identifying the intent behind user prompts at runtime, we narrow down the API toolset required for task execution, reducing token consumption by up to 24.6\%. Early results on a real-world, massively parallel Copilot pla… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: GLSVLSI 2024

  41. arXiv:2404.15500  [pdf, other

    cs.AI cs.CL cs.LG

    GeoLLM-Engine: A Realistic Environment for Building Geospatial Copilots

    Authors: Simranjit Singh, Michael Fore, Dimitrios Stamoulis

    Abstract: Geospatial Copilots unlock unprecedented potential for performing Earth Observation (EO) applications through natural language instructions. However, existing agents rely on overly simplified single tasks and template-based prompts, creating a disconnect with real-world scenarios. In this work, we present GeoLLM-Engine, an environment for tool-augmented agents with intricate tasks routinely execut… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: Earthvision 2024, CVPR Workshop

  42. arXiv:2404.14695  [pdf, other

    cs.CL

    MisgenderMender: A Community-Informed Approach to Interventions for Misgendering

    Authors: Tamanna Hossain, Sunipa Dev, Sameer Singh

    Abstract: Content Warning: This paper contains examples of misgendering and erasure that could be offensive and potentially triggering. Misgendering, the act of incorrectly addressing someone's gender, inflicts serious harm and is pervasive in everyday technologies, yet there is a notable lack of research to combat it. We are the first to address this lack of research into interventions for misgendering b… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: NAACL 2024

  43. arXiv:2404.14062  [pdf, other

    cs.CV cs.LG

    GatedLexiconNet: A Comprehensive End-to-End Handwritten Paragraph Text Recognition System

    Authors: Lalita Kumari, Sukhdeep Singh, Vaibhav Varish Singh Rathore, Anuj Sharma

    Abstract: The Handwritten Text Recognition problem has been a challenge for researchers for the last few decades, especially in the domain of computer vision, a subdomain of pattern recognition. Variability of texts amongst writers, cursiveness, and different font styles of handwritten texts with degradation of historical text images make it a challenging problem. Recognizing scanned document images in neur… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  44. arXiv:2404.13252  [pdf, other

    cs.CV cs.LG eess.IV

    3D-Convolution Guided Spectral-Spatial Transformer for Hyperspectral Image Classification

    Authors: Shyam Varahagiri, Aryaman Sinha, Shiv Ram Dubey, Satish Kumar Singh

    Abstract: In recent years, Vision Transformers (ViTs) have shown promising classification performance over Convolutional Neural Networks (CNNs) due to their self-attention mechanism. Many researchers have incorporated ViTs for Hyperspectral Image (HSI) classification. HSIs are characterised by narrow contiguous spectral bands, providing rich spectral data. Although ViTs excel with sequential data, they cann… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: Accepted in IEEE Conference on Artificial Intelligence, 2024

  45. arXiv:2404.12306  [pdf

    cs.AR

    Switchable Single/Dual Edge Registers for Pipeline Architecture

    Authors: Suyash Vardhan Singh, Rakeshkumar Mahto

    Abstract: The demand for low power processing is increasing due to mobile and portable devices. In a processor unit, an adder is an important building block since it is used in Floating Point Units (FPU) and Arithmetic Logic Units (ALU). Also, pipeline techniques are used extensively to improve the throughput of the processing unit. To implement a pipeline requires adding a register at each sub-stage that r… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  46. arXiv:2404.11843  [pdf, other

    eess.IV cs.CV cs.LG

    Computer-Aided Diagnosis of Thoracic Diseases in Chest X-rays using hybrid CNN-Transformer Architecture

    Authors: Sonit Singh

    Abstract: Medical imaging has been used for diagnosis of various conditions, making it one of the most powerful resources for effective patient care. Due to widespread availability, low cost, and low radiation, chest X-ray is one of the most sought after radiology examination for the diagnosis of various thoracic diseases. Due to advancements in medical imaging technologies and increasing patient load, curr… ▽ More

    Submitted 18 April, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

    Comments: 24 pages, 13 Figures, 13 Tables. This article heavily draws from arXiv:1904.09925 where authors originally proposed attention-augmented convolutional network. arXiv admin note: text overlap with arXiv:1904.09925 by other authors

  47. arXiv:2404.09818  [pdf, other

    cs.AR

    Error Detection and Correction Codes for Safe In-Memory Computations

    Authors: Luca Parrini, Taha Soliman, Benjamin Hettwer, Jan Micha Borrmann, Simranjeet Singh, Ankit Bende, Vikas Rana, Farhad Merchant, Norbert Wehn

    Abstract: In-Memory Computing (IMC) introduces a new paradigm of computation that offers high efficiency in terms of latency and power consumption for AI accelerators. However, the non-idealities and defects of emerging technologies used in advanced IMC can severely degrade the accuracy of inferred Neural Networks (NN) and lead to malfunctions in safety-critical applications. In this paper, we investigate a… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: This paper will be presented at 29th IEEE European Test Symposium 2024 (ETS) 2024

  48. arXiv:2404.08011  [pdf, other

    cs.CV cs.LG

    An inclusive review on deep learning techniques and their scope in handwriting recognition

    Authors: Sukhdeep Singh, Sudhir Rohilla, Anuj Sharma

    Abstract: Deep learning expresses a category of machine learning algorithms that have the capability to combine raw inputs into intermediate features layers. These deep learning algorithms have demonstrated great results in different fields. Deep learning has particularly witnessed for a great achievement of human level performance across a number of domains in computer vision and pattern recognition. For t… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

  49. arXiv:2404.05243  [pdf, other

    cs.CL cs.AI

    Product Description and QA Assisted Self-Supervised Opinion Summarization

    Authors: Tejpalsingh Siledar, Rupasai Rangaraju, Sankara Sri Raghava Ravindra Muddu, Suman Banerjee, Amey Patil, Sudhanshu Shekhar Singh, Muthusamy Chelliah, Nikesh Garera, Swaprava Nath, Pushpak Bhattacharyya

    Abstract: In e-commerce, opinion summarization is the process of summarizing the consensus opinions found in product reviews. However, the potential of additional sources such as product description and question-answers (QA) has been considered less often. Moreover, the absence of any supervised training data makes this task challenging. To address this, we propose a novel synthetic dataset creation (SDC) s… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  50. arXiv:2404.04642  [pdf

    eess.IV cs.AI cs.LG

    Power-Efficient Image Storage: Leveraging Super Resolution Generative Adversarial Network for Sustainable Compression and Reduced Carbon Footprint

    Authors: Ashok Mondal, Satyam Singh

    Abstract: In recent years, large-scale adoption of cloud storage solutions has revolutionized the way we think about digital data storage. However, the exponential increase in data volume, especially images, has raised environmental concerns regarding power and resource consumption, as well as the rising digital carbon footprint emissions. The aim of this research is to propose a methodology for cloud-based… ▽ More

    Submitted 6 April, 2024; originally announced April 2024.

    Comments: 5 pages, 5 figures

    MSC Class: 68T07 ACM Class: I.2.m; H.3.2