Skip to main content

Showing 1–50 of 50 results for author: Misra, D

  1. arXiv:2405.20494  [pdf, other

    cs.CV cs.AI cs.LG

    Slight Corruption in Pre-training Data Makes Better Diffusion Models

    Authors: Hao Chen, Yujin Han, Diganta Misra, Xiang Li, Kai Hu, Difan Zou, Masashi Sugiyama, Jindong Wang, Bhiksha Raj

    Abstract: Diffusion models (DMs) have shown remarkable capabilities in generating realistic high-quality images, audios, and videos. They benefit significantly from extensive pre-training on large-scale datasets, including web-crawled data with paired data and conditions, such as image-text and image-class pairs. Despite rigorous filtering, these pre-training datasets often inevitably contain corrupted pair… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: 50 pages, 33 figures, 4 tables

  2. arXiv:2404.15269  [pdf, other

    cs.CL cs.AI cs.IR cs.LG

    Aligning LLM Agents by Learning Latent Preference from User Edits

    Authors: Ge Gao, Alexey Taymanov, Eduardo Salinas, Paul Mineiro, Dipendra Misra

    Abstract: We study interactive learning of LLM-based language agents based on user edits made to the agent's output. In a typical setting such as writing assistants, the user interacts with a language agent to generate a response given a context, and may optionally edit the agent response to personalize it based on their latent preference, in addition to improving the correctness. The edit feedback is natur… ▽ More

    Submitted 9 June, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

  3. arXiv:2404.09123  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    Provable Interactive Learning with Hindsight Instruction Feedback

    Authors: Dipendra Misra, Aldo Pacchiano, Robert E. Schapire

    Abstract: We study interactive learning in a setting where the agent has to generate a response (e.g., an action or trajectory) given a context and an instruction. In contrast, to typical approaches that train the system using reward or expert supervision on response, we study learning with hindsight instruction where a teacher provides an instruction that is most suitable for the agent's generated response… ▽ More

    Submitted 13 April, 2024; originally announced April 2024.

  4. arXiv:2404.08495  [pdf, other

    cs.LG cs.AI cs.CL

    Dataset Reset Policy Optimization for RLHF

    Authors: Jonathan D. Chang, Wenhao Zhan, Owen Oertell, Kianté Brantley, Dipendra Misra, Jason D. Lee, Wen Sun

    Abstract: Reinforcement Learning (RL) from Human Preference-based feedback is a popular paradigm for fine-tuning generative models, which has produced impressive models such as GPT-4 and Claude3 Opus. This framework often consists of two steps: learning a reward model from an offline preference dataset followed by running online RL to optimize the learned reward model. In this work, leveraging the idea of r… ▽ More

    Submitted 16 April, 2024; v1 submitted 12 April, 2024; originally announced April 2024.

    Comments: 28 pages, 6 tables, 3 Figures, 3 Algorithms

  5. arXiv:2404.00399  [pdf, other

    cs.CL cs.AI cs.LG

    Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order

    Authors: Taishi Nakamura, Mayank Mishra, Simone Tedeschi, Yekun Chai, Jason T Stillerman, Felix Friedrich, Prateek Yadav, Tanmay Laud, Vu Minh Chien, Terry Yue Zhuo, Diganta Misra, Ben Bogin, Xuan-Son Vu, Marzena Karpinska, Arnav Varma Dantuluri, Wojciech Kusa, Tommaso Furlanello, Rio Yokota, Niklas Muennighoff, Suhas Pai, Tosin Adewumi, Veronika Laippala, Xiaozhe Yao, Adalberto Junior, Alpay Ariyak , et al. (20 additional authors not shown)

    Abstract: Pretrained language models underpin several AI applications, but their high computational cost for training limits accessibility. Initiatives such as BLOOM and StarCoder aim to democratize access to pretrained models for collaborative community development. However, such existing models face challenges: limited multilingual capabilities, continual pretraining causing catastrophic forgetting, where… ▽ More

    Submitted 23 April, 2024; v1 submitted 30 March, 2024; originally announced April 2024.

    Comments: Preprint

  6. arXiv:2403.13765  [pdf, other

    cs.LG cs.AI cs.CV

    Towards Principled Representation Learning from Videos for Reinforcement Learning

    Authors: Dipendra Misra, Akanksha Saran, Tengyang Xie, Alex Lamb, John Langford

    Abstract: We study pre-training representations for decision-making using video data, which is abundantly available for tasks such as game agents and software testing. Even though significant empirical advances have been made on this problem, a theoretical understanding remains absent. We initiate the theoretical investigation into principled approaches for representation learning and focus on learning the… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

    Comments: ICLR 2024 Spotlight Conference Paper

  7. arXiv:2403.13106  [pdf, other

    cs.LG cs.AI cs.CL cs.CV

    Knowing Your Nonlinearities: Shapley Interactions Reveal the Underlying Structure of Data

    Authors: Divyansh Singhvi, Andrej Erkelens, Raghav Jain, Diganta Misra, Naomi Saphra

    Abstract: Measuring nonlinear feature interaction is an established approach to understanding complex patterns of attribution in many models. In this paper, we use Shapley Taylor interaction indices (STII) to analyze the impact of underlying data structure on model representations in a variety of modalities, tasks, and architectures. Considering linguistic structure in masked and auto-regressive language mo… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

  8. arXiv:2403.10853  [pdf, other

    cs.LG cs.AI cs.CV

    Just Say the Name: Online Continual Learning with Category Names Only via Data Generation

    Authors: Minhyuk Seo, Diganta Misra, Seongwon Cho, Minjae Lee, Jonghyun Choi

    Abstract: In real-world scenarios, extensive manual annotation for continual learning is impractical due to prohibitive costs. Although prior arts, influenced by large-scale webly supervised training, suggest leveraging web-scraped data in continual learning, this poses challenges such as data imbalance, usage restrictions, and privacy concerns. Addressing the risks of continual webly supervised training, w… ▽ More

    Submitted 30 April, 2024; v1 submitted 16 March, 2024; originally announced March 2024.

  9. arXiv:2403.10696  [pdf, other

    cs.CV cs.LG

    On the low-shot transferability of [V]-Mamba

    Authors: Diganta Misra, Jay Gala, Antonio Orvieto

    Abstract: The strength of modern large-scale neural networks lies in their ability to efficiently adapt to new tasks with few examples. Although extensive research has investigated the transferability of Vision Transformers (ViTs) to various downstream tasks under diverse constraints, this study shifts focus to explore the transfer learning potential of [V]-Mamba. We compare its performance with ViTs across… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: Preprint (Work in progress)

  10. arXiv:2402.07876  [pdf, other

    cs.LG cs.AI cs.CL

    Policy Improvement using Language Feedback Models

    Authors: Victor Zhong, Dipendra Misra, Xingdi Yuan, Marc-Alexandre Côté

    Abstract: We introduce Language Feedback Models (LFMs) that identify desirable behaviour - actions that help achieve tasks specified in the instruction - for imitation learning in instruction following. To train LFMs, we obtain feedback from Large Language Models (LLMs) on visual trajectories verbalized to language descriptions. First, by using LFMs to identify desirable behaviour to imitate, we improve in… ▽ More

    Submitted 18 April, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

  11. arXiv:2312.13558  [pdf, other

    cs.LG cs.AI cs.CL cs.CV

    The Truth is in There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction

    Authors: Pratyusha Sharma, Jordan T. Ash, Dipendra Misra

    Abstract: Transformer-based Large Language Models (LLMs) have become a fixture in modern machine learning. Correspondingly, significant resources are allocated towards research that aims to further advance this technology, typically resulting in models of increasing size that are trained on increasing amounts of data. This work, however, demonstrates the surprising result that it is often possible to signif… ▽ More

    Submitted 20 December, 2023; originally announced December 2023.

  12. arXiv:2312.06853  [pdf, other

    cs.AI

    LLF-Bench: Benchmark for Interactive Learning from Language Feedback

    Authors: Ching-An Cheng, Andrey Kolobov, Dipendra Misra, Allen Nie, Adith Swaminathan

    Abstract: We introduce a new benchmark, LLF-Bench (Learning from Language Feedback Benchmark; pronounced as "elf-bench"), to evaluate the ability of AI agents to interactively learn from natural language feedback and instructions. Learning from language feedback (LLF) is essential for people, largely because the rich information this feedback provides can help a learner avoid much of trial and error and the… ▽ More

    Submitted 13 December, 2023; v1 submitted 11 December, 2023; originally announced December 2023.

  13. arXiv:2312.05212  [pdf, other

    cs.AR

    Enabling Normally-off In-Situ Computing with a Magneto-Electric FET-based SRAM Design

    Authors: Deniz Najafi, Mehrdad Morsali, Ranyang Zhou, Arman Roohi, Andrew Marshall, Durga Misra, Shaahin Angizi

    Abstract: As an emerging post-CMOS Field Effect Transistor, Magneto-Electric FETs (MEFETs) offer compelling design characteristics for logic and memory applications, such as high-speed switching, low power consumption, and non-volatility. In this paper, for the first time, a non-volatile MEFET-based SRAM design named ME-SRAM is proposed for edge applications which can remarkably save the SRAM static power c… ▽ More

    Submitted 8 December, 2023; originally announced December 2023.

    Comments: 7 pages, 10 Figures, 4 Tables

  14. arXiv:2308.14969  [pdf, other

    cs.LG cs.CV

    Uncovering the Hidden Cost of Model Compression

    Authors: Diganta Misra, Muawiz Chaudhary, Agam Goyal, Bharat Runwal, Pin Yu Chen

    Abstract: In an age dominated by resource-intensive foundation models, the ability to efficiently adapt to downstream tasks is crucial. Visual Prompting (VP), drawing inspiration from the prompting techniques employed in Large Language Models (LLMs), has emerged as a pivotal method for transfer learning in the realm of computer vision. As the importance of efficiency continues to rise, research into model c… ▽ More

    Submitted 15 March, 2024; v1 submitted 28 August, 2023; originally announced August 2023.

    Comments: Preprint

  15. arXiv:2306.11816  [pdf, other

    cs.LG cs.AI cs.CL

    Learning to Generate Better Than Your LLM

    Authors: Jonathan D. Chang, Kiante Brantley, Rajkumar Ramamurthy, Dipendra Misra, Wen Sun

    Abstract: Reinforcement learning (RL) has emerged as a powerful paradigm for fine-tuning Large Language Models (LLMs) for text generation. In particular, recent LLMs such as ChatGPT and GPT-4 can engage in fluent conversations with users after finetuning with RL. Capitalizing on key properties of text generation, we seek to investigate RL algorithms beyond general purpose algorithms like Proximal Policy Opt… ▽ More

    Submitted 13 November, 2023; v1 submitted 20 June, 2023; originally announced June 2023.

    Comments: 23 pages, 5 figures, 7 tables, 4 algorithms

  16. arXiv:2306.03286  [pdf, other

    cs.LG cs.AI

    Survival Instinct in Offline Reinforcement Learning

    Authors: Anqi Li, Dipendra Misra, Andrey Kolobov, Ching-An Cheng

    Abstract: We present a novel observation about the behavior of offline reinforcement learning (RL) algorithms: on many benchmark datasets, offline RL can produce well-performing and safe policies even when trained with "wrong" reward labels, such as those that are zero everywhere or are negatives of the true rewards. This phenomenon cannot be easily explained by offline RL's return maximization objective. M… ▽ More

    Submitted 8 November, 2023; v1 submitted 5 June, 2023; originally announced June 2023.

  17. arXiv:2211.07614  [pdf, other

    cs.LG

    Towards Data-Driven Offline Simulations for Online Reinforcement Learning

    Authors: Shengpu Tang, Felipe Vieira Frujeri, Dipendra Misra, Alex Lamb, John Langford, Paul Mineiro, Sebastian Kochman

    Abstract: Modern decision-making systems, from robots to web recommendation engines, are expected to adapt: to user preferences, changing circumstances or even new tasks. Yet, it is still uncommon to deploy a dynamically learning agent (rather than a fixed policy) to a production system, as it's perceived as unsafe. Using historical data to reason about learning algorithms, similar to offline policy evaluat… ▽ More

    Submitted 14 November, 2022; originally announced November 2022.

    Comments: Presented at the 3rd Offline Reinforcement Learning Workshop at NeurIPS 2022

  18. arXiv:2211.00164  [pdf, other

    cs.LG cs.AI cs.CV cs.RO

    Agent-Controller Representations: Principled Offline RL with Rich Exogenous Information

    Authors: Riashat Islam, Manan Tomar, Alex Lamb, Yonathan Efroni, Hongyu Zang, Aniket Didolkar, Dipendra Misra, Xin Li, Harm van Seijen, Remi Tachet des Combes, John Langford

    Abstract: Learning to control an agent from data collected offline in a rich pixel-based visual observation space is vital for real-world applications of reinforcement learning (RL). A major challenge in this setting is the presence of input information that is hard to model and irrelevant to controlling the agent. This problem has been approached by the theoretical RL community through the lens of exogenou… ▽ More

    Submitted 13 August, 2023; v1 submitted 31 October, 2022; originally announced November 2022.

    Comments: ICML 2023

  19. arXiv:2210.14492  [pdf, other

    cs.LG cs.AI stat.ML

    Provable Safe Reinforcement Learning with Binary Feedback

    Authors: Andrew Bennett, Dipendra Misra, Nathan Kallus

    Abstract: Safety is a crucial necessity in many applications of reinforcement learning (RL), whether robotic, automotive, or medical. Many existing approaches to safe RL rely on receiving numeric safety feedback, but in many cases this feedback can only take binary values; that is, whether an action in a given state is safe or unsafe. This is particularly true when feedback comes from human experts. We ther… ▽ More

    Submitted 26 October, 2022; originally announced October 2022.

  20. arXiv:2207.08229  [pdf, other

    cs.LG cs.RO stat.ML

    Guaranteed Discovery of Control-Endogenous Latent States with Multi-Step Inverse Models

    Authors: Alex Lamb, Riashat Islam, Yonathan Efroni, Aniket Didolkar, Dipendra Misra, Dylan Foster, Lekan Molu, Rajan Chari, Akshay Krishnamurthy, John Langford

    Abstract: In many sequential decision-making tasks, the agent is not able to model the full complexity of the world, which consists of multitudes of relevant and irrelevant information. For example, a person walking along a city street who tries to model all aspects of the world would quickly be overwhelmed by a multitude of shops, cars, and people moving in and out of view, each following their own complex… ▽ More

    Submitted 27 December, 2022; v1 submitted 17 July, 2022; originally announced July 2022.

    Comments: Project Website: https://controllable-latent-state.github.io/

  21. arXiv:2207.04543  [pdf, other

    cs.LG cs.AI

    Challenging Common Assumptions about Catastrophic Forgetting

    Authors: Timothée Lesort, Oleksiy Ostapenko, Diganta Misra, Md Rifat Arefin, Pau Rodríguez, Laurent Charlin, Irina Rish

    Abstract: Building learning agents that can progressively learn and accumulate knowledge is the core goal of the continual learning (CL) research field. Unfortunately, training a model on new data usually compromises the performance on past data. In the CL literature, this effect is referred to as catastrophic forgetting (CF). CF has been largely studied, and a plethora of methods have been proposed to addr… ▽ More

    Submitted 15 May, 2023; v1 submitted 10 July, 2022; originally announced July 2022.

  22. arXiv:2206.04615  [pdf, other

    cs.CL cs.AI cs.CY cs.LG stat.ML

    Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

    Authors: Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza , et al. (426 additional authors not shown)

    Abstract: Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur… ▽ More

    Submitted 12 June, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench

    Journal ref: Transactions on Machine Learning Research, May/2022, https://openreview.net/forum?id=uyTL5Bvosj

  23. arXiv:2206.04282  [pdf, ps, other

    cs.LG

    Sample-Efficient Reinforcement Learning in the Presence of Exogenous Information

    Authors: Yonathan Efroni, Dylan J. Foster, Dipendra Misra, Akshay Krishnamurthy, John Langford

    Abstract: In real-world reinforcement learning applications the learner's observation space is ubiquitously high-dimensional with both relevant and irrelevant information about the task at hand. Learning from high-dimensional observations has been the subject of extensive investigation in supervised learning and statistics (e.g., via sparsity), but analogous issues in reinforcement learning are not well und… ▽ More

    Submitted 9 June, 2022; originally announced June 2022.

    Comments: Accepted for presentation at the Conference on Learning Theory (COLT) 2022

  24. arXiv:2205.14237  [pdf, other

    cs.LG cs.AI stat.ML

    Provably Sample-Efficient RL with Side Information about Latent Dynamics

    Authors: Yao Liu, Dipendra Misra, Miro Dudík, Robert E. Schapire

    Abstract: We study reinforcement learning (RL) in settings where observations are high-dimensional, but where an RL agent has access to abstract knowledge about the structure of the state space, as is the case, for example, when a robot is tasked to go to a specific room in a building using observations from its own camera, while having access to the floor plan. We formalize this setting as transfer reinfor… ▽ More

    Submitted 27 May, 2022; originally announced May 2022.

    Comments: 35 pages, 4 figures

  25. arXiv:2204.01640  [pdf, other

    cs.LG cs.CV

    APP: Anytime Progressive Pruning

    Authors: Diganta Misra, Bharat Runwal, Tianlong Chen, Zhangyang Wang, Irina Rish

    Abstract: With the latest advances in deep learning, there has been a lot of focus on the online learning paradigm due to its relevance in practical settings. Although many methods have been investigated for optimal learning settings in scenarios where the data stream is continuous over time, sparse networks training in such settings have often been overlooked. In this paper, we explore the problem of train… ▽ More

    Submitted 1 June, 2022; v1 submitted 4 April, 2022; originally announced April 2022.

    Comments: 21 pages including 4 pages of references. Preprint version

  26. arXiv:2203.16683  [pdf, other

    astro-ph.SR cs.LG

    Active Learning for Computationally Efficient Distribution of Binary Evolution Simulations

    Authors: Kyle Akira Rocha, Jeff J. Andrews, Christopher P. L. Berry, Zoheyr Doctor, Aggelos K. Katsaggelos, Juan Gabriel Serra Pérez, Pablo Marchant, Vicky Kalogera, Scott Coughlin, Simone S. Bavera, Aaron Dotter, Tassos Fragos, Konstantinos Kovlakas, Devina Misra, Zepei Xing, Emmanouil Zapartas

    Abstract: Binary stars undergo a variety of interactions and evolutionary phases, critical for predicting and explaining observed properties. Binary population synthesis with full stellar-structure and evolution simulations are computationally expensive requiring a large number of mass-transfer sequences. The recently developed binary population synthesis code POSYDON incorporates grids of MESA binary star… ▽ More

    Submitted 16 September, 2022; v1 submitted 30 March, 2022; originally announced March 2022.

    Comments: 21 pages, 10 figures, ApJ in press

    Journal ref: Astrophysical Journal; 938(1):64(15); 2022

  27. arXiv:2202.14037  [pdf, other

    cs.LG cs.AI

    Understanding Contrastive Learning Requires Incorporating Inductive Biases

    Authors: Nikunj Saunshi, Jordan Ash, Surbhi Goel, Dipendra Misra, Cyril Zhang, Sanjeev Arora, Sham Kakade, Akshay Krishnamurthy

    Abstract: Contrastive learning is a popular form of self-supervised learning that encourages augmentations (views) of the same input to have more similar representations compared to augmentations of different inputs. Recent attempts to theoretically explain the success of contrastive learning on downstream classification tasks prove guarantees depending on properties of {\em augmentations} and the value of… ▽ More

    Submitted 28 February, 2022; originally announced February 2022.

  28. arXiv:2110.08847  [pdf, other

    cs.LG

    Provable RL with Exogenous Distractors via Multistep Inverse Dynamics

    Authors: Yonathan Efroni, Dipendra Misra, Akshay Krishnamurthy, Alekh Agarwal, John Langford

    Abstract: Many real-world applications of reinforcement learning (RL) require the agent to deal with high-dimensional observations such as those generated from a megapixel camera. Prior work has addressed such problems with representation learning, through which the agent can provably extract endogenous, latent state information from raw observations and subsequently plan efficiently. However, such approach… ▽ More

    Submitted 5 March, 2022; v1 submitted 17 October, 2021; originally announced October 2021.

    Comments: ICLR 2022

  29. arXiv:2106.09943  [pdf, other

    cs.LG cs.CL stat.ML

    Investigating the Role of Negatives in Contrastive Representation Learning

    Authors: Jordan T. Ash, Surbhi Goel, Akshay Krishnamurthy, Dipendra Misra

    Abstract: Noise contrastive learning is a popular technique for unsupervised representation learning. In this approach, a representation is obtained via reduction to supervised learning, where given a notion of semantic similarity, the learner tries to distinguish a similar (positive) example from a collection of random (negative) examples. The success of modern contrastive learning pipelines relies on many… ▽ More

    Submitted 18 June, 2021; originally announced June 2021.

  30. arXiv:2105.10165  [pdf, ps, other

    cs.CL cs.CY cs.IR cs.LG

    Have you tried Neural Topic Models? Comparative Analysis of Neural and Non-Neural Topic Models with Application to COVID-19 Twitter Data

    Authors: Andrew Bennett, Dipendra Misra, Nga Than

    Abstract: Topic models are widely used in studying social phenomena. We conduct a comparative study examining state-of-the-art neural versus non-neural topic models, performing a rigorous quantitative and qualitative assessment on a dataset of tweets about the COVID-19 pandemic. Our results show that not only do neural topic models outperform their classical counterparts on standard evaluation metrics, but… ▽ More

    Submitted 21 May, 2021; originally announced May 2021.

  31. arXiv:2102.07024  [pdf, other

    cs.CL cs.AI cs.HC cs.LG

    Interactive Learning from Activity Description

    Authors: Khanh Nguyen, Dipendra Misra, Robert Schapire, Miro Dudík, Patrick Shafto

    Abstract: We present a novel interactive learning protocol that enables training request-fulfilling agents by verbally describing their activities. Unlike imitation learning (IL), our protocol allows the teaching agent to provide feedback in a language that is most appropriate for them. Compared with reward in reinforcement learning (RL), the description feedback is richer and allows for improved sample com… ▽ More

    Submitted 14 June, 2021; v1 submitted 13 February, 2021; originally announced February 2021.

    Comments: ICML 2021

  32. arXiv:2010.03799  [pdf, ps, other

    cs.LG math.OC math.ST stat.ML

    Learning the Linear Quadratic Regulator from Nonlinear Observations

    Authors: Zakaria Mhammedi, Dylan J. Foster, Max Simchowitz, Dipendra Misra, Wen Sun, Akshay Krishnamurthy, Alexander Rakhlin, John Langford

    Abstract: We introduce a new problem setting for continuous control called the LQR with Rich Observations, or RichLQR. In our setting, the environment is summarized by a low-dimensional continuous latent state with linear dynamics and quadratic costs, but the agent operates on high-dimensional, nonlinear observations such as images from a camera. To enable sample-efficient learning, we assume that the learn… ▽ More

    Submitted 8 October, 2020; originally announced October 2020.

    Comments: To appear at NeurIPS 2020

  33. arXiv:2010.03045  [pdf, other

    cs.CV

    Rotate to Attend: Convolutional Triplet Attention Module

    Authors: Diganta Misra, Trikay Nalamada, Ajay Uppili Arasanipalai, Qibin Hou

    Abstract: Benefiting from the capability of building inter-dependencies among channels or spatial locations, attention mechanisms have been extensively studied and broadly used in a variety of computer vision tasks recently. In this paper, we investigate light-weight but effective attention mechanisms and present triplet attention, a novel method for computing attention weights by capturing cross-dimension… ▽ More

    Submitted 5 November, 2020; v1 submitted 6 October, 2020; originally announced October 2020.

    Comments: Accepted to WACV 2021

  34. arXiv:1911.05815  [pdf, other

    cs.LG stat.ML

    Kinematic State Abstraction and Provably Efficient Rich-Observation Reinforcement Learning

    Authors: Dipendra Misra, Mikael Henaff, Akshay Krishnamurthy, John Langford

    Abstract: We present an algorithm, HOMER, for exploration and reinforcement learning in rich observation environments that are summarizable by an unknown latent state space. The algorithm interleaves representation learning to identify a new notion of kinematic state abstraction with strategic exploration to reach new states using the learned abstraction. The algorithm provably explores the environment with… ▽ More

    Submitted 13 November, 2019; originally announced November 2019.

  35. arXiv:1909.07572  [pdf, other

    cs.RO cs.AI cs.CV

    Is That a Chair? Imagining Affordances Using Simulations of an Articulated Human Body

    Authors: Hongtao Wu, Deven Misra, Gregory S. Chirikjian

    Abstract: For robots to exhibit a high level of intelligence in the real world, they must be able to assess objects for which they have no prior knowledge. Therefore, it is crucial for robots to perceive object affordances by reasoning about physical interactions with the object. In this paper, we propose a novel method to provide robots with an ability to imagine object affordances using physical simulatio… ▽ More

    Submitted 7 April, 2020; v1 submitted 16 September, 2019; originally announced September 2019.

    Comments: 7 pages, 6 figures. Accepted to ICRA2020

  36. arXiv:1908.08681  [pdf, other

    cs.LG cs.CV cs.NE stat.ML

    Mish: A Self Regularized Non-Monotonic Activation Function

    Authors: Diganta Misra

    Abstract: We propose $\textit{Mish}$, a novel self-regularized non-monotonic activation function which can be mathematically defined as: $f(x)=x\tanh(softplus(x))$. As activation functions play a crucial role in the performance and training dynamics in neural networks, we validated experimentally on several well-known benchmarks against the best combinations of architectures and activation functions. We als… ▽ More

    Submitted 13 August, 2020; v1 submitted 23 August, 2019; originally announced August 2019.

    Comments: Accepted to BMVC 2020

  37. arXiv:1905.13320  [pdf, other

    cs.LG cs.AI stat.ML

    Combating the Compounding-Error Problem with a Multi-step Model

    Authors: Kavosh Asadi, Dipendra Misra, Seungchan Kim, Michel L. Littman

    Abstract: Model-based reinforcement learning is an appealing framework for creating agents that learn, plan, and act in sequential environments. Model-based algorithms typically involve learning a transition model that takes a state and an action and outputs the next state---a one-step model. This model can be composed with itself to enable predicting multiple steps into the future, but one-step prediction… ▽ More

    Submitted 30 May, 2019; originally announced May 2019.

  38. arXiv:1812.09702  [pdf

    cs.CV cs.IR

    Advanced Image Processing for Astronomical Images

    Authors: Diganta Misra, Sparsha Mishra, Bhargav Appasani

    Abstract: Image Processing in Astronomy is a major field of research and involves a lot of techniques pertaining to improve analyzing the properties of the celestial objects or obtaining preliminary inference from the image data. In this paper, we provide a comprehensive case study of advanced image processing techniques applied to Astronomical Galaxy Images for improved analysis, accurate inferences and fa… ▽ More

    Submitted 23 December, 2018; originally announced December 2018.

    Comments: 7 pages, 13 figures, accepted at IEEE International Conference on Electrical, Communication, Electronics, Instrumentation and Computing (ICECEIC)

  39. arXiv:1812.09693  [pdf

    cs.CV cs.IR

    Image Processing on IOPA Radiographs: A comprehensive case study on Apical Periodontitis

    Authors: Diganta Misra, Vanshika Arora

    Abstract: With the recent advancements in Image Processing Techniques and development of new robust computer vision algorithms, new areas of research within Medical Diagnosis and Biomedical Engineering are picking up pace. This paper provides a comprehensive in-depth case study of Image Processing, Feature Extraction and Analysis of Apical Periodontitis diagnostic cases in IOPA (Intra Oral Peri-Apical) Radi… ▽ More

    Submitted 22 March, 2019; v1 submitted 23 December, 2018; originally announced December 2018.

    Comments: 15 pages, 42 figures and Submitted at ICIAP 2019: 21st International Conference on Image Analysis and Processing

  40. arXiv:1811.12354  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Touchdown: Natural Language Navigation and Spatial Reasoning in Visual Street Environments

    Authors: Howard Chen, Alane Suhr, Dipendra Misra, Noah Snavely, Yoav Artzi

    Abstract: We study the problem of jointly reasoning about language and vision through a navigation and spatial reasoning task. We introduce the Touchdown task and dataset, where an agent must first follow navigation instructions in a real-life visual urban environment, and then identify a location described in natural language to find a hidden object at the goal position. The data contains 9,326 examples of… ▽ More

    Submitted 16 May, 2020; v1 submitted 29 November, 2018; originally announced November 2018.

    Comments: arXiv admin note: text overlap with arXiv:1809.00786

    Journal ref: Published in CVPR 2019

  41. arXiv:1811.08824  [pdf, other

    cs.CV cs.RO

    Early Fusion for Goal Directed Robotic Vision

    Authors: Aaron Walsman, Yonatan Bisk, Saadia Gabriel, Dipendra Misra, Yoav Artzi, Yejin Choi, Dieter Fox

    Abstract: Building perceptual systems for robotics which perform well under tight computational budgets requires novel architectures which rethink the traditional computer vision pipeline. Modern vision architectures require the agent to build a summary representation of the entire scene, even if most of the input is irrelevant to the agent's current goal. In this work, we flip this paradigm, by introducing… ▽ More

    Submitted 7 August, 2019; v1 submitted 21 November, 2018; originally announced November 2018.

  42. arXiv:1811.04179  [pdf, other

    cs.RO cs.AI cs.CL cs.CV cs.LG

    Mapping Navigation Instructions to Continuous Control Actions with Position-Visitation Prediction

    Authors: Valts Blukis, Dipendra Misra, Ross A. Knepper, Yoav Artzi

    Abstract: We propose an approach for mapping natural language instructions and raw observations to continuous control of a quadcopter drone. Our model predicts interpretable position-visitation distributions indicating where the agent should go during execution and where it should stop, and uses the predicted distributions to select the actions to execute. This two-step model decomposition allows for simple… ▽ More

    Submitted 10 December, 2018; v1 submitted 9 November, 2018; originally announced November 2018.

    Comments: Appeared in Conference on Robot Learning 2018

    Journal ref: In Conference on Robot Learning (pp. 505-518) (2018)

  43. arXiv:1811.00128  [pdf, other

    cs.LG cs.AI stat.ML

    Towards a Simple Approach to Multi-step Model-based Reinforcement Learning

    Authors: Kavosh Asadi, Evan Cater, Dipendra Misra, Michael L. Littman

    Abstract: When environmental interaction is expensive, model-based reinforcement learning offers a solution by planning ahead and avoiding costly mistakes. Model-based agents typically learn a single-step transition model. In this paper, we propose a multi-step model that predicts the outcome of an action sequence with variable length. We show that this model is easy to learn, and that the model can make po… ▽ More

    Submitted 31 October, 2018; originally announced November 2018.

  44. arXiv:1809.01299  [pdf, other

    cs.CL

    Policy Shaping and Generalized Update Equations for Semantic Parsing from Denotations

    Authors: Dipendra Misra, Ming-Wei Chang, Xiaodong He, Wen-tau Yih

    Abstract: Semantic parsing from denotations faces two key challenges in model training: (1) given only the denotations (e.g., answers), search for good candidate semantic parses, and (2) choose the best model update algorithm. We propose effective and general solutions to each of them. Using policy shaping, we bias the search procedure towards semantic parses that are more compatible to the text, which prov… ▽ More

    Submitted 4 September, 2018; originally announced September 2018.

    Comments: Accepted at EMNLP 2018

  45. arXiv:1809.00786  [pdf, other

    cs.CL

    Mapping Instructions to Actions in 3D Environments with Visual Goal Prediction

    Authors: Dipendra Misra, Andrew Bennett, Valts Blukis, Eyvind Niklasson, Max Shatkhin, Yoav Artzi

    Abstract: We propose to decompose instruction execution to goal prediction and action generation. We design a model that maps raw visual observations to goals using LINGUNET, a language-conditioned image generation network, and then generates the actions required to complete them. Our model is trained from demonstration only without external resources. To evaluate our approach, we introduce two benchmarks f… ▽ More

    Submitted 18 March, 2019; v1 submitted 3 September, 2018; originally announced September 2018.

    Comments: Accepted at EMNLP 2018

  46. arXiv:1806.01265  [pdf, ps, other

    cs.LG cs.AI stat.ML

    Equivalence Between Wasserstein and Value-Aware Loss for Model-based Reinforcement Learning

    Authors: Kavosh Asadi, Evan Cater, Dipendra Misra, Michael L. Littman

    Abstract: Learning a generative model is a key component of model-based reinforcement learning. Though learning a good model in the tabular setting is a simple task, learning a useful model in the approximate setting is challenging. In this context, an important question is the loss function used for model learning as varying the loss function can have a remarkable impact on effectiveness of planning. Recen… ▽ More

    Submitted 8 July, 2018; v1 submitted 1 June, 2018; originally announced June 2018.

    Comments: Accepted at the FAIM workshop "Prediction and Generative Modeling in Reinforcement Learning", Stockholm, Sweden, 2018

  47. arXiv:1804.07193  [pdf, other

    cs.LG cs.AI stat.ML

    Lipschitz Continuity in Model-based Reinforcement Learning

    Authors: Kavosh Asadi, Dipendra Misra, Michael L. Littman

    Abstract: We examine the impact of learning Lipschitz continuous models in the context of model-based reinforcement learning. We provide a novel bound on multi-step prediction error of Lipschitz models where we quantify the error using the Wasserstein metric. We go on to prove an error bound for the value-function estimate arising from Lipschitz models and show that the estimated value function is itself Li… ▽ More

    Submitted 27 July, 2018; v1 submitted 19 April, 2018; originally announced April 2018.

    Comments: Accepted for the 35th International Conference on Machine Learning (ICML 2018)

  48. arXiv:1801.07357  [pdf, other

    cs.AI

    CHALET: Cornell House Agent Learning Environment

    Authors: Claudia Yan, Dipendra Misra, Andrew Bennnett, Aaron Walsman, Yonatan Bisk, Yoav Artzi

    Abstract: We present CHALET, a 3D house simulator with support for navigation and manipulation. CHALET includes 58 rooms and 10 house configuration, and allows to easily create new house and room layouts. CHALET supports a range of common household activities, including moving objects, toggling appliances, and placing objects inside closeable containers. The environment and actions available are designed to… ▽ More

    Submitted 16 September, 2019; v1 submitted 22 January, 2018; originally announced January 2018.

  49. arXiv:1704.08795  [pdf, other

    cs.CL

    Mapping Instructions and Visual Observations to Actions with Reinforcement Learning

    Authors: Dipendra Misra, John Langford, Yoav Artzi

    Abstract: We propose to directly map raw visual observations and text input to actions for instruction execution. While existing approaches assume access to structured environment representations or use a pipeline of separately trained models, we learn a single model to jointly reason about linguistic and visual input. We use reinforcement learning in a contextual bandit setting to train a neural network ag… ▽ More

    Submitted 22 July, 2017; v1 submitted 27 April, 2017; originally announced April 2017.

    Comments: In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2017

  50. arXiv:1412.0691  [pdf, other

    cs.AI cs.RO

    RoboBrain: Large-Scale Knowledge Engine for Robots

    Authors: Ashutosh Saxena, Ashesh Jain, Ozan Sener, Aditya Jami, Dipendra K. Misra, Hema S. Koppula

    Abstract: In this paper we introduce a knowledge engine, which learns and shares knowledge representations, for robots to carry out a variety of tasks. Building such an engine brings with it the challenge of dealing with multiple data modalities including symbols, natural language, haptic senses, robot trajectories, visual features and many others. The \textit{knowledge} stored in the engine comes from mult… ▽ More

    Submitted 12 April, 2015; v1 submitted 1 December, 2014; originally announced December 2014.

    Comments: 10 pages, 9 figures