Skip to main content

Showing 1–30 of 30 results for author: Doshi, P

  1. arXiv:2311.08393  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    MVSA-Net: Multi-View State-Action Recognition for Robust and Deployable Trajectory Generation

    Authors: Ehsan Asali, Prashant Doshi, Jin Sun

    Abstract: The learn-from-observation (LfO) paradigm is a human-inspired mode for a robot to learn to perform a task simply by watching it being performed. LfO can facilitate robot integration on factory floors by minimizing disruption and reducing tedious programming. A key component of the LfO pipeline is a transformation of the depth camera frames to the corresponding task state and action pairs, which ar… ▽ More

    Submitted 7 April, 2024; v1 submitted 14 November, 2023; originally announced November 2023.

    Comments: Presented at Deployable AI Workshop at AAAI-2024 and 'Towards Reliable and Deployable Learning-Based Robotic Systems' Workshop at CoRL2023

  2. arXiv:2311.03698  [pdf, other

    cs.LG cs.AI cs.RO

    A Novel Variational Lower Bound for Inverse Reinforcement Learning

    Authors: Yikang Gui, Prashant Doshi

    Abstract: Inverse reinforcement learning (IRL) seeks to learn the reward function from expert trajectories, to understand the task for imitation or collaboration thereby removing the need for manual reward engineering. However, IRL in the context of large, high-dimensional problems with unknown dynamics has been particularly challenging. In this paper, we present a new Variational Lower Bound for IRL (VLB-I… ▽ More

    Submitted 10 November, 2023; v1 submitted 6 November, 2023; originally announced November 2023.

  3. arXiv:2311.02305  [pdf, other

    cs.CV cs.AI cs.RO

    OSM vs HD Maps: Map Representations for Trajectory Prediction

    Authors: Jing-Yan Liao, Parth Doshi, Zihan Zhang, David Paz, Henrik Christensen

    Abstract: While High Definition (HD) Maps have long been favored for their precise depictions of static road elements, their accessibility constraints and susceptibility to rapid environmental changes impede the widespread deployment of autonomous driving, especially in the motion forecasting task. In this context, we propose to leverage OpenStreetMap (OSM) as a promising alternative to HD Maps for long-ter… ▽ More

    Submitted 3 November, 2023; originally announced November 2023.

  4. arXiv:2305.05159  [pdf, other

    cs.LG cs.AI cs.MA

    Latent Interactive A2C for Improved RL in Open Many-Agent Systems

    Authors: Keyang He, Prashant Doshi, Bikramjit Banerjee

    Abstract: There is a prevalence of multiagent reinforcement learning (MARL) methods that engage in centralized training. But, these methods involve obtaining various types of information from the other agents, which may not be feasible in competitive or adversarial settings. A recent method, the interactive advantage actor critic (IA2C), engages in decentralized training coupled with decentralized execution… ▽ More

    Submitted 9 May, 2023; originally announced May 2023.

  5. arXiv:2208.06988  [pdf, ps, other

    cs.LG cs.IT

    IRL with Partial Observations using the Principle of Uncertain Maximum Entropy

    Authors: Kenneth Bogert, Yikang Gui, Prashant Doshi

    Abstract: The principle of maximum entropy is a broadly applicable technique for computing a distribution with the least amount of information possible while constrained to match empirically estimated feature expectations. However, in many real-world applications that use noisy sensors computing the feature expectations may be challenging due to partial observation of the relevant model variables. For examp… ▽ More

    Submitted 14 August, 2022; originally announced August 2022.

    Comments: 7 pages, 5 figures

  6. arXiv:2206.04615  [pdf, other

    cs.CL cs.AI cs.CY cs.LG stat.ML

    Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

    Authors: Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza , et al. (426 additional authors not shown)

    Abstract: Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur… ▽ More

    Submitted 12 June, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench

    Journal ref: Transactions on Machine Learning Research, May/2022, https://openreview.net/forum?id=uyTL5Bvosj

  7. arXiv:2202.11188  [pdf, other

    cs.MA

    SIPOMDPLite-Net: Lightweight, Self-Interested Learning and Planning in POSGs with Sparse Interactions

    Authors: Gengyu Zhang, Prashant Doshi

    Abstract: This work introduces sIPOMDPLite-net, a deep neural network (DNN) architecture for decentralized, self-interested agent control in partially observable stochastic games (POSGs) with sparse interactions between agents. The network learns to plan in contexts modeled by the interactive partially observable Markov decision process (I-POMDP) Lite framework and uses hierarchical value iteration networks… ▽ More

    Submitted 22 February, 2022; originally announced February 2022.

    Comments: 7 pages, 4 figures, 1 table without the appendix

  8. arXiv:2109.07788  [pdf, other

    cs.RO cs.AI cs.CV

    Marginal MAP Estimation for Inverse RL under Occlusion with Observer Noise

    Authors: Prasanth Sengadu Suresh, Prashant Doshi

    Abstract: We consider the problem of learning the behavioral preferences of an expert engaged in a task from noisy and partially-observable demonstrations. This is motivated by real-world applications such as a line robot learning from observing a human worker, where some observations are occluded by environmental objects that cannot be removed. Furthermore, robotic perception tends to be imperfect and nois… ▽ More

    Submitted 16 September, 2021; originally announced September 2021.

  9. arXiv:2107.05818  [pdf, ps, other

    cs.LG cs.RO

    A Hierarchical Bayesian model for Inverse RL in Partially-Controlled Environments

    Authors: Kenneth Bogert, Prashant Doshi

    Abstract: Robots learning from observations in the real world using inverse reinforcement learning (IRL) may encounter objects or agents in the environment, other than the expert, that cause nuisance observations during the demonstration. These confounding elements are typically removed in fully-controlled environments such as virtual simulations or lab settings. When complete removal is impossible the nuis… ▽ More

    Submitted 12 July, 2021; originally announced July 2021.

    Comments: 8 pages, 10 figures

    ACM Class: I.2.6; I.2.9

    Journal ref: Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems. 2022

  10. arXiv:2106.09825  [pdf, other

    cs.LG cs.AI cs.MA

    Many Agent Reinforcement Learning Under Partial Observability

    Authors: Keyang He, Prashant Doshi, Bikramjit Banerjee

    Abstract: Recent renewed interest in multi-agent reinforcement learning (MARL) has generated an impressive array of techniques that leverage deep reinforcement learning, primarily actor-critic architectures, and can be applied to a limited range of settings in terms of observability and communication. However, a continuing limitation of much of this work is the curse of dimensionality when it comes to repre… ▽ More

    Submitted 17 June, 2021; originally announced June 2021.

  11. arXiv:2010.08030  [pdf, other

    cs.LG

    Cooperative-Competitive Reinforcement Learning with History-Dependent Rewards

    Authors: Keyang He, Bikramjit Banerjee, Prashant Doshi

    Abstract: Consider a typical organization whose worker agents seek to collectively cooperate for its general betterment. However, each individual agent simultaneously seeks to act to secure a larger chunk than its co-workers of the annual increment in compensation, which usually comes from a {\em fixed} pot. As such, the individual agent in the organization must cooperate and compete. Another feature of man… ▽ More

    Submitted 15 October, 2020; originally announced October 2020.

    Comments: 9 pages, 6 figures

  12. arXiv:2007.09512  [pdf, other

    cs.MA cs.CR

    Active Deception using Factored Interactive POMDPs to Recognize Cyber Attacker's Intent

    Authors: Aditya Shinde, Prashant Doshi, Omid Setayeshfar

    Abstract: This paper presents an intelligent and adaptive agent that employs deception to recognize a cyber adversary's intent. Unlike previous approaches to cyber deception, which mainly focus on delaying or confusing the attackers, we focus on engaging with them to learn their intent. We model cyber deception as a sequential decision-making problem in a two-agent context. We introduce factored finitely ne… ▽ More

    Submitted 18 July, 2020; originally announced July 2020.

  13. arXiv:2006.07300  [pdf, other

    cs.AI cs.LG

    Recurrent Sum-Product-Max Networks for Decision Making in Perfectly-Observed Environments

    Authors: Hari Teja Tatavarti, Prashant Doshi, Layton Hayes

    Abstract: Recent investigations into sum-product-max networks (SPMN) that generalize sum-product networks (SPN) offer a data-driven alternative for decision making, which has predominantly relied on handcrafted models. SPMNs computationally represent a probabilistic decision-making problem whose solution scales linearly in the size of the network. However, SPMNs are not well suited for sequential decision m… ▽ More

    Submitted 12 June, 2020; originally announced June 2020.

  14. arXiv:2004.12873  [pdf, ps, other

    cs.LG cs.AI cs.RO stat.ML

    Maximum Entropy Multi-Task Inverse RL

    Authors: Saurabh Arora, Bikramjit Banerjee, Prashant Doshi

    Abstract: Multi-task IRL allows for the possibility that the expert could be switching between multiple ways of solving the same problem, or interleaving demonstrations of multiple tasks. The learner aims to learn the multiple reward functions that guide these ways of solving the problem. We present a new method for multi-task IRL that generalizes the well-known maximum entropy approach to IRL by combining… ▽ More

    Submitted 27 April, 2020; originally announced April 2020.

  15. arXiv:1911.08642  [pdf, other

    cs.MA

    Scalable Decision-Theoretic Planning in Open and Typed Multiagent Systems

    Authors: Adam Eck, Maulik Shah, Prashant Doshi, Leen-Kiat Soh

    Abstract: In open agent systems, the set of agents that are cooperating or competing changes over time and in ways that are nontrivial to predict. For example, if collaborative robots were tasked with fighting wildfires, they may run out of suppressants and be temporarily unavailable to assist their peers. We consider the problem of planning in these contexts with the additional challenges that the agents a… ▽ More

    Submitted 19 November, 2019; originally announced November 2019.

    Comments: Pre-print with appendices for AAAI 2020

  16. arXiv:1909.00902  [pdf, other

    cs.CR

    GrAALF:Supporting Graphical Analysis of Audit Logs for Forensics

    Authors: Omid Setayeshfar, Christian Adkins, Matthew Jones, Kyu Hyung Lee, Prashant Doshi

    Abstract: System-level audit logs often play a critical role in computer forensics. They capture low-level interactions between programs and users in much detail, making them a rich source of insight and provenance on malicious user activity. However, using these logs to discover and understand malicious activities when a typical computer generates more than 2.5 million system events hourly is both compute… ▽ More

    Submitted 21 April, 2020; v1 submitted 2 September, 2019; originally announced September 2019.

  17. SA-Net: Deep Neural Network for Robot Trajectory Recognition from RGB-D Streams

    Authors: Nihal Soans, Ehsan Asali, Yi Hong, Prashant Doshi

    Abstract: Learning from demonstration (LfD) and imitation learning offer new paradigms for transferring task behavior to robots. A class of methods that enable such online learning require the robot to observe the task being performed and decompose the sensed streaming data into sequences of state-action pairs, which are then input to the methods. Thus, recognizing the state-action pairs correctly and quick… ▽ More

    Submitted 28 August, 2020; v1 submitted 10 May, 2019; originally announced May 2019.

    Comments: (in press)

    Journal ref: ICRA 2020, pp. 2153-2159

  18. arXiv:1806.06877  [pdf, other

    cs.LG stat.ML

    A Survey of Inverse Reinforcement Learning: Challenges, Methods and Progress

    Authors: Saurabh Arora, Prashant Doshi

    Abstract: Inverse reinforcement learning (IRL) is the problem of inferring the reward function of an agent, given its policy or observed behavior. Analogous to RL, IRL is perceived both as a problem and as a class of methods. By categorically surveying the current literature in IRL, this article serves as a reference for researchers and practitioners of machine learning and beyond to understand the challeng… ▽ More

    Submitted 18 November, 2020; v1 submitted 18 June, 2018; originally announced June 2018.

  19. Reinforcement Learning for Heterogeneous Teams with PALO Bounds

    Authors: Roi Ceren, Prashant Doshi, Keyang He

    Abstract: We introduce reinforcement learning for heterogeneous teams in which rewards for an agent are additively factored into local costs, stimuli unique to each agent, and global rewards, those shared by all agents in the domain. Motivating domains include coordination of varied robotic platforms, which incur different costs for the same action, but share an overall goal. We present two templates for le… ▽ More

    Submitted 23 May, 2018; originally announced May 2018.

    Journal ref: Neurocomputing, Volume 420, 8 January 2021, Pages 36-56

  20. arXiv:1805.07871  [pdf, other

    cs.LG cs.AI stat.ML

    A Framework and Method for Online Inverse Reinforcement Learning

    Authors: Saurabh Arora, Prashant Doshi, Bikramjit Banerjee

    Abstract: Inverse reinforcement learning (IRL) is the problem of learning the preferences of an agent from the observations of its behavior on a task. While this problem has been well investigated, the related problem of {\em online} IRL---where the observations are incrementally accrued, yet the demands of the application often prohibit a full rerun of an IRL method---has received relatively less attention… ▽ More

    Submitted 20 May, 2018; originally announced May 2018.

    Journal ref: Journal of Autonomous Agents and Multi-Agent Systems, Volume 35, Article number: 4 (2021)

  21. arXiv:1710.10116  [pdf, other

    cs.RO cs.AI cs.LG

    Inverse Reinforcement Learning Under Noisy Observations

    Authors: Shervin Shahryari, Prashant Doshi

    Abstract: We consider the problem of performing inverse reinforcement learning when the trajectory of the expert is not perfectly observed by the learner. Instead, a noisy continuous-time observation of the trajectory is provided to the learner. This problem exhibits wide-ranging applications and the specific application we consider here is the scenario in which the learner seeks to penetrate a perimeter pa… ▽ More

    Submitted 27 October, 2017; originally announced October 2017.

    Comments: Full version of the extended abstract published in AAMAS 2017 conference, pages 1733 - 1735

    Journal ref: In Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems (AAMAS '17). International Foundation for Autonomous Agents and Multiagent Systems, Richland, SC, 1733-1735, 2017

  22. Freeway Merging in Congested Traffic based on Multipolicy Decision Making with Passive Actor Critic

    Authors: Tomoki Nishi, Prashant Doshi, Danil Prokhorov

    Abstract: Freeway merging in congested traffic is a significant challenge toward fully automated driving. Merging vehicles need to decide not only how to merge into a spot, but also where to merge. We present a method for the freeway merging based on multi-policy decision making with a reinforcement learning method called {\em passive actor-critic} (pAC), which learns with less knowledge of the system and w… ▽ More

    Submitted 14 July, 2017; originally announced July 2017.

    Comments: 6 pages, 5 figures. ICML Workshop on Machine Learning for Autonomous Vehicles

  23. arXiv:1706.01077  [pdf, other

    cs.AI

    Actor-Critic for Linearly-Solvable Continuous MDP with Partially Known Dynamics

    Authors: Tomoki Nishi, Prashant Doshi, Michael R. James, Danil Prokhorov

    Abstract: In many robotic applications, some aspects of the system dynamics can be modeled accurately while others are difficult to obtain or model. We present a novel reinforcement learning (RL) method for continuous state and action spaces that learns with partial knowledge of the system and without active exploration. It solves linearly-solvable Markov decision processes (L-MDPs), which are well suited f… ▽ More

    Submitted 4 June, 2017; originally announced June 2017.

    Comments: 10 pages, 7 figures

  24. arXiv:1511.04412  [pdf, other

    cs.LG cs.AI stat.ML

    Dynamic Sum Product Networks for Tractable Inference on Sequence Data (Extended Version)

    Authors: Mazen Melibari, Pascal Poupart, Prashant Doshi, George Trimponias

    Abstract: Sum-Product Networks (SPN) have recently emerged as a new class of tractable probabilistic graphical models. Unlike Bayesian networks and Markov networks where inference may be exponential in the size of the network, inference in SPNs is in time linear in the size of the network. Since SPNs represent distributions over a fixed set of variables only, we propose dynamic sum product networks (DSPNs)… ▽ More

    Submitted 15 July, 2016; v1 submitted 13 November, 2015; originally announced November 2015.

    Comments: Published in the Proceedings of the International Conference on Probabilistic Graphical Models (PGM), 2016

  25. arXiv:1503.07220  [pdf, ps, other

    cs.MA cs.AI cs.GT

    Individual Planning in Agent Populations: Exploiting Anonymity and Frame-Action Hypergraphs

    Authors: Ekhlas Sonu, Yingke Chen, Prashant Doshi

    Abstract: Interactive partially observable Markov decision processes (I-POMDP) provide a formal framework for planning for a self-interested agent in multiagent settings. An agent operating in a multiagent environment must deliberate about the actions that other agents may take and the effect these actions have on the environment and the rewards it receives. Traditional I-POMDPs model this dependence on the… ▽ More

    Submitted 2 April, 2015; v1 submitted 24 March, 2015; originally announced March 2015.

    Comments: 8 page article plus two page appendix containing proofs in Proceedings of 25th International Conference on Autonomous Planning and Scheduling, 2015

    Journal ref: In Proceedings of 25th International Conference on Automated Planning and Scheduling, 2015

  26. arXiv:1409.0302  [pdf, ps, other

    cs.MA cs.AI

    Team Behavior in Interactive Dynamic Influence Diagrams with Applications to Ad Hoc Teams

    Authors: Muthukumaran Chandrasekaran, Prashant Doshi, Yifeng Zeng, Yingke Chen

    Abstract: Planning for ad hoc teamwork is challenging because it involves agents collaborating without any prior coordination or communication. The focus is on principled methods for a single agent to cooperate with others. This motivates investigating the ad hoc teamwork problem in the context of individual decision making frameworks. However, individual decision making in multiagent settings faces the tas… ▽ More

    Submitted 1 September, 2014; originally announced September 2014.

    Comments: 8 pages, Appeared in the MSDM Workshop at AAMAS 2014, Extended Abstract version appeared at AAMAS 2014, France

    MSC Class: 68T37

  27. Exploiting Model Equivalences for Solving Interactive Dynamic Influence Diagrams

    Authors: Yifeng Zeng, Prashant Doshi

    Abstract: We focus on the problem of sequential decision making in partially observable environments shared with other agents of uncertain types having similar or conflicting objectives. This problem has been previously formalized by multiple frameworks one of which is the interactive dynamic influence diagram (I-DID), which generalizes the well-known influence diagram to the mult… ▽ More

    Submitted 18 January, 2014; originally announced January 2014.

    Journal ref: Journal Of Artificial Intelligence Research, Volume 43, pages 211-255, 2012

  28. Monte Carlo Sampling Methods for Approximating Interactive POMDPs

    Authors: Prashant Doshi, Piotr J. Gmytrasiewicz

    Abstract: Partially observable Markov decision processes (POMDPs) provide a principled framework for sequential planning in uncertain single agent settings. An extension of POMDPs to multiagent settings, called interactive POMDPs (I-POMDPs), replaces POMDP belief spaces with interactive hierarchical belief systems which represent an agent's belief about the physical world, about beliefs of other agents, and… ▽ More

    Submitted 15 January, 2014; originally announced January 2014.

    Journal ref: Journal Of Artificial Intelligence Research, Volume 34, pages 297-337, 2009

  29. arXiv:1210.0595  [pdf

    cs.IR cs.DB

    From Questions to Effective Answers: On the Utility of Knowledge-Driven Querying Systems for Life Sciences Data

    Authors: Amir H. Asiaee, Prashant Doshi, Todd Minning, Satya Sahoo, Priti Parikh, Amit Sheth, Rick L. Tarleton

    Abstract: We compare two distinct approaches for querying data in the context of the life sciences. The first approach utilizes conventional databases to store the data and intuitive form-based interfaces to facilitate easy querying of the data. These interfaces could be seen as implementing a set of "pre-canned" queries commonly used by the life science researchers that we study. The second approach is bas… ▽ More

    Submitted 1 October, 2012; originally announced October 2012.

  30. arXiv:1109.2135  [pdf, ps

    cs.AI cs.MA

    A Framework for Sequential Planning in Multi-Agent Settings

    Authors: P. Doshi, P. J. Gmytrasiewicz

    Abstract: This paper extends the framework of partially observable Markov decision processes (POMDPs) to multi-agent settings by incorporating the notion of agent models into the state space. Agents maintain beliefs over physical states of the environment and over models of other agents, and they use Bayesian updates to maintain their beliefs over time. The solutions map belief states to actions. Models of… ▽ More

    Submitted 9 September, 2011; originally announced September 2011.

    Journal ref: Journal Of Artificial Intelligence Research, Volume 24, pages 49-79, 2005