Skip to main content

Showing 1–28 of 28 results for author: Misra, V

  1. arXiv:2407.10954  [pdf, other

    cs.GR cs.AI cs.LG

    A Unified Differentiable Boolean Operator with Fuzzy Logic

    Authors: Hsueh-Ti Derek Liu, Maneesh Agrawala, Cem Yuksel, Tim Omernick, Vinith Misra, Stefano Corazza, Morgan McGuire, Victor Zordan

    Abstract: This paper presents a unified differentiable boolean operator for implicit solid shape modeling using Constructive Solid Geometry (CSG). Traditional CSG relies on min, max operators to perform boolean operations on implicit shapes. But because these boolean operators are discontinuous and discrete in the choice of operations, this makes optimization over the CSG representation challenging. Drawing… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: SIGGRAPH'24

  2. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  3. arXiv:2402.03175  [pdf, other

    cs.LG cs.AI

    The Matrix: A Bayesian learning model for LLMs

    Authors: Siddhartha Dalal, Vishal Misra

    Abstract: In this paper, we introduce a Bayesian learning model to understand the behavior of Large Language Models (LLMs). We explore the optimization metric of LLMs, which is based on predicting the next token, and develop a novel model grounded in this principle. Our approach involves constructing an ideal generative text model represented by a multinomial transition probability matrix with a prior, and… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

    Comments: 12 pages, 6 figures

    ACM Class: I.2.7

  4. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  5. arXiv:2311.02287  [pdf, other

    cs.LG cs.AI

    Predicting Ground Reaction Force from Inertial Sensors

    Authors: Bowen Song, Marco Paolieri, Harper E. Stewart, Leana Golubchik, Jill L. McNitt-Gray, Vishal Misra, Devavrat Shah

    Abstract: The study of ground reaction forces (GRF) is used to characterize the mechanical loading experienced by individuals in movements such as running, which is clinically applicable to identify athletes at risk for stress-related injuries. Our aim in this paper is to determine if data collected with inertial measurement units (IMUs), that can be worn by athletes during outdoor runs, can be used to pred… ▽ More

    Submitted 3 November, 2023; originally announced November 2023.

  6. arXiv:2305.10403  [pdf, other

    cs.CL cs.AI

    PaLM 2 Technical Report

    Authors: Rohan Anil, Andrew M. Dai, Orhan Firat, Melvin Johnson, Dmitry Lepikhin, Alexandre Passos, Siamak Shakeri, Emanuel Taropa, Paige Bailey, Zhifeng Chen, Eric Chu, Jonathan H. Clark, Laurent El Shafey, Yanping Huang, Kathy Meier-Hellstern, Gaurav Mishra, Erica Moreira, Mark Omernick, Kevin Robinson, Sebastian Ruder, Yi Tay, Kefan Xiao, Yuanzhong Xu, Yujing Zhang, Gustavo Hernandez Abrego , et al. (103 additional authors not shown)

    Abstract: We introduce PaLM 2, a new state-of-the-art language model that has better multilingual and reasoning capabilities and is more compute-efficient than its predecessor PaLM. PaLM 2 is a Transformer-based model trained using a mixture of objectives. Through extensive evaluations on English and multilingual language, and reasoning tasks, we demonstrate that PaLM 2 has significantly improved quality on… ▽ More

    Submitted 13 September, 2023; v1 submitted 17 May, 2023; originally announced May 2023.

  7. arXiv:2303.14084  [pdf, other

    cs.LG cs.DS stat.ML

    Differentially Private Synthetic Control

    Authors: Saeyoung Rho, Rachel Cummings, Vishal Misra

    Abstract: Synthetic control is a causal inference tool used to estimate the treatment effects of an intervention by creating synthetic counterfactual data. This approach combines measurements from other similar observations (i.e., donor pool ) to predict a counterfactual time series of interest (i.e., target unit) by analyzing the relationship between the target and the donor pool before the intervention. A… ▽ More

    Submitted 24 March, 2023; originally announced March 2023.

  8. arXiv:2302.04850  [pdf, other

    cs.CV

    Robot Synesthesia: A Sound and Emotion Guided AI Painter

    Authors: Vihaan Misra, Peter Schaldenbrand, Jean Oh

    Abstract: If a picture paints a thousand words, sound may voice a million. While recent robotic painting and image synthesis methods have achieved progress in generating visuals from text inputs, the translation of sound into images is vastly unexplored. Generally, sound-based interfaces and sonic interactions have the potential to expand accessibility and control for the user and provide a means to convey… ▽ More

    Submitted 23 May, 2024; v1 submitted 9 February, 2023; originally announced February 2023.

    Comments: 9 pages, 10 figures

  9. arXiv:2207.04901  [pdf, other

    cs.CL cs.LG

    Exploring Length Generalization in Large Language Models

    Authors: Cem Anil, Yuhuai Wu, Anders Andreassen, Aitor Lewkowycz, Vedant Misra, Vinay Ramasesh, Ambrose Slone, Guy Gur-Ari, Ethan Dyer, Behnam Neyshabur

    Abstract: The ability to extrapolate from short problem instances to longer ones is an important form of out-of-distribution generalization in reasoning tasks, and is crucial when learning from datasets where longer problem instances are rare. These include theorem proving, solving quantitative mathematics problems, and reading/summarizing novels. In this paper, we run careful empirical studies exploring th… ▽ More

    Submitted 14 November, 2022; v1 submitted 11 July, 2022; originally announced July 2022.

  10. arXiv:2206.14858  [pdf, other

    cs.CL cs.AI cs.LG

    Solving Quantitative Reasoning Problems with Language Models

    Authors: Aitor Lewkowycz, Anders Andreassen, David Dohan, Ethan Dyer, Henryk Michalewski, Vinay Ramasesh, Ambrose Slone, Cem Anil, Imanol Schlag, Theo Gutman-Solo, Yuhuai Wu, Behnam Neyshabur, Guy Gur-Ari, Vedant Misra

    Abstract: Language models have achieved remarkable performance on a wide range of tasks that require natural language understanding. Nevertheless, state-of-the-art models have generally struggled with tasks that require quantitative reasoning, such as solving mathematics, science, and engineering problems at the college level. To help close this gap, we introduce Minerva, a large language model pretrained o… ▽ More

    Submitted 30 June, 2022; v1 submitted 29 June, 2022; originally announced June 2022.

    Comments: 12 pages, 5 figures + references and appendices

  11. arXiv:2206.04615  [pdf, other

    cs.CL cs.AI cs.CY cs.LG stat.ML

    Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

    Authors: Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza , et al. (426 additional authors not shown)

    Abstract: Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur… ▽ More

    Submitted 12 June, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench

    Journal ref: Transactions on Machine Learning Research, May/2022, https://openreview.net/forum?id=uyTL5Bvosj

  12. arXiv:2204.12588  [pdf, other

    cs.GT

    Bandwidth Allocation Games

    Authors: Niloofar Bayat, Vishal Misra, Dan Rubenstein

    Abstract: Internet providers often offer data plans that, for each user's monthly billing cycle, guarantee a fixed amount of data at high rates until a byte threshold is reached, at which point the user's data rate is throttled to a lower rate for the remainder of the cycle. In practice, the thresholds and rates of throttling can appear and may be somewhat arbitrary. In this paper, we evaluate the choice of… ▽ More

    Submitted 26 April, 2022; originally announced April 2022.

  13. arXiv:2204.02311  [pdf, other

    cs.CL

    PaLM: Scaling Language Modeling with Pathways

    Authors: Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, Adam Roberts, Paul Barham, Hyung Won Chung, Charles Sutton, Sebastian Gehrmann, Parker Schuh, Kensen Shi, Sasha Tsvyashchenko, Joshua Maynez, Abhishek Rao, Parker Barnes, Yi Tay, Noam Shazeer, Vinodkumar Prabhakaran, Emily Reif, Nan Du, Ben Hutchinson, Reiner Pope, James Bradbury, Jacob Austin , et al. (42 additional authors not shown)

    Abstract: Large language models have been shown to achieve remarkable performance across a variety of natural language tasks using few-shot learning, which drastically reduces the number of task-specific training examples needed to adapt the model to a particular application. To further our understanding of the impact of scale on few-shot learning, we trained a 540-billion parameter, densely activated, Tran… ▽ More

    Submitted 5 October, 2022; v1 submitted 5 April, 2022; originally announced April 2022.

  14. arXiv:2201.02177  [pdf, other

    cs.LG

    Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets

    Authors: Alethea Power, Yuri Burda, Harri Edwards, Igor Babuschkin, Vedant Misra

    Abstract: In this paper we propose to study generalization of neural networks on small algorithmically generated datasets. In this setting, questions about data efficiency, memorization, generalization, and speed of learning can be studied in great detail. In some situations we show that neural networks learn through a process of "grokking" a pattern in the data, improving generalization performance from ra… ▽ More

    Submitted 6 January, 2022; originally announced January 2022.

    Comments: Correspondence to alethea@openai.com. Code available at: https://github.com/openai/grok

  15. arXiv:2107.03374  [pdf, other

    cs.LG

    Evaluating Large Language Models Trained on Code

    Authors: Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, Alex Ray, Raul Puri, Gretchen Krueger, Michael Petrov, Heidy Khlaaf, Girish Sastry, Pamela Mishkin, Brooke Chan, Scott Gray, Nick Ryder, Mikhail Pavlov, Alethea Power, Lukasz Kaiser, Mohammad Bavarian, Clemens Winter , et al. (33 additional authors not shown)

    Abstract: We introduce Codex, a GPT language model fine-tuned on publicly available code from GitHub, and study its Python code-writing capabilities. A distinct production version of Codex powers GitHub Copilot. On HumanEval, a new evaluation set we release to measure functional correctness for synthesizing programs from docstrings, our model solves 28.8% of the problems, while GPT-3 solves 0% and GPT-J sol… ▽ More

    Submitted 14 July, 2021; v1 submitted 7 July, 2021; originally announced July 2021.

    Comments: corrected typos, added references, added authors, added acknowledgements

  16. arXiv:2101.05677  [pdf, other

    stat.OT cs.AI stat.AP

    Improving non-deterministic uncertainty modelling in Industry 4.0 scheduling

    Authors: Ashwin Misra, Ankit Mittal, Vihaan Misra, Deepanshu Pandey

    Abstract: The latest Industrial revolution has helped industries in achieving very high rates of productivity and efficiency. It has introduced data aggregation and cyber-physical systems to optimize planning and scheduling. Although, uncertainty in the environment and the imprecise nature of human operators are not accurately considered for into the decision making process. This leads to delays in consignm… ▽ More

    Submitted 8 January, 2021; originally announced January 2021.

  17. arXiv:2009.09987  [pdf, other

    cs.CY

    Synthetic Control, Synthetic Interventions, and COVID-19 spread: Exploring the impact of lockdown measures and herd immunity

    Authors: Niloofar Bayat, Cody Morrin, Yuheng Wang, Vishal Misra

    Abstract: The synthetic control method is an empirical methodology forcausal inference using observational data. By observing thespread of COVID-19 throughout the world, we analyze the dataon the number of deaths and cases in different regions usingthe power of prediction, counterfactual analysis, and syntheticinterventions of the synthetic control and its extensions. Weobserve that the number of deaths and… ▽ More

    Submitted 26 September, 2020; v1 submitted 21 September, 2020; originally announced September 2020.

  18. arXiv:2005.11197  [pdf, other

    cs.CL

    Simplify-then-Translate: Automatic Preprocessing for Black-Box Machine Translation

    Authors: Sneha Mehta, Bahareh Azarnoush, Boris Chen, Avneesh Saluja, Vinith Misra, Ballav Bihani, Ritwik Kumar

    Abstract: Black-box machine translation systems have proven incredibly useful for a variety of applications yet by design are hard to adapt, tune to a specific domain, or build on top of. In this work, we introduce a method to improve such systems via automatic pre-processing (APP) using sentence simplification. We first propose a method to automatically generate a large in-domain paraphrase corpus through… ▽ More

    Submitted 27 May, 2020; v1 submitted 22 May, 2020; originally announced May 2020.

  19. arXiv:2003.13371  [pdf, other

    cs.GT

    Zero-Rating and Net Neutrality: Who Wins, Who Loses?

    Authors: Niloofar Bayat, Richard Ma, Vishal Misra, Dan Rubenstein

    Abstract: An objective of network neutrality is that the design of regulations for the Internet will ensure that it remains a public, open platform where innovations can thrive. While there is broad agreement that preserving the content quality of service falls under the purview of net neutrality, the role of differential pricing, especially the practice of \emph {zero-rating} remains controversial. Even th… ▽ More

    Submitted 13 February, 2020; originally announced March 2020.

  20. arXiv:2003.07074  [pdf

    cs.CY cs.CL cs.LG

    A Machine Learning Application for Raising WASH Awareness in the Times of COVID-19 Pandemic

    Authors: Rohan Pandey, Vaibhav Gautam, Ridam Pal, Harsh Bandhey, Lovedeep Singh Dhingra, Himanshu Sharma, Chirag Jain, Kanav Bhagat, Arushi, Lajjaben Patel, Mudit Agarwal, Samprati Agrawal, Rishabh Jalan, Akshat Wadhwa, Ayush Garg, Vihaan Misra, Yashwin Agrawal, Bhavika Rana, Ponnurangam Kumaraguru, Tavpritesh Sethi

    Abstract: Background: The COVID-19 pandemic has uncovered the potential of digital misinformation in shaping the health of nations. The deluge of unverified information that spreads faster than the epidemic itself is an unprecedented phenomenon that has put millions of lives in danger. Mitigating this Infodemic requires strong health messaging systems that are engaging, vernacular, scalable, effective and c… ▽ More

    Submitted 30 October, 2020; v1 submitted 16 March, 2020; originally announced March 2020.

    Comments: 14 pages, 7 figures

  21. arXiv:1912.03357  [pdf, other

    cs.NI

    Down for Failure: Active Power Status Monitoring

    Authors: Niloofar Bayat, Kunal Mahajan, Sam Denton, Vishal Misra, Dan Rubenstein

    Abstract: Despite society's strong dependence on electricity, power outages remain prevalent. Standard methods for directly measuring power availability are complex, often inaccurate, and are prone to attack. This paper explores an alternative approach to identifying power outages through intelligent monitoring of IP address availability. In finding these outages, we explore the trade-off between the accura… ▽ More

    Submitted 22 November, 2019; originally announced December 2019.

  22. arXiv:1803.09211  [pdf, other

    cs.LG cs.AI stat.ML

    Bernoulli Embeddings for Graphs

    Authors: Vinith Misra, Sumit Bhatia

    Abstract: Just as semantic hashing can accelerate information retrieval, binary valued embeddings can significantly reduce latency in the retrieval of graphical data. We introduce a simple but effective model for learning such binary vectors for nodes in a graph. By imagining the embeddings as independent coin flips of varying bias, continuous optimization techniques can be applied to the approximate expect… ▽ More

    Submitted 25 March, 2018; originally announced March 2018.

    Comments: The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18)

  23. arXiv:1605.05753  [pdf, ps, other

    cs.PF

    Delay Bounds for Multiclass FIFO

    Authors: Yuming Jiang, Vishal Misra

    Abstract: FIFO is perhaps the simplest scheduling discipline. For single-class FIFO, its delay guarantee performance has been extensively studied: The well-known results include a stochastic delay bound for $GI/GI/1$ by Kingman and a deterministic delay bound for $D/D/1$ by Cruz. However, for multiclass FIFO, few such results are available. To fill the gap, we prove delay bounds for multiclass FIFO in this… ▽ More

    Submitted 25 August, 2017; v1 submitted 18 May, 2016; originally announced May 2016.

  24. arXiv:1211.5852  [pdf, other

    cs.NI

    On the Evolution of the Internet Economic Ecosystem

    Authors: Richard T. B. Ma, John C. S. Lui, Vishal Misra

    Abstract: The evolution of the Internet has manifested itself in many ways: the traffic characteristics, the interconnection topologies and the business relationships among the autonomous components. It is important to understand why (and how) this evolution came about, and how the interplay of these dynamics may affect future evolution and services. We propose a network aware, macroscopic model that captur… ▽ More

    Submitted 25 November, 2012; originally announced November 2012.

    Comments: 25 pages, 18 Figures

  25. Distributed Functional Scalar Quantization Simplified

    Authors: John Z. Sun, Vinith Misra, Vivek K Goyal

    Abstract: Distributed functional scalar quantization (DFSQ) theory provides optimality conditions and predicts performance of data acquisition systems in which a computation on acquired data is desired. We address two limitations of previous works: prohibitively expensive decoder design and a restriction to sources with bounded distributions. We rigorously show that a much simpler decoder has equivalent asy… ▽ More

    Submitted 6 June, 2012; originally announced June 2012.

    Journal ref: IEEE Trans. on Signal Processing, vol. 61, no. 14, pp. 3495-3508, July 2013

  26. arXiv:1205.6974  [pdf, ps, other

    cs.IT

    The Porosity of Additive Noise Sequences

    Authors: Vinith Misra, Tsachy Weissman

    Abstract: Consider a binary additive noise channel with noiseless feedback. When the noise is a stationary and ergodic process $\mathbf{Z}$, the capacity is $1-\mathbb{H}(\mathbf{Z})$ ($\mathbb{H}(\cdot)$ denoting the entropy rate). It is shown analogously that when the noise is a deterministic sequence $z^\infty$, the capacity under finite-state encoding and decoding is $1-\barρ(z^\infty)$, where… ▽ More

    Submitted 31 May, 2012; originally announced May 2012.

    Comments: 22 pages, 9 figures

  27. arXiv:1106.3242  [pdf, other

    cs.NI cs.GT

    The Public Option: a Non-regulatory Alternative to Network Neutrality

    Authors: Richard T. B. Ma, Vishal Misra

    Abstract: Network neutrality and the role of regulation on the Internet have been heavily debated in recent times. Amongst the various definitions of network neutrality, we focus on the one which prohibits paid prioritization of content and we present an analytical treatment of the topic. We develop a model of the Internet ecosystem in terms of three primary players: consumers, ISPs and content providers. O… ▽ More

    Submitted 1 July, 2011; v1 submitted 16 June, 2011; originally announced June 2011.

  28. Distributed Scalar Quantization for Computing: High-Resolution Analysis and Extensions

    Authors: Vinith Misra, Vivek K Goyal, Lav R. Varshney

    Abstract: Communication of quantized information is frequently followed by a computation. We consider situations of \emph{distributed functional scalar quantization}: distributed scalar quantization of (possibly correlated) sources followed by centralized computation of a function. Under smoothness conditions on the sources and function, companding scalar quantizer designs are developed to minimize mean-squ… ▽ More

    Submitted 12 May, 2011; v1 submitted 21 November, 2008; originally announced November 2008.

    Comments: 36 pages, 10 figures

    Journal ref: IEEE Trans. on Information Theory, vol. 57, no. 8, pp. 5298-5325, August 2011