Skip to main content

Showing 1–25 of 25 results for author: Chu, E

  1. arXiv:2406.14981  [pdf, other

    cs.AI cs.HC

    Human-AI collectives produce the most accurate differential diagnoses

    Authors: N. Zöller, J. Berger, I. Lin, N. Fu, J. Komarneni, G. Barabucci, K. Laskowski, V. Shia, B. Harack, E. A. Chu, V. Trianni, R. H. J. M. Kurvers, S. M. Herzog

    Abstract: Artificial intelligence systems, particularly large language models (LLMs), are increasingly being employed in high-stakes decisions that impact both individuals and society at large, often without adequate safeguards to ensure safety, quality, and equity. Yet LLMs hallucinate, lack common sense, and are biased - shortcomings that may reflect LLMs' inherent limitations and thus may not be remedied… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  2. arXiv:2404.11018  [pdf, other

    cs.LG cs.AI cs.CL

    Many-Shot In-Context Learning

    Authors: Rishabh Agarwal, Avi Singh, Lei M. Zhang, Bernd Bohnet, Luis Rosias, Stephanie Chan, Biao Zhang, Ankesh Anand, Zaheer Abbas, Azade Nova, John D. Co-Reyes, Eric Chu, Feryal Behbahani, Aleksandra Faust, Hugo Larochelle

    Abstract: Large language models (LLMs) excel at few-shot in-context learning (ICL) -- learning from a few examples provided in context at inference, without any weight updates. Newly expanded context windows allow us to investigate ICL with hundreds or thousands of examples -- the many-shot regime. Going from few-shot to many-shot, we observe significant performance gains across a wide variety of generative… ▽ More

    Submitted 22 May, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

  3. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  4. CharNeRF: 3D Character Generation from Concept Art

    Authors: Eddy Chu, Yiyang Chen, Chedy Raissi, Anand Bhojan

    Abstract: 3D modeling holds significant importance in the realms of AR/VR and gaming, allowing for both artistic creativity and practical applications. However, the process is often time-consuming and demands a high level of skill. In this paper, we present a novel approach to create volumetric representations of 3D characters from consistent turnaround concept art, which serves as the standard input in the… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

  5. arXiv:2402.08806  [pdf, other

    cs.AI

    Combining Insights From Multiple Large Language Models Improves Diagnostic Accuracy

    Authors: Gioele Barabucci, Victor Shia, Eugene Chu, Benjamin Harack, Nathan Fu

    Abstract: Background: Large language models (LLMs) such as OpenAI's GPT-4 or Google's PaLM 2 are proposed as viable diagnostic support tools or even spoken of as replacements for "curbside consults". However, even LLMs specifically trained on medical topics may lack sufficient diagnostic accuracy for real-life applications. Methods: Using collective intelligence methods and a dataset of 200 clinical vigne… ▽ More

    Submitted 13 February, 2024; originally announced February 2024.

    Comments: 5 pages, 2 figures, 1 table

    ACM Class: I.2.1; J.3

  6. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  7. arXiv:2309.11059  [pdf, other

    eess.AS cs.SD

    Deep Complex U-Net with Conformer for Audio-Visual Speech Enhancement

    Authors: Shafique Ahmed, Chia-Wei Chen, Wenze Ren, Chin-Jou Li, Ernie Chu, Jun-Cheng Chen, Amir Hussain, Hsin-Min Wang, Yu Tsao, Jen-Cheng Hou

    Abstract: Recent studies have increasingly acknowledged the advantages of incorporating visual data into speech enhancement (SE) systems. In this paper, we introduce a novel audio-visual SE approach, termed DCUC-Net (deep complex U-Net with conformer network). The proposed DCUC-Net leverages complex domain features and a stack of conformer blocks. The encoder and decoder of DCUC-Net are designed using a com… ▽ More

    Submitted 8 October, 2023; v1 submitted 20 September, 2023; originally announced September 2023.

  8. arXiv:2308.10079  [pdf, other

    cs.CV

    MeDM: Mediating Image Diffusion Models for Video-to-Video Translation with Temporal Correspondence Guidance

    Authors: Ernie Chu, Tzuhsuan Huang, Shuo-Yen Lin, Jun-Cheng Chen

    Abstract: This study introduces an efficient and effective method, MeDM, that utilizes pre-trained image Diffusion Models for video-to-video translation with consistent temporal flow. The proposed framework can render videos from scene position information, such as a normal G-buffer, or perform text-guided editing on videos captured in real-world scenarios. We employ explicit optical flows to construct a pr… ▽ More

    Submitted 20 December, 2023; v1 submitted 19 August, 2023; originally announced August 2023.

    Comments: Accepted as a conference paper in AAAI 2024. Project page: https://medm2023.github.io

  9. arXiv:2307.08076  [pdf, other

    cs.CV

    Diffusion to Confusion: Naturalistic Adversarial Patch Generation Based on Diffusion Model for Object Detector

    Authors: Shuo-Yen Lin, Ernie Chu, Che-Hsien Lin, Jun-Cheng Chen, Jia-Ching Wang

    Abstract: Many physical adversarial patch generation methods are widely proposed to protect personal privacy from malicious monitoring using object detectors. However, they usually fail to generate satisfactory patch images in terms of both stealthiness and attack performance without making huge efforts on careful hyperparameter tuning. To address this issue, we propose a novel naturalistic adversarial patc… ▽ More

    Submitted 16 July, 2023; originally announced July 2023.

  10. arXiv:2305.19193  [pdf, other

    cs.CV

    Video ControlNet: Towards Temporally Consistent Synthetic-to-Real Video Translation Using Conditional Image Diffusion Models

    Authors: Ernie Chu, Shuo-Yen Lin, Jun-Cheng Chen

    Abstract: In this study, we present an efficient and effective approach for achieving temporally consistent synthetic-to-real video translation in videos of varying lengths. Our method leverages off-the-shelf conditional image diffusion models, allowing us to perform multiple synthetic-to-real image generations in parallel. By utilizing the available optical flow information from the synthetic videos, our a… ▽ More

    Submitted 30 May, 2023; originally announced May 2023.

  11. arXiv:2305.10403  [pdf, other

    cs.CL cs.AI

    PaLM 2 Technical Report

    Authors: Rohan Anil, Andrew M. Dai, Orhan Firat, Melvin Johnson, Dmitry Lepikhin, Alexandre Passos, Siamak Shakeri, Emanuel Taropa, Paige Bailey, Zhifeng Chen, Eric Chu, Jonathan H. Clark, Laurent El Shafey, Yanping Huang, Kathy Meier-Hellstern, Gaurav Mishra, Erica Moreira, Mark Omernick, Kevin Robinson, Sebastian Ruder, Yi Tay, Kefan Xiao, Yuanzhong Xu, Yujing Zhang, Gustavo Hernandez Abrego , et al. (103 additional authors not shown)

    Abstract: We introduce PaLM 2, a new state-of-the-art language model that has better multilingual and reasoning capabilities and is more compute-efficient than its predecessor PaLM. PaLM 2 is a Transformer-based model trained using a mixture of objectives. Through extensive evaluations on English and multilingual language, and reasoning tasks, we demonstrate that PaLM 2 has significantly improved quality on… ▽ More

    Submitted 13 September, 2023; v1 submitted 17 May, 2023; originally announced May 2023.

  12. arXiv:2304.06335  [pdf

    cs.LG eess.SP

    Deep Learning-based Fall Detection Algorithm Using Ensemble Model of Coarse-fine CNN and GRU Networks

    Authors: Chien-Pin Liu, Ju-Hsuan Li, En-Ping Chu, Chia-Yeh Hsieh, Kai-Chun Liu, Chia-Tai Chan, Yu Tsao

    Abstract: Falls are the public health issue for the elderly all over the world since the fall-induced injuries are associated with a large amount of healthcare cost. Falls can cause serious injuries, even leading to death if the elderly suffers a "long-lie". Hence, a reliable fall detection (FD) system is required to provide an emergency alarm for first aid. Due to the advances in wearable device technology… ▽ More

    Submitted 13 April, 2023; originally announced April 2023.

  13. arXiv:2303.16779  [pdf, other

    cs.CL cs.LG

    Language Models Trained on Media Diets Can Predict Public Opinion

    Authors: Eric Chu, Jacob Andreas, Stephen Ansolabehere, Deb Roy

    Abstract: Public opinion reflects and shapes societal behavior, but the traditional survey-based tools to measure it are limited. We introduce a novel approach to probe media diet models -- language models adapted to online news, TV broadcast, or radio show content -- that can emulate the opinions of subpopulations that have consumed a set of media. To validate this method, we use as ground truth the opinio… ▽ More

    Submitted 28 March, 2023; originally announced March 2023.

  14. arXiv:2210.17152  [pdf, other

    cs.SD eess.AS

    Audio Time-Scale Modification with Temporal Compressing Networks

    Authors: Ernie Chu, Ju-Ting Chen, Chia-Ping Chen

    Abstract: We propose a novel approach for time-scale modification of audio signals. Unlike traditional methods that rely on the framing technique or the short-time Fourier transform to preserve the frequency during temporal stretching, our neural network model encodes the raw audio into a high-level latent representation, dubbed Neuralgram, where each vector represents 1024 audio sample points. Due to a suf… ▽ More

    Submitted 6 October, 2023; v1 submitted 31 October, 2022; originally announced October 2022.

  15. arXiv:2210.05063  [pdf, other

    cs.CV

    Improving Dense Contrastive Learning with Dense Negative Pairs

    Authors: Berk Iskender, Zhenlin Xu, Simon Kornblith, En-Hung Chu, Maryam Khademi

    Abstract: Many contrastive representation learning methods learn a single global representation of an entire image. However, dense contrastive representation learning methods such as DenseCL (Wang et al., 2021) can learn better representations for tasks requiring stronger spatial localization of features, such as multi-label classification, detection, and segmentation. In this work, we study how to improve… ▽ More

    Submitted 10 January, 2023; v1 submitted 10 October, 2022; originally announced October 2022.

  16. arXiv:2206.04615  [pdf, other

    cs.CL cs.AI cs.CY cs.LG stat.ML

    Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

    Authors: Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza , et al. (426 additional authors not shown)

    Abstract: Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur… ▽ More

    Submitted 12 June, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench

    Journal ref: Transactions on Machine Learning Research, May/2022, https://openreview.net/forum?id=uyTL5Bvosj

  17. arXiv:2111.04839  [pdf, other

    cs.CV

    Evolving Evocative 2D Views of Generated 3D Objects

    Authors: Eric Chu

    Abstract: We present a method for jointly generating 3D models of objects and 2D renders at different viewing angles, with the process guided by ImageNet and CLIP -based models. Our results indicate that it can generate anamorphic objects, with renders that both evoke the target caption and look visually appealing.

    Submitted 8 November, 2021; originally announced November 2021.

    Journal ref: NeurIPS 2021 Workshop on Machine Learning for Creativity and Design

  18. arXiv:2007.12248  [pdf, other

    cs.LG cs.AI cs.CV cs.HC stat.ML

    Are Visual Explanations Useful? A Case Study in Model-in-the-Loop Prediction

    Authors: Eric Chu, Deb Roy, Jacob Andreas

    Abstract: We present a randomized controlled trial for a model-in-the-loop regression task, with the goal of measuring the extent to which (1) good explanations of model predictions increase human accuracy, and (2) faulty explanations decrease human trust in the model. We study explanations based on visual saliency in an image-based age prediction task for which humans and learned models are individually ca… ▽ More

    Submitted 23 July, 2020; originally announced July 2020.

  19. arXiv:2004.09551  [pdf, other

    cs.CY cs.AI

    Games for Fairness and Interpretability

    Authors: Eric Chu, Nabeel Gillani, Sneha Priscilla Makini

    Abstract: As Machine Learning (ML) systems becomes more ubiquitous, ensuring the fair and equitable application of their underlying algorithms is of paramount importance. We argue that one way to achieve this is to proactively cultivate public pressure for ML developers to design and develop fairer algorithms -- and that one way to cultivate public pressure while simultaneously serving the interests and obj… ▽ More

    Submitted 20 April, 2020; originally announced April 2020.

  20. arXiv:1810.08717  [pdf, other

    cs.CL

    Learning Personas from Dialogue with Attentive Memory Networks

    Authors: Eric Chu, Prashanth Vijayaraghavan, Deb Roy

    Abstract: The ability to infer persona from dialogue can have applications in areas ranging from computational narrative analysis to personalized dialogue generation. We introduce neural models to learn persona embeddings in a supervised character trope classification task. The models encode dialogue snippets from IMDB into representations that can capture the various categories of film characters. The best… ▽ More

    Submitted 19 October, 2018; originally announced October 2018.

    Comments: Accepted EMNLP Long Paper

  21. arXiv:1810.05739  [pdf, other

    cs.CL

    MeanSum: A Neural Model for Unsupervised Multi-document Abstractive Summarization

    Authors: Eric Chu, Peter J. Liu

    Abstract: Abstractive summarization has been studied using neural sequence transduction methods with datasets of large, paired document-summary examples. However, such datasets are rare and the models trained from them do not generalize to other domains. Recently, some progress has been made in learning sequence-to-sequence mappings with only unpaired examples. In our work, we consider the setting where the… ▽ More

    Submitted 22 May, 2019; v1 submitted 12 October, 2018; originally announced October 2018.

    Comments: Accepted to ICML 2019

  22. arXiv:1712.02896  [pdf, other

    cs.CV cs.CL

    Audio-Visual Sentiment Analysis for Learning Emotional Arcs in Movies

    Authors: Eric Chu, Deb Roy

    Abstract: Stories can have tremendous power -- not only useful for entertainment, they can activate our interests and mobilize our actions. The degree to which a story resonates with its audience may be in part reflected in the emotional journey it takes the audience upon. In this paper, we use machine learning methods to construct emotional arcs in movies, calculate families of arcs, and demonstrate the ab… ▽ More

    Submitted 7 December, 2017; originally announced December 2017.

    Comments: Data Mining (ICDM), 2017 IEEE 17th International Conference on

  23. arXiv:1602.02426  [pdf, other

    cs.SI physics.soc-ph

    Human Atlas: A Tool for Mapping Social Networks

    Authors: Martin Saveski, Eric Chu, Soroush Vosoughi, Deb Roy

    Abstract: Most social network analyses focus on online social networks. While these networks encode important aspects of our lives they fail to capture many real-world connections. Most of these connections are, in fact, public and known to the members of the community. Mapping them is a task very suitable for crowdsourcing: it is easily broken down in many simple and independent subtasks. Due to the nature… ▽ More

    Submitted 10 February, 2016; v1 submitted 7 February, 2016; originally announced February 2016.

    Comments: WWW'16 Demonstration, WWW'16 Companion, April 11-15, 2016, Montreal, Quebec, Canada

    ACM Class: H.5.2, H.3.4

  24. arXiv:1204.1106  [pdf, ps, other

    math.OC cs.DC eess.SY

    Message Passing for Dynamic Network Energy Management

    Authors: Matt Kraning, Eric Chu, Javad Lavaei, Stephen Boyd

    Abstract: We consider a network of devices, such as generators, fixed loads, deferrable loads, and storage devices, each with its own dynamic constraints and objective, connected by lossy capacitated lines. The problem is to minimize the total network objective subject to the device and line constraints, over a given time horizon. This is a large optimization problem, with variables for consumption or gener… ▽ More

    Submitted 4 April, 2012; originally announced April 2012.

    Comments: Submitted to IEEE Transactions on Smart grid

  25. arXiv:0909.1783  [pdf

    cs.DB cs.IR

    The Case for a Structured Approach to Managing Unstructured Data

    Authors: AnHai Doan, Jeff Naughton, Akanksha Baid, Xiaoyong Chai, Fei Chen, Ting Chen, Eric Chu, Pedro DeRose, Byron Gao, Chaitanya Gokhale, Jiansheng Huang, Warren Shen, Ba-Quy Vuong

    Abstract: The challenge of managing unstructured data represents perhaps the largest data management opportunity for our community since managing relational data. And yet we are risking letting this opportunity go by, ceding the playing field to other players, ranging from communities such as AI, KDD, IR, Web, and Semantic Web, to industrial players such as Google, Yahoo, and Microsoft. In this essay we e… ▽ More

    Submitted 9 September, 2009; originally announced September 2009.

    Comments: CIDR 2009