Skip to main content

Showing 1–50 of 206 results for author: Chang, T

  1. arXiv:2406.18865  [pdf, other

    cs.LG stat.ML

    From Biased Selective Labels to Pseudo-Labels: An Expectation-Maximization Framework for Learning from Biased Decisions

    Authors: Trenton Chang, Jenna Wiens

    Abstract: Selective labels occur when label observations are subject to a decision-making process; e.g., diagnoses that depend on the administration of laboratory tests. We study a clinically-inspired selective label problem called disparate censorship, where labeling biases vary across subgroups and unlabeled individuals are imputed as "negative" (i.e., no diagnostic test = no illness). Machine learning mo… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: 39 pages, 33 figures. ICML 2024 conference paper

  2. arXiv:2406.13131  [pdf, other

    cs.CL

    When Parts are Greater Than Sums: Individual LLM Components Can Outperform Full Models

    Authors: Ting-Yun Chang, Jesse Thomason, Robin Jia

    Abstract: This paper studies in-context learning (ICL) by decomposing the output of large language models into the individual contributions of attention heads and MLPs (components). We observe curious components: good-performing ones that individually do well on a classification task, even when the model performs poorly; bad-performing ones that do much worse than chance; and label-biased components that al… ▽ More

    Submitted 24 June, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

    Comments: fix typos and citations; appendix

  3. arXiv:2406.09923  [pdf, other

    cs.CL cs.AI cs.LG

    CliBench: Multifaceted Evaluation of Large Language Models in Clinical Decisions on Diagnoses, Procedures, Lab Tests Orders and Prescriptions

    Authors: Mingyu Derek Ma, Chenchen Ye, Yu Yan, Xiaoxuan Wang, Peipei Ping, Timothy S Chang, Wei Wang

    Abstract: The integration of Artificial Intelligence (AI), especially Large Language Models (LLMs), into the clinical diagnosis process offers significant potential to improve the efficiency and accessibility of medical care. While LLMs have shown some promise in the medical domain, their application in clinical diagnosis remains underexplored, especially in real-world clinical practice, where highly sophis… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: Project page: https://clibench.github.io

  4. arXiv:2405.18881  [pdf, other

    cs.LG cs.AI

    Tuning-Free Alignment of Diffusion Models with Direct Noise Optimization

    Authors: Zhiwei Tang, Jiangweizhi Peng, Jiasheng Tang, Mingyi Hong, Fan Wang, Tsung-Hui Chang

    Abstract: In this work, we focus on the alignment problem of diffusion models with a continuous reward function, which represents specific objectives for downstream tasks, such as improving human preference. The central goal of the alignment problem is to adjust the distribution learned by diffusion models such that the generated samples maximize the target reward function. We propose a novel alignment appr… ▽ More

    Submitted 3 July, 2024; v1 submitted 29 May, 2024; originally announced May 2024.

  5. arXiv:2405.12235  [pdf

    cs.LG q-bio.QM

    Hypergraph: A Unified and Uniform Definition with Application to Chemical Hypergraph

    Authors: Daniel T. Chang

    Abstract: The conventional definition of hypergraph has two major issues: (1) there is not a standard definition of directed hypergraph and (2) there is not a formal definition of nested hypergraph. To resolve these issues, we propose a new definition of hypergraph that unifies the concepts of undirected, directed and nested hypergraphs, and that is uniform in using hyperedge as a single construct for repre… ▽ More

    Submitted 18 June, 2024; v1 submitted 14 May, 2024; originally announced May 2024.

    Comments: arXiv admin note: text overlap with arXiv:2310.03623 by other authors

  6. arXiv:2405.01610  [pdf, other

    cs.CL cs.IR

    Automating the Analysis of Public Saliency and Attitudes towards Biodiversity from Digital Media

    Authors: Noah Giebink, Amrita Gupta, Diogo Verìssimo, Charlotte H. Chang, Tony Chang, Angela Brennan, Brett Dickson, Alex Bowmer, Jonathan Baillie

    Abstract: Measuring public attitudes toward wildlife provides crucial insights into our relationship with nature and helps monitor progress toward Global Biodiversity Framework targets. Yet, conducting such assessments at a global scale is challenging. Manually curating search terms for querying news and social media is tedious, costly, and can lead to biased results. Raw news and social media data returned… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: v0.1, 21 pages with 10 figures

  7. arXiv:2404.03586  [pdf, other

    cs.LG stat.ML

    Leveraging Interpolation Models and Error Bounds for Verifiable Scientific Machine Learning

    Authors: Tyler Chang, Andrew Gillette, Romit Maulik

    Abstract: Effective verification and validation techniques for modern scientific machine learning workflows are challenging to devise. Statistical methods are abundant and easily deployed, but often rely on speculative assumptions about the data and methods involved. Error bounds for classical interpolation techniques can provide mathematically rigorous estimates of accuracy, but often are difficult or impr… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

  8. arXiv:2404.00898  [pdf, other

    cs.LG

    CAAP: Class-Dependent Automatic Data Augmentation Based On Adaptive Policies For Time Series

    Authors: Tien-Yu Chang, Hao Dai, Vincent S. Tseng

    Abstract: Data Augmentation is a common technique used to enhance the performance of deep learning models by expanding the training dataset. Automatic Data Augmentation (ADA) methods are getting popular because of their capacity to generate policies for various datasets. However, existing ADA methods primarily focused on overall performance improvement, neglecting the problem of class-dependent bias that le… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

  9. arXiv:2403.17891  [pdf, other

    cs.LG cs.AI

    Image-based Novel Fault Detection with Deep Learning Classifiers using Hierarchical Labels

    Authors: Nurettin Sergin, Jiayu Huang, Tzyy-Shuh Chang, Hao Yan

    Abstract: One important characteristic of modern fault classification systems is the ability to flag the system when faced with previously unseen fault types. This work considers the unknown fault detection capabilities of deep neural network-based fault classifiers. Specifically, we propose a methodology on how, when available, labels regarding the fault taxonomy can be used to increase unknown fault detec… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: Accepted in IISE Transaction

  10. arXiv:2403.14874  [pdf, other

    cs.CV cs.LG

    WeatherProof: Leveraging Language Guidance for Semantic Segmentation in Adverse Weather

    Authors: Blake Gella, Howard Zhang, Rishi Upadhyay, Tiffany Chang, Nathan Wei, Matthew Waliman, Yunhao Ba, Celso de Melo, Alex Wong, Achuta Kadambi

    Abstract: We propose a method to infer semantic segmentation maps from images captured under adverse weather conditions. We begin by examining existing models on images degraded by weather conditions such as rain, fog, or snow, and found that they exhibit a large performance drop as compared to those captured under clear weather. To control for changes in scene structures, we propose WeatherProof, the first… ▽ More

    Submitted 7 May, 2024; v1 submitted 21 March, 2024; originally announced March 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2312.09534

  11. arXiv:2403.13754  [pdf, other

    cs.CL

    Different Tokenization Schemes Lead to Comparable Performance in Spanish Number Agreement

    Authors: Catherine Arnett, Pamela D. Rivière, Tyler A. Chang, Sean Trott

    Abstract: The relationship between language model tokenization and performance is an open area of research. Here, we investigate how different tokenization schemes impact number agreement in Spanish plurals. We find that morphologically-aligned tokenization performs similarly to other tokenization schemes, even when induced artificially for words that would not be tokenized that way during training. We then… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

  12. arXiv:2403.09188  [pdf

    cs.LG eess.SP

    Design of an basis-projected layer for sparse datasets in deep learning training using gc-ms spectra as a case study

    Authors: Yu Tang Chang, Shih Fang Chen

    Abstract: Deep learning (DL) models encompass millions or even billions of parameters and learn complex patterns from big data. However, not all data are initially stored in a suitable formation to effectively train a DL model, e.g., gas chromatography-mass spectrometry (GC-MS) spectra and DNA sequence. These datasets commonly contain many zero values, and the sparse data formation causes difficulties in op… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: 5 pages, 2 figures, 2 tables, conference

    MSC Class: 68-06 ACM Class: I.2.4; J.2

  13. arXiv:2403.08904  [pdf, other

    cs.CL

    Detecting Hallucination and Coverage Errors in Retrieval Augmented Generation for Controversial Topics

    Authors: Tyler A. Chang, Katrin Tomanek, Jessica Hoffmann, Nithum Thain, Erin van Liemt, Kathleen Meier-Hellstern, Lucas Dixon

    Abstract: We explore a strategy to handle controversial topics in LLM-based chatbots based on Wikipedia's Neutral Point of View (NPOV) principle: acknowledge the absence of a single true answer and surface multiple perspectives. We frame this as retrieval augmented generation, where perspectives are retrieved from a knowledge base and the LLM is tasked with generating a fluent and faithful response from the… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: Accepted at LREC-COLING 2024

  14. arXiv:2403.08553  [pdf, other

    math.OC cs.LG eess.SY

    Regret Analysis of Policy Optimization over Submanifolds for Linearly Constrained Online LQG

    Authors: Ting-Jui Chang, Shahin Shahrampour

    Abstract: Recent advancement in online optimization and control has provided novel tools to study online linear quadratic regulator (LQR) problems, where cost matrices are varying adversarially over time. However, the controller parameterization of existing works may not satisfy practical conditions like sparsity due to physical connections. In this work, we study online linear quadratic Gaussian problems w… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

  15. arXiv:2403.00686  [pdf, other

    cs.CL

    A Bit of a Problem: Measurement Disparities in Dataset Sizes Across Languages

    Authors: Catherine Arnett, Tyler A. Chang, Benjamin K. Bergen

    Abstract: How should text dataset sizes be compared across languages? Even for content-matched (parallel) corpora, UTF-8 encoded text can require a dramatically different number of bytes for different languages. In our work, we define the byte premium between two languages as the ratio of bytes used to encode content-matched text in those languages. We compute byte premiums for 1155 languages, and we use li… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

  16. arXiv:2402.09970  [pdf, other

    cs.LG stat.ML

    Accelerating Parallel Sampling of Diffusion Models

    Authors: Zhiwei Tang, Jiasheng Tang, Hao Luo, Fan Wang, Tsung-Hui Chang

    Abstract: Diffusion models have emerged as state-of-the-art generative models for image generation. However, sampling from diffusion models is usually time-consuming due to the inherent autoregressive nature of their sampling process. In this work, we propose a novel approach that accelerates the sampling of diffusion models by parallelizing the autoregressive process. Specifically, we reformulate the sampl… ▽ More

    Submitted 27 May, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

    Comments: ICML 2024

  17. arXiv:2402.09941  [pdf, other

    cs.LG cs.AI stat.ML

    FedLion: Faster Adaptive Federated Optimization with Fewer Communication

    Authors: Zhiwei Tang, Tsung-Hui Chang

    Abstract: In Federated Learning (FL), a framework to train machine learning models across distributed data, well-known algorithms like FedAvg tend to have slow convergence rates, resulting in high communication costs during training. To address this challenge, we introduce FedLion, an adaptive federated optimization algorithm that seamlessly incorporates key elements from the recently proposed centralized a… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

    Comments: ICASSP 2024

  18. arXiv:2401.15484  [pdf, other

    cs.RO

    R$\times$R: Rapid eXploration for Reinforcement Learning via Sampling-based Reset Distributions and Imitation Pre-training

    Authors: Gagan Khandate, Tristan L. Saidi, Siqi Shang, Eric T. Chang, Yang Liu, Seth Dennis, Johnson Adams, Matei Ciocarlie

    Abstract: We present a method for enabling Reinforcement Learning of motor control policies for complex skills such as dexterous manipulation. We posit that a key difficulty for training such policies is the difficulty of exploring the problem state space, as the accessible and useful regions of this space form a complex structure along manifolds of the original high-dimensional state space. This work prese… ▽ More

    Submitted 27 January, 2024; originally announced January 2024.

    Comments: 20 pages, 14 figures, submitted to Autonomous Robots, RSS 2023 Special Issue. arXiv admin note: substantial text overlap with arXiv:2303.03486

  19. arXiv:2401.12025  [pdf, other

    cs.IT eess.SP math.OC

    A Survey of Recent Advances in Optimization Methods for Wireless Communications

    Authors: Ya-Feng Liu, Tsung-Hui Chang, Mingyi Hong, Zheyu Wu, Anthony Man-Cho So, Eduard A. Jorswieck, Wei Yu

    Abstract: Mathematical optimization is now widely regarded as an indispensable modeling and solution tool for the design of wireless communications systems. While optimization has played a significant role in the revolutionary progress in wireless communication and networking technologies from 1G to 5G and onto the future 6G, the innovations in wireless technologies have also substantially transformed the n… ▽ More

    Submitted 7 June, 2024; v1 submitted 22 January, 2024; originally announced January 2024.

    Comments: 39 pages, 5 figures, accepted for publication in IEEE Journal on Selected Areas in Communications

  20. arXiv:2401.06164  [pdf, other

    q-fin.GN cs.LG

    Multimodal Gen-AI for Fundamental Investment Research

    Authors: Lezhi Li, Ting-Yu Chang, Hai Wang

    Abstract: This report outlines a transformative initiative in the financial investment industry, where the conventional decision-making process, laden with labor-intensive tasks such as sifting through voluminous documents, is being reimagined. Leveraging language models, our experiments aim to automate information summarization and investment idea generation. We seek to evaluate the effectiveness of fine-t… ▽ More

    Submitted 23 December, 2023; originally announced January 2024.

  21. All Attention U-NET for Semantic Segmentation of Intracranial Hemorrhages In Head CT Images

    Authors: Chia Shuo Chang, Tian Sheuan Chang, Jiun Lin Yan, Li Ko

    Abstract: Intracranial hemorrhages in head CT scans serve as a first line tool to help specialists diagnose different types. However, their types have diverse shapes in the same type but similar confusing shape, size and location between types. To solve this problem, this paper proposes an all attention U-Net. It uses channel attentions in the U-Net encoder side to enhance class specific feature extraction,… ▽ More

    Submitted 16 December, 2023; originally announced December 2023.

    Comments: 2022 IEEE Biomedical Circuits and Systems Conference (BioCAS)

  22. arXiv:2312.09799  [pdf, other

    eess.IV cs.AI cs.CV

    IQNet: Image Quality Assessment Guided Just Noticeable Difference Prefiltering For Versatile Video Coding

    Authors: Yu-Han Sun, Chiang Lo-Hsuan Lee, Tian-Sheuan Chang

    Abstract: Image prefiltering with just noticeable distortion (JND) improves coding efficiency in a visual lossless way by filtering the perceptually redundant information prior to compression. However, real JND cannot be well modeled with inaccurate masking equations in traditional approaches or image-level subject tests in deep learning approaches. Thus, this paper proposes a fine-grained JND prefiltering… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

  23. arXiv:2312.09580  [pdf, other

    cs.SD cs.AR eess.AS

    A 1.6-mW Sparse Deep Learning Accelerator for Speech Separation

    Authors: Chih-Chyau Yang, Tian-Sheuan Chang

    Abstract: Low power deep learning accelerators on the speech processing enable real-time applications on edge devices. However, most of the existing accelerators suffer from high power consumption and focus on image applications only. This paper presents a low power accelerator for speech separation through algorithm and hardware optimizations. At the algorithm level, the model is compressed with structured… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

    Journal ref: in IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 31, no. 3, pp. 310-319, March 2023

  24. arXiv:2312.09534  [pdf, other

    cs.CV

    WeatherProof: A Paired-Dataset Approach to Semantic Segmentation in Adverse Weather

    Authors: Blake Gella, Howard Zhang, Rishi Upadhyay, Tiffany Chang, Matthew Waliman, Yunhao Ba, Alex Wong, Achuta Kadambi

    Abstract: The introduction of large, foundational models to computer vision has led to drastically improved performance on the task of semantic segmentation. However, these existing methods exhibit a large performance drop when testing on images degraded by weather conditions such as rain, fog, or snow. We introduce a general paired-training method that can be applied to all current foundational model archi… ▽ More

    Submitted 14 December, 2023; originally announced December 2023.

  25. arXiv:2312.08176  [pdf, other

    cs.CV cs.AR eess.IV

    ASC: Adaptive Scale Feature Map Compression for Deep Neural Network

    Authors: Yuan Yao, Tian-Sheuan Chang

    Abstract: Deep-learning accelerators are increasingly in demand; however, their performance is constrained by the size of the feature map, leading to high bandwidth requirements and large buffer sizes. We propose an adaptive scale feature map compression technique leveraging the unique properties of the feature map. This technique adopts independent channel indexing given the weak channel correlation and ut… ▽ More

    Submitted 13 December, 2023; originally announced December 2023.

  26. arXiv:2312.02213  [pdf, other

    cs.LG cs.AI cs.DB stat.AP

    JarviX: A LLM No code Platform for Tabular Data Analysis and Optimization

    Authors: Shang-Ching Liu, ShengKun Wang, Wenqi Lin, Chung-Wei Hsiung, Yi-Chen Hsieh, Yu-Ping Cheng, Sian-Hong Luo, Tsungyao Chang, Jianwei Zhang

    Abstract: In this study, we introduce JarviX, a sophisticated data analytics framework. JarviX is designed to employ Large Language Models (LLMs) to facilitate an automated guide and execute high-precision data analyzes on tabular datasets. This framework emphasizes the significance of varying column types, capitalizing on state-of-the-art LLMs to generate concise data insight summaries, propose relevant an… ▽ More

    Submitted 3 December, 2023; originally announced December 2023.

  27. arXiv:2311.09205  [pdf, other

    cs.CL

    When Is Multilinguality a Curse? Language Modeling for 250 High- and Low-Resource Languages

    Authors: Tyler A. Chang, Catherine Arnett, Zhuowen Tu, Benjamin K. Bergen

    Abstract: Multilingual language models are widely used to extend NLP systems to low-resource languages. However, concrete evidence for the effects of multilinguality on language modeling performance in individual languages remains scarce. Here, we pre-train over 10,000 monolingual and multilingual language models for over 250 languages, including multiple language families that are under-studied in NLP. We… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

  28. arXiv:2311.09194  [pdf, other

    cs.CL

    Structural Priming Demonstrates Abstract Grammatical Representations in Multilingual Language Models

    Authors: James A. Michaelov, Catherine Arnett, Tyler A. Chang, Benjamin K. Bergen

    Abstract: Abstract grammatical knowledge - of parts of speech and grammatical patterns - is key to the capacity for linguistic generalization in humans. But how abstract is grammatical knowledge in large language models? In the human literature, compelling evidence for grammatical abstraction comes from structural priming. A sentence that shares the same grammatical structure as a preceding sentence is proc… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

    Comments: Accepted at EMNLP 2023

  29. arXiv:2311.09060  [pdf, other

    cs.CL

    Do Localization Methods Actually Localize Memorized Data in LLMs? A Tale of Two Benchmarks

    Authors: Ting-Yun Chang, Jesse Thomason, Robin Jia

    Abstract: The concept of localization in LLMs is often mentioned in prior work; however, methods for localization have never been systematically and directly evaluated. We propose two complementary benchmarks that evaluate the ability of localization methods to pinpoint LLM components responsible for memorized data. In our INJ benchmark, we actively inject a piece of new information into a small subset of L… ▽ More

    Submitted 2 April, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: accepted by NAACL 2024

  30. arXiv:2310.07929  [pdf, other

    cs.CL

    Crosslingual Structural Priming and the Pre-Training Dynamics of Bilingual Language Models

    Authors: Catherine Arnett, Tyler A. Chang, James A. Michaelov, Benjamin K. Bergen

    Abstract: Do multilingual language models share abstract grammatical representations across languages, and if so, when do these develop? Following Sinclair et al. (2022), we use structural priming to test for abstract grammatical representations with causal effects on model outputs. We extend the approach to a Dutch-English bilingual setting, and we evaluate a Dutch-English language model during pre-trainin… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

    Comments: Extended abstract accepted to the 3rd Multilingual Representation Learning workshop at EMNLP 2023

  31. arXiv:2310.03206  [pdf, other

    math.OC cs.LG eess.SY

    Regret Analysis of Distributed Online Control for LTI Systems with Adversarial Disturbances

    Authors: Ting-Jui Chang, Shahin Shahrampour

    Abstract: This paper addresses the distributed online control problem over a network of linear time-invariant (LTI) systems (with possibly unknown dynamics) in the presence of adversarial perturbations. There exists a global network cost that is characterized by a time-varying convex function, which evolves in an adversarial manner and is sequentially and partially observed by local agents. The goal of each… ▽ More

    Submitted 4 October, 2023; originally announced October 2023.

  32. arXiv:2310.00206  [pdf, other

    cs.RO

    An Investigation of Multi-feature Extraction and Super-resolution with Fast Microphone Arrays

    Authors: Eric T. Chang, Runsheng Wang, Peter Ballentine, Jingxi Xu, Trey Smith, Brian Coltin, Ioannis Kymissis, Matei Ciocarlie

    Abstract: In this work, we use MEMS microphones as vibration sensors to simultaneously classify texture and estimate contact position and velocity. Vibration sensors are an important facet of both human and robotic tactile sensing, providing fast detection of contact and onset of slip. Microphones are an attractive option for implementing vibration sensing as they offer a fast response and can be sampled qu… ▽ More

    Submitted 7 March, 2024; v1 submitted 29 September, 2023; originally announced October 2023.

    Comments: 6 pages, 4 figures, accepted to 2024 IEEE International Conference on Robotics and Automation (ICRA)

  33. arXiv:2309.14936  [pdf, other

    cs.LG cs.DC

    Parallel Multi-Objective Hyperparameter Optimization with Uniform Normalization and Bounded Objectives

    Authors: Romain Egele, Tyler Chang, Yixuan Sun, Venkatram Vishwanath, Prasanna Balaprakash

    Abstract: Machine learning (ML) methods offer a wide range of configurable hyperparameters that have a significant influence on their performance. While accuracy is a commonly used performance objective, in many settings, it is not sufficient. Optimizing the ML models with respect to multiple objectives such as accuracy, confidence, fairness, calibration, privacy, latency, and memory consumption is becoming… ▽ More

    Submitted 26 September, 2023; originally announced September 2023.

    Comments: Preprint with appendices

  34. arXiv:2308.16139  [pdf, other

    cs.CV cs.DB cs.LG

    MedShapeNet -- A Large-Scale Dataset of 3D Medical Shapes for Computer Vision

    Authors: Jianning Li, Zongwei Zhou, Jiancheng Yang, Antonio Pepe, Christina Gsaxner, Gijs Luijten, Chongyu Qu, Tiezheng Zhang, Xiaoxi Chen, Wenxuan Li, Marek Wodzinski, Paul Friedrich, Kangxian Xie, Yuan Jin, Narmada Ambigapathy, Enrico Nasca, Naida Solak, Gian Marco Melito, Viet Duc Vu, Afaque R. Memon, Christopher Schlachta, Sandrine De Ribaupierre, Rajnikant Patel, Roy Eagleson, Xiaojun Chen , et al. (132 additional authors not shown)

    Abstract: Prior to the deep learning era, shape was commonly used to describe the objects. Nowadays, state-of-the-art (SOTA) algorithms in medical imaging are predominantly diverging from computer vision, where voxel grids, meshes, point clouds, and implicit surface models are used. This is seen from numerous shape-related publications in premier vision conferences as well as the growing popularity of Shape… ▽ More

    Submitted 12 December, 2023; v1 submitted 30 August, 2023; originally announced August 2023.

    Comments: 16 pages

    MSC Class: 68T01

  35. arXiv:2308.15807  [pdf, other

    eess.IV cs.AR cs.CV

    ACNPU: A 4.75TOPS/W 1080P@30FPS Super Resolution Accelerator with Decoupled Asymmetric Convolution

    Authors: Tun-Hao Yang, Tian-Sheuan Chang

    Abstract: Deep learning-driven superresolution (SR) outperforms traditional techniques but also faces the challenge of high complexity and memory bandwidth. This challenge leads many accelerators to opt for simpler and shallow models like FSRCNN, compromising performance for real-time needs, especially for resource-limited edge devices. This paper proposes an energy-efficient SR accelerator, ACNPU, to tackl… ▽ More

    Submitted 30 August, 2023; originally announced August 2023.

    Comments: 9 pages, 14 figures

  36. arXiv:2308.15419  [pdf, other

    cs.CL

    Characterizing Learning Curves During Language Model Pre-Training: Learning, Forgetting, and Stability

    Authors: Tyler A. Chang, Zhuowen Tu, Benjamin K. Bergen

    Abstract: How do language models learn to make predictions during pre-training? To study this question, we extract learning curves from five autoregressive English language model pre-training runs, for 1M tokens in context. We observe that the language models generate short repetitive phrases before learning to generate longer and more coherent text. We quantify the final surprisal, within-run variability,… ▽ More

    Submitted 29 August, 2023; originally announced August 2023.

  37. arXiv:2306.17089  [pdf

    cs.LG cs.CL

    Concept-Oriented Deep Learning with Large Language Models

    Authors: Daniel T. Chang

    Abstract: Large Language Models (LLMs) have been successfully used in many natural-language tasks and applications including text generation and AI chatbots. They also are a promising new technology for concept-oriented deep learning (CODL). However, the prerequisite is that LLMs understand concepts and ensure conceptual consistency. We discuss these in this paper, as well as major uses of LLMs for CODL inc… ▽ More

    Submitted 19 September, 2023; v1 submitted 29 June, 2023; originally announced June 2023.

  38. arXiv:2306.15218  [pdf

    cs.CV

    Semantic Segmentation Using Super Resolution Technique as Pre-Processing

    Authors: Chih-Chia Chen, Wei-Han Chen, Jen-Shiun Chiang, Chun-Tse Chien, Tingkai Chang

    Abstract: Combining high-level and low-level visual tasks is a common technique in the field of computer vision. This work integrates the technique of image super resolution to semantic segmentation for document image binarization. It demonstrates that using image super-resolution as a preprocessing step can effectively enhance the results and performance of semantic segmentation.

    Submitted 27 June, 2023; originally announced June 2023.

  39. arXiv:2306.13962  [pdf, other

    cs.IT eess.SP math.OC

    QoS-based Beamforming and Compression Design for Cooperative Cellular Networks via Lagrangian Duality

    Authors: Xilai Fan, Ya-Feng Liu, Liang Liu, Tsung-Hui Chang

    Abstract: This paper considers the quality-of-service (QoS)-based joint beamforming and compression design problem in the downlink cooperative cellular network, where multiple relay-like base stations (BSs), connected to the central processor via rate-limited fronthaul links, cooperatively transmit messages to the users. The problem of interest is formulated as the minimization of the total transmit power o… ▽ More

    Submitted 24 June, 2023; originally announced June 2023.

    Comments: 15 pages, 7 figures, submitted for possible publication

  40. arXiv:2305.18446  [pdf, other

    cs.LG

    Trompt: Towards a Better Deep Neural Network for Tabular Data

    Authors: Kuan-Yu Chen, Ping-Han Chiang, Hsin-Rung Chou, Ting-Wei Chen, Tien-Hao Chang

    Abstract: Tabular data is arguably one of the most commonly used data structures in various practical domains, including finance, healthcare and e-commerce. The inherent heterogeneity allows tabular data to store rich information. However, based on a recently published tabular benchmark, we can see deep neural networks still fall behind tree-based models on tabular datasets. In this paper, we propose Trompt… ▽ More

    Submitted 30 May, 2023; v1 submitted 28 May, 2023; originally announced May 2023.

    Comments: ICML'23 (poster)

  41. arXiv:2305.17127  [pdf, other

    cs.CL

    Characterizing and Measuring Linguistic Dataset Drift

    Authors: Tyler A. Chang, Kishaloy Halder, Neha Anna John, Yogarshi Vyas, Yassine Benajiba, Miguel Ballesteros, Dan Roth

    Abstract: NLP models often degrade in performance when real world data distributions differ markedly from training data. However, existing dataset drift metrics in NLP have generally not considered specific dimensions of linguistic drift that affect model performance, and they have not been validated in their ability to predict model performance at the individual example level, where such metrics are often… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

    Comments: Accepted to ACL 2023

  42. arXiv:2304.09981  [pdf, other

    stat.ME cs.LG q-bio.QM

    Interpretable (not just posthoc-explainable) heterogeneous survivor bias-corrected treatment effects for assignment of postdischarge interventions to prevent readmissions

    Authors: Hongjing Xia, Joshua C. Chang, Sarah Nowak, Sonya Mahajan, Rohit Mahajan, Ted L. Chang, Carson C. Chow

    Abstract: We used survival analysis to quantify the impact of postdischarge evaluation and management (E/M) services in preventing hospital readmission or death. Our approach avoids a specific pitfall of applying machine learning to this problem, which is an inflated estimate of the effect of interventions, due to survivors bias -- where the magnitude of inflation may be conditional on heterogeneous confoun… ▽ More

    Submitted 3 August, 2023; v1 submitted 19 April, 2023; originally announced April 2023.

    Comments: Submitted

    Journal ref: PMLR 219:884-905, 2023

  43. arXiv:2304.07445  [pdf, other

    cs.LG

    A framework for fully autonomous design of materials via multiobjective optimization and active learning: challenges and next steps

    Authors: Tyler H. Chang, Jakob R. Elias, Stefan M. Wild, Santanu Chaudhuri, Joseph A. Libera

    Abstract: In order to deploy machine learning in a real-world self-driving laboratory where data acquisition is costly and there are multiple competing design criteria, systems need to be able to intelligently sample while balancing performance trade-offs and constraints. For these reasons, we present an active learning process based on multiobjective black-box optimization with continuously updated machine… ▽ More

    Submitted 14 April, 2023; originally announced April 2023.

  44. arXiv:2304.06881  [pdf, other

    math.OC cs.MS

    Designing a Framework for Solving Multiobjective Simulation Optimization Problems

    Authors: Tyler H. Chang, Stefan M. Wild

    Abstract: Multiobjective simulation optimization (MOSO) problems are optimization problems with multiple conflicting objectives, where evaluation of at least one of the objectives depends on a black-box numerical code or real-world experiment, which we refer to as a simulation. This paper describes the design goals driving the development of the parallel MOSO library ParMOO. We derive these goals from the r… ▽ More

    Submitted 6 July, 2023; v1 submitted 13 April, 2023; originally announced April 2023.

  45. arXiv:2303.11504  [pdf, ps, other

    cs.CL

    Language Model Behavior: A Comprehensive Survey

    Authors: Tyler A. Chang, Benjamin K. Bergen

    Abstract: Transformer language models have received widespread public attention, yet their generated text is often surprising even to NLP researchers. In this survey, we discuss over 250 recent studies of English language model behavior before task-specific fine-tuning. Language models possess basic capabilities in syntax, semantics, pragmatics, world knowledge, and reasoning, but these capabilities are sen… ▽ More

    Submitted 25 August, 2023; v1 submitted 20 March, 2023; originally announced March 2023.

    Comments: 32 pages, accepted to Computational Linguistics

  46. arXiv:2303.03751  [pdf, other

    cs.LG cs.AI

    Zeroth-Order Optimization Meets Human Feedback: Provable Learning via Ranking Oracles

    Authors: Zhiwei Tang, Dmitry Rybin, Tsung-Hui Chang

    Abstract: In this study, we delve into an emerging optimization challenge involving a black-box objective function that can only be gauged via a ranking oracle-a situation frequently encountered in real-world scenarios, especially when the function is evaluated by human judges. Such challenge is inspired from Reinforcement Learning with Human Feedback (RLHF), an approach recently employed to enhance the per… ▽ More

    Submitted 13 April, 2024; v1 submitted 7 March, 2023; originally announced March 2023.

    Comments: ICLR 2024

  47. arXiv:2303.03486  [pdf, other

    cs.RO

    Sampling-based Exploration for Reinforcement Learning of Dexterous Manipulation

    Authors: Gagan Khandate, Siqi Shang, Eric T. Chang, Tristan Luca Saidi, Yang Liu, Seth Matthew Dennis, Johnson Adams, Matei Ciocarlie

    Abstract: In this paper, we present a novel method for achieving dexterous manipulation of complex objects, while simultaneously securing the object without the use of passive support surfaces. We posit that a key difficulty for training such policies in a Reinforcement Learning framework is the difficulty of exploring the problem state space, as the accessible regions of this space form a complex structure… ▽ More

    Submitted 23 May, 2023; v1 submitted 6 March, 2023; originally announced March 2023.

    Comments: 10 pages, 7 figures, accepted at Robotics Science & Systems 2023

  48. arXiv:2303.02469  [pdf

    cs.CL cs.LG

    Variational Quantum Classifiers for Natural-Language Text

    Authors: Daniel T. Chang

    Abstract: As part of the recent research effort on quantum natural language processing (QNLP), variational quantum sentence classifiers (VQSCs) have been implemented and supported in lambeq / DisCoPy, based on the DisCoCat model of sentence meaning. We discuss in some detail VQSCs, including category theory, DisCoCat for modeling sentence as string diagram, and DisCoPy for encoding string diagram as paramet… ▽ More

    Submitted 4 March, 2023; originally announced March 2023.

  49. arXiv:2302.13571  [pdf, other

    cs.LG cs.AI

    FLAG: Fast Label-Adaptive Aggregation for Multi-label Classification in Federated Learning

    Authors: Shih-Fang Chang, Benny Wei-Yun Hsu, Tien-Yu Chang, Vincent S. Tseng

    Abstract: Federated learning aims to share private data to maximize the data utility without privacy leakage. Previous federated learning research mainly focuses on multi-class classification problems. However, multi-label classification is a crucial research problem close to real-world data properties. Nevertheless, a limited number of federated learning studies explore this research problem. Existing stud… ▽ More

    Submitted 27 February, 2023; originally announced February 2023.

    Comments: 16 pages, 6 figures, and 2 tables

  50. arXiv:2302.12320  [pdf, other

    math.OC cs.LG eess.SY

    Dynamic Regret Analysis of Safe Distributed Online Optimization for Convex and Non-convex Problems

    Authors: Ting-Jui Chang, Sapana Chaudhary, Dileep Kalathil, Shahin Shahrampour

    Abstract: This paper addresses safe distributed online optimization over an unknown set of linear safety constraints. A network of agents aims at jointly minimizing a global, time-varying function, which is only partially observable to each individual agent. Therefore, agents must engage in local communications to generate a safe sequence of actions competitive with the best minimizer sequence in hindsight,… ▽ More

    Submitted 23 February, 2023; originally announced February 2023.