Skip to main content

Showing 1–50 of 2,685 results for author: Lee, J

  1. arXiv:2407.13515  [pdf, other

    cs.HC

    CookAR: Affordance Augmentations in Wearable AR to Support Kitchen Tool Interactions for People with Low Vision

    Authors: Jaewook Lee, Andrew D. Tjahjadi, Jiho Kim, Junpu Yu, Minji Park, Jiawen Zhang, Jon E. Froehlich, Yapeng Tian, Yuhang Zhao

    Abstract: Cooking is a central activity of daily living, supporting independence and both mental and physical health. However, prior work has highlighted key barriers for people with low vision (LV) to cook, particularly around safely interacting with cooking tools, such as sharp knives or hot pans. Drawing on recent advancements in computer vision (CV) and robotics, we present CookAR, a head-mounted AR sys… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  2. arXiv:2407.13427  [pdf, other

    cs.CE cs.AI

    DeepClair: Utilizing Market Forecasts for Effective Portfolio Selection

    Authors: Donghee Choi, Jinkyu Kim, Mogan Gim, Jinho Lee, Jaewoo Kang

    Abstract: Utilizing market forecasts is pivotal in optimizing portfolio selection strategies. We introduce DeepClair, a novel framework for portfolio selection. DeepClair leverages a transformer-based time-series forecasting model to predict market trends, facilitating more informed and adaptable portfolio decisions. To integrate the forecasting model into a deep reinforcement learning-driven portfolio sele… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

    Comments: CIKM 2024 Accepted

  3. arXiv:2407.13399  [pdf, other

    cs.AI cs.CL cs.LG

    Correcting the Mythos of KL-Regularization: Direct Alignment without Overparameterization via Chi-squared Preference Optimization

    Authors: Audrey Huang, Wenhao Zhan, Tengyang Xie, Jason D. Lee, Wen Sun, Akshay Krishnamurthy, Dylan J. Foster

    Abstract: Language model alignment methods, such as reinforcement learning from human feedback (RLHF), have led to impressive advances in language model capabilities, but existing techniques are limited by a widely observed phenomenon known as overoptimization, where the quality of the language model plateaus or degrades over the course of the alignment process. Overoptimization is often attributed to overf… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  4. arXiv:2407.13146  [pdf, other

    cs.LG cs.AI

    PG-Rainbow: Using Distributional Reinforcement Learning in Policy Gradient Methods

    Authors: WooJae Jeon, KanJun Lee, Jeewoo Lee

    Abstract: This paper introduces PG-Rainbow, a novel algorithm that incorporates a distributional reinforcement learning framework with a policy gradient algorithm. Existing policy gradient methods are sample inefficient and rely on the mean of returns when calculating the state-action value function, neglecting the distributional nature of returns in reinforcement learning tasks. To address this issue, we u… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  5. arXiv:2407.13064  [pdf

    cs.DL cs.CY

    On the modification and revocation of open source licences

    Authors: Paul Gagnon, Misha Benjamin, Justine Gauthier, Catherine Regis, Jenny Lee, Alexei Nordell-Markovits

    Abstract: Historically, open source commitments have been deemed irrevocable once materials are released under open source licenses. In this paper, the authors argue for the creation of a subset of rights that allows open source contributors to force users to (i) update to the most recent version of a model, (ii) accept new use case restrictions, or even (iii) cease using the software entirely. While this w… ▽ More

    Submitted 28 May, 2024; originally announced July 2024.

  6. arXiv:2407.12863  [pdf, other

    cs.CL cs.AI

    Token-Supervised Value Models for Enhancing Mathematical Reasoning Capabilities of Large Language Models

    Authors: Jung Hyun Lee, June Yong Yang, Byeongho Heo, Dongyoon Han, Kang Min Yoo

    Abstract: Large Language Models (LLMs) have demonstrated impressive problem-solving capabilities in mathematics through step-by-step reasoning chains. However, they are susceptible to reasoning errors that impact the quality of subsequent reasoning chains and the final answer due to language models' autoregressive token-by-token generating nature. Recent works have proposed adopting external verifiers to gu… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  7. arXiv:2407.12637  [pdf, other

    cs.CV

    Toward INT4 Fixed-Point Training via Exploring Quantization Error for Gradients

    Authors: Dohyung Kim, Junghyup Lee, Jeimin Jeon, Jaehyeon Moon, Bumsub Ham

    Abstract: Network quantization generally converts full-precision weights and/or activations into low-bit fixed-point values in order to accelerate an inference process. Recent approaches to network quantization further discretize the gradients into low-bit fixed-point values, enabling an efficient training. They typically set a quantization interval using a min-max range of the gradients or adjust the inter… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

    Comments: Accepted to ECCV 2024

  8. arXiv:2407.12529  [pdf, other

    cs.CL

    Crafting the Path: Robust Query Rewriting for Information Retrieval

    Authors: Ingeol Baek, Jimin Lee, Joonho Yang, Hwanhee Lee

    Abstract: Query rewriting aims to generate a new query that can complement the original query to improve the information retrieval system. Recent studies on query rewriting, such as query2doc (Q2D), query2expand (Q2E) and querey2cot (Q2C), rely on the internal knowledge of Large Language Models (LLMs) to generate a relevant passage to add information to the query. Nevertheless, the efficacy of these methodo… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

    Comments: 1 figure, 12 tables

  9. arXiv:2407.12374  [pdf, other

    cs.IR cs.AI

    Graph Signal Processing for Cross-Domain Recommendation

    Authors: Jeongeun Lee, Seongku Kang, Won-Yong Shin, Jeongwhan Choi, Noseong Park, Dongha Lee

    Abstract: Cross-domain recommendation (CDR) extends conventional recommender systems by leveraging user-item interactions from dense domains to mitigate data sparsity and the cold start problem. While CDR offers substantial potential for enhancing recommendation performance, most existing CDR methods suffer from sensitivity to the ratio of overlapping users and intrinsic discrepancy between source and targe… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

  10. arXiv:2407.11714  [pdf, other

    cs.CV

    Improving Unsupervised Video Object Segmentation via Fake Flow Generation

    Authors: Suhwan Cho, Minhyeok Lee, Jungho Lee, Donghyeong Kim, Seunghoon Lee, Sungmin Woo, Sangyoun Lee

    Abstract: Unsupervised video object segmentation (VOS), also known as video salient object detection, aims to detect the most prominent object in a video at the pixel level. Recently, two-stream approaches that leverage both RGB images and optical flow maps have gained significant attention. However, the limited amount of training data remains a substantial challenge. In this study, we propose a novel data… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  11. arXiv:2407.11534  [pdf, other

    cs.LG cs.AI

    LRQ: Optimizing Post-Training Quantization for Large Language Models by Learning Low-Rank Weight-Scaling Matrices

    Authors: Jung Hyun Lee, Jeonghoon Kim, June Yong Yang, Se Jung Kwon, Eunho Yang, Kang Min Yoo, Dongsoo Lee

    Abstract: With the commercialization of large language models (LLMs), weight-activation quantization has emerged to compress and accelerate LLMs, achieving high throughput while reducing inference costs. However, existing post-training quantization (PTQ) techniques for quantizing weights and activations of LLMs still suffer from non-negligible accuracy drops, especially on massive multitask language underst… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: Preprint

  12. arXiv:2407.11451  [pdf, other

    cs.LG cs.CV

    Isometric Representation Learning for Disentangled Latent Space of Diffusion Models

    Authors: Jaehoon Hahm, Junho Lee, Sunghyun Kim, Joonseok Lee

    Abstract: The latent space of diffusion model mostly still remains unexplored, despite its great success and potential in the field of generative modeling. In fact, the latent space of existing diffusion models are entangled, with a distorted mapping from its latent space to image space. To tackle this problem, we present Isometric Diffusion, equipping a diffusion model with a geometric regularizer to guide… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Journal ref: Forty-first International Conference on Machine Learning (ICML 2024)

  13. arXiv:2407.11330  [pdf

    cs.NE cs.LG cs.MA nlin.AO

    Navigating the swarm: Deep neural networks command emergent behaviours

    Authors: Dongjo Kim, Jeongsu Lee, Ho-Young Kim

    Abstract: Interacting individuals in complex systems often give rise to coherent motion exhibiting coordinated global structures. Such phenomena are ubiquitously observed in nature, from cell migration, bacterial swarms, animal and insect groups, and even human societies. Primary mechanisms responsible for the emergence of collective behavior have been extensively identified, including local alignments base… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

  14. arXiv:2407.11214  [pdf, ps, other

    cs.AI cs.CL cs.LG cs.LO cs.PL

    PutnamBench: Evaluating Neural Theorem-Provers on the Putnam Mathematical Competition

    Authors: George Tsoukalas, Jasper Lee, John Jennings, Jimmy Xin, Michelle Ding, Michael Jennings, Amitayush Thakur, Swarat Chaudhuri

    Abstract: We present PutnamBench, a new multilingual benchmark for evaluating the ability of neural theorem-provers to solve competition mathematics problems. PutnamBench consists of 1697 hand-constructed formalizations of 640 theorems sourced from the William Lowell Putnam Mathematical Competition, the premier undergraduate-level mathematics competition in North America. All the theorems have formalization… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

  15. arXiv:2407.11203  [pdf

    cs.CY cs.CL

    The Life Cycle of Large Language Models: A Review of Biases in Education

    Authors: Jinsook Lee, Yann Hicke, Renzhe Yu, Christopher Brooks, René F. Kizilcec

    Abstract: Large Language Models (LLMs) are increasingly adopted in educational contexts to provide personalized support to students and teachers. The unprecedented capacity of LLM-based applications to understand and generate natural language can potentially improve instructional effectiveness and learning outcomes, but the integration of LLMs in education technology has renewed concerns over algorithmic bi… ▽ More

    Submitted 3 June, 2024; originally announced July 2024.

    Comments: 20 pages, 2 figures, preprint for British Journal of Educational Technology submission

  16. arXiv:2407.11199  [pdf, other

    cs.CY

    Algorithms for College Admissions Decision Support: Impacts of Policy Change and Inherent Variability

    Authors: Jinsook Lee, Emma Harvey, Joyce Zhou, Nikhil Garg, Thorsten Joachims, Rene F. Kizilcec

    Abstract: Each year, selective American colleges sort through tens of thousands of applications to identify a first-year class that displays both academic merit and diversity. In the 2023-2024 admissions cycle, these colleges faced unprecedented challenges. First, the number of applications has been steadily growing. Second, test-optional policies that have remained in place since the COVID-19 pandemic limi… ▽ More

    Submitted 24 June, 2024; originally announced July 2024.

    Comments: 25 pages, 8 figures

  17. arXiv:2407.10972  [pdf, other

    cs.CV cs.AI cs.LG

    VGBench: Evaluating Large Language Models on Vector Graphics Understanding and Generation

    Authors: Bocheng Zou, Mu Cai, Jianrui Zhang, Yong Jae Lee

    Abstract: In the realm of vision models, the primary mode of representation is using pixels to rasterize the visual world. Yet this is not always the best or unique way to represent visual content, especially for designers and artists who depict the world using geometry primitives such as polygons. Vector graphics (VG), on the other hand, offer a textual representation of visual content, which can be more c… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: Project Page: https://vgbench.github.io

  18. arXiv:2407.10542  [pdf, other

    cs.CV cs.AI

    3D Geometric Shape Assembly via Efficient Point Cloud Matching

    Authors: Nahyuk Lee, Juhong Min, Junha Lee, Seungwook Kim, Kanghee Lee, Jaesik Park, Minsu Cho

    Abstract: Learning to assemble geometric shapes into a larger target structure is a pivotal task in various practical applications. In this work, we tackle this problem by establishing local correspondences between point clouds of part shapes in both coarse- and fine-levels. To this end, we introduce Proxy Match Transform (PMT), an approximate high-order feature transform layer that enables reliable matchin… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: Accepted to ICML 2024

  19. arXiv:2407.10454  [pdf, other

    cs.LG math.OC

    Deflated Dynamics Value Iteration

    Authors: Jongmin Lee, Amin Rakhsha, Ernest K. Ryu, Amir-massoud Farahmand

    Abstract: The Value Iteration (VI) algorithm is an iterative procedure to compute the value function of a Markov decision process, and is the basis of many reinforcement learning (RL) algorithms as well. As the error convergence rate of VI as a function of iteration $k$ is $O(γ^k)$, it is slow when the discount factor $γ$ is close to $1$. To accelerate the computation of the value function, we propose Defla… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

  20. arXiv:2407.10330  [pdf, other

    cs.CV

    Tree-D Fusion: Simulation-Ready Tree Dataset from Single Images with Diffusion Priors

    Authors: Jae Joong Lee, Bosheng Li, Sara Beery, Jonathan Huang, Songlin Fei, Raymond A. Yeh, Bedrich Benes

    Abstract: We introduce Tree D-fusion, featuring the first collection of 600,000 environmentally aware, 3D simulation-ready tree models generated through Diffusion priors. Each reconstructed 3D tree model corresponds to an image from Google's Auto Arborist Dataset, comprising street view images and associated genus labels of trees across North America. Our method distills the scores of two tree-adapted diffu… ▽ More

    Submitted 14 July, 2024; originally announced July 2024.

    Comments: Accepted to ECCV24

  21. arXiv:2407.10277  [pdf, other

    cs.CV cs.AI cs.LG

    Disrupting Diffusion-based Inpainters with Semantic Digression

    Authors: Geonho Son, Juhun Lee, Simon S. Woo

    Abstract: The fabrication of visual misinformation on the web and social media has increased exponentially with the advent of foundational text-to-image diffusion models. Namely, Stable Diffusion inpainters allow the synthesis of maliciously inpainted images of personal and private figures, and copyrighted contents, also known as deepfakes. To combat such generations, a disruption framework, namely Photogua… ▽ More

    Submitted 14 July, 2024; originally announced July 2024.

    Comments: 16 pages, 13 figures, IJCAI 2024

  22. arXiv:2407.10206  [pdf

    cs.CE cs.AI cs.NE cs.SI

    Dominant Design Prediction with Phylogenetic Networks

    Authors: Youwei He, Jeong-Dong Lee, Dawoon Jeong, Sungjun Choi, Jiyong Kim

    Abstract: This study proposes an effective method to predict technology development from an evolutionary perspective. Product evolution is the result of technological evolution and market selection. A phylogenetic network is the main method to study product evolution. The formation of the dominant design determines the trajectory of technology development. How to predict future dominant design has become a… ▽ More

    Submitted 14 July, 2024; originally announced July 2024.

  23. arXiv:2407.09541  [pdf, other

    cs.CL cs.AI cs.CV

    MATE: Meet At The Embedding -- Connecting Images with Long Texts

    Authors: Young Kyun Jang, Junmo Kang, Yong Jae Lee, Donghyun Kim

    Abstract: While advancements in Vision Language Models (VLMs) have significantly improved the alignment of visual and textual data, these models primarily focus on aligning images with short descriptive captions. This focus limits their ability to handle complex text interactions, particularly with longer texts such as lengthy captions or documents, which have not been extensively explored yet. In this pape… ▽ More

    Submitted 26 June, 2024; originally announced July 2024.

  24. arXiv:2407.09484  [pdf

    cs.HC cs.CY

    GPTutor: Great Personalized Tutor with Large Language Models for Personalized Learning Content Generation

    Authors: Eason Chen, Jia-En Lee, Jionghao Lin, Kenneth Koedinger

    Abstract: We developed GPTutor, a pioneering web application designed to revolutionize personalized learning by leveraging the capabilities of Generative AI at scale. GPTutor adapts educational content and practice exercises to align with individual students' interests and career goals, enhancing their engagement and understanding of critical academic concepts. The system uses a serverless architecture to d… ▽ More

    Submitted 16 May, 2024; originally announced July 2024.

  25. arXiv:2407.09012  [pdf, other

    cs.CV cs.AI

    TCAN: Animating Human Images with Temporally Consistent Pose Guidance using Diffusion Models

    Authors: Jeongho Kim, Min-Jung Kim, Junsoo Lee, Jaegul Choo

    Abstract: Pose-driven human-image animation diffusion models have shown remarkable capabilities in realistic human video synthesis. Despite the promising results achieved by previous approaches, challenges persist in achieving temporally consistent animation and ensuring robustness with off-the-shelf pose detectors. In this paper, we present TCAN, a pose-driven human image animation method that is robust to… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

    Comments: The first two authors contributed equally

  26. arXiv:2407.08503  [pdf, other

    eess.IV cs.CV

    DIOR-ViT: Differential Ordinal Learning Vision Transformer for Cancer Classification in Pathology Images

    Authors: Ju Cheon Lee, Keunho Byeon, Boram Song, Kyungeun Kim, Jin Tae Kwak

    Abstract: In computational pathology, cancer grading has been mainly studied as a categorical classification problem, which does not utilize the ordering nature of cancer grades such as the higher the grade is, the worse the cancer is. To incorporate the ordering relationship among cancer grades, we introduce a differential ordinal learning problem in which we define and learn the degree of difference in th… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  27. arXiv:2407.07329  [pdf, other

    cs.CL

    Probability of Differentiation Reveals Brittleness of Homogeneity Bias in Large Language Models

    Authors: Messi H. J. Lee, Calvin K. Lai

    Abstract: Homogeneity bias in Large Language Models (LLMs) refers to their tendency to homogenize the representations of some groups compared to others. Previous studies documenting this bias have predominantly used encoder models, which may have inadvertently introduced biases. To address this limitation, we prompted GPT-4 to generate single word/expression completions associated with 18 situation cues - s… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  28. arXiv:2407.07024  [pdf, other

    cs.CV cs.AI

    Exploring Scalability of Self-Training for Open-Vocabulary Temporal Action Localization

    Authors: Jeongseok Hyun, Su Ho Han, Hyolim Kang, Joon-Young Lee, Seon Joo Kim

    Abstract: The vocabulary size in temporal action localization (TAL) is constrained by the scarcity of large-scale annotated datasets. To address this, recent works incorporate powerful pre-trained vision-language models (VLMs), such as CLIP, to perform open-vocabulary TAL (OV-TAL). However, unlike VLMs trained on extensive image/video-text pairs, existing OV-TAL methods still rely on small, fully labeled TA… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  29. arXiv:2407.06869  [pdf, ps, other

    math.CO cs.DM

    Forcing quasirandomness with 4-point permutations

    Authors: Daniel Kráľ, Jae-baek Lee, Jonathan A. Noel

    Abstract: A combinatorial object is said to be quasirandom if it exhibits certain properties that are typically seen in a truly random object of the same kind. It is known that a permutation is quasirandom if and only if the pattern density of each of the twenty-four 4-point permutations is close to 1/24, which is its expected value in a random permutation. In other words, the set of all twenty-four 4-point… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  30. arXiv:2407.06613  [pdf, other

    cs.CV

    Sparse-DeRF: Deblurred Neural Radiance Fields from Sparse View

    Authors: Dogyoon Lee, Donghyeong Kim, Jungho Lee, Minhyeok Lee, Seunghoon Lee, Sangyoun Lee

    Abstract: Recent studies construct deblurred neural radiance fields (DeRF) using dozens of blurry images, which are not practical scenarios if only a limited number of blurry images are available. This paper focuses on constructing DeRF from sparse-view for more pragmatic real-world scenarios. As observed in our experiments, establishing DeRF from sparse views proves to be a more challenging problem due to… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: Project page: https://dogyoonlee.github.io/sparsederf/

  31. arXiv:2407.06460  [pdf, other

    cs.CL cs.AI

    MUSE: Machine Unlearning Six-Way Evaluation for Language Models

    Authors: Weijia Shi, Jaechan Lee, Yangsibo Huang, Sadhika Malladi, Jieyu Zhao, Ari Holtzman, Daogao Liu, Luke Zettlemoyer, Noah A. Smith, Chiyuan Zhang

    Abstract: Language models (LMs) are trained on vast amounts of text data, which may include private and copyrighted content. Data owners may request the removal of their data from a trained model due to privacy or copyright concerns. However, exactly unlearning only these datapoints (i.e., retraining with the data removed) is intractable in modern-day models. This has led to the development of many approxim… ▽ More

    Submitted 14 July, 2024; v1 submitted 8 July, 2024; originally announced July 2024.

  32. arXiv:2407.06194  [pdf, other

    cs.CV cs.AI cs.CL

    More Distinctively Black and Feminine Faces Lead to Increased Stereotyping in Vision-Language Models

    Authors: Messi H. J. Lee, Jacob M. Montgomery, Calvin K. Lai

    Abstract: Vision Language Models (VLMs), exemplified by GPT-4V, adeptly integrate text and vision modalities. This integration enhances Large Language Models' ability to mimic human perception, allowing them to process image inputs. Despite VLMs' advanced capabilities, however, there is a concern that VLMs inherit biases of both modalities in ways that make biases more pervasive and difficult to mitigate. O… ▽ More

    Submitted 21 May, 2024; originally announced July 2024.

  33. arXiv:2407.05872  [pdf, other

    cs.LG

    Scaling Exponents Across Parameterizations and Optimizers

    Authors: Katie Everett, Lechao Xiao, Mitchell Wortsman, Alexander A. Alemi, Roman Novak, Peter J. Liu, Izzeddin Gur, Jascha Sohl-Dickstein, Leslie Pack Kaelbling, Jaehoon Lee, Jeffrey Pennington

    Abstract: Robust and effective scaling of models from small to large width typically requires the precise adjustment of many algorithmic and architectural details, such as parameterization and optimizer choices. In this work, we propose a new perspective on parameterization by investigating a key assumption in prior work about the alignment between parameters and data and derive new theoretical results unde… ▽ More

    Submitted 16 July, 2024; v1 submitted 8 July, 2024; originally announced July 2024.

    Comments: 63 pages, International Conference on Machine Learning 2024

  34. arXiv:2407.05551  [pdf, other

    cs.CV cs.MM cs.SD eess.AS

    Read, Watch and Scream! Sound Generation from Text and Video

    Authors: Yujin Jeong, Yunji Kim, Sanghyuk Chun, Jiyoung Lee

    Abstract: Multimodal generative models have shown impressive advances with the help of powerful diffusion models. Despite the progress, generating sound solely from text poses challenges in ensuring comprehensive scene depiction and temporal alignment. Meanwhile, video-to-sound generation limits the flexibility to prioritize sound synthesis for specific objects within the scene. To tackle these challenges,… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

    Comments: Project page: https://naver-ai.github.io/rewas

  35. arXiv:2407.05516  [pdf, other

    eess.AS cs.AI cs.SD eess.SP

    Differentiable Modal Synthesis for Physical Modeling of Planar String Sound and Motion Simulation

    Authors: Jin Woo Lee, Jaehyun Park, Min Jun Choi, Kyogu Lee

    Abstract: While significant advancements have been made in music generation and differentiable sound synthesis within machine learning and computer audition, the simulation of instrument vibration guided by physical laws has been underexplored. To address this gap, we introduce a novel model for simulating the spatio-temporal motion of nonlinear strings, integrating modal synthesis and spectral modeling wit… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

  36. arXiv:2407.04844  [pdf, other

    cs.CV cs.AI

    Neural varifolds: an aggregate representation for quantifying the geometry of point clouds

    Authors: Juheon Lee, Xiaohao Cai, Carola-Bibian Schönlieb, Simon Masnou

    Abstract: Point clouds are popular 3D representations for real-life objects (such as in LiDAR and Kinect) due to their detailed and compact representation of surface-based geometry. Recent approaches characterise the geometry of point clouds by bringing deep learning based techniques together with geometric fidelity metrics such as optimal transportation costs (e.g., Chamfer and Wasserstein metrics). In thi… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: The first author, Juheon Lee, is an unaffiliated, independent researcher. This work is a personal endeavor, unrelated to his current job

  37. arXiv:2407.04345  [pdf, other

    cs.CV

    CanonicalFusion: Generating Drivable 3D Human Avatars from Multiple Images

    Authors: Jisu Shin, Junmyeong Lee, Seongmin Lee, Min-Gyu Park, Ju-Mi Kang, Ju Hong Yoon, Hae-Gon Jeon

    Abstract: We present a novel framework for reconstructing animatable human avatars from multiple images, termed CanonicalFusion. Our central concept involves integrating individual reconstruction results into the canonical space. To be specific, we first predict Linear Blend Skinning (LBS) weight maps and depth maps using a shared-encoder-dual-decoder network, enabling direct canonicalization of the 3D mesh… ▽ More

    Submitted 15 July, 2024; v1 submitted 5 July, 2024; originally announced July 2024.

    Comments: ECCV 2024 Accepted (18 pages, 9 figures)

  38. arXiv:2407.04271  [pdf, other

    cs.CV cs.AI cs.LG

    Variational Partial Group Convolutions for Input-Aware Partial Equivariance of Rotations and Color-Shifts

    Authors: Hyunsu Kim, Yegon Kim, Hongseok Yang, Juho Lee

    Abstract: Group Equivariant CNNs (G-CNNs) have shown promising efficacy in various tasks, owing to their ability to capture hierarchical features in an equivariant manner. However, their equivariance is fixed to the symmetry of the whole group, limiting adaptability to diverse partial symmetries in real-world datasets, such as limited rotation symmetry of handwritten digit images and limited color-shift sym… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: ICML2024

  39. arXiv:2407.04061  [pdf, other

    cs.CV

    Detect Closer Surfaces that can be Seen: New Modeling and Evaluation in Cross-domain 3D Object Detection

    Authors: Ruixiao Zhang, Yihong Wu, Juheon Lee, Adam Prugel-Bennett, Xiaohao Cai

    Abstract: The performance of domain adaptation technologies has not yet reached an ideal level in the current 3D object detection field for autonomous driving, which is mainly due to significant differences in the size of vehicles, as well as the environments they operate in when applied across domains. These factors together hinder the effective transfer and application of knowledge learned from specific d… ▽ More

    Submitted 12 July, 2024; v1 submitted 4 July, 2024; originally announced July 2024.

    Comments: Accepted by the 27th European Conference on Artificial Intelligence (ECAI 2024)

  40. arXiv:2407.03923  [pdf, other

    cs.CV cs.AI

    CRiM-GS: Continuous Rigid Motion-Aware Gaussian Splatting from Motion Blur Images

    Authors: Junghe Lee, Donghyeong Kim, Dogyoon Lee, Suhwan Cho, Sangyoun Lee

    Abstract: Neural radiance fields (NeRFs) have received significant attention due to their high-quality novel view rendering ability, prompting research to address various real-world cases. One critical challenge is the camera motion blur caused by camera movement during exposure time, which prevents accurate 3D scene reconstruction. In this study, we propose continuous rigid motion-aware gaussian splatting… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: Project Page : https://jho-yonsei.github.io/CRiM-Gaussian/

  41. arXiv:2407.03593  [pdf, other

    math.NA cs.LG

    Green Multigrid Network

    Authors: Ye Lin, Young Ju Lee, Jiwei Jia

    Abstract: GreenLearning networks (GL) directly learn Green's function in physical space, making them an interpretable model for capturing unknown solution operators of partial differential equations (PDEs). For many PDEs, the corresponding Green's function exhibits asymptotic smoothness. In this paper, we propose a framework named Green Multigrid networks (GreenMGNet), an operator learning algorithm designe… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  42. arXiv:2407.03051  [pdf, other

    cs.CL

    Improving Conversational Abilities of Quantized Large Language Models via Direct Preference Alignment

    Authors: Janghwan Lee, Seongmin Park, Sukjin Hong, Minsoo Kim, Du-Seong Chang, Jungwook Choi

    Abstract: The rapid advancement of large language models (LLMs) has facilitated their transformation into conversational chatbots that can grasp contextual nuances and generate pertinent sentences, closely mirroring human values through advanced techniques such as instruction tuning and reinforcement learning from human feedback (RLHF). However, the computational efficiency required for LLMs, achieved throu… ▽ More

    Submitted 18 July, 2024; v1 submitted 3 July, 2024; originally announced July 2024.

    Comments: ACL 2024 Main

  43. arXiv:2407.02362  [pdf, other

    cs.AR cs.AI cs.LG

    Fast, Scalable, Energy-Efficient Non-element-wise Matrix Multiplication on FPGA

    Authors: Xuqi Zhu, Huaizhi Zhang, JunKyu Lee, Jiacheng Zhu, Chandrajit Pal, Sangeet Saha, Klaus D. McDonald-Maier, Xiaojun Zhai

    Abstract: Modern Neural Network (NN) architectures heavily rely on vast numbers of multiply-accumulate arithmetic operations, constituting the predominant computational cost. Therefore, this paper proposes a high-throughput, scalable and energy efficient non-element-wise matrix multiplication unit on FPGAs as a basic component of the NNs. We firstly streamline inter-layer and intra-layer redundancies of MAD… ▽ More

    Submitted 7 July, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

  44. arXiv:2407.02245  [pdf, other

    cs.RO cs.AI

    Safe CoR: A Dual-Expert Approach to Integrating Imitation Learning and Safe Reinforcement Learning Using Constraint Rewards

    Authors: Hyeokjin Kwon, Gunmin Lee, Junseo Lee, Songhwai Oh

    Abstract: In the realm of autonomous agents, ensuring safety and reliability in complex and dynamic environments remains a paramount challenge. Safe reinforcement learning addresses these concerns by introducing safety constraints, but still faces challenges in navigating intricate environments such as complex driving situations. To overcome these challenges, we present the safe constraint reward (Safe CoR)… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: Accepted to the Proc. of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2024

  45. arXiv:2407.02232  [pdf, other

    cs.RO

    Efficient Extrinsic Self-Calibration of Multiple IMUs using Measurement Subset Selection

    Authors: Jongwon Lee, David Hanley, Timothy Bretl

    Abstract: This paper addresses the problem of choosing a sparse subset of measurements for quick calibration parameter estimation. A standard solution to this is selecting a measurement only if its utility -- the difference between posterior (with the measurement) and prior information (without the measurement) -- exceeds some threshold. Theoretically, utility, a function of the parameter estimate, should b… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: Accepted at the 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS2024)

  46. arXiv:2407.01626  [pdf, other

    cs.IR cs.AI cs.CL

    SPARKLE: Enhancing SPARQL Generation with Direct KG Integration in Decoding

    Authors: Jaebok Lee, Hyeonjeong Shin

    Abstract: Existing KBQA methods have traditionally relied on multi-stage methodologies, involving tasks such as entity linking, subgraph retrieval and query structure generation. However, multi-stage approaches are dependent on the accuracy of preceding steps, leading to cascading errors and increased inference time. Although a few studies have explored the use of end-to-end models, they often suffer from l… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

  47. arXiv:2407.01624  [pdf, other

    cs.LG cs.AI

    Guided Trajectory Generation with Diffusion Models for Offline Model-based Optimization

    Authors: Taeyoung Yun, Sujin Yun, Jaewoo Lee, Jinkyoo Park

    Abstract: Optimizing complex and high-dimensional black-box functions is ubiquitous in science and engineering fields. Unfortunately, the online evaluation of these functions is restricted due to time and safety constraints in most cases. In offline model-based optimization (MBO), we aim to find a design that maximizes the target function using only a pre-existing offline dataset. While prior methods consid… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

    Comments: 29 pages, 11 figures, 17 tables

  48. arXiv:2407.01222  [pdf, other

    cs.RO

    Deep Learning Models for Flapping Fin Unmanned Underwater Vehicle Control System Gait Optimization

    Authors: Brian Zhou, Kamal Viswanath, Jason Geder, Alisha Sharma, Julian Lee

    Abstract: The last few decades have led to the rise of research focused on propulsion and control systems for bio-inspired unmanned underwater vehicles (UUVs), which provide more maneuverable alternatives to traditional UUVs in underwater missions. Recent work has explored the use of time-series neural network surrogate models to predict thrust and power from vehicle design and fin kinematics. We develop a… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: 28 pages, 20 figures. arXiv admin note: text overlap with arXiv:2310.14135

  49. arXiv:2407.00740  [pdf, other

    cs.CL cs.LG

    Locate&Edit: Energy-based Text Editing for Efficient, Flexible, and Faithful Controlled Text Generation

    Authors: Hye Ryung Son, Jay-Yoon Lee

    Abstract: Recent approaches to controlled text generation (CTG) often involve manipulating the weights or logits of base language models (LMs) at decoding time. However, these methods are inapplicable to latest black-box LMs and ineffective at preserving the core semantics of the base LM's original generations. In this work, we propose Locate&Edit(L&E), an efficient and flexible energy-based approach to CTG… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

    Comments: 18 pages, 2 figures

  50. arXiv:2407.00369  [pdf, other

    cs.CL

    How to Train Your Fact Verifier: Knowledge Transfer with Multimodal Open Models

    Authors: Jaeyoung Lee, Ximing Lu, Jack Hessel, Faeze Brahman, Youngjae Yu, Yonatan Bisk, Yejin Choi, Saadia Gabriel

    Abstract: Given the growing influx of misinformation across news and social media, there is a critical need for systems that can provide effective real-time verification of news claims. Large language or multimodal model based verification has been proposed to scale up online policing mechanisms for mitigating spread of false and harmful content. While these can potentially reduce burden on human fact-check… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.