Skip to main content

Showing 1–50 of 143 results for author: Tong, X

  1. arXiv:2407.07094  [pdf, other

    cs.CL cs.AI

    AnyTaskTune: Advanced Domain-Specific Solutions through Task-Fine-Tuning

    Authors: Jiaxi Cui, Wentao Zhang, Jing Tang, Xudong Tong, Zhenwei Zhang, Amie, Jing Wen, Rongsheng Wang, Pengfei Wu

    Abstract: The pervasive deployment of Large Language Models-LLMs in various sectors often neglects the nuanced requirements of individuals and small organizations, who benefit more from models precisely tailored to their specific business contexts rather than those with broadly superior general capabilities. This work introduces \textbf{AnyTaskTune}, a novel fine-tuning methodology coined as \textbf{Task-Fi… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  2. arXiv:2407.03952  [pdf, other

    cs.CL

    A framework for annotating and modelling intentions behind metaphor use

    Authors: Gianluca Michelli, Xiaoyu Tong, Ekaterina Shutova

    Abstract: Metaphors are part of everyday language and shape the way in which we conceptualize the world. Moreover, they play a multifaceted role in communication, making their understanding and generation a challenging task for language models (LMs). While there has been extensive work in the literature linking metaphor to the fulfilment of individual intentions, no comprehensive taxonomy of such intentions… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  3. arXiv:2407.02657  [pdf, other

    cs.LG stat.ME

    Large Scale Hierarchical Industrial Demand Time-Series Forecasting incorporating Sparsity

    Authors: Harshavardhan Kamarthi, Aditya B. Sasanur, Xinjie Tong, Xingyu Zhou, James Peters, Joe Czyzyk, B. Aditya Prakash

    Abstract: Hierarchical time-series forecasting (HTSF) is an important problem for many real-world business applications where the goal is to simultaneously forecast multiple time-series that are related to each other via a hierarchical relation. Recent works, however, do not address two important challenges that are typically observed in many demand forecasting applications at large companies. First, many t… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: Accepted at KDD 2024

  4. arXiv:2406.16473  [pdf, other

    cs.CV cs.AI

    Seeking Certainty In Uncertainty: Dual-Stage Unified Framework Solving Uncertainty in Dynamic Facial Expression Recognition

    Authors: Haoran Wang, Xinji Mai, Zeng Tao, Xuan Tong, Junxiong Lin, Yan Wang, Jiawen Yu, Boyang Wang, Shaoqi Yan, Qing Zhao, Ziheng Zhou, Shuyong Gao, Wenqiang Zhang

    Abstract: The contemporary state-of-the-art of Dynamic Facial Expression Recognition (DFER) technology facilitates remarkable progress by deriving emotional mappings of facial expressions from video content, underpinned by training on voluminous datasets. Yet, the DFER datasets encompass a substantial volume of noise data. Noise arises from low-quality captures that defy logical labeling, and instances that… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  5. arXiv:2406.16459  [pdf, other

    cs.CV

    Suppressing Uncertainties in Degradation Estimation for Blind Super-Resolution

    Authors: Junxiong Lin, Zeng Tao, Xuan Tong, Xinji Mai, Haoran Wang, Boyang Wang, Yan Wang, Qing Zhao, Jiawen Yu, Yuxuan Lin, Shaoqi Yan, Shuyong Gao, Wenqiang Zhang

    Abstract: The problem of blind image super-resolution aims to recover high-resolution (HR) images from low-resolution (LR) images with unknown degradation modes. Most existing methods model the image degradation process using blur kernels. However, this explicit modeling approach struggles to cover the complex and varied degradation processes encountered in the real world, such as high-order combinations of… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  6. arXiv:2406.00914  [pdf, other

    math.OC cs.AI

    Wasserstein gradient flow for optimal probability measure decomposition

    Authors: Jiangze Han, Christopher Thomas Ryan, Xin T. Tong

    Abstract: We examine the infinite-dimensional optimization problem of finding a decomposition of a probability measure into K probability sub-measures to minimize specific loss functions inspired by applications in clustering and user grouping. We analytically explore the structures of the support of optimal sub-measures and introduce algorithms based on Wasserstein gradient flow, demonstrating their conver… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  7. arXiv:2406.00891  [pdf, other

    cs.CV

    Global High Categorical Resolution Land Cover Mapping via Weak Supervision

    Authors: Xin-Yi Tong, Runmin Dong, Xiao Xiang Zhu

    Abstract: Land cover information is indispensable for advancing the United Nations' sustainable development goals, and land cover mapping under a more detailed category system would significantly contribute to economic livelihood tracking and environmental degradation measurement. However, the substantial difficulty in acquiring fine-grained training data makes the implementation of this task particularly c… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

  8. arXiv:2405.18769  [pdf, other

    cs.CV

    OUS: Scene-Guided Dynamic Facial Expression Recognition

    Authors: Xinji Mai, Haoran Wang, Zeng Tao, Junxiong Lin, Shaoqi Yan, Yan Wang, Jing Liu, Jiawen Yu, Xuan Tong, Yating Li, Wenqiang Zhang

    Abstract: Dynamic Facial Expression Recognition (DFER) is crucial for affective computing but often overlooks the impact of scene context. We have identified a significant issue in current DFER tasks: human annotators typically integrate emotions from various angles, including environmental cues and body language, whereas existing DFER methods tend to consider the scene as noise that needs to be filtered ou… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: 12 pages, 6 figures, 6 tables

    ACM Class: I.4; I.5.1

  9. arXiv:2405.17272  [pdf, other

    cs.LG cs.AI

    DPN: Decoupling Partition and Navigation for Neural Solvers of Min-max Vehicle Routing Problems

    Authors: Zhi Zheng, Shunyu Yao, Zhenkun Wang, Xialiang Tong, Mingxuan Yuan, Ke Tang

    Abstract: The min-max vehicle routing problem (min-max VRP) traverses all given customers by assigning several routes and aims to minimize the length of the longest route. Recently, reinforcement learning (RL)-based sequential planning methods have exhibited advantages in solving efficiency and optimality. However, these methods fail to exploit the problem-specific properties in learning representations, re… ▽ More

    Submitted 6 June, 2024; v1 submitted 27 May, 2024; originally announced May 2024.

  10. arXiv:2405.15646  [pdf, other

    cs.RO

    LLM-based Robot Task Planning with Exceptional Handling for General Purpose Service Robots

    Authors: Ruoyu Wang, Zhipeng Yang, Zinan Zhao, Xinyan Tong, Zhi Hong, Kun Qian

    Abstract: The development of a general purpose service robot for daily life necessitates the robot's ability to deploy a myriad of fundamental behaviors judiciously. Recent advancements in training Large Language Models (LLMs) can be used to generate action sequences directly, given an instruction in natural language with no additional domain information. However, while the outputs of LLMs are semantically… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  11. arXiv:2405.13729  [pdf, other

    cs.LG cs.AI cs.CV cs.GR

    ComboStoc: Combinatorial Stochasticity for Diffusion Generative Models

    Authors: Rui Xu, Jiepeng Wang, Hao Pan, Yang Liu, Xin Tong, Shiqing Xin, Changhe Tu, Taku Komura, Wenping Wang

    Abstract: In this paper, we study an under-explored but important factor of diffusion generative models, i.e., the combinatorial complexity. Data samples are generally high-dimensional, and for various structured generation tasks, there are additional attributes which are combined to associate with data samples. We show that the space spanned by the combination of dimensions and attributes is insufficiently… ▽ More

    Submitted 24 May, 2024; v1 submitted 22 May, 2024; originally announced May 2024.

  12. arXiv:2405.12262  [pdf, other

    cs.LG cs.AI

    Prompt Learning for Generalized Vehicle Routing

    Authors: Fei Liu, Xi Lin, Weiduo Liao, Zhenkun Wang, Qingfu Zhang, Xialiang Tong, Mingxuan Yuan

    Abstract: Neural combinatorial optimization (NCO) is a promising learning-based approach to solving various vehicle routing problems without much manual algorithm design. However, the current NCO methods mainly focus on the in-distribution performance, while the real-world problem instances usually come from different distributions. A costly fine-tuning approach or generalized model retraining from scratch… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

  13. arXiv:2405.01906  [pdf, other

    cs.AI cs.LG

    Instance-Conditioned Adaptation for Large-scale Generalization of Neural Combinatorial Optimization

    Authors: Changliang Zhou, Xi Lin, Zhenkun Wang, Xialiang Tong, Mingxuan Yuan, Qingfu Zhang

    Abstract: The neural combinatorial optimization (NCO) approach has shown great potential for solving routing problems without the requirement of expert knowledge. However, existing constructive NCO methods cannot directly solve large-scale instances, which significantly limits their application prospects. To address these crucial shortcomings, this work proposes a novel Instance-Conditioned Adaptation Model… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

    Comments: 17 pages, 6 figures

  14. arXiv:2405.00514  [pdf, other

    cs.CV

    Get Your Embedding Space in Order: Domain-Adaptive Regression for Forest Monitoring

    Authors: Sizhuo Li, Dimitri Gominski, Martin Brandt, Xiaoye Tong, Philippe Ciais

    Abstract: Image-level regression is an important task in Earth observation, where visual domain and label shifts are a core challenge hampering generalization. However, cross-domain regression with remote sensing data remains understudied due to the absence of suited datasets. We introduce a new dataset with aerial and satellite imagery in five countries with three forest-related regression tasks. To match… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  15. arXiv:2404.10667  [pdf, other

    cs.CV

    VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time

    Authors: Sicheng Xu, Guojun Chen, Yu-Xiao Guo, Jiaolong Yang, Chong Li, Zhenyu Zang, Yizhong Zhang, Xin Tong, Baining Guo

    Abstract: We introduce VASA, a framework for generating lifelike talking faces with appealing visual affective skills (VAS) given a single static image and a speech audio clip. Our premiere model, VASA-1, is capable of not only producing lip movements that are exquisitely synchronized with the audio, but also capturing a large spectrum of facial nuances and natural head motions that contribute to the percep… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: Tech Report. Project webpage: https://www.microsoft.com/en-us/research/project/vasa-1/

  16. arXiv:2403.19561  [pdf, other

    cs.LG cs.AI

    Self-Improved Learning for Scalable Neural Combinatorial Optimization

    Authors: Fu Luo, Xi Lin, Zhenkun Wang, Xialiang Tong, Mingxuan Yuan, Qingfu Zhang

    Abstract: The end-to-end neural combinatorial optimization (NCO) method shows promising performance in solving complex combinatorial optimization problems without the need for expert design. However, existing methods struggle with large-scale problems, hindering their practical applicability. To overcome this limitation, this work proposes a novel Self-Improved Learning (SIL) method for better scalability o… ▽ More

    Submitted 2 May, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

  17. arXiv:2403.11810  [pdf, other

    cs.CL

    Metaphor Understanding Challenge Dataset for LLMs

    Authors: Xiaoyu Tong, Rochelle Choenni, Martha Lewis, Ekaterina Shutova

    Abstract: Metaphors in natural language are a reflection of fundamental cognitive processes such as analogical reasoning and categorisation, and are deeply rooted in everyday communication. Metaphor understanding is therefore an essential task for large language models (LLMs). We release the Metaphor Understanding Challenge Dataset (MUNCH), designed to evaluate the metaphor understanding capabilities of LLM… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  18. arXiv:2403.11503  [pdf, other

    cs.CV

    Diffusion Models are Geometry Critics: Single Image 3D Editing Using Pre-Trained Diffusion Priors

    Authors: Ruicheng Wang, Jianfeng Xiang, Jiaolong Yang, Xin Tong

    Abstract: We propose a novel image editing technique that enables 3D manipulations on single images, such as object rotation and translation. Existing 3D-aware image editing approaches typically rely on synthetic multi-view datasets for training specialized models, thus constraining their effectiveness on open-domain images featuring significantly more varied layouts and styles. In contrast, our method dire… ▽ More

    Submitted 13 July, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

    Comments: Project page: https://wangrc.site/Diff3DEdit/

  19. arXiv:2403.10082  [pdf, other

    cs.CV

    CrossGLG: LLM Guides One-shot Skeleton-based 3D Action Recognition in a Cross-level Manner

    Authors: Tingbing Yan, Wenzheng Zeng, Yang Xiao, Xingyu Tong, Bo Tan, Zhiwen Fang, Zhiguo Cao, Joey Tianyi Zhou

    Abstract: Most existing one-shot skeleton-based action recognition focuses on raw low-level information (e.g., joint location), and may suffer from local information loss and low generalization ability. To alleviate these, we propose to leverage text description generated from large language models (LLM) that contain high-level human knowledge, to guide feature learning, in a global-local-global way. Partic… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  20. arXiv:2403.05808  [pdf, other

    cs.CV eess.IV

    Adaptive Multi-modal Fusion of Spatially Variant Kernel Refinement with Diffusion Model for Blind Image Super-Resolution

    Authors: Junxiong Lin, Yan Wang, Zeng Tao, Boyang Wang, Qing Zhao, Haorang Wang, Xuan Tong, Xinji Mai, Yuxuan Lin, Wei Song, Jiawen Yu, Shaoqi Yan, Wenqiang Zhang

    Abstract: Pre-trained diffusion models utilized for image generation encapsulate a substantial reservoir of a priori knowledge pertaining to intricate textures. Harnessing the potential of leveraging this a priori knowledge in the context of image super-resolution presents a compelling avenue. Nonetheless, prevailing diffusion-based methodologies presently overlook the constraints imposed by degradation inf… ▽ More

    Submitted 9 July, 2024; v1 submitted 9 March, 2024; originally announced March 2024.

  21. arXiv:2403.04294  [pdf, other

    cs.CV

    A$^{3}$lign-DFER: Pioneering Comprehensive Dynamic Affective Alignment for Dynamic Facial Expression Recognition with CLIP

    Authors: Zeng Tao, Yan Wang, Junxiong Lin, Haoran Wang, Xinji Mai, Jiawen Yu, Xuan Tong, Ziheng Zhou, Shaoqi Yan, Qing Zhao, Liyuan Han, Wenqiang Zhang

    Abstract: The performance of CLIP in dynamic facial expression recognition (DFER) task doesn't yield exceptional results as observed in other CLIP-based classification tasks. While CLIP's primary objective is to achieve alignment between images and text in the feature space, DFER poses challenges due to the abstract nature of text and the dynamic nature of video, making label representation limited and perf… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

  22. arXiv:2402.16891  [pdf, other

    cs.LG cs.AI

    Multi-Task Learning for Routing Problem with Cross-Problem Zero-Shot Generalization

    Authors: Fei Liu, Xi Lin, Zhenkun Wang, Qingfu Zhang, Xialiang Tong, Mingxuan Yuan

    Abstract: Vehicle routing problems (VRPs), which can be found in numerous real-world applications, have been an important research topic for several decades. Recently, the neural combinatorial optimization (NCO) approach that leverages a learning-based model to solve VRPs without manual algorithm design has gained substantial attention. However, current NCO methods typically require building one model for e… ▽ More

    Submitted 12 April, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

  23. arXiv:2402.14253  [pdf, other

    cs.CV cs.GR

    MVD$^2$: Efficient Multiview 3D Reconstruction for Multiview Diffusion

    Authors: Xin-Yang Zheng, Hao Pan, Yu-Xiao Guo, Xin Tong, Yang Liu

    Abstract: As a promising 3D generation technique, multiview diffusion (MVD) has received a lot of attention due to its advantages in terms of generalizability, quality, and efficiency. By finetuning pretrained large image diffusion models with 3D data, the MVD methods first generate multiple views of a 3D object based on an image or text prompt and then reconstruct 3D shapes with multiview 3D reconstruction… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

  24. DiLightNet: Fine-grained Lighting Control for Diffusion-based Image Generation

    Authors: Chong Zeng, Yue Dong, Pieter Peers, Youkang Kong, Hongzhi Wu, Xin Tong

    Abstract: This paper presents a novel method for exerting fine-grained lighting control during text-driven diffusion-based image generation. While existing diffusion models already have the ability to generate images under any lighting condition, without additional guidance these models tend to correlate image content and lighting. Moreover, text prompts lack the necessary expressional power to describe det… ▽ More

    Submitted 27 May, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

    Comments: Accepted to SIGGRAPH 2024. Project page: https://dilightnet.github.io/

    Journal ref: ACM SIGGRAPH 2024 Conference Proceedings

  25. arXiv:2402.07234  [pdf, other

    cs.AI

    CPSDBench: A Large Language Model Evaluation Benchmark and Baseline for Chinese Public Security Domain

    Authors: Xin Tong, Bo Jin, Zhi Lin, Binjun Wang, Ting Yu, Qiang Cheng

    Abstract: Large Language Models (LLMs) have demonstrated significant potential and effectiveness across multiple application domains. To assess the performance of mainstream LLMs in public security tasks, this study aims to construct a specialized evaluation benchmark tailored to the Chinese public security domain--CPSDbench. CPSDbench integrates datasets related to public security collected from real-world… ▽ More

    Submitted 21 March, 2024; v1 submitted 11 February, 2024; originally announced February 2024.

  26. arXiv:2401.04730  [pdf, other

    cs.CV

    A Simple Baseline for Spoken Language to Sign Language Translation with 3D Avatars

    Authors: Ronglai Zuo, Fangyun Wei, Zenggui Chen, Brian Mak, Jiaolong Yang, Xin Tong

    Abstract: The objective of this paper is to develop a functional system for translating spoken languages into sign languages, referred to as Spoken2Sign translation. The Spoken2Sign task is orthogonal and complementary to traditional sign language to spoken language (Sign2Spoken) translation. To enable Spoken2Sign translation, we present a simple baseline consisting of three steps: 1) creating a gloss-video… ▽ More

    Submitted 3 July, 2024; v1 submitted 9 January, 2024; originally announced January 2024.

    Comments: Accepted by ECCV 2024

  27. arXiv:2401.02992  [pdf

    cs.CL cs.AI

    Advanced Unstructured Data Processing for ESG Reports: A Methodology for Structured Transformation and Enhanced Analysis

    Authors: Jiahui Peng, Jing Gao, Xin Tong, Jing Guo, Hang Yang, Jianchuan Qi, Ruiqiao Li, Nan Li, Ming Xu

    Abstract: In the evolving field of corporate sustainability, analyzing unstructured Environmental, Social, and Governance (ESG) reports is a complex challenge due to their varied formats and intricate content. This study introduces an innovative methodology utilizing the "Unstructured Core Library", specifically tailored to address these challenges by transforming ESG reports into structured, analyzable for… ▽ More

    Submitted 4 January, 2024; originally announced January 2024.

  28. arXiv:2401.02678  [pdf, other

    cs.SD cs.MM eess.AS

    MusicAOG: an Energy-Based Model for Learning and Sampling a Hierarchical Representation of Symbolic Music

    Authors: Yikai Qian, Tianle Wang, Xinyi Tong, Xin Jin, Duo Xu, Bo Zheng, Tiezheng Ge, Feng Yu, Song-Chun Zhu

    Abstract: In addressing the challenge of interpretability and generalizability of artificial music intelligence, this paper introduces a novel symbolic representation that amalgamates both explicit and implicit musical information across diverse traditions and granularities. Utilizing a hierarchical and-or graph representation, the model employs nodes and edges to encapsulate a broad spectrum of musical ele… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

  29. arXiv:2401.02051  [pdf, other

    cs.NE cs.AI

    Evolution of Heuristics: Towards Efficient Automatic Algorithm Design Using Large Language Model

    Authors: Fei Liu, Xialiang Tong, Mingxuan Yuan, Xi Lin, Fu Luo, Zhenkun Wang, Zhichao Lu, Qingfu Zhang

    Abstract: Heuristics are widely used for dealing with complex search and optimization problems. However, manual design of heuristics can be often very labour extensive and requires rich working experience and knowledge. This paper proposes Evolution of Heuristic (EoH), a novel evolutionary paradigm that leverages both Large Language Models (LLMs) and Evolutionary Computation (EC) methods for Automatic Heuri… ▽ More

    Submitted 1 June, 2024; v1 submitted 3 January, 2024; originally announced January 2024.

  30. arXiv:2312.14828  [pdf, other

    cs.CV

    Plan, Posture and Go: Towards Open-World Text-to-Motion Generation

    Authors: Jinpeng Liu, Wenxun Dai, Chunyu Wang, Yiji Cheng, Yansong Tang, Xin Tong

    Abstract: Conventional text-to-motion generation methods are usually trained on limited text-motion pairs, making them hard to generalize to open-world scenarios. Some works use the CLIP model to align the motion space and the text space, aiming to enable motion generation from natural language motion descriptions. However, they are still constrained to generate limited and unrealistic in-place motions. To… ▽ More

    Submitted 22 December, 2023; originally announced December 2023.

  31. arXiv:2311.17707  [pdf, other

    cs.CV

    SAMPro3D: Locating SAM Prompts in 3D for Zero-Shot Scene Segmentation

    Authors: Mutian Xu, Xingyilang Yin, Lingteng Qiu, Yang Liu, Xin Tong, Xiaoguang Han

    Abstract: We introduce SAMPro3D for zero-shot 3D indoor scene segmentation. Given the 3D point cloud and multiple posed 2D frames of 3D scenes, our approach segments 3D scenes by applying the pretrained Segment Anything Model (SAM) to 2D frames. Our key idea involves locating 3D points in scenes as natural 3D prompts to align their projected pixel prompts across frames, ensuring frame-consistency in both pi… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

    Comments: Project page: https://mutianxu.github.io/sampro3d/

  32. arXiv:2311.17510  [pdf, other

    cs.CV

    StructRe: Rewriting for Structured Shape Modeling

    Authors: Jiepeng Wang, Hao Pan, Yang Liu, Xin Tong, Taku Komura, Wenping Wang

    Abstract: Man-made 3D shapes are naturally organized in parts and hierarchies; such structures provide important constraints for shape reconstruction and generation. Modeling shape structures is difficult, because there can be multiple hierarchies for a given shape, causing ambiguity, and across different categories the shape structures are correlated with semantics, limiting generalization. We present Stru… ▽ More

    Submitted 29 November, 2023; v1 submitted 29 November, 2023; originally announced November 2023.

    Comments: Our project page: https://jiepengwang.github.io/StructRe/

  33. arXiv:2311.15249  [pdf, other

    cs.NE cs.AI cs.LG

    Algorithm Evolution Using Large Language Model

    Authors: Fei Liu, Xialiang Tong, Mingxuan Yuan, Qingfu Zhang

    Abstract: Optimization can be found in many real-life applications. Designing an effective algorithm for a specific optimization problem typically requires a tedious amount of effort from human experts with domain knowledge and algorithm design skills. In this paper, we propose a novel approach called Algorithm Evolution using Large Language Model (AEL). It utilizes a large language model (LLM) to automatic… ▽ More

    Submitted 26 November, 2023; originally announced November 2023.

  34. arXiv:2311.10990  [pdf, other

    cs.CY cs.CR econ.GN q-fin.TR

    "Centralized or Decentralized?": Concerns and Value Judgments of Stakeholders in the Non-Fungible Tokens (NFTs) Market

    Authors: Yunpeng Xiao, Bufan Deng, Siqi Chen, Kyrie Zhixuan Zhou, Ray LC, Luyao Zhang, Xin Tong

    Abstract: Non-fungible tokens (NFTs) are decentralized digital tokens to represent the unique ownership of items. Recently, NFTs have been gaining popularity and at the same time bringing up issues, such as scams, racism, and sexism. Decentralization, a key attribute of NFT, contributes to some of the issues that are easier to regulate under centralized schemes, which are intentionally left out of the NFT m… ▽ More

    Submitted 21 November, 2023; v1 submitted 18 November, 2023; originally announced November 2023.

    Comments: Accepted by CSCW 2024

    ACM Class: J.4; K.4.1

  35. arXiv:2310.16676  [pdf, other

    cs.CL

    SSLCL: An Efficient Model-Agnostic Supervised Contrastive Learning Framework for Emotion Recognition in Conversations

    Authors: Tao Shi, Xiao Liang, Yaoyuan Liang, Xinyi Tong, Shao-Lun Huang

    Abstract: Emotion recognition in conversations (ERC) is a rapidly evolving task within the natural language processing community, which aims to detect the emotions expressed by speakers during a conversation. Recently, a growing number of ERC methods have focused on leveraging supervised contrastive learning (SCL) to enhance the robustness and generalizability of learned features. However, current SCL-based… ▽ More

    Submitted 10 December, 2023; v1 submitted 25 October, 2023; originally announced October 2023.

  36. arXiv:2310.12541  [pdf, other

    cs.NE cs.AI cs.CL cs.ET

    Large Language Model for Multi-objective Evolutionary Optimization

    Authors: Fei Liu, Xi Lin, Zhenkun Wang, Shunyu Yao, Xialiang Tong, Mingxuan Yuan, Qingfu Zhang

    Abstract: Multiobjective evolutionary algorithms (MOEAs) are major methods for solving multiobjective optimization problems (MOPs). Many MOEAs have been proposed in the past decades, of which the search operators need a carefully handcrafted design with domain knowledge. Recently, some attempts have been made to replace the manually designed operators in MOEAs with learning-based operators (e.g., neural net… ▽ More

    Submitted 26 March, 2024; v1 submitted 19 October, 2023; originally announced October 2023.

  37. arXiv:2310.05837  [pdf, other

    cs.CV cs.GR

    A Real-time Method for Inserting Virtual Objects into Neural Radiance Fields

    Authors: Keyang Ye, Hongzhi Wu, Xin Tong, Kun Zhou

    Abstract: We present the first real-time method for inserting a rigid virtual object into a neural radiance field, which produces realistic lighting and shadowing effects, as well as allows interactive manipulation of the object. By exploiting the rich information about lighting and geometry in a NeRF, our method overcomes several challenges of object insertion in augmented reality. For lighting estimation,… ▽ More

    Submitted 9 October, 2023; originally announced October 2023.

  38. arXiv:2309.08121  [pdf, other

    cs.HC cs.AI cs.CY

    "I'm Not Confident in Debiasing AI Systems Since I Know Too Little": Teaching AI Creators About Gender Bias Through Hands-on Tutorials

    Authors: Kyrie Zhixuan Zhou, Jiaxun Cao, Xiaowen Yuan, Daniel E. Weissglass, Zachary Kilhoffer, Madelyn Rose Sanfilippo, Xin Tong

    Abstract: Gender bias is rampant in AI systems, causing bad user experience, injustices, and mental harm to women. School curricula fail to educate AI creators on this topic, leaving them unprepared to mitigate gender bias in AI. In this paper, we designed hands-on tutorials to raise AI creators' awareness of gender bias in AI and enhance their knowledge of sources of gender bias and debiasing techniques. T… ▽ More

    Submitted 14 September, 2023; originally announced September 2023.

  39. arXiv:2309.05092  [pdf, other

    stat.ME cs.LG math.ST

    Adaptive conformal classification with noisy labels

    Authors: Matteo Sesia, Y. X. Rachel Wang, Xin Tong

    Abstract: This paper develops novel conformal prediction methods for classification tasks that can automatically adapt to random label contamination in the calibration sample, leading to more informative prediction sets with stronger coverage guarantees compared to state-of-the-art approaches. This is made possible by a precise characterization of the effective coverage inflation (or deflation) suffered by… ▽ More

    Submitted 21 February, 2024; v1 submitted 10 September, 2023; originally announced September 2023.

    Comments: 28 pages (127 pages including references and appendices)

  40. arXiv:2309.02186  [pdf, other

    cs.CV cs.AI cs.GR

    AniPortraitGAN: Animatable 3D Portrait Generation from 2D Image Collections

    Authors: Yue Wu, Sicheng Xu, Jianfeng Xiang, Fangyun Wei, Qifeng Chen, Jiaolong Yang, Xin Tong

    Abstract: Previous animatable 3D-aware GANs for human generation have primarily focused on either the human head or full body. However, head-only videos are relatively uncommon in real life, and full body generation typically does not deal with facial expression control and still has challenges in generating high-quality results. Towards applicable video avatars, we present an animatable 3D-aware GAN that g… ▽ More

    Submitted 5 September, 2023; originally announced September 2023.

    Comments: SIGGRAPH Asia 2023. Project Page: https://yuewuhkust.github.io/AniPortraitGAN/

  41. Relighting Neural Radiance Fields with Shadow and Highlight Hints

    Authors: Chong Zeng, Guojun Chen, Yue Dong, Pieter Peers, Hongzhi Wu, Xin Tong

    Abstract: This paper presents a novel neural implicit radiance representation for free viewpoint relighting from a small set of unstructured photographs of an object lit by a moving point light source different from the view position. We express the shape as a signed distance function modeled by a multi layer perceptron. In contrast to prior relightable implicit neural representations, we do not disentangle… ▽ More

    Submitted 25 August, 2023; originally announced August 2023.

    Comments: Accepted to SIGGRAPH 2023. Author's version. Project page: https://nrhints.github.io/

    Journal ref: ACM SIGGRAPH 2023 Conference Proceedings

  42. On the Mechanics of NFT Valuation: AI Ethics and Social Media

    Authors: Luyao Zhang, Yutong Sun, Yutong Quan, Jiaxun Cao, Xin Tong

    Abstract: As CryptoPunks pioneers the innovation of non-fungible tokens (NFTs) in AI and art, the valuation mechanics of NFTs has become a trending topic. Earlier research identifies the impact of ethics and society on the price prediction of CryptoPunks. Since the booming year of the NFT market in 2021, the discussion of CryptoPunks has propagated on social media. Still, existing literature hasn't consider… ▽ More

    Submitted 21 July, 2023; v1 submitted 12 July, 2023; originally announced July 2023.

    Comments: Presented at ChainScience Conference, 2003 (arXiv:2307.03277v2 [cs.DC] 11 Jul 2023)

    Report number: ChainScience/2023/16

  43. arXiv:2307.10162  [pdf, other

    cs.HC

    RTVis: Research Trend Visualization Toolkit

    Authors: Xingyu Shen, Yueqian Lin, Zhixian Zhang, Xin Tong

    Abstract: When researchers are about to start a new project or have just entered a new research field, choosing a proper research topic is always challenging. To help them have an overall understanding of the research trend in real-time and find out the research topic they are interested in, we developed the Research Trend Visualization toolkit (RTVis) to analyze and visualize the research paper information… ▽ More

    Submitted 16 August, 2023; v1 submitted 19 July, 2023; originally announced July 2023.

    Comments: Accepted by IEEE VIS 2023 (Poster). 2 pages, 1 figure. For our demo page, visit https://www.rtvis.design/

  44. arXiv:2307.04356  [pdf, other

    cs.NE cs.CV

    InfLoR-SNN: Reducing Information Loss for Spiking Neural Networks

    Authors: Yufei Guo, Yuanpei Chen, Liwen Zhang, Xiaode Liu, Xinyi Tong, Yuanyuan Ou, Xuhui Huang, Zhe Ma

    Abstract: The Spiking Neural Network (SNN) has attracted more and more attention recently. It adopts binary spike signals to transmit information. Benefitting from the information passing paradigm of SNNs, the multiplications of activations and weights can be replaced by additions, which are more energy-efficient. However, its "Hard Reset" mechanism for the firing activity would ignore the difference among… ▽ More

    Submitted 17 August, 2023; v1 submitted 10 July, 2023; originally announced July 2023.

    Comments: Accepted by ECCV2022

  45. arXiv:2307.00894  [pdf

    cs.CV physics.soc-ph

    Mega-cities dominate China's urban greening

    Authors: Xiaoxin Zhang, Martin Brandt, Xiaoye Tong, Xiaowei Tong, Wenmin Zhang, Florian Reiner, Sizhuo Li, Feng Tian, Yuemin Yue, Weiqi Zhou, Bin Chen, Xiangming Xiao, Rasmus Fensholt

    Abstract: Trees play a crucial role in urban environments, offering various ecosystem services that contribute to public health and human well-being. China has initiated a range of urban greening policies over the past decades, however, monitoring their impact on urban tree dynamics at a national scale has proven challenging. In this study, we deployed nano-satellites to quantify urban tree coverage in all… ▽ More

    Submitted 3 July, 2023; originally announced July 2023.

  46. arXiv:2306.11867  [pdf, other

    cs.LG cs.DC

    Personalized Federated Learning with Feature Alignment and Classifier Collaboration

    Authors: Jian Xu, Xinyi Tong, Shao-Lun Huang

    Abstract: Data heterogeneity is one of the most challenging issues in federated learning, which motivates a variety of approaches to learn personalized models for participating clients. One such approach in deep neural networks based tasks is employing a shared feature representation and learning a customized classifier head for each client. However, previous works do not utilize the global knowledge during… ▽ More

    Submitted 20 June, 2023; originally announced June 2023.

    Comments: ICLR 2023, fix some typos and add the code link

  47. arXiv:2306.04366  [pdf, other

    cs.SI cs.AI cs.HC cs.LG

    Enhancing Worker Recruitment in Collaborative Mobile Crowdsourcing: A Graph Neural Network Trust Evaluation Approach

    Authors: Zhongwei Zhan, Yingjie Wang, Peiyong Duan, Akshita Maradapu Vera Venkata Sai, Zhaowei Liu, Chaocan Xiang, Xiangrong Tong, Weilong Wang, Zhipeng Cai

    Abstract: Collaborative Mobile Crowdsourcing (CMCS) allows platforms to recruit worker teams to collaboratively execute complex sensing tasks. The efficiency of such collaborations could be influenced by trust relationships among workers. To obtain the asymmetric trust values among all workers in the social network, the Trust Reinforcement Evaluation Framework (TREF) based on Graph Convolutional Neural Netw… ▽ More

    Submitted 21 March, 2024; v1 submitted 7 June, 2023; originally announced June 2023.

    Comments: The article has been accepted by IEEE TMC, and its DOI is 10.1109/TMC.2024.3373469

  48. arXiv:2305.04461  [pdf, other

    cs.CV cs.GR

    Locally Attentional SDF Diffusion for Controllable 3D Shape Generation

    Authors: Xin-Yang Zheng, Hao Pan, Peng-Shuai Wang, Xin Tong, Yang Liu, Heung-Yeung Shum

    Abstract: Although the recent rapid evolution of 3D generative neural networks greatly improves 3D shape generation, it is still not convenient for ordinary users to create 3D shapes and control the local geometry of generated shapes. To address these challenges, we propose a diffusion-based 3D generation framework -- locally attentional SDF diffusion, to model plausible 3D shapes, via 2D sketch image input… ▽ More

    Submitted 8 May, 2023; v1 submitted 8 May, 2023; originally announced May 2023.

    Comments: Accepted to SIGGRAPH 2023 (Journal version)

    Journal ref: ACM Transactions on Graphics (SIGGRAPH), 42, 4 (August 2023), 13 pages

  49. arXiv:2304.07511  [pdf, other

    cs.HC

    Pilgrimage to Pureland: Art, Perception and the Wutai Mural VR Reconstruction

    Authors: Rongxuan Mu, Yuhe Nie, Kent Cao, Ruoxin You, Yinzong Wei, Xin Tong

    Abstract: Virtual reality (VR) supports audiences to engage with cultural heritage proactively. We designed an easy-to-access and guided Pilgrimage To Pureland VR reconstruction of Dunhuang Mogao Grottoes to offer the general public an accessible and engaging way to explore the Dunhuang murals. We put forward an immersive VR reconstruction paradigm that can efficiently convert complex 2D artwork into a VR e… ▽ More

    Submitted 15 April, 2023; originally announced April 2023.

  50. arXiv:2304.06911  [pdf, other

    cs.CV

    3D Feature Prediction for Masked-AutoEncoder-Based Point Cloud Pretraining

    Authors: Siming Yan, Yuqi Yang, Yuxiao Guo, Hao Pan, Peng-shuai Wang, Xin Tong, Yang Liu, Qixing Huang

    Abstract: Masked autoencoders (MAE) have recently been introduced to 3D self-supervised pretraining for point clouds due to their great success in NLP and computer vision. Unlike MAEs used in the image domain, where the pretext task is to restore features at the masked pixels, such as colors, the existing 3D MAE works reconstruct the missing geometry only, i.e, the location of the masked points. In contrast… ▽ More

    Submitted 28 April, 2024; v1 submitted 13 April, 2023; originally announced April 2023.

    Comments: Published in ICLR 2024