-
Fish-Vista: A Multi-Purpose Dataset for Understanding & Identification of Traits from Images
Authors:
Kazi Sajeed Mehrab,
M. Maruf,
Arka Daw,
Harish Babu Manogaran,
Abhilash Neog,
Mridul Khurana,
Bahadir Altintas,
Yasin Bakis,
Elizabeth G Campolongo,
Matthew J Thompson,
Xiaojun Wang,
Hilmar Lapp,
Wei-Lun Chao,
Paula M. Mabee,
Henry L. Bart Jr.,
Wasila Dahdul,
Anuj Karpatne
Abstract:
Fishes are integral to both ecological systems and economic sectors, and studying fish traits is crucial for understanding biodiversity patterns and macro-evolution trends. To enable the analysis of visual traits from fish images, we introduce the Fish-Visual Trait Analysis (Fish-Vista) dataset - a large, annotated collection of about 60K fish images spanning 1900 different species, supporting sev…
▽ More
Fishes are integral to both ecological systems and economic sectors, and studying fish traits is crucial for understanding biodiversity patterns and macro-evolution trends. To enable the analysis of visual traits from fish images, we introduce the Fish-Visual Trait Analysis (Fish-Vista) dataset - a large, annotated collection of about 60K fish images spanning 1900 different species, supporting several challenging and biologically relevant tasks including species classification, trait identification, and trait segmentation. These images have been curated through a sophisticated data processing pipeline applied to a cumulative set of images obtained from various museum collections. Fish-Vista provides fine-grained labels of various visual traits present in each image. It also offers pixel-level annotations of 9 different traits for 2427 fish images, facilitating additional trait segmentation and localization tasks. The ultimate goal of Fish-Vista is to provide a clean, carefully curated, high-resolution dataset that can serve as a foundation for accelerating biological discoveries using advances in AI. Finally, we provide a comprehensive analysis of state-of-the-art deep learning techniques on Fish-Vista.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
GPT is Not an Annotator: The Necessity of Human Annotation in Fairness Benchmark Construction
Authors:
Virginia K. Felkner,
Jennifer A. Thompson,
Jonathan May
Abstract:
Social biases in LLMs are usually measured via bias benchmark datasets. Current benchmarks have limitations in scope, grounding, quality, and human effort required. Previous work has shown success with a community-sourced, rather than crowd-sourced, approach to benchmark development. However, this work still required considerable effort from annotators with relevant lived experience. This paper ex…
▽ More
Social biases in LLMs are usually measured via bias benchmark datasets. Current benchmarks have limitations in scope, grounding, quality, and human effort required. Previous work has shown success with a community-sourced, rather than crowd-sourced, approach to benchmark development. However, this work still required considerable effort from annotators with relevant lived experience. This paper explores whether an LLM (specifically, GPT-3.5-Turbo) can assist with the task of developing a bias benchmark dataset from responses to an open-ended community survey. We also extend the previous work to a new community and set of biases: the Jewish community and antisemitism. Our analysis shows that GPT-3.5-Turbo has poor performance on this annotation task and produces unacceptable quality issues in its output. Thus, we conclude that GPT-3.5-Turbo is not an appropriate substitute for human annotation in sensitive tasks related to social biases, and that its use actually negates many of the benefits of community-sourcing bias benchmarks.
△ Less
Submitted 24 May, 2024;
originally announced May 2024.
-
Modeling Kinematic Uncertainty of Tendon-Driven Continuum Robots via Mixture Density Networks
Authors:
Jordan Thompson,
Brian Y. Cho,
Daniel S. Brown,
Alan Kuntz
Abstract:
Tendon-driven continuum robot kinematic models are frequently computationally expensive, inaccurate due to unmodeled effects, or both. In particular, unmodeled effects produce uncertainties that arise during the robot's operation that lead to variability in the resulting geometry. We propose a novel solution to these issues through the development of a Gaussian mixture kinematic model. We train a…
▽ More
Tendon-driven continuum robot kinematic models are frequently computationally expensive, inaccurate due to unmodeled effects, or both. In particular, unmodeled effects produce uncertainties that arise during the robot's operation that lead to variability in the resulting geometry. We propose a novel solution to these issues through the development of a Gaussian mixture kinematic model. We train a mixture density network to output a Gaussian mixture model representation of the robot geometry given the current tendon displacements. This model computes a probability distribution that is more representative of the true distribution of geometries at a given configuration than a model that outputs a single geometry, while also reducing the computation time. We demonstrate one use of this model through a trajectory optimization method that explicitly reasons about the workspace uncertainty to minimize the probability of collision.
△ Less
Submitted 5 April, 2024;
originally announced April 2024.
-
Accounting for Hysteresis in the Forward Kinematics of Nonlinearly-Routed Tendon-Driven Continuum Robots via a Learned Deep Decoder Network
Authors:
Brian Y. Cho,
Daniel S. Esser,
Jordan Thompson,
Bao Thach,
Robert J. Webster III,
Alan Kuntz
Abstract:
Tendon-driven continuum robots have been gaining popularity in medical applications due to their ability to curve around complex anatomical structures, potentially reducing the invasiveness of surgery. However, accurate modeling is required to plan and control the movements of these flexible robots. Physics-based models have limitations due to unmodeled effects, leading to mismatches between model…
▽ More
Tendon-driven continuum robots have been gaining popularity in medical applications due to their ability to curve around complex anatomical structures, potentially reducing the invasiveness of surgery. However, accurate modeling is required to plan and control the movements of these flexible robots. Physics-based models have limitations due to unmodeled effects, leading to mismatches between model prediction and actual robot shape. Recently proposed learning-based methods have been shown to overcome some of these limitations but do not account for hysteresis, a significant source of error for these robots. To overcome these challenges, we propose a novel deep decoder neural network that predicts the complete shape of tendon-driven robots using point clouds as the shape representation, conditioned on prior configurations to account for hysteresis. We evaluate our method on a physical tendon-driven robot and show that our network model accurately predicts the robot's shape, significantly outperforming a state-of-the-art physics-based model and a learning-based model that does not account for hysteresis.
△ Less
Submitted 4 April, 2024;
originally announced April 2024.
-
Stable numerics for finite-strain elasticity
Authors:
Rezgar Shakeri,
Leila Ghaffari,
Jeremy L. Thompson,
Jed Brown
Abstract:
A backward stable numerical calculation of a function with condition number $κ$ will have a relative accuracy of $κε_{\text{machine}}$. Standard formulations and software implementations of finite-strain elastic materials models make use of the deformation gradient $\boldsymbol F = I + \partial \boldsymbol u/\partial \boldsymbol X$ and Cauchy-Green tensors. These formulations are not numerically s…
▽ More
A backward stable numerical calculation of a function with condition number $κ$ will have a relative accuracy of $κε_{\text{machine}}$. Standard formulations and software implementations of finite-strain elastic materials models make use of the deformation gradient $\boldsymbol F = I + \partial \boldsymbol u/\partial \boldsymbol X$ and Cauchy-Green tensors. These formulations are not numerically stable, leading to loss of several digits of accuracy when used in the small strain regime, and often precluding the use of single precision floating point arithmetic. We trace the source of this instability to specific points of numerical cancellation, interpretable as ill-conditioned steps. We show how to compute various strain measures in a stable way and how to transform common constitutive models to their stable representations, formulated in either initial or current configuration. The stable formulations all provide accuracy of order $ε_{\text{machine}}$. In many cases, the stable formulations have elegant representations in terms of appropriate strain measures and offer geometric intuition that is lacking in their standard representation. We show that algorithmic differentiation can stably compute stresses so long as the strain energy is expressed stably, and give principles for stable computation that can be applied to inelastic materials.
△ Less
Submitted 8 July, 2024; v1 submitted 23 January, 2024;
originally announced January 2024.
-
BioCLIP: A Vision Foundation Model for the Tree of Life
Authors:
Samuel Stevens,
Jiaman Wu,
Matthew J Thompson,
Elizabeth G Campolongo,
Chan Hee Song,
David Edward Carlyn,
Li Dong,
Wasila M Dahdul,
Charles Stewart,
Tanya Berger-Wolf,
Wei-Lun Chao,
Yu Su
Abstract:
Images of the natural world, collected by a variety of cameras, from drones to individual phones, are increasingly abundant sources of biological information. There is an explosion of computational methods and tools, particularly computer vision, for extracting biologically relevant information from images for science and conservation. Yet most of these are bespoke approaches designed for a specif…
▽ More
Images of the natural world, collected by a variety of cameras, from drones to individual phones, are increasingly abundant sources of biological information. There is an explosion of computational methods and tools, particularly computer vision, for extracting biologically relevant information from images for science and conservation. Yet most of these are bespoke approaches designed for a specific task and are not easily adaptable or extendable to new questions, contexts, and datasets. A vision model for general organismal biology questions on images is of timely need. To approach this, we curate and release TreeOfLife-10M, the largest and most diverse ML-ready dataset of biology images. We then develop BioCLIP, a foundation model for the tree of life, leveraging the unique properties of biology captured by TreeOfLife-10M, namely the abundance and variety of images of plants, animals, and fungi, together with the availability of rich structured biological knowledge. We rigorously benchmark our approach on diverse fine-grained biology classification tasks and find that BioCLIP consistently and substantially outperforms existing baselines (by 16% to 17% absolute). Intrinsic evaluation reveals that BioCLIP has learned a hierarchical representation conforming to the tree of life, shedding light on its strong generalizability. https://imageomics.github.io/bioclip has models, data and code.
△ Less
Submitted 14 May, 2024; v1 submitted 30 November, 2023;
originally announced November 2023.
-
Neural-based Compression Scheme for Solar Image Data
Authors:
Ali Zafari,
Atefeh Khoshkhahtinat,
Jeremy A. Grajeda,
Piyush M. Mehta,
Nasser M. Nasrabadi,
Laura E. Boucheron,
Barbara J. Thompson,
Michael S. F. Kirk,
Daniel da Silva
Abstract:
Studying the solar system and especially the Sun relies on the data gathered daily from space missions. These missions are data-intensive and compressing this data to make them efficiently transferable to the ground station is a twofold decision to make. Stronger compression methods, by distorting the data, can increase data throughput at the cost of accuracy which could affect scientific analysis…
▽ More
Studying the solar system and especially the Sun relies on the data gathered daily from space missions. These missions are data-intensive and compressing this data to make them efficiently transferable to the ground station is a twofold decision to make. Stronger compression methods, by distorting the data, can increase data throughput at the cost of accuracy which could affect scientific analysis of the data. On the other hand, preserving subtle details in the compressed data requires a high amount of data to be transferred, reducing the desired gains from compression. In this work, we propose a neural network-based lossy compression method to be used in NASA's data-intensive imagery missions. We chose NASA's SDO mission which transmits 1.4 terabytes of data each day as a proof of concept for the proposed algorithm. In this work, we propose an adversarially trained neural network, equipped with local and non-local attention modules to capture both the local and global structure of the image resulting in a better trade-off in rate-distortion (RD) compared to conventional hand-engineered codecs. The RD variational autoencoder used in this work is jointly trained with a channel-dependent entropy model as a shared prior between the analysis and synthesis transforms to make the entropy coding of the latent code more effective. Our neural image compression algorithm outperforms currently-in-use and state-of-the-art codecs such as JPEG and JPEG-2000 in terms of the RD performance when compressing extreme-ultraviolet (EUV) data. As a proof of concept for use of this algorithm in SDO data analysis, we have performed coronal hole (CH) detection using our compressed images, and generated consistent segmentations, even at a compression rate of $\sim0.1$ bits per pixel (compared to 8 bits per pixel on the original data) using EUV data from SDO.
△ Less
Submitted 5 November, 2023;
originally announced November 2023.
-
Multi-spectral Entropy Constrained Neural Compression of Solar Imagery
Authors:
Ali Zafari,
Atefeh Khoshkhahtinat,
Piyush M. Mehta,
Nasser M. Nasrabadi,
Barbara J. Thompson,
Michael S. F. Kirk,
Daniel da Silva
Abstract:
Missions studying the dynamic behaviour of the Sun are defined to capture multi-spectral images of the sun and transmit them to the ground station in a daily basis. To make transmission efficient and feasible, image compression systems need to be exploited. Recently successful end-to-end optimized neural network-based image compression systems have shown great potential to be used in an ad-hoc man…
▽ More
Missions studying the dynamic behaviour of the Sun are defined to capture multi-spectral images of the sun and transmit them to the ground station in a daily basis. To make transmission efficient and feasible, image compression systems need to be exploited. Recently successful end-to-end optimized neural network-based image compression systems have shown great potential to be used in an ad-hoc manner. In this work we have proposed a transformer-based multi-spectral neural image compressor to efficiently capture redundancies both intra/inter-wavelength. To unleash the locality of window-based self attention mechanism, we propose an inter-window aggregated token multi head self attention. Additionally to make the neural compressor autoencoder shift invariant, a randomly shifted window attention mechanism is used which makes the transformer blocks insensitive to translations in their input domain. We demonstrate that the proposed approach not only outperforms the conventional compression algorithms but also it is able to better decorrelates images along the multiple wavelengths compared to single spectral compression.
△ Less
Submitted 10 October, 2023; v1 submitted 19 September, 2023;
originally announced September 2023.
-
Context-Aware Neural Video Compression on Solar Dynamics Observatory
Authors:
Atefeh Khoshkhahtinat,
Ali Zafari,
Piyush M. Mehta,
Nasser M. Nasrabadi,
Barbara J. Thompson,
Michael S. F. Kirk,
Daniel da Silva
Abstract:
NASA's Solar Dynamics Observatory (SDO) mission collects large data volumes of the Sun's daily activity. Data compression is crucial for space missions to reduce data storage and video bandwidth requirements by eliminating redundancies in the data. In this paper, we present a novel neural Transformer-based video compression approach specifically designed for the SDO images. Our primary objective i…
▽ More
NASA's Solar Dynamics Observatory (SDO) mission collects large data volumes of the Sun's daily activity. Data compression is crucial for space missions to reduce data storage and video bandwidth requirements by eliminating redundancies in the data. In this paper, we present a novel neural Transformer-based video compression approach specifically designed for the SDO images. Our primary objective is to efficiently exploit the temporal and spatial redundancies inherent in solar images to obtain a high compression ratio. Our proposed architecture benefits from a novel Transformer block called Fused Local-aware Window (FLaWin), which incorporates window-based self-attention modules and an efficient fused local-aware feed-forward (FLaFF) network. This architectural design allows us to simultaneously capture short-range and long-range information while facilitating the extraction of rich and diverse contextual representations. Moreover, this design choice results in reduced computational complexity. Experimental results demonstrate the significant contribution of the FLaWin Transformer block to the compression performance, outperforming conventional hand-engineered video codecs such as H.264 and H.265 in terms of rate-distortion trade-off.
△ Less
Submitted 19 September, 2023;
originally announced September 2023.
-
Data Formulator: AI-powered Concept-driven Visualization Authoring
Authors:
Chenglong Wang,
John Thompson,
Bongshin Lee
Abstract:
With most modern visualization tools, authors need to transform their data into tidy formats to create visualizations they want. Because this requires experience with programming or separate data processing tools, data transformation remains a barrier in visualization authoring. To address this challenge, we present a new visualization paradigm, concept binding, that separates high-level visualiza…
▽ More
With most modern visualization tools, authors need to transform their data into tidy formats to create visualizations they want. Because this requires experience with programming or separate data processing tools, data transformation remains a barrier in visualization authoring. To address this challenge, we present a new visualization paradigm, concept binding, that separates high-level visualization intents and low-level data transformation steps, leveraging an AI agent. We realize this paradigm in Data Formulator, an interactive visualization authoring tool. With Data Formulator, authors first define data concepts they plan to visualize using natural languages or examples, and then bind them to visual channels. Data Formulator then dispatches its AI-agent to automatically transform the input data to surface these concepts and generate desired visualizations. When presenting the results (transformed table and output visualizations) from the AI agent, Data Formulator provides feedback to help authors inspect and understand them. A user study with 10 participants shows that participants could learn and use Data Formulator to create visualizations that involve challenging data transformations, and presents interesting future research directions.
△ Less
Submitted 27 October, 2023; v1 submitted 18 September, 2023;
originally announced September 2023.
-
Scalable Label-efficient Footpath Network Generation Using Remote Sensing Data and Self-supervised Learning
Authors:
Xinye Wanyan,
Sachith Seneviratne,
Kerry Nice,
Jason Thompson,
Marcus White,
Nano Langenheim,
Mark Stevenson
Abstract:
Footpath mapping, modeling, and analysis can provide important geospatial insights to many fields of study, including transport, health, environment and urban planning. The availability of robust Geographic Information System (GIS) layers can benefit the management of infrastructure inventories, especially at local government level with urban planners responsible for the deployment and maintenance…
▽ More
Footpath mapping, modeling, and analysis can provide important geospatial insights to many fields of study, including transport, health, environment and urban planning. The availability of robust Geographic Information System (GIS) layers can benefit the management of infrastructure inventories, especially at local government level with urban planners responsible for the deployment and maintenance of such infrastructure. However, many cities still lack real-time information on the location, connectivity, and width of footpaths, and/or employ costly and manual survey means to gather this information. This work designs and implements an automatic pipeline for generating footpath networks based on remote sensing images using machine learning models. The annotation of segmentation tasks, especially labeling remote sensing images with specialized requirements, is very expensive, so we aim to introduce a pipeline requiring less labeled data. Considering supervised methods require large amounts of training data, we use a self-supervised method for feature representation learning to reduce annotation requirements. Then the pre-trained model is used as the encoder of the U-Net for footpath segmentation. Based on the generated masks, the footpath polygons are extracted and converted to footpath networks which can be loaded and visualized by geographic information systems conveniently. Validation results indicate considerable consistency when compared to manually collected GIS layers. The footpath network generation pipeline proposed in this work is low-cost and extensible, and it can be applied where remote sensing images are available. Github: https://github.com/WennyXY/FootpathSeg.
△ Less
Submitted 17 September, 2023;
originally announced September 2023.
-
WonderFlow: Narration-Centric Design of Animated Data Videos
Authors:
Yun Wang,
Leixian Shen,
Zhengxin You,
Xinhuan Shu,
Bongshin Lee,
John Thompson,
Haidong Zhang,
Dongmei Zhang
Abstract:
Creating an animated data video enriched with audio narration takes a significant amount of time and effort and requires expertise. Users not only need to design complex animations, but also turn written text scripts into audio narrations and synchronize visual changes with the narrations. This paper presents WonderFlow, an interactive authoring tool, that facilitates narration-centric design of a…
▽ More
Creating an animated data video enriched with audio narration takes a significant amount of time and effort and requires expertise. Users not only need to design complex animations, but also turn written text scripts into audio narrations and synchronize visual changes with the narrations. This paper presents WonderFlow, an interactive authoring tool, that facilitates narration-centric design of animated data videos. WonderFlow allows authors to easily specify a semantic link between text and the corresponding chart elements. Then it automatically generates audio narration by leveraging text-to-speech techniques and aligns the narration with an animation. WonderFlow provides a visualization structure-aware animation library designed to ease chart animation creation, enabling authors to apply pre-designed animation effects to common visualization components. It also allows authors to preview and iteratively refine their data videos in a unified system, without having to switch between different creation tools. To evaluate WonderFlow's effectiveness and usability, we created an example gallery and conducted a user study and expert interviews. The results demonstrated that WonderFlow is easy to use and simplifies the creation of data videos with narration-animation interplay.
△ Less
Submitted 6 June, 2024; v1 submitted 8 August, 2023;
originally announced August 2023.
-
Mixed-type Distance Shrinkage and Selection for Clustering via Kernel Metric Learning
Authors:
Jesse S. Ghashti,
John R. J. Thompson
Abstract:
Distance-based clustering and classification are widely used in various fields to group mixed numeric and categorical data. In many algorithms, a predefined distance measurement is used to cluster data points based on their dissimilarity. While there exist numerous distance-based measures for data with pure numerical attributes and several ordered and unordered categorical metrics, an efficient an…
▽ More
Distance-based clustering and classification are widely used in various fields to group mixed numeric and categorical data. In many algorithms, a predefined distance measurement is used to cluster data points based on their dissimilarity. While there exist numerous distance-based measures for data with pure numerical attributes and several ordered and unordered categorical metrics, an efficient and accurate distance for mixed-type data that utilizes the continuous and discrete properties simulatenously is an open problem. Many metrics convert numerical attributes to categorical ones or vice versa. They handle the data points as a single attribute type or calculate a distance between each attribute separately and add them up. We propose a metric called KDSUM that uses mixed kernels to measure dissimilarity, with cross-validated optimal bandwidth selection. We demonstrate that KDSUM is a shrinkage method from existing mixed-type metrics to a uniform dissimilarity metric, and improves clustering accuracy when utilized in existing distance-based clustering algorithms on simulated and real-world datasets containing continuous-only, categorical-only, and mixed-type data.
△ Less
Submitted 30 August, 2023; v1 submitted 2 June, 2023;
originally announced June 2023.
-
All-path convexity: Combinatorial and complexity aspects
Authors:
Fábio Protti,
João V. C. Thompson
Abstract:
Let $¶$ be any collection of paths of a graph $G=(V,E)$. For $S\subseteq V$, define $I(S)=S\cup\{v\mid v \ \mbox{lies in a path of} \ ¶\ \mbox{with endpoints in} \ S\}$. Let $\C$ be the collection of fixed points of the function $I$, that is, $\C=\{S\subseteq V\mid I(S)=S\}$. It is well known that $(V,\C)$ is a finite convexity space, where the members of $\C$ are precisely the convex sets. If…
▽ More
Let $¶$ be any collection of paths of a graph $G=(V,E)$. For $S\subseteq V$, define $I(S)=S\cup\{v\mid v \ \mbox{lies in a path of} \ ¶\ \mbox{with endpoints in} \ S\}$. Let $\C$ be the collection of fixed points of the function $I$, that is, $\C=\{S\subseteq V\mid I(S)=S\}$. It is well known that $(V,\C)$ is a finite convexity space, where the members of $\C$ are precisely the convex sets. If $¶$ is taken as the collection of all the paths of $G$, then $(V,\C)$ is the {\em all-path convexity} with respect to graph $G$. In this work we study how important parameters and problems in graph convexity are solved for the all-path convexity.
△ Less
Submitted 31 March, 2023;
originally announced March 2023.
-
Channelformer: Attention based Neural Solution for Wireless Channel Estimation and Effective Online Training
Authors:
Dianxin Luan,
John Thompson
Abstract:
In this paper, we propose an encoder-decoder neural architecture (called Channelformer) to achieve improved channel estimation for orthogonal frequency-division multiplexing (OFDM) waveforms in downlink scenarios. The self-attention mechanism is employed to achieve input precoding for the input features before processing them in the decoder. In particular, we implement multi-head attention in the…
▽ More
In this paper, we propose an encoder-decoder neural architecture (called Channelformer) to achieve improved channel estimation for orthogonal frequency-division multiplexing (OFDM) waveforms in downlink scenarios. The self-attention mechanism is employed to achieve input precoding for the input features before processing them in the decoder. In particular, we implement multi-head attention in the encoder and a residual convolutional neural architecture as the decoder, respectively. We also employ a customized weight-level pruning to slim the trained neural network with a fine-tuning process, which reduces the computational complexity significantly to realize a low complexity and low latency solution. This enables reductions of up to 70\% in the parameters, while maintaining an almost identical performance compared with the complete Channelformer. We also propose an effective online training method based on the fifth generation (5G) new radio (NR) configuration for the modern communication systems, which only needs the available information at the receiver for online training. Using industrial standard channel models, the simulations of attention-based solutions show superior estimation performance compared with other candidate neural network methods for channel estimation.
△ Less
Submitted 8 February, 2023;
originally announced February 2023.
-
Achieving Robust Generalization for Wireless Channel Estimation Neural Networks by Designed Training Data
Authors:
Dianxin Luan,
John Thompson
Abstract:
In this paper, we propose a method to design the training data that can support robust generalization of trained neural networks to unseen channels. The proposed design that improves the generalization is described and analysed. It avoids the requirement of online training for previously unseen channels, as this is a memory and processing intensive solution, especially for battery powered mobile t…
▽ More
In this paper, we propose a method to design the training data that can support robust generalization of trained neural networks to unseen channels. The proposed design that improves the generalization is described and analysed. It avoids the requirement of online training for previously unseen channels, as this is a memory and processing intensive solution, especially for battery powered mobile terminals. To prove the validity of the proposed method, we use the channels modelled by different standards and fading modelling for simulation. We also use an attention-based structure and a convolutional neural network to evaluate the generalization results achieved. Simulation results show that the trained neural networks maintain almost identical performance on the unseen channels.
△ Less
Submitted 4 February, 2023;
originally announced February 2023.
-
NFTrig
Authors:
Jordan Thompson,
Ryan Benac,
Kidus Olana,
Talha Hassan,
Andrew Sward,
Tauheed Khan Mohd
Abstract:
NFTrig is a web-based application created for use as an educational tool to teach trigonometry and block chain technology. Creation of the application includes front and back end development as well as integration with other outside sources including MetaMask and OpenSea. The primary development languages include HTML, CSS (Bootstrap 5), and JavaScript as well as Solidity for smart contract creati…
▽ More
NFTrig is a web-based application created for use as an educational tool to teach trigonometry and block chain technology. Creation of the application includes front and back end development as well as integration with other outside sources including MetaMask and OpenSea. The primary development languages include HTML, CSS (Bootstrap 5), and JavaScript as well as Solidity for smart contract creation. The application itself is hosted on Moralis utilizing their Web3 API. This technical report describes how the application was created, what the application requires, and smart contract design with security considerations in mind. The NFTrig application has underwent significant testing and validation prior to and after deployment. Future suggestions and recommendations for further development, maintenance, and use in other fields for education are also described.
△ Less
Submitted 21 December, 2022;
originally announced January 2023.
-
Attention-Based Generative Neural Image Compression on Solar Dynamics Observatory
Authors:
Ali Zafari,
Atefeh Khoshkhahtinat,
Piyush M. Mehta,
Nasser M. Nasrabadi,
Barbara J. Thompson,
Daniel da Silva,
Michael S. F. Kirk
Abstract:
NASA's Solar Dynamics Observatory (SDO) mission gathers 1.4 terabytes of data each day from its geosynchronous orbit in space. SDO data includes images of the Sun captured at different wavelengths, with the primary scientific goal of understanding the dynamic processes governing the Sun. Recently, end-to-end optimized artificial neural networks (ANN) have shown great potential in performing image…
▽ More
NASA's Solar Dynamics Observatory (SDO) mission gathers 1.4 terabytes of data each day from its geosynchronous orbit in space. SDO data includes images of the Sun captured at different wavelengths, with the primary scientific goal of understanding the dynamic processes governing the Sun. Recently, end-to-end optimized artificial neural networks (ANN) have shown great potential in performing image compression. ANN-based compression schemes have outperformed conventional hand-engineered algorithms for lossy and lossless image compression. We have designed an ad-hoc ANN-based image compression scheme to reduce the amount of data needed to be stored and retrieved on space missions studying solar dynamics. In this work, we propose an attention module to make use of both local and non-local attention mechanisms in an adversarially trained neural image compression network. We have also demonstrated the superior perceptual quality of this neural image compressor. Our proposed algorithm for compressing images downloaded from the SDO spacecraft performs better in rate-distortion trade-off than the popular currently-in-use image compression codecs such as JPEG and JPEG2000. In addition we have shown that the proposed method outperforms state-of-the art lossy transform coding compression codec, i.e., BPG.
△ Less
Submitted 4 May, 2023; v1 submitted 12 October, 2022;
originally announced October 2022.
-
Urban feature analysis from aerial remote sensing imagery using self-supervised and semi-supervised computer vision
Authors:
Sachith Seneviratne,
Jasper S. Wijnands,
Kerry Nice,
Haifeng Zhao,
Branislava Godic,
Suzanne Mavoa,
Rajith Vidanaarachchi,
Mark Stevenson,
Leandro Garcia,
Ruth F. Hunter,
Jason Thompson
Abstract:
Analysis of overhead imagery using computer vision is a problem that has received considerable attention in academic literature. Most techniques that operate in this space are both highly specialised and require expensive manual annotation of large datasets. These problems are addressed here through the development of a more generic framework, incorporating advances in representation learning whic…
▽ More
Analysis of overhead imagery using computer vision is a problem that has received considerable attention in academic literature. Most techniques that operate in this space are both highly specialised and require expensive manual annotation of large datasets. These problems are addressed here through the development of a more generic framework, incorporating advances in representation learning which allows for more flexibility in analysing new categories of imagery with limited labeled data. First, a robust representation of an unlabeled aerial imagery dataset was created based on the momentum contrast mechanism. This was subsequently specialised for different tasks by building accurate classifiers with as few as 200 labeled images. The successful low-level detection of urban infrastructure evolution over a 10-year period from 60 million unlabeled images, exemplifies the substantial potential of our approach to advance quantitative urban research.
△ Less
Submitted 16 August, 2022;
originally announced August 2022.
-
DALLE-URBAN: Capturing the urban design expertise of large text to image transformers
Authors:
Sachith Seneviratne,
Damith Senanayake,
Sanka Rasnayaka,
Rajith Vidanaarachchi,
Jason Thompson
Abstract:
Automatically converting text descriptions into images using transformer architectures has recently received considerable attention. Such advances have implications for many applied design disciplines across fashion, art, architecture, urban planning, landscape design and the future tools available to such disciplines. However, a detailed analysis capturing the capabilities of such models, specifi…
▽ More
Automatically converting text descriptions into images using transformer architectures has recently received considerable attention. Such advances have implications for many applied design disciplines across fashion, art, architecture, urban planning, landscape design and the future tools available to such disciplines. However, a detailed analysis capturing the capabilities of such models, specifically with a focus on the built environment, has not been performed to date. In this work, we investigate the capabilities and biases of such text-to-image methods as it applies to the built environment in detail. We use a systematic grammar to generate queries related to the built environment and evaluate resulting generated images. We generate 1020 different images and find that text to image transformers are robust at generating realistic images across different domains for this use-case. Generated imagery can be found at the github: https://github.com/sachith500/DALLEURBAN
△ Less
Submitted 3 October, 2022; v1 submitted 3 August, 2022;
originally announced August 2022.
-
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
Authors:
Aarohi Srivastava,
Abhinav Rastogi,
Abhishek Rao,
Abu Awal Md Shoeb,
Abubakar Abid,
Adam Fisch,
Adam R. Brown,
Adam Santoro,
Aditya Gupta,
Adrià Garriga-Alonso,
Agnieszka Kluska,
Aitor Lewkowycz,
Akshat Agarwal,
Alethea Power,
Alex Ray,
Alex Warstadt,
Alexander W. Kocurek,
Ali Safaya,
Ali Tazarv,
Alice Xiang,
Alicia Parrish,
Allen Nie,
Aman Hussain,
Amanda Askell,
Amanda Dsouza
, et al. (426 additional authors not shown)
Abstract:
Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur…
▽ More
Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-future capabilities and limitations of language models. To address this challenge, we introduce the Beyond the Imitation Game benchmark (BIG-bench). BIG-bench currently consists of 204 tasks, contributed by 450 authors across 132 institutions. Task topics are diverse, drawing problems from linguistics, childhood development, math, common-sense reasoning, biology, physics, social bias, software development, and beyond. BIG-bench focuses on tasks that are believed to be beyond the capabilities of current language models. We evaluate the behavior of OpenAI's GPT models, Google-internal dense transformer architectures, and Switch-style sparse transformers on BIG-bench, across model sizes spanning millions to hundreds of billions of parameters. In addition, a team of human expert raters performed all tasks in order to provide a strong baseline. Findings include: model performance and calibration both improve with scale, but are poor in absolute terms (and when compared with rater performance); performance is remarkably similar across model classes, though with benefits from sparsity; tasks that improve gradually and predictably commonly involve a large knowledge or memorization component, whereas tasks that exhibit "breakthrough" behavior at a critical scale often involve multiple steps or components, or brittle metrics; social bias typically increases with scale in settings with ambiguous context, but this can be improved with prompting.
△ Less
Submitted 12 June, 2023; v1 submitted 9 June, 2022;
originally announced June 2022.
-
6G Network AI Architecture for Everyone-Centric Customized Services
Authors:
Yang Yang,
Mulei Ma,
Hequan Wu,
Quan Yu,
Ping Zhang,
Xiaohu You,
Jianjun Wu,
Chenghui Peng,
Tak-Shing Peter Yum,
Sherman Shen,
Hamid Aghvami,
Geoffrey Y Li,
Jiangzhou Wang,
Guangyi Liu,
Peng Gao,
Xiongyan Tang,
Chang Cao,
John Thompson,
Kat-Kit Wong,
Shanzhi Chen,
Merouane Debbah,
Schahram Dustdar,
Frank Eliassen,
Tao Chen,
Xiangyang Duan
, et al. (29 additional authors not shown)
Abstract:
Mobile communication standards were developed for enhancing transmission and network performance by using more radio resources and improving spectrum and energy efficiency. How to effectively address diverse user requirements and guarantee everyone's Quality of Experience (QoE) remains an open problem. The Sixth Generation (6G) mobile systems will solve this problem by utilizing heterogenous netwo…
▽ More
Mobile communication standards were developed for enhancing transmission and network performance by using more radio resources and improving spectrum and energy efficiency. How to effectively address diverse user requirements and guarantee everyone's Quality of Experience (QoE) remains an open problem. The Sixth Generation (6G) mobile systems will solve this problem by utilizing heterogenous network resources and pervasive intelligence to support everyone-centric customized services anywhere and anytime. In this article, we first coin the concept of Service Requirement Zone (SRZ) on the user side to characterize and visualize the integrated service requirements and preferences of specific tasks of individual users. On the system side, we further introduce the concept of User Satisfaction Ratio (USR) to evaluate the system's overall service ability of satisfying a variety of tasks with different SRZs. Then, we propose a network Artificial Intelligence (AI) architecture with integrated network resources and pervasive AI capabilities for supporting customized services with guaranteed QoEs. Finally, extensive simulations show that the proposed network AI architecture can consistently offer a higher USR performance than the cloud AI and edge AI architectures with respect to different task scheduling algorithms, random service requirements, and dynamic network conditions.
△ Less
Submitted 6 December, 2023; v1 submitted 19 May, 2022;
originally announced May 2022.
-
Attention Based Neural Networks for Wireless Channel Estimation
Authors:
Dianxin Luan,
John Thompson
Abstract:
In this paper, we deploy the self-attention mechanism to achieve improved channel estimation for orthogonal frequency-division multiplexing waveforms in the downlink. Specifically, we propose a new hybrid encoder-decoder structure (called HA02) for the first time which exploits the attention mechanism to focus on the most important input information. In particular, we implement a transformer encod…
▽ More
In this paper, we deploy the self-attention mechanism to achieve improved channel estimation for orthogonal frequency-division multiplexing waveforms in the downlink. Specifically, we propose a new hybrid encoder-decoder structure (called HA02) for the first time which exploits the attention mechanism to focus on the most important input information. In particular, we implement a transformer encoder block as the encoder to achieve the sparsity in the input features and a residual neural network as the decoder respectively, inspired by the success of the attention mechanism. Using 3GPP channel models, our simulations show superior estimation performance compared with other candidate neural network methods for channel estimation.
△ Less
Submitted 28 April, 2022;
originally announced April 2022.
-
Federated Learning Enables Big Data for Rare Cancer Boundary Detection
Authors:
Sarthak Pati,
Ujjwal Baid,
Brandon Edwards,
Micah Sheller,
Shih-Han Wang,
G Anthony Reina,
Patrick Foley,
Alexey Gruzdev,
Deepthi Karkada,
Christos Davatzikos,
Chiharu Sako,
Satyam Ghodasara,
Michel Bilello,
Suyash Mohan,
Philipp Vollmuth,
Gianluca Brugnara,
Chandrakanth J Preetha,
Felix Sahm,
Klaus Maier-Hein,
Maximilian Zenk,
Martin Bendszus,
Wolfgang Wick,
Evan Calabrese,
Jeffrey Rudie,
Javier Villanueva-Meyer
, et al. (254 additional authors not shown)
Abstract:
Although machine learning (ML) has shown promise in numerous domains, there are concerns about generalizability to out-of-sample data. This is currently addressed by centrally sharing ample, and importantly diverse, data from multiple sites. However, such centralization is challenging to scale (or even not feasible) due to various limitations. Federated ML (FL) provides an alternative to train acc…
▽ More
Although machine learning (ML) has shown promise in numerous domains, there are concerns about generalizability to out-of-sample data. This is currently addressed by centrally sharing ample, and importantly diverse, data from multiple sites. However, such centralization is challenging to scale (or even not feasible) due to various limitations. Federated ML (FL) provides an alternative to train accurate and generalizable ML models, by only sharing numerical model updates. Here we present findings from the largest FL study to-date, involving data from 71 healthcare institutions across 6 continents, to generate an automatic tumor boundary detector for the rare disease of glioblastoma, utilizing the largest dataset of such patients ever used in the literature (25,256 MRI scans from 6,314 patients). We demonstrate a 33% improvement over a publicly trained model to delineate the surgically targetable tumor, and 23% improvement over the tumor's entire extent. We anticipate our study to: 1) enable more studies in healthcare informed by large and diverse data, ensuring meaningful results for rare diseases and underrepresented populations, 2) facilitate further quantitative analyses for glioblastoma via performance optimization of our consensus model for eventual public release, and 3) demonstrate the effectiveness of FL at such scale and task complexity as a paradigm shift for multi-site collaborations, alleviating the need for data sharing.
△ Less
Submitted 25 April, 2022; v1 submitted 22 April, 2022;
originally announced April 2022.
-
Performance Portable Solid Mechanics via Matrix-Free $p$-Multigrid
Authors:
Jed Brown,
Valeria Barra,
Natalie Beams,
Leila Ghaffari,
Matthew Knepley,
William Moses,
Rezgar Shakeri,
Karen Stengel,
Jeremy L. Thompson,
Junchao Zhang
Abstract:
Finite element analysis of solid mechanics is a foundational tool of modern engineering, with low-order finite element methods and assembled sparse matrices representing the industry standard for implicit analysis. We use performance models and numerical experiments to demonstrate that high-order methods greatly reduce the costs to reach engineering tolerances while enabling effective use of GPUs;…
▽ More
Finite element analysis of solid mechanics is a foundational tool of modern engineering, with low-order finite element methods and assembled sparse matrices representing the industry standard for implicit analysis. We use performance models and numerical experiments to demonstrate that high-order methods greatly reduce the costs to reach engineering tolerances while enabling effective use of GPUs; these data structures also offer up to 2x benefit for linear elements. We demonstrate the reliability, efficiency, and scalability of matrix-free $p$-multigrid methods with algebraic multigrid coarse solvers through large deformation hyperelastic simulations of multiscale structures. We investigate accuracy, cost, and execution time on multi-node CPU and GPU systems for moderate to large models (millions to billions of degrees of freedom) using AMD MI250X (OLCF Crusher), NVIDIA A100 (NERSC Perlmutter), and V100 (LLNL Lassen and OLCF Summit), resulting in order of magnitude efficiency improvements over a broad range of model properties and scales. We discuss efficient matrix-free representation of Jacobians and demonstrate how automatic differentiation enables rapid development of nonlinear material models without impacting debuggability and workflows targeting GPUs. The methods are broadly applicable and amenable to common workflows, presented here via open source libraries that encapsulate all GPU-specific aspects and are accessible to both new and legacy code, allowing application code to be GPU-oblivious without compromising end-to-end performance on GPUs.
△ Less
Submitted 23 May, 2022; v1 submitted 4 April, 2022;
originally announced April 2022.
-
Self-Supervision, Remote Sensing and Abstraction: Representation Learning Across 3 Million Locations
Authors:
Sachith Seneviratne,
Kerry A. Nice,
Jasper S. Wijnands,
Mark Stevenson,
Jason Thompson
Abstract:
Self-supervision based deep learning classification approaches have received considerable attention in academic literature. However, the performance of such methods on remote sensing imagery domains remains under-explored. In this work, we explore contrastive representation learning methods on the task of imagery-based city classification, an important problem in urban computing. We use satellite…
▽ More
Self-supervision based deep learning classification approaches have received considerable attention in academic literature. However, the performance of such methods on remote sensing imagery domains remains under-explored. In this work, we explore contrastive representation learning methods on the task of imagery-based city classification, an important problem in urban computing. We use satellite and map imagery across 2 domains, 3 million locations and more than 1500 cities. We show that self-supervised methods can build a generalizable representation from as few as 200 cities, with representations achieving over 95\% accuracy in unseen cities with minimal additional training. We also find that the performance discrepancy of such methods, when compared to supervised methods, induced by the domain discrepancy between natural imagery and abstract imagery is significant for remote sensing imagery. We compare all analysis against existing supervised models from academic literature and open-source our models for broader usage and further criticism.
△ Less
Submitted 8 March, 2022;
originally announced March 2022.
-
Low Complexity Channel estimation with Neural Network Solutions
Authors:
Dianxin Luan,
John Thompson
Abstract:
Research on machine learning for channel estimation, especially neural network solutions for wireless communications, is attracting significant current interest. This is because conventional methods cannot meet the present demands of the high speed communication. In the paper, we deploy a general residual convolutional neural network to achieve channel estimation for the orthogonal frequency-divis…
▽ More
Research on machine learning for channel estimation, especially neural network solutions for wireless communications, is attracting significant current interest. This is because conventional methods cannot meet the present demands of the high speed communication. In the paper, we deploy a general residual convolutional neural network to achieve channel estimation for the orthogonal frequency-division multiplexing (OFDM) signals in a downlink scenario. Our method also deploys a simple interpolation layer to replace the transposed convolutional layer used in other networks to reduce the computation cost. The proposed method is more easily adapted to different pilot patterns and packet sizes. Compared with other deep learning methods for channel estimation, our results for 3GPP channel models suggest improved mean squared error performance for our approach.
△ Less
Submitted 24 January, 2022;
originally announced January 2022.
-
QuALITY: Question Answering with Long Input Texts, Yes!
Authors:
Richard Yuanzhe Pang,
Alicia Parrish,
Nitish Joshi,
Nikita Nangia,
Jason Phang,
Angelica Chen,
Vishakh Padmakumar,
Johnny Ma,
Jana Thompson,
He He,
Samuel R. Bowman
Abstract:
To enable building and testing models on long-document comprehension, we introduce QuALITY, a multiple-choice QA dataset with context passages in English that have an average length of about 5,000 tokens, much longer than typical current models can process. Unlike in prior work with passages, our questions are written and validated by contributors who have read the entire passage, rather than rely…
▽ More
To enable building and testing models on long-document comprehension, we introduce QuALITY, a multiple-choice QA dataset with context passages in English that have an average length of about 5,000 tokens, much longer than typical current models can process. Unlike in prior work with passages, our questions are written and validated by contributors who have read the entire passage, rather than relying on summaries or excerpts. In addition, only half of the questions are answerable by annotators working under tight time constraints, indicating that skimming and simple search are not enough to consistently perform well. Our baseline models perform poorly on this task (55.4%) and significantly lag behind human performance (93.5%).
△ Less
Submitted 11 May, 2022; v1 submitted 15 December, 2021;
originally announced December 2021.
-
BBQ: A Hand-Built Bias Benchmark for Question Answering
Authors:
Alicia Parrish,
Angelica Chen,
Nikita Nangia,
Vishakh Padmakumar,
Jason Phang,
Jana Thompson,
Phu Mon Htut,
Samuel R. Bowman
Abstract:
It is well documented that NLP models learn social biases, but little work has been done on how these biases manifest in model outputs for applied tasks like question answering (QA). We introduce the Bias Benchmark for QA (BBQ), a dataset of question sets constructed by the authors that highlight attested social biases against people belonging to protected classes along nine social dimensions rele…
▽ More
It is well documented that NLP models learn social biases, but little work has been done on how these biases manifest in model outputs for applied tasks like question answering (QA). We introduce the Bias Benchmark for QA (BBQ), a dataset of question sets constructed by the authors that highlight attested social biases against people belonging to protected classes along nine social dimensions relevant for U.S. English-speaking contexts. Our task evaluates model responses at two levels: (i) given an under-informative context, we test how strongly responses reflect social biases, and (ii) given an adequately informative context, we test whether the model's biases override a correct answer choice. We find that models often rely on stereotypes when the context is under-informative, meaning the model's outputs consistently reproduce harmful biases in this setting. Though models are more accurate when the context provides an informative answer, they still rely on stereotypes and average up to 3.4 percentage points higher accuracy when the correct answer aligns with a social bias than when it conflicts, with this difference widening to over 5 points on examples targeting gender for most models tested.
△ Less
Submitted 15 March, 2022; v1 submitted 15 October, 2021;
originally announced October 2021.
-
Efficient Exascale Discretizations: High-Order Finite Element Methods
Authors:
Tzanio Kolev,
Paul Fischer,
Misun Min,
Jack Dongarra,
Jed Brown,
Veselin Dobrev,
Tim Warburton,
Stanimire Tomov,
Mark S. Shephard,
Ahmad Abdelfattah,
Valeria Barra,
Natalie Beams,
Jean-Sylvain Camier,
Noel Chalmers,
Yohann Dudouit,
Ali Karakus,
Ian Karlin,
Stefan Kerkemeier,
Yu-Hsiang Lan,
David Medina,
Elia Merzari,
Aleksandr Obabko,
Will Pazner,
Thilina Rathnayake,
Cameron W. Smith
, et al. (5 additional authors not shown)
Abstract:
Efficient exploitation of exascale architectures requires rethinking of the numerical algorithms used in many large-scale applications. These architectures favor algorithms that expose ultra fine-grain parallelism and maximize the ratio of floating point operations to energy intensive data movement. One of the few viable approaches to achieve high efficiency in the area of PDE discretizations on u…
▽ More
Efficient exploitation of exascale architectures requires rethinking of the numerical algorithms used in many large-scale applications. These architectures favor algorithms that expose ultra fine-grain parallelism and maximize the ratio of floating point operations to energy intensive data movement. One of the few viable approaches to achieve high efficiency in the area of PDE discretizations on unstructured grids is to use matrix-free/partially-assembled high-order finite element methods, since these methods can increase the accuracy and/or lower the computational time due to reduced data motion. In this paper we provide an overview of the research and development activities in the Center for Efficient Exascale Discretizations (CEED), a co-design center in the Exascale Computing Project that is focused on the development of next-generation discretization software and algorithms to enable a wide range of finite element applications to run efficiently on future hardware. CEED is a research partnership involving more than 30 computational scientists from two US national labs and five universities, including members of the Nek5000, MFEM, MAGMA and PETSc projects. We discuss the CEED co-design activities based on targeted benchmarks, miniapps and discretization libraries and our work on performance optimizations for large-scale GPU architectures. We also provide a broad overview of research and development activities in areas such as unstructured adaptive mesh refinement algorithms, matrix-free linear solvers, high-order data visualization, and list examples of collaborations with several ECP and external applications.
△ Less
Submitted 10 September, 2021;
originally announced September 2021.
-
An explicit vector algorithm for high-girth MaxCut
Authors:
Jessica K. Thompson,
Ojas Parekh,
Kunal Marwaha
Abstract:
We give an approximation algorithm for MaxCut and provide guarantees on the average fraction of edges cut on $d$-regular graphs of girth $\geq 2k$. For every $d \geq 3$ and $k \geq 4$, our approximation guarantees are better than those of all other classical and quantum algorithms known to the authors. Our algorithm constructs an explicit vector solution to the standard semidefinite relaxation of…
▽ More
We give an approximation algorithm for MaxCut and provide guarantees on the average fraction of edges cut on $d$-regular graphs of girth $\geq 2k$. For every $d \geq 3$ and $k \geq 4$, our approximation guarantees are better than those of all other classical and quantum algorithms known to the authors. Our algorithm constructs an explicit vector solution to the standard semidefinite relaxation of MaxCut and applies hyperplane rounding. It may be viewed as a simplification of the previously best known technique, which approximates Gaussian wave processes on the infinite $d$-regular tree.
△ Less
Submitted 27 August, 2021;
originally announced August 2021.
-
Quantum adaptive agents with efficient long-term memories
Authors:
Thomas J. Elliott,
Mile Gu,
Andrew J. P. Garner,
Jayne Thompson
Abstract:
Central to the success of adaptive systems is their ability to interpret signals from their environment and respond accordingly -- they act as agents interacting with their surroundings. Such agents typically perform better when able to execute increasingly complex strategies. This comes with a cost: the more information the agent must recall from its past experiences, the more memory it will need…
▽ More
Central to the success of adaptive systems is their ability to interpret signals from their environment and respond accordingly -- they act as agents interacting with their surroundings. Such agents typically perform better when able to execute increasingly complex strategies. This comes with a cost: the more information the agent must recall from its past experiences, the more memory it will need. Here we investigate the power of agents capable of quantum information processing. We uncover the most general form a quantum agent need adopt to maximise memory compression advantages, and provide a systematic means of encoding their memory states. We show these encodings can exhibit extremely favourable scaling advantages relative to memory-minimal classical agents, particularly when information must be retained about events increasingly far into the past.
△ Less
Submitted 11 January, 2022; v1 submitted 24 August, 2021;
originally announced August 2021.
-
Smartphone Camera Oximetry in an Induced Hypoxemia Study
Authors:
Jason S. Hoffman,
Varun Viswanath,
Xinyi Ding,
Matthew J. Thompson,
Eric C. Larson,
Shwetak N. Patel,
Edward Wang
Abstract:
Hypoxemia, a medical condition that occurs when the blood is not carrying enough oxygen to adequately supply the tissues, is a leading indicator for dangerous complications of respiratory diseases like asthma, COPD, and COVID-19. While purpose-built pulse oximeters can provide accurate blood-oxygen saturation (SpO$_2$) readings that allow for diagnosis of hypoxemia, enabling this capability in unm…
▽ More
Hypoxemia, a medical condition that occurs when the blood is not carrying enough oxygen to adequately supply the tissues, is a leading indicator for dangerous complications of respiratory diseases like asthma, COPD, and COVID-19. While purpose-built pulse oximeters can provide accurate blood-oxygen saturation (SpO$_2$) readings that allow for diagnosis of hypoxemia, enabling this capability in unmodified smartphone cameras via a software update could give more people access to important information about their health, as well as improve physicians' ability to remotely diagnose and treat respiratory conditions. In this work, we take a step towards this goal by performing the first clinical development validation on a smartphone-based SpO$_2$ sensing system using a varied fraction of inspired oxygen (FiO$_2$) protocol, creating a clinically relevant validation dataset for solely smartphone-based methods on a wide range of SpO$_2$ values (70%-100%) for the first time. This contrasts with previous studies, which evaluated performance on a far smaller range (85%-100%). We build a deep learning model using this data to demonstrate accurate reporting of SpO$_2$ level with an overall MAE=5.00% SpO$_2$ and identifying positive cases of low SpO$_2$<90% with 81% sensitivity and 79% specificity. We ground our analysis with a summary of recent literature in smartphone-based SpO2 monitoring, and we provide the data from the FiO$_2$ study in open-source format, so that others may build on this work.
△ Less
Submitted 31 March, 2021;
originally announced April 2021.
-
Identifying safe intersection design through unsupervised feature extraction from satellite imagery
Authors:
Jasper S. Wijnands,
Haifeng Zhao,
Kerry A. Nice,
Jason Thompson,
Katherine Scully,
Jingqiu Guo,
Mark Stevenson
Abstract:
The World Health Organization has listed the design of safer intersections as a key intervention to reduce global road trauma. This article presents the first study to systematically analyze the design of all intersections in a large country, based on aerial imagery and deep learning. Approximately 900,000 satellite images were downloaded for all intersections in Australia and customized computer…
▽ More
The World Health Organization has listed the design of safer intersections as a key intervention to reduce global road trauma. This article presents the first study to systematically analyze the design of all intersections in a large country, based on aerial imagery and deep learning. Approximately 900,000 satellite images were downloaded for all intersections in Australia and customized computer vision techniques emphasized the road infrastructure. A deep autoencoder extracted high-level features, including the intersection's type, size, shape, lane markings, and complexity, which were used to cluster similar designs. An Australian telematics data set linked infrastructure design to driving behaviors captured during 66 million kilometers of driving. This showed more frequent hard acceleration events (per vehicle) at four- than three-way intersections, relatively low hard deceleration frequencies at T-intersections, and consistently low average speeds on roundabouts. Overall, domain-specific feature extraction enabled the identification of infrastructure improvements that could result in safer driving behaviors, potentially reducing road trauma.
△ Less
Submitted 28 October, 2020;
originally announced October 2020.
-
Boosting on the shoulders of giants in quantum device calibration
Authors:
Alex Wozniakowski,
Jayne Thompson,
Mile Gu,
Felix Binder
Abstract:
Traditional machine learning applications, such as optical character recognition, arose from the inability to explicitly program a computer to perform a routine task. In this context, learning algorithms usually derive a model exclusively from the evidence present in a massive dataset. Yet in some scientific disciplines, obtaining an abundance of data is an impractical luxury, however; there is an…
▽ More
Traditional machine learning applications, such as optical character recognition, arose from the inability to explicitly program a computer to perform a routine task. In this context, learning algorithms usually derive a model exclusively from the evidence present in a massive dataset. Yet in some scientific disciplines, obtaining an abundance of data is an impractical luxury, however; there is an explicit model of the domain based upon previous scientific discoveries. Here we introduce a new approach to machine learning that is able to leverage prior scientific discoveries in order to improve generalizability over a scientific model. We show its efficacy in predicting the entire energy spectrum of a Hamiltonian on a superconducting quantum device, a key task in present quantum computer calibration. Our accuracy surpasses the current state-of-the-art by over $20\%.$ Our approach thus demonstrates how artificial intelligence can be further enhanced by "standing on the shoulders of giants."
△ Less
Submitted 13 May, 2020;
originally announced May 2020.
-
Knowledge Patterns
Authors:
Peter Clark,
John Thompson,
Bruce Porter
Abstract:
This paper describes a new technique, called "knowledge patterns", for helping construct axiom-rich, formal ontologies, based on identifying and explicitly representing recurring patterns of knowledge (theory schemata) in the ontology, and then stating how those patterns map onto domain-specific concepts in the ontology. From a modeling perspective, knowledge patterns provide an important insight…
▽ More
This paper describes a new technique, called "knowledge patterns", for helping construct axiom-rich, formal ontologies, based on identifying and explicitly representing recurring patterns of knowledge (theory schemata) in the ontology, and then stating how those patterns map onto domain-specific concepts in the ontology. From a modeling perspective, knowledge patterns provide an important insight into the structure of a formal ontology: rather than viewing a formal ontology simply as a list of terms and axioms, knowledge patterns views it as a collection of abstract, modular theories (the "knowledge patterns") plus a collection of modeling decisions stating how different aspects of the world can be modeled using those theories. Knowledge patterns make both those abstract theories and their mappings to the domain of interest explicit, thus making modeling decisions clear, and avoiding some of the ontological confusion that can otherwise arise. In addition, from a computational perspective, knowledge patterns provide a simple and computationally efficient mechanism for facilitating knowledge reuse. We describe the technique and an application built using them, and then critique its strengths and weaknesses. We conclude that this technique enables us to better explicate both the structure and modeling decisions made when constructing a formal axiom-rich ontology.
△ Less
Submitted 8 May, 2020;
originally announced May 2020.
-
Deep Learning Framework for Detecting Ground Deformation in the Built Environment using Satellite InSAR data
Authors:
Nantheera Anantrasirichai,
Juliet Biggs,
Krisztina Kelevitz,
Zahra Sadeghi,
Tim Wright,
James Thompson,
Alin Achim,
David Bull
Abstract:
The large volumes of Sentinel-1 data produced over Europe are being used to develop pan-national ground motion services. However, simple analysis techniques like thresholding cannot detect and classify complex deformation signals reliably making providing usable information to a broad range of non-expert stakeholders a challenge. Here we explore the applicability of deep learning approaches by ada…
▽ More
The large volumes of Sentinel-1 data produced over Europe are being used to develop pan-national ground motion services. However, simple analysis techniques like thresholding cannot detect and classify complex deformation signals reliably making providing usable information to a broad range of non-expert stakeholders a challenge. Here we explore the applicability of deep learning approaches by adapting a pre-trained convolutional neural network (CNN) to detect deformation in a national-scale velocity field. For our proof-of-concept, we focus on the UK where previously identified deformation is associated with coal-mining, ground water withdrawal, landslides and tunnelling. The sparsity of measurement points and the presence of spike noise make this a challenging application for deep learning networks, which involve calculations of the spatial convolution between images. Moreover, insufficient ground truth data exists to construct a balanced training data set, and the deformation signals are slower and more localised than in previous applications. We propose three enhancement methods to tackle these problems: i) spatial interpolation with modified matrix completion, ii) a synthetic training dataset based on the characteristics of real UK velocity map, and iii) enhanced over-wrapping techniques. Using velocity maps spanning 2015-2019, our framework detects several areas of coal mining subsidence, uplift due to dewatering, slate quarries, landslides and tunnel engineering works. The results demonstrate the potential applicability of the proposed framework to the development of automated ground motion analysis systems.
△ Less
Submitted 12 May, 2020; v1 submitted 6 May, 2020;
originally announced May 2020.
-
Location-Enabled IoT (LE-IoT): A Survey of Positioning Techniques, Error Sources, and Mitigation
Authors:
You Li,
Yuan Zhuang,
Xin Hu,
Zhouzheng Gao,
Jia Hu,
Long Chen,
Zhe He,
Ling Pei,
Kejie Chen,
Maosong Wang,
Xiaoji Niu,
Ruizhi Chen,
John Thompson,
Fadhel Ghannouchi,
Naser El-Sheimy
Abstract:
The Internet of Things (IoT) has started to empower the future of many industrial and mass-market applications. Localization techniques are becoming key to add location context to IoT data without human perception and intervention. Meanwhile, the newly-emerged Low-Power Wide-Area Network (LPWAN) technologies have advantages such as long-range, low power consumption, low cost, massive connections,…
▽ More
The Internet of Things (IoT) has started to empower the future of many industrial and mass-market applications. Localization techniques are becoming key to add location context to IoT data without human perception and intervention. Meanwhile, the newly-emerged Low-Power Wide-Area Network (LPWAN) technologies have advantages such as long-range, low power consumption, low cost, massive connections, and the capability for communication in both indoor and outdoor areas. These features make LPWAN signals strong candidates for mass-market localization applications. However, there are various error sources that have limited localization performance by using such IoT signals. This paper reviews the IoT localization system through the following sequence: IoT localization system review -- localization data sources -- localization algorithms -- localization error sources and mitigation -- localization performance evaluation. Compared to the related surveys, this paper has a more comprehensive and state-of-the-art review on IoT localization methods, an original review on IoT localization error sources and mitigation, an original review on IoT localization performance evaluation, and a more comprehensive review of IoT localization applications, opportunities, and challenges. Thus, this survey provides comprehensive guidance for peers who are interested in enabling localization ability in the existing IoT systems, using IoT systems for localization, or integrating IoT signals with the existing localization sensors.
△ Less
Submitted 7 April, 2020;
originally announced April 2020.
-
The effect of task and training on intermediate representations in convolutional neural networks revealed with modified RV similarity analysis
Authors:
Jessica A. F. Thompson,
Yoshua Bengio,
Marc Schoenwiesner
Abstract:
Centered Kernel Alignment (CKA) was recently proposed as a similarity metric for comparing activation patterns in deep networks. Here we experiment with the modified RV-coefficient (RV2), which has very similar properties as CKA while being less sensitive to dataset size. We compare the representations of networks that received varying amounts of training on different layers: a standard trained ne…
▽ More
Centered Kernel Alignment (CKA) was recently proposed as a similarity metric for comparing activation patterns in deep networks. Here we experiment with the modified RV-coefficient (RV2), which has very similar properties as CKA while being less sensitive to dataset size. We compare the representations of networks that received varying amounts of training on different layers: a standard trained network (all parameters updated at every step), a freeze trained network (layers gradually frozen during training), random networks (only some layers trained), and a completely untrained network. We found that RV2 was able to recover expected similarity patterns and provide interpretable similarity matrices that suggested hypotheses about how representations are affected by different training recipes. We propose that the superior performance achieved by freeze training can be attributed to representational differences in the penultimate layer. Our comparisons of random networks suggest that the inputs and targets serve as anchors on the representations in the lowest and highest layers.
△ Less
Submitted 4 December, 2019;
originally announced December 2019.
-
Real-time monitoring of driver drowsiness on mobile platforms using 3D neural networks
Authors:
Jasper S. Wijnands,
Jason Thompson,
Kerry A. Nice,
Gideon D. P. A. Aschwanden,
Mark Stevenson
Abstract:
Driver drowsiness increases crash risk, leading to substantial road trauma each year. Drowsiness detection methods have received considerable attention, but few studies have investigated the implementation of a detection approach on a mobile phone. Phone applications reduce the need for specialised hardware and hence, enable a cost-effective roll-out of the technology across the driving population…
▽ More
Driver drowsiness increases crash risk, leading to substantial road trauma each year. Drowsiness detection methods have received considerable attention, but few studies have investigated the implementation of a detection approach on a mobile phone. Phone applications reduce the need for specialised hardware and hence, enable a cost-effective roll-out of the technology across the driving population. While it has been shown that three-dimensional (3D) operations are more suitable for spatiotemporal feature learning, current methods for drowsiness detection commonly use frame-based, multi-step approaches. However, computationally expensive techniques that achieve superior results on action recognition benchmarks (e.g. 3D convolutions, optical flow extraction) create bottlenecks for real-time, safety-critical applications on mobile devices. Here, we show how depthwise separable 3D convolutions, combined with an early fusion of spatial and temporal information, can achieve a balance between high prediction accuracy and real-time inference requirements. In particular, increased accuracy is achieved when assessment requires motion information, for example, when sunglasses conceal the eyes. Further, a custom TensorFlow-based smartphone application shows the true impact of various approaches on inference times and demonstrates the effectiveness of real-time monitoring based on out-of-sample data to alert a drowsy driver. Our model is pre-trained on ImageNet and Kinetics and fine-tuned on a publicly available Driver Drowsiness Detection dataset. Fine-tuning on large naturalistic driving datasets could further improve accuracy to obtain robust in-vehicle performance. Overall, our research is a step towards practical deep learning applications, potentially preventing micro-sleeps and reducing road trauma.
△ Less
Submitted 15 October, 2019;
originally announced October 2019.
-
The 'Paris-end' of town? Urban typology through machine learning
Authors:
Kerry A. Nice,
Jason Thompson,
Jasper S. Wijnands,
Gideon D. P. A. Aschwanden,
Mark Stevenson
Abstract:
The confluence of recent advances in availability of geospatial information, computing power, and artificial intelligence offers new opportunities to understand how and where our cities differ or are alike. Departing from a traditional `top-down' analysis of urban design features, this project analyses millions of images of urban form (consisting of street view, satellite imagery, and street maps)…
▽ More
The confluence of recent advances in availability of geospatial information, computing power, and artificial intelligence offers new opportunities to understand how and where our cities differ or are alike. Departing from a traditional `top-down' analysis of urban design features, this project analyses millions of images of urban form (consisting of street view, satellite imagery, and street maps) to find shared characteristics. A (novel) neural network-based framework is trained with imagery from the largest 1692 cities in the world and the resulting models are used to compare within-city locations from Melbourne and Sydney to determine the closest connections between these areas and their international comparators. This work demonstrates a new, consistent, and objective method to begin to understand the relationship between cities and their health, transport, and environmental consequences of their design. The results show specific advantages and disadvantages using each type of imagery. Neural networks trained with map imagery will be highly influenced by the mix of roads, public transport, and green and blue space as well as the structure of these elements. The colours of natural and built features stand out as dominant characteristics in satellite imagery. The use of street view imagery will emphasise the features of a human scaled visual geography of streetscapes. Finally, and perhaps most importantly, this research also answers the age-old question, ``Is there really a `Paris-end' to your city?''.
△ Less
Submitted 8 October, 2019;
originally announced October 2019.
-
The Nature of Human Settlement: Building an understanding of high performance city design
Authors:
Kerry A. Nice,
Gideon D. P. A. Aschwanden,
Jasper S. Wijnands,
Jason Thompson,
Haifeng Zhao,
Mark Stevenson
Abstract:
In an impending urban age where the majority of the world's population will live in cities, it is critical that we improve our understanding of the strengths and limitations of existing city designs to ensure they are safe, clean, can deliver health co-benefits and importantly, are sustainable into the future. To enable this, a systematic and efficient means of performing inter- and intra-city com…
▽ More
In an impending urban age where the majority of the world's population will live in cities, it is critical that we improve our understanding of the strengths and limitations of existing city designs to ensure they are safe, clean, can deliver health co-benefits and importantly, are sustainable into the future. To enable this, a systematic and efficient means of performing inter- and intra-city comparisons based on urban form is required. Until now, methods for comparing cities have been limited by scalability, often reliant upon non-standardised local input data that can be costly and difficult to obtain. To address this, we have developed a unique approach to determine the mix, distribution, and composition of neighbourhood types in cities based on dimensions of block size and regularity, sorted by a self-organising map. We illustrate the utility of the method to provide an understanding of the underlying city morphology by overlaying spatially standardised city metrics such as air pollution and transport activity across a set of 1667 global cities with populations exceeding 300,000. The unique approach reports associations between specific mixes of neighbourhood typologies and quantities of moving vehicles (r=0.97), impervious surfaces (r=0.86), and air pollution levels (aerosol optical depth r=0.58 and NO$_{2}$ r=0.57). What this illustrates, is that this unique approach can identify the characteristics and neighbourhood mixes of well-performing urban areas while also producing unique `city fingerprints' that can be used to provide new metrics, insights, and drive improvements in city design for the future.
△ Less
Submitted 8 October, 2019;
originally announced October 2019.
-
Sky pixel detection in outdoor imagery using an adaptive algorithm and machine learning
Authors:
Kerry A. Nice,
Jasper S. Wijnands,
Ariane Middel,
Jingcheng Wang,
Yiming Qiu,
Nan Zhao,
Jason Thompson,
Gideon D. P. A. Aschwanden,
Haifeng Zhao,
Mark Stevenson
Abstract:
Computer vision techniques enable automated detection of sky pixels in outdoor imagery. In urban climate, sky detection is an important first step in gathering information about urban morphology and sky view factors. However, obtaining accurate results remains challenging and becomes even more complex using imagery captured under a variety of lighting and weather conditions.
To address this prob…
▽ More
Computer vision techniques enable automated detection of sky pixels in outdoor imagery. In urban climate, sky detection is an important first step in gathering information about urban morphology and sky view factors. However, obtaining accurate results remains challenging and becomes even more complex using imagery captured under a variety of lighting and weather conditions.
To address this problem, we present a new sky pixel detection system demonstrated to produce accurate results using a wide range of outdoor imagery types. Images are processed using a selection of mean-shift segmentation, K-means clustering, and Sobel filters to mark sky pixels in the scene. The algorithm for a specific image is chosen by a convolutional neural network, trained with 25,000 images from the Skyfinder data set, reaching 82% accuracy for the top three classes. This selection step allows the sky marking to follow an adaptive process and to use different techniques and parameters to best suit a particular image. An evaluation of fourteen different techniques and parameter sets shows that no single technique can perform with high accuracy across varied Skyfinder and Google Street View data sets. However, by using our adaptive process, large increases in accuracy are observed. The resulting system is shown to perform better than other published techniques.
△ Less
Submitted 9 December, 2019; v1 submitted 7 October, 2019;
originally announced October 2019.
-
Extreme dimensionality reduction with quantum modelling
Authors:
Thomas J. Elliott,
Chengran Yang,
Felix C. Binder,
Andrew J. P. Garner,
Jayne Thompson,
Mile Gu
Abstract:
Effective and efficient forecasting relies on identification of the relevant information contained in past observations -- the predictive features -- and isolating it from the rest. When the future of a process bears a strong dependence on its behaviour far into the past, there are many such features to store, necessitating complex models with extensive memories. Here, we highlight a family of sto…
▽ More
Effective and efficient forecasting relies on identification of the relevant information contained in past observations -- the predictive features -- and isolating it from the rest. When the future of a process bears a strong dependence on its behaviour far into the past, there are many such features to store, necessitating complex models with extensive memories. Here, we highlight a family of stochastic processes whose minimal classical models must devote unboundedly many bits to tracking the past. For this family, we identify quantum models of equal accuracy that can store all relevant information within a single two-dimensional quantum system (qubit). This represents the ultimate limit of quantum compression and highlights an immense practical advantage of quantum technologies for the forecasting and simulation of complex systems.
△ Less
Submitted 23 December, 2020; v1 submitted 6 September, 2019;
originally announced September 2019.
-
Critical Reflections on Visualization Authoring Systems
Authors:
Arvind Satyanarayan,
Bongshin Lee,
Donghao Ren,
Jeffrey Heer,
John Stasko,
John Thompson,
Matthew Brehmer,
Zhicheng Liu
Abstract:
An emerging generation of visualization authoring systems support expressive information visualization without textual programming. As they vary in their visualization models, system architectures, and user interfaces, it is challenging to directly compare these systems using traditional evaluative methods. Recognizing the value of contextualizing our decisions in the broader design space, we pres…
▽ More
An emerging generation of visualization authoring systems support expressive information visualization without textual programming. As they vary in their visualization models, system architectures, and user interfaces, it is challenging to directly compare these systems using traditional evaluative methods. Recognizing the value of contextualizing our decisions in the broader design space, we present critical reflections on three systems we developed -- Lyra, Data Illustrator, and Charticulator. This paper surfaces knowledge that would have been daunting within the constituent papers of these three systems. We compare and contrast their (previously unmentioned) limitations and trade-offs between expressivity and learnability. We also reflect on common assumptions that we made during the development of our systems, thereby informing future research directions in visualization authoring systems.
△ Less
Submitted 31 July, 2019;
originally announced July 2019.
-
Beam Entropy of 5G Cellular Millimetre-Wave Channels
Authors:
Krishan Kumar Tiwari,
Eckhard Grass,
John S. Thompson,
Rolf Kraemer
Abstract:
In this paper, we obtain and study typical beam entropy values for millimetre wave (mm-wave) channel models using the NYUSIM simulator for frequencies up to 100 GHz for fifth generation (5G) and beyond 5G cellular communication systems. The beam entropy is used to quantify sparse MIMO channel randomness in beamspace. Lower relative beam entropy channels are suitable for memory-assisted statistical…
▽ More
In this paper, we obtain and study typical beam entropy values for millimetre wave (mm-wave) channel models using the NYUSIM simulator for frequencies up to 100 GHz for fifth generation (5G) and beyond 5G cellular communication systems. The beam entropy is used to quantify sparse MIMO channel randomness in beamspace. Lower relative beam entropy channels are suitable for memory-assisted statistically-ranked (MarS) and hybrid radio frequency (RF) beam training algorithms. High beam entropies can potentially be advantageous for low overhead secured radio communications by generating cryptographic keys based on channel randomness in beamspace, especially for sparse multiple input multiple output (MIMO) channels. Urban micro (UMi), urban macro (UMa) and rural macro (RMa) cellular scenarios have been investigated in this work for 28, 60, 73 and 100 GHz.
△ Less
Submitted 15 July, 2019; v1 submitted 17 June, 2019;
originally announced June 2019.
-
Memory-assisted Statistically-ranked RF Beam Training Algorithm for Sparse MIMO
Authors:
Krishan K. Tiwari,
Eckhard Grass,
John S. Thompson,
Rolf Kraemer
Abstract:
This paper presents a novel radio frequency (RF) beam training algorithm for sparse multiple input multiple output (MIMO) channels using unitary RF beamforming codebooks at transmitter (Tx) and receiver (Rx). The algorithm leverages statistical knowledge from past beam data for expedited beam search with statistically-minimal training overheads. Beams are tested in the order of their ranks based o…
▽ More
This paper presents a novel radio frequency (RF) beam training algorithm for sparse multiple input multiple output (MIMO) channels using unitary RF beamforming codebooks at transmitter (Tx) and receiver (Rx). The algorithm leverages statistical knowledge from past beam data for expedited beam search with statistically-minimal training overheads. Beams are tested in the order of their ranks based on their probabilities for providing a communication link. For low beam entropy scenarios, statistically-ranked beam search performs excellent in reducing the average number of beam tests per Tx-Rx beam pair identification for a communication link. For high beam entropy cases, a hybrid algorithm involving both memory-assisted statistically-ranked (MarS) beam search and multi-level (ML) beam search is also proposed. Savings in training overheads increase with decrease in beam entropy and increase in MIMO channel dimensions.
△ Less
Submitted 18 June, 2019; v1 submitted 4 June, 2019;
originally announced June 2019.
-
Streetscape augmentation using generative adversarial networks: insights related to health and wellbeing
Authors:
Jasper S. Wijnands,
Kerry A. Nice,
Jason Thompson,
Haifeng Zhao,
Mark Stevenson
Abstract:
Deep learning using neural networks has provided advances in image style transfer, merging the content of one image (e.g., a photo) with the style of another (e.g., a painting). Our research shows this concept can be extended to analyse the design of streetscapes in relation to health and wellbeing outcomes. An Australian population health survey (n=34,000) was used to identify the spatial distrib…
▽ More
Deep learning using neural networks has provided advances in image style transfer, merging the content of one image (e.g., a photo) with the style of another (e.g., a painting). Our research shows this concept can be extended to analyse the design of streetscapes in relation to health and wellbeing outcomes. An Australian population health survey (n=34,000) was used to identify the spatial distribution of health and wellbeing outcomes, including general health and social capital. For each outcome, the most and least desirable locations formed two domains. Streetscape design was sampled using around 80,000 Google Street View images per domain. Generative adversarial networks translated these images from one domain to the other, preserving the main structure of the input image, but transforming the `style' from locations where self-reported health was bad to locations where it was good. These translations indicate that areas in Melbourne with good general health are characterised by sufficient green space and compactness of the urban environment, whilst streetscape imagery related to high social capital contained more and wider footpaths, fewer fences and more grass. Beyond identifying relationships, the method is a first step towards computer-generated design interventions that have the potential to improve population health and wellbeing.
△ Less
Submitted 13 May, 2019;
originally announced May 2019.
-
Relevant Word Order Vectorization for Improved Natural Language Processing in Electronic Healthcare Records
Authors:
Jeffrey Thompson,
Jinxiang Hu,
Dinesh Pal Mudaranthakam,
David Streeter,
Lisa Neums,
Michele Park,
Devin C. Koestler,
Byron Gajewski,
Matthew S. Mayo
Abstract:
Objective: Electronic health records (EHR) represent a rich resource for conducting observational studies, supporting clinical trials, and more. However, much of the relevant information is stored in an unstructured format that makes it difficult to use. Natural language processing approaches that attempt to automatically classify the data depend on vectorization algorithms that impose structure o…
▽ More
Objective: Electronic health records (EHR) represent a rich resource for conducting observational studies, supporting clinical trials, and more. However, much of the relevant information is stored in an unstructured format that makes it difficult to use. Natural language processing approaches that attempt to automatically classify the data depend on vectorization algorithms that impose structure on the text, but these algorithms were not designed for the unique characteristics of EHR. Here, we propose a new algorithm for structuring so-called free-text that may help researchers make better use of EHR. We call this method Relevant Word Order Vectorization (RWOV).
Materials and Methods: As a proof-of-concept, we attempted to classify the hormone receptor status of breast cancer patients treated at the University of Kansas Medical Center during a recent year, from the unstructured text of pathology reports. Our approach attempts to account for the semi-structured way that healthcare providers often enter information. We compared this approach to the ngrams and word2vec methods.
Results: Our approach resulted in the most consistently high accuracy, as measured by F1 score and area under the receiver operating characteristic curve (AUC).
Discussion: Our results suggest that methods of structuring free text that take into account its context may show better performance, and that our approach is promising.
Conclusion: By using a method that accounts for the fact that healthcare providers tend to use certain key words repetitively and that the order of these key words is important, we showed improved performance over methods that do not.
△ Less
Submitted 6 December, 2018;
originally announced December 2018.
-
How can deep learning advance computational modeling of sensory information processing?
Authors:
Jessica A. F. Thompson,
Yoshua Bengio,
Elia Formisano,
Marc Schönwiesner
Abstract:
Deep learning, computational neuroscience, and cognitive science have overlapping goals related to understanding intelligence such that perception and behaviour can be simulated in computational systems. In neuroimaging, machine learning methods have been used to test computational models of sensory information processing. Recently, these model comparison techniques have been used to evaluate deep…
▽ More
Deep learning, computational neuroscience, and cognitive science have overlapping goals related to understanding intelligence such that perception and behaviour can be simulated in computational systems. In neuroimaging, machine learning methods have been used to test computational models of sensory information processing. Recently, these model comparison techniques have been used to evaluate deep neural networks (DNNs) as models of sensory information processing. However, the interpretation of such model evaluations is muddied by imprecise statistical conclusions. Here, we make explicit the types of conclusions that can be drawn from these existing model comparison techniques and how these conclusions change when the model in question is a DNN. We discuss how DNNs are amenable to new model comparison techniques that allow for stronger conclusions to be made about the computational mechanisms underlying sensory information processing.
△ Less
Submitted 25 September, 2018;
originally announced October 2018.