-
Unraveling the Geography of Infection Spread: Harnessing Super-Agents for Predictive Modeling
Authors:
Amir Mohammad Esmaieeli Sikaroudi,
Alon Efrat,
Michael Chertkov
Abstract:
Our study presents an intermediate-level modeling approach that bridges the gap between complex Agent-Based Models (ABMs) and traditional compartmental models for infectious diseases. We introduce "super-agents" to simulate infection spread in cities, reducing computational complexity while retaining individual-level interactions. This approach leverages real-world mobility data and strategic geos…
▽ More
Our study presents an intermediate-level modeling approach that bridges the gap between complex Agent-Based Models (ABMs) and traditional compartmental models for infectious diseases. We introduce "super-agents" to simulate infection spread in cities, reducing computational complexity while retaining individual-level interactions. This approach leverages real-world mobility data and strategic geospatial tessellations for efficiency. Voronoi Diagram tessellations, based on specific street network locations, outperform standard Census Block Group tessellations, and a hybrid approach balances accuracy and efficiency. Benchmarking against existing ABMs highlights key optimizations. This research improves disease modeling in urban areas, aiding public health strategies in scenarios requiring geographic specificity and high computational efficiency.
△ Less
Submitted 9 March, 2024; v1 submitted 13 September, 2023;
originally announced September 2023.
-
ZeroSCROLLS: A Zero-Shot Benchmark for Long Text Understanding
Authors:
Uri Shaham,
Maor Ivgi,
Avia Efrat,
Jonathan Berant,
Omer Levy
Abstract:
We introduce ZeroSCROLLS, a zero-shot benchmark for natural language understanding over long texts, which contains only test and small validation sets, without training data. We adapt six tasks from the SCROLLS benchmark, and add four new datasets, including two novel information fusing tasks, such as aggregating the percentage of positive reviews. Using ZeroSCROLLS, we conduct a comprehensive eva…
▽ More
We introduce ZeroSCROLLS, a zero-shot benchmark for natural language understanding over long texts, which contains only test and small validation sets, without training data. We adapt six tasks from the SCROLLS benchmark, and add four new datasets, including two novel information fusing tasks, such as aggregating the percentage of positive reviews. Using ZeroSCROLLS, we conduct a comprehensive evaluation of both open-source and closed large language models, finding that Claude outperforms ChatGPT, and that GPT-4 achieves the highest average score. However, there is still room for improvement on multiple open challenges in ZeroSCROLLS, such as aggregation tasks, where models struggle to pass the naive baseline. As the state of the art is a moving target, we invite researchers to evaluate their ideas on the live ZeroSCROLLS leaderboard.
△ Less
Submitted 17 December, 2023; v1 submitted 23 May, 2023;
originally announced May 2023.
-
LIMA: Less Is More for Alignment
Authors:
Chunting Zhou,
Pengfei Liu,
Puxin Xu,
Srini Iyer,
Jiao Sun,
Yuning Mao,
Xuezhe Ma,
Avia Efrat,
Ping Yu,
Lili Yu,
Susan Zhang,
Gargi Ghosh,
Mike Lewis,
Luke Zettlemoyer,
Omer Levy
Abstract:
Large language models are trained in two stages: (1) unsupervised pretraining from raw text, to learn general-purpose representations, and (2) large scale instruction tuning and reinforcement learning, to better align to end tasks and user preferences. We measure the relative importance of these two stages by training LIMA, a 65B parameter LLaMa language model fine-tuned with the standard supervis…
▽ More
Large language models are trained in two stages: (1) unsupervised pretraining from raw text, to learn general-purpose representations, and (2) large scale instruction tuning and reinforcement learning, to better align to end tasks and user preferences. We measure the relative importance of these two stages by training LIMA, a 65B parameter LLaMa language model fine-tuned with the standard supervised loss on only 1,000 carefully curated prompts and responses, without any reinforcement learning or human preference modeling. LIMA demonstrates remarkably strong performance, learning to follow specific response formats from only a handful of examples in the training data, including complex queries that range from planning trip itineraries to speculating about alternate history. Moreover, the model tends to generalize well to unseen tasks that did not appear in the training data. In a controlled human study, responses from LIMA are either equivalent or strictly preferred to GPT-4 in 43% of cases; this statistic is as high as 58% when compared to Bard and 65% versus DaVinci003, which was trained with human feedback. Taken together, these results strongly suggest that almost all knowledge in large language models is learned during pretraining, and only limited instruction tuning data is necessary to teach models to produce high quality output.
△ Less
Submitted 18 May, 2023;
originally announced May 2023.
-
LMentry: A Language Model Benchmark of Elementary Language Tasks
Authors:
Avia Efrat,
Or Honovich,
Omer Levy
Abstract:
As the performance of large language models rapidly improves, benchmarks are getting larger and more complex as well. We present LMentry, a benchmark that avoids this "arms race" by focusing on a compact set of tasks that are trivial to humans, e.g. writing a sentence containing a specific word, identifying which words in a list belong to a specific category, or choosing which of two words is long…
▽ More
As the performance of large language models rapidly improves, benchmarks are getting larger and more complex as well. We present LMentry, a benchmark that avoids this "arms race" by focusing on a compact set of tasks that are trivial to humans, e.g. writing a sentence containing a specific word, identifying which words in a list belong to a specific category, or choosing which of two words is longer. LMentry is specifically designed to provide quick and interpretable insights into the capabilities and robustness of large language models. Our experiments reveal a wide variety of failure cases that, while immediately obvious to humans, pose a considerable challenge for large language models, including OpenAI's latest 175B-parameter instruction-tuned model, TextDavinci002. LMentry complements contemporary evaluation approaches of large language models, providing a quick, automatic, and easy-to-run "unit test", without resorting to large benchmark suites of complex tasks.
△ Less
Submitted 19 December, 2022; v1 submitted 3 November, 2022;
originally announced November 2022.
-
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
Authors:
Aarohi Srivastava,
Abhinav Rastogi,
Abhishek Rao,
Abu Awal Md Shoeb,
Abubakar Abid,
Adam Fisch,
Adam R. Brown,
Adam Santoro,
Aditya Gupta,
Adrià Garriga-Alonso,
Agnieszka Kluska,
Aitor Lewkowycz,
Akshat Agarwal,
Alethea Power,
Alex Ray,
Alex Warstadt,
Alexander W. Kocurek,
Ali Safaya,
Ali Tazarv,
Alice Xiang,
Alicia Parrish,
Allen Nie,
Aman Hussain,
Amanda Askell,
Amanda Dsouza
, et al. (426 additional authors not shown)
Abstract:
Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur…
▽ More
Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-future capabilities and limitations of language models. To address this challenge, we introduce the Beyond the Imitation Game benchmark (BIG-bench). BIG-bench currently consists of 204 tasks, contributed by 450 authors across 132 institutions. Task topics are diverse, drawing problems from linguistics, childhood development, math, common-sense reasoning, biology, physics, social bias, software development, and beyond. BIG-bench focuses on tasks that are believed to be beyond the capabilities of current language models. We evaluate the behavior of OpenAI's GPT models, Google-internal dense transformer architectures, and Switch-style sparse transformers on BIG-bench, across model sizes spanning millions to hundreds of billions of parameters. In addition, a team of human expert raters performed all tasks in order to provide a strong baseline. Findings include: model performance and calibration both improve with scale, but are poor in absolute terms (and when compared with rater performance); performance is remarkably similar across model classes, though with benefits from sparsity; tasks that improve gradually and predictably commonly involve a large knowledge or memorization component, whereas tasks that exhibit "breakthrough" behavior at a critical scale often involve multiple steps or components, or brittle metrics; social bias typically increases with scale in settings with ambiguous context, but this can be improved with prompting.
△ Less
Submitted 12 June, 2023; v1 submitted 9 June, 2022;
originally announced June 2022.
-
SCROLLS: Standardized CompaRison Over Long Language Sequences
Authors:
Uri Shaham,
Elad Segal,
Maor Ivgi,
Avia Efrat,
Ori Yoran,
Adi Haviv,
Ankit Gupta,
Wenhan Xiong,
Mor Geva,
Jonathan Berant,
Omer Levy
Abstract:
NLP benchmarks have largely focused on short texts, such as sentences and paragraphs, even though long texts comprise a considerable amount of natural language in the wild. We introduce SCROLLS, a suite of tasks that require reasoning over long texts. We examine existing long-text datasets, and handpick ones where the text is naturally long, while prioritizing tasks that involve synthesizing infor…
▽ More
NLP benchmarks have largely focused on short texts, such as sentences and paragraphs, even though long texts comprise a considerable amount of natural language in the wild. We introduce SCROLLS, a suite of tasks that require reasoning over long texts. We examine existing long-text datasets, and handpick ones where the text is naturally long, while prioritizing tasks that involve synthesizing information across the input. SCROLLS contains summarization, question answering, and natural language inference tasks, covering multiple domains, including literature, science, business, and entertainment. Initial baselines, including Longformer Encoder-Decoder, indicate that there is ample room for improvement on SCROLLS. We make all datasets available in a unified text-to-text format and host a live leaderboard to facilitate research on model architecture and pretraining methods.
△ Less
Submitted 11 October, 2022; v1 submitted 10 January, 2022;
originally announced January 2022.
-
Prediction and Prevention of Pandemics via Graphical Model Inference and Convex Programming
Authors:
Mikhail Krechetov,
Amir Mohammad Esmaieeli Sikaroudi,
Alon Efrat,
Valentin Polishchuk,
Michael Chertkov
Abstract:
Hard-to-predict bursts of COVID-19 pandemic revealed significance of statistical modeling which would resolve spatio-temporal correlations over geographical areas, for example spread of the infection over a city with census tract granularity. In this manuscript, we provide algorithmic answers to the following two inter-related public health challenges. (1) Inference Challenge: assuming that there…
▽ More
Hard-to-predict bursts of COVID-19 pandemic revealed significance of statistical modeling which would resolve spatio-temporal correlations over geographical areas, for example spread of the infection over a city with census tract granularity. In this manuscript, we provide algorithmic answers to the following two inter-related public health challenges. (1) Inference Challenge: assuming that there are $N$ census blocks (nodes) in the city, and given an initial infection at any set of nodes, what is the probability for a subset of census blocks to become infected by the time the spread of the infection burst is stabilized? (2) Prevention Challenge: What is the minimal control action one can take to minimize the infected part of the stabilized state footprint? To answer the challenges, we build a Graphical Model of pandemic of the attractive Ising (pair-wise, binary) type, where each node represents a census track and each edge factor represents the strength of the pairwise interaction between a pair of nodes. We show that almost all attractive Ising Models on dense graphs result in either of the two modes for the most probable state: either all nodes which were not infected initially became infected, or all the initially uninfected nodes remain uninfected. This bi-modal solution of the Inference Challenge allows us to re-state the Prevention Challenge as the following tractable convex programming: for the bare Ising Model with pair-wise and bias factors representing the system without prevention measures, such that the MAP state is fully infected for at least one of the initial infection patterns, find the closest, in $l_1$ norm, set of factors resulting in all the MAP states of the Ising model, with the optimal prevention measures applied, to become safe.
△ Less
Submitted 26 April, 2022; v1 submitted 9 September, 2021;
originally announced September 2021.
-
How Optimal is Greedy Decoding for Extractive Question Answering?
Authors:
Or Castel,
Ori Ram,
Avia Efrat,
Omer Levy
Abstract:
Fine-tuned language models use greedy decoding to answer reading comprehension questions with relative success. However, this approach does not ensure that the answer is a span in the given passage, nor does it guarantee that it is the most probable one. Does greedy decoding actually perform worse than an algorithm that does adhere to these properties? To study the performance and optimality of gr…
▽ More
Fine-tuned language models use greedy decoding to answer reading comprehension questions with relative success. However, this approach does not ensure that the answer is a span in the given passage, nor does it guarantee that it is the most probable one. Does greedy decoding actually perform worse than an algorithm that does adhere to these properties? To study the performance and optimality of greedy decoding, we present exact-extract, a decoding algorithm that efficiently finds the most probable answer span in the context. We compare the performance of T5 with both decoding algorithms on zero-shot and few-shot extractive question answering. When no training examples are available, exact-extract significantly outperforms greedy decoding. However, greedy decoding quickly converges towards the performance of exact-extract with the introduction of a few training examples, becoming more extractive and increasingly likelier to generate the most probable span as the training set grows. We also show that self-supervised training can bias the model towards extractive behavior, increasing performance in the zero-shot setting without resorting to annotated examples. Overall, our results suggest that pretrained language models are so good at adapting to extractive question answering, that it is often enough to fine-tune on a small training set for the greedy algorithm to emulate the optimal decoding strategy.
△ Less
Submitted 8 November, 2022; v1 submitted 12 August, 2021;
originally announced August 2021.
-
Cryptonite: A Cryptic Crossword Benchmark for Extreme Ambiguity in Language
Authors:
Avia Efrat,
Uri Shaham,
Dan Kilman,
Omer Levy
Abstract:
Current NLP datasets targeting ambiguity can be solved by a native speaker with relative ease. We present Cryptonite, a large-scale dataset based on cryptic crosswords, which is both linguistically complex and naturally sourced. Each example in Cryptonite is a cryptic clue, a short phrase or sentence with a misleading surface reading, whose solving requires disambiguating semantic, syntactic, and…
▽ More
Current NLP datasets targeting ambiguity can be solved by a native speaker with relative ease. We present Cryptonite, a large-scale dataset based on cryptic crosswords, which is both linguistically complex and naturally sourced. Each example in Cryptonite is a cryptic clue, a short phrase or sentence with a misleading surface reading, whose solving requires disambiguating semantic, syntactic, and phonetic wordplays, as well as world knowledge. Cryptic clues pose a challenge even for experienced solvers, though top-tier experts can solve them with almost 100% accuracy. Cryptonite is a challenging task for current models; fine-tuning T5-Large on 470k cryptic clues achieves only 7.6% accuracy, on par with the accuracy of a rule-based clue solver (8.6%).
△ Less
Submitted 1 November, 2021; v1 submitted 1 March, 2021;
originally announced March 2021.
-
The Turking Test: Can Language Models Understand Instructions?
Authors:
Avia Efrat,
Omer Levy
Abstract:
Supervised machine learning provides the learner with a set of input-output examples of the target task. Humans, however, can also learn to perform new tasks from instructions in natural language. Can machines learn to understand instructions as well? We present the Turking Test, which examines a model's ability to follow natural language instructions of varying complexity. These range from simple…
▽ More
Supervised machine learning provides the learner with a set of input-output examples of the target task. Humans, however, can also learn to perform new tasks from instructions in natural language. Can machines learn to understand instructions as well? We present the Turking Test, which examines a model's ability to follow natural language instructions of varying complexity. These range from simple tasks, like retrieving the nth word of a sentence, to ones that require creativity, such as generating examples for SNLI and SQuAD in place of human intelligence workers ("turkers"). Despite our lenient evaluation methodology, we observe that a large pretrained language model performs poorly across all tasks. Analyzing the model's error patterns reveals that the model tends to ignore explicit instructions and often generates outputs that cannot be construed as an attempt to solve the task. While it is not yet clear whether instruction understanding can be captured by traditional language models, the sheer expressivity of instruction understanding makes it an appealing alternative to the rising few-shot inference paradigm.
△ Less
Submitted 22 October, 2020;
originally announced October 2020.
-
Polygons with Prescribed Angles in 2D and 3D
Authors:
Alon Efrat,
Radoslav Fulek,
Stephen Kobourov,
Csaba D. Tóth
Abstract:
We consider the construction of a polygon $P$ with $n$ vertices whose turning angles at the vertices are given by a sequence $A=(α_0,\ldots, α_{n-1})$, $α_i\in (-π,π)$, for $i\in\{0,\ldots, n-1\}$. The problem of realizing $A$ by a polygon can be seen as that of constructing a straight-line drawing of a graph with prescribed angles at vertices, and hence, it is a special case of the well studied p…
▽ More
We consider the construction of a polygon $P$ with $n$ vertices whose turning angles at the vertices are given by a sequence $A=(α_0,\ldots, α_{n-1})$, $α_i\in (-π,π)$, for $i\in\{0,\ldots, n-1\}$. The problem of realizing $A$ by a polygon can be seen as that of constructing a straight-line drawing of a graph with prescribed angles at vertices, and hence, it is a special case of the well studied problem of constructing an \emph{angle graph}.
In 2D, we characterize sequences $A$ for which every generic polygon $P\subset \mathbb{R}^2$ realizing $A$ has at least $c$ crossings, for every $c\in \mathbb{N}$, and describe an efficient algorithm that constructs, for a given sequence $A$, a generic polygon $P\subset \mathbb{R}^2$ that realizes $A$ with the minimum number of crossings.
In 3D, we describe an efficient algorithm that tests whether a given sequence $A$ can be realized by a (not necessarily generic) polygon $P\subset \mathbb{R}^3$, and for every realizable sequence the algorithm finds a realization.
△ Less
Submitted 1 November, 2020; v1 submitted 24 August, 2020;
originally announced August 2020.
-
Data Inference from Encrypted Databases: A Multi-dimensional Order-Preserving Matching Approach
Authors:
Yanjun Pan,
Alon Efrat,
Ming Li,
Boyang Wang,
Hanyu Quan,
Joseph Mitchell,
Jie Gao,
Esther Arkin
Abstract:
Due to increasing concerns of data privacy, databases are being encrypted before they are stored on an untrusted server. To enable search operations on the encrypted data, searchable encryption techniques have been proposed. Representative schemes use order-preserving encryption (OPE) for supporting efficient Boolean queries on encrypted databases. Yet, recent works showed the possibility of infer…
▽ More
Due to increasing concerns of data privacy, databases are being encrypted before they are stored on an untrusted server. To enable search operations on the encrypted data, searchable encryption techniques have been proposed. Representative schemes use order-preserving encryption (OPE) for supporting efficient Boolean queries on encrypted databases. Yet, recent works showed the possibility of inferring plaintext data from OPE-encrypted databases, merely using the order-preserving constraints, or combined with an auxiliary plaintext dataset with similar frequency distribution. So far, the effectiveness of such attacks is limited to single-dimensional dense data (most values from the domain are encrypted), but it remains challenging to achieve it on high-dimensional datasets (e.g., spatial data) which are often sparse in nature. In this paper, for the first time, we study data inference attacks on multi-dimensional encrypted databases (with 2-D as a special case). We formulate it as a 2-D order-preserving matching problem and explore both unweighted and weighted cases, where the former maximizes the number of points matched using only order information and the latter further considers points with similar frequencies. We prove that the problem is NP-hard, and then propose a greedy algorithm, along with a polynomial-time algorithm with approximation guarantees. Experimental results on synthetic and real-world datasets show that the data recovery rate is significantly enhanced compared with the previous 1-D matching algorithm.
△ Less
Submitted 23 January, 2020;
originally announced January 2020.
-
A Simple and Effective Model for Answering Multi-span Questions
Authors:
Elad Segal,
Avia Efrat,
Mor Shoham,
Amir Globerson,
Jonathan Berant
Abstract:
Models for reading comprehension (RC) commonly restrict their output space to the set of all single contiguous spans from the input, in order to alleviate the learning problem and avoid the need for a model that generates text explicitly. However, forcing an answer to be a single span can be restrictive, and some recent datasets also include multi-span questions, i.e., questions whose answer is a…
▽ More
Models for reading comprehension (RC) commonly restrict their output space to the set of all single contiguous spans from the input, in order to alleviate the learning problem and avoid the need for a model that generates text explicitly. However, forcing an answer to be a single span can be restrictive, and some recent datasets also include multi-span questions, i.e., questions whose answer is a set of non-contiguous spans in the text. Naturally, models that return single spans cannot answer these questions. In this work, we propose a simple architecture for answering multi-span questions by casting the task as a sequence tagging problem, namely, predicting for each input token whether it should be part of the output or not. Our model substantially improves performance on span extraction questions from DROP and Quoref by 9.9 and 5.5 EM points respectively.
△ Less
Submitted 5 October, 2020; v1 submitted 29 September, 2019;
originally announced September 2019.
-
Euclidean TSP, Motorcycle Graphs, and Other New Applications of Nearest-Neighbor Chains
Authors:
Nil Mamano,
Alon Efrat,
David Eppstein,
Daniel Frishberg,
Michael Goodrich,
Stephen Kobourov,
Pedro Matias,
Valentin Polishchuk
Abstract:
We show new applications of the nearest-neighbor chain algorithm, a technique that originated in agglomerative hierarchical clustering. We apply it to a diverse class of geometric problems: we construct the greedy multi-fragment tour for Euclidean TSP in $O(n\log n)$ time in any fixed dimension and for Steiner TSP in planar graphs in $O(n\sqrt{n}\log n)$ time; we compute motorcycle graphs (which a…
▽ More
We show new applications of the nearest-neighbor chain algorithm, a technique that originated in agglomerative hierarchical clustering. We apply it to a diverse class of geometric problems: we construct the greedy multi-fragment tour for Euclidean TSP in $O(n\log n)$ time in any fixed dimension and for Steiner TSP in planar graphs in $O(n\sqrt{n}\log n)$ time; we compute motorcycle graphs (which are a central part in straight skeleton algorithms) in $O(n^{4/3+\varepsilon})$ time for any $\varepsilon>0$; we introduce a narcissistic variant of the $k$-attribute stable matching model, and solve it in $O(n^{2-4/(k(1+\varepsilon)+2)})$ time; we give a linear-time $2$-approximation for a 1D geometric set cover problem with applications to radio station placement.
△ Less
Submitted 2 December, 2019; v1 submitted 18 February, 2019;
originally announced February 2019.
-
Approximation algorithms for the vertex-weighted grade-of-service Steiner tree problem
Authors:
Faryad Darabi Sahneh,
Alon Efrat,
Stephen Kobourov,
Spencer Krieger,
Richard Spence
Abstract:
Given a graph $G = (V,E)$ and a subset $T \subseteq V$ of terminals, a \emph{Steiner tree} of $G$ is a tree that spans $T$. In the vertex-weighted Steiner tree (VST) problem, each vertex is assigned a non-negative weight, and the goal is to compute a minimum weight Steiner tree of $G$.
We study a natural generalization of the VST problem motivated by multi-level graph construction, the \emph{ver…
▽ More
Given a graph $G = (V,E)$ and a subset $T \subseteq V$ of terminals, a \emph{Steiner tree} of $G$ is a tree that spans $T$. In the vertex-weighted Steiner tree (VST) problem, each vertex is assigned a non-negative weight, and the goal is to compute a minimum weight Steiner tree of $G$.
We study a natural generalization of the VST problem motivated by multi-level graph construction, the \emph{vertex-weighted grade-of-service Steiner tree problem} (V-GSST), which can be stated as follows: given a graph $G$ and terminals $T$, where each terminal $v \in T$ requires a facility of a minimum grade of service $R(v)\in \{1,2,\ldots\ell\}$, compute a Steiner tree $G'$ by installing facilities on a subset of vertices, such that any two vertices requiring a certain grade of service are connected by a path in $G'$ with the minimum grade of service or better. Facilities of higher grade are more costly than facilities of lower grade. Multi-level variants such as this one can be useful in network design problems where vertices may require facilities of varying priority.
While similar problems have been studied in the edge-weighted case, they have not been studied as well in the more general vertex-weighted case. We first describe a simple heuristic for the V-GSST problem whose approximation ratio depends on $\ell$, the number of grades of service. We then generalize the greedy algorithm of [Klein \& Ravi, 1995] to show that the V-GSST problem admits a $(2 \ln |T|)$-approximation, where $T$ is the set of terminals requiring some facility. This result is surprising, as it shows that the (seemingly harder) multi-grade problem can be approximated as well as the VST problem, and that the approximation ratio does not depend on the number of grades of service.
△ Less
Submitted 3 May, 2019; v1 submitted 28 November, 2018;
originally announced November 2018.
-
Multi-Level Steiner Trees
Authors:
Reyan Ahmed,
Patrizio Angelini,
Faryad Darabi Sahneh,
Alon Efrat,
David Glickenstein,
Martin Gronemann,
Niklas Heinsohn,
Stephen G. Kobourov,
Richard Spence,
Joseph Watkins,
Alexander Wolff
Abstract:
In the classical Steiner tree problem, given an undirected, connected graph $G=(V,E)$ with non-negative edge costs and a set of \emph{terminals} $T\subseteq V$, the objective is to find a minimum-cost tree $E' \subseteq E$ that spans the terminals. The problem is APX-hard; the best known approximation algorithm has a ratio of $ρ= \ln(4)+\varepsilon < 1.39$. In this paper, we study a natural genera…
▽ More
In the classical Steiner tree problem, given an undirected, connected graph $G=(V,E)$ with non-negative edge costs and a set of \emph{terminals} $T\subseteq V$, the objective is to find a minimum-cost tree $E' \subseteq E$ that spans the terminals. The problem is APX-hard; the best known approximation algorithm has a ratio of $ρ= \ln(4)+\varepsilon < 1.39$. In this paper, we study a natural generalization, the \emph{multi-level Steiner tree} (MLST) problem: given a nested sequence of terminals $T_{\ell} \subset \dots \subset T_1 \subseteq V$, compute nested trees $E_{\ell}\subseteq \dots \subseteq E_1\subseteq E$ that span the corresponding terminal sets with minimum total cost. The MLST problem and variants thereof have been studied under various names including Multi-level Network Design, Quality-of-Service Multicast tree, Grade-of-Service Steiner tree, and Multi-Tier tree. Several approximation results are known. We first present two simple $O(\ell)$-approximation heuristics. Based on these, we introduce a rudimentary composite algorithm that generalizes the above heuristics, and determine its approximation ratio by solving a linear program. We then present a method that guarantees the same approximation ratio using at most $2\ell$ Steiner tree computations. We compare these heuristics experimentally on various instances of up to 500 vertices using three different network generation models. We also present various integer linear programming (ILP) formulations for the MLST problem, and compare their running times on these instances. To our knowledge, the composite algorithm achieves the best approximation ratio for up to $\ell=100$ levels, which is sufficient for most applications such as network visualization or designing multi-level infrastructure.
△ Less
Submitted 26 November, 2018; v1 submitted 8 April, 2018;
originally announced April 2018.
-
L-Graphs and Monotone L-Graphs
Authors:
Abu Reyan Ahmed,
Felice De Luca,
Sabin Devkota,
Alon Efrat,
Md Iqbal Hossain,
Stephen Kobourov,
Jixian Li,
Sammi Abida Salma,
Eric Welch
Abstract:
In an $\mathsf{L}$-embedding of a graph, each vertex is represented by an $\mathsf{L}$-segment, and two segments intersect each other if and only if the corresponding vertices are adjacent in the graph. If the corner of each $\mathsf{L}$-segment in an $\mathsf{L}$-embedding lies on a straight line, we call it a monotone $\mathsf{L}$-embedding. In this paper we give a full characterization of monot…
▽ More
In an $\mathsf{L}$-embedding of a graph, each vertex is represented by an $\mathsf{L}$-segment, and two segments intersect each other if and only if the corresponding vertices are adjacent in the graph. If the corner of each $\mathsf{L}$-segment in an $\mathsf{L}$-embedding lies on a straight line, we call it a monotone $\mathsf{L}$-embedding. In this paper we give a full characterization of monotone $\mathsf{L}$-embeddings by introducing a new class of graphs which we call "non-jumping" graphs. We show that a graph admits a monotone $\mathsf{L}$-embedding if and only if the graph is a non-jumping graph. Further, we show that outerplanar graphs, convex bipartite graphs, interval graphs, 3-leaf power graphs, and complete graphs are subclasses of non-jumping graphs. Finally, we show that distance-hereditary graphs and $k$-leaf power graphs ($k\le 4$) admit $\mathsf{L}$-embeddings.
△ Less
Submitted 4 March, 2017;
originally announced March 2017.
-
Improved Approximation Algorithms for Relay Placement
Authors:
Alon Efrat,
Sándor P. Fekete,
Joseph S. B. Mitchell,
Valentin Polishchuk,
Jukka Suomela
Abstract:
In the relay placement problem the input is a set of sensors and a number $r \ge 1$, the communication range of a relay. In the one-tier version of the problem the objective is to place a minimum number of relays so that between every pair of sensors there is a path through sensors and/or relays such that the consecutive vertices of the path are within distance $r$ if both vertices are relays and…
▽ More
In the relay placement problem the input is a set of sensors and a number $r \ge 1$, the communication range of a relay. In the one-tier version of the problem the objective is to place a minimum number of relays so that between every pair of sensors there is a path through sensors and/or relays such that the consecutive vertices of the path are within distance $r$ if both vertices are relays and within distance 1 otherwise. The two-tier version adds the restrictions that the path must go through relays, and not through sensors. We present a 3.11-approximation algorithm for the one-tier version and a PTAS for the two-tier version. We also show that the one-tier version admits no PTAS, assuming P $\ne$ NP.
△ Less
Submitted 8 November, 2015;
originally announced November 2015.
-
A Mobile Food Recommendation System Based on The Traffic Light Diet
Authors:
Thienne Johnson,
Jorge Vergara,
Chelsea Doll,
Madison Kramer,
Gayathri Sundararaman,
Harsha Rajendran,
Alon Efrat,
Melanie Hingle
Abstract:
Innovative, real-time solutions are needed to address the mismatch between the demand for and supply of critical information to inform and motivate diet and health-related behavior change. Research suggests that interventions using mobile health technologies hold great promise for influencing knowledge, attitudes, and behaviors related to energy balance. The objective of this paper is to present i…
▽ More
Innovative, real-time solutions are needed to address the mismatch between the demand for and supply of critical information to inform and motivate diet and health-related behavior change. Research suggests that interventions using mobile health technologies hold great promise for influencing knowledge, attitudes, and behaviors related to energy balance. The objective of this paper is to present insights related to the development and testing of a mobile food recommendation system targeting fast food restaurants. The system is designed to provide consumers with information about energy density of food options combined with tips for healthier choices when dining out, accessible through a mobile phone.
△ Less
Submitted 1 September, 2014;
originally announced September 2014.
-
On Channel-Discontinuity-Constraint Routing in Wireless Networks
Authors:
Swaminathan Sankararaman,
Alon Efrat,
Srinivasan Ramasubramanian,
Pankaj K. Agarwal
Abstract:
Multi-channel wireless networks are increasingly being employed as infrastructure networks, e.g. in metro areas. Nodes in these networks frequently employ directional antennas to improve spatial throughput. In such networks, given a source and destination, it is of interest to compute an optimal path and channel assignment on every link in the path such that the path bandwidth is the same as tha…
▽ More
Multi-channel wireless networks are increasingly being employed as infrastructure networks, e.g. in metro areas. Nodes in these networks frequently employ directional antennas to improve spatial throughput. In such networks, given a source and destination, it is of interest to compute an optimal path and channel assignment on every link in the path such that the path bandwidth is the same as that of the link bandwidth and such a path satisfies the constraint that no two consecutive links on the path are assigned the same channel, referred to as "Channel Discontinuity Constraint" (CDC). CDC-paths are also quite useful for TDMA system, where preferably every consecutive links along a path are assigned different time slots.
This paper contains several contributions. We first present an $O(N^{2})$ distributed algorithm for discovering the shortest CDC-path between given source and destination. This improves the running time of the $O(N^{3})$ centralized algorithm of Ahuja et al. for finding the minimum-weight CDC-path. Our second result is a generalized $t$-spanner for CDC-path; For any $θ>0$ we show how to construct a sub-network containing only $O(\frac{N}θ)$ edges, such that that length of shortest CDC-paths between arbitrary sources and destinations increases by only a factor of at most $(1-2\sin{\tfracθ{2}})^{-2}$. We propose a novel algorithm to compute the spanner in a distributed manner using only $O(n\log{n})$ messages. An important conclusion of this scheme is in the case of directional antennas are used. In this case, it is enough to consider only the two closest nodes in each cone.
△ Less
Submitted 26 February, 2010; v1 submitted 21 December, 2009;
originally announced December 2009.
-
Scheduling Sensors for Guaranteed Sparse Coverage
Authors:
Swaminathan Sankararaman,
Alon Efrat,
Srinivasan Ramasubramanian,
Javad Taheri
Abstract:
Sensor networks are particularly applicable to the tracking of objects in motion. For such applications, it may not necessary that the whole region be covered by sensors as long as the uncovered region is not too large. This notion has been formalized by Balasubramanian et.al. as the problem of $κ$-weak coverage. This model of coverage provides guarantees about the regions in which the objects m…
▽ More
Sensor networks are particularly applicable to the tracking of objects in motion. For such applications, it may not necessary that the whole region be covered by sensors as long as the uncovered region is not too large. This notion has been formalized by Balasubramanian et.al. as the problem of $κ$-weak coverage. This model of coverage provides guarantees about the regions in which the objects may move undetected. In this paper, we analyse the theoretical aspects of the problem and provide guarantees about the lifetime achievable. We introduce a number of practical algorithms and analyse their significance. The main contribution is a novel linear programming based algorithm which provides near-optimal lifetime. Through extensive experimentation, we analyse the performance of these algorithms based on several parameters defined.
△ Less
Submitted 26 February, 2010; v1 submitted 23 November, 2009;
originally announced November 2009.
-
Restricted Strip Covering and the Sensor Cover Problem
Authors:
Adam L. Buchsbaum,
Alon Efrat,
Shaili Jain,
Suresh Venkatasubramanian,
Ke Yi
Abstract:
Given a set of objects with durations (jobs) that cover a base region, can we schedule the jobs to maximize the duration the original region remains covered? We call this problem the sensor cover problem. This problem arises in the context of covering a region with sensors. For example, suppose you wish to monitor activity along a fence by sensors placed at various fixed locations. Each sensor h…
▽ More
Given a set of objects with durations (jobs) that cover a base region, can we schedule the jobs to maximize the duration the original region remains covered? We call this problem the sensor cover problem. This problem arises in the context of covering a region with sensors. For example, suppose you wish to monitor activity along a fence by sensors placed at various fixed locations. Each sensor has a range and limited battery life. The problem is to schedule when to turn on the sensors so that the fence is fully monitored for as long as possible. This one dimensional problem involves intervals on the real line. Associating a duration to each yields a set of rectangles in space and time, each specified by a pair of fixed horizontal endpoints and a height. The objective is to assign a position to each rectangle to maximize the height at which the spanning interval is fully covered. We call this one dimensional problem restricted strip covering. If we replace the covering constraint by a packing constraint, the problem is identical to dynamic storage allocation, a scheduling problem that is a restricted case of the strip packing problem. We show that the restricted strip covering problem is NP-hard and present an O(log log n)-approximation algorithm. We present better approximations or exact algorithms for some special cases. For the uniform-duration case of restricted strip covering we give a polynomial-time, exact algorithm but prove that the uniform-duration case for higher-dimensional regions is NP-hard. Finally, we consider regions that are arbitrary sets, and we present an O(log n)-approximation algorithm.
△ Less
Submitted 23 May, 2006;
originally announced May 2006.
-
On Simultaneous Graph Embedding
Authors:
C. A. Duncan,
A. Efrat,
C. Erten,
S. Kobourov,
J. S. B. Mitchell
Abstract:
We consider the problem of simultaneous embedding of planar graphs. There are two variants of this problem, one in which the mapping between the vertices of the two graphs is given and another where the mapping is not given. In particular, we show that without mapping, any number of outerplanar graphs can be embedded simultaneously on an $O(n)\times O(n)$ grid, and an outerplanar and general pla…
▽ More
We consider the problem of simultaneous embedding of planar graphs. There are two variants of this problem, one in which the mapping between the vertices of the two graphs is given and another where the mapping is not given. In particular, we show that without mapping, any number of outerplanar graphs can be embedded simultaneously on an $O(n)\times O(n)$ grid, and an outerplanar and general planar graph can be embedded simultaneously on an $O(n^2)\times O(n^3)$ grid. If the mapping is given, we show how to embed two paths on an $n \times n$ grid, a caterpillar and a path on an $n \times 2n$ grid, or two caterpillar graphs on an $O(n^2)\times O(n^3)$ grid. We also show that 5 paths, or 3 caterpillars, or two general planar graphs cannot be simultaneously embedded given the mapping.
△ Less
Submitted 9 September, 2002; v1 submitted 11 June, 2002;
originally announced June 2002.
-
Computing Homotopic Shortest Paths Efficiently
Authors:
Alon Efrat,
Stephen G. Kobourov,
Anna Lubiw
Abstract:
This paper addresses the problem of finding shortest paths homotopic to a given disjoint set of paths that wind amongst point obstacles in the plane. We present a faster algorithm than previously known.
This paper addresses the problem of finding shortest paths homotopic to a given disjoint set of paths that wind amongst point obstacles in the plane. We present a faster algorithm than previously known.
△ Less
Submitted 25 April, 2002;
originally announced April 2002.
-
Pattern Matching for sets of segments
Authors:
Alon Efrat,
Piotr Indyk,
Suresh Venkatasubramanian
Abstract:
In this paper we present algorithms for a number of problems in geometric pattern matching where the input consist of a collections of segments in the plane. Our work consists of two main parts. In the first, we address problems and measures that relate to collections of orthogonal line segments in the plane. Such collections arise naturally from problems in mapping buildings and robot explorati…
▽ More
In this paper we present algorithms for a number of problems in geometric pattern matching where the input consist of a collections of segments in the plane. Our work consists of two main parts. In the first, we address problems and measures that relate to collections of orthogonal line segments in the plane. Such collections arise naturally from problems in mapping buildings and robot exploration.
We propose a new measure of segment similarity called a \emph{coverage measure}, and present efficient algorithms for maximising this measure between sets of axis-parallel segments under translations. Our algorithms run in time $O(n^3\polylog n)$ in the general case, and run in time $O(n^2\polylog n)$ for the case when all segments are horizontal. In addition, we show that when restricted to translations that are only vertical, the Hausdorff distance between two sets of horizontal segments can be computed in time roughly $O(n^{3/2}{\sl polylog}n)$. These algorithms form significant improvements over the general algorithm of Chew et al. that takes time $O(n^4 \log^2 n)$. In the second part of this paper we address the problem of matching polygonal chains. We study the well known \Frd, and present the first algorithm for computing the \Frd under general translations. Our methods also yield algorithms for computing a generalization of the \Fr distance, and we also present a simple approximation algorithm for the \Frd that runs in time $O(n^2\polylog n)$.
△ Less
Submitted 22 September, 2000; v1 submitted 19 September, 2000;
originally announced September 2000.