-
A Framework for Efficient Model Evaluation through Stratification, Sampling, and Estimation
Authors:
Riccardo Fogliato,
Pratik Patil,
Mathew Monfort,
Pietro Perona
Abstract:
Model performance evaluation is a critical and expensive task in machine learning and computer vision. Without clear guidelines, practitioners often estimate model accuracy using a one-time completely random selection of the data. However, by employing tailored sampling and estimation strategies, one can obtain more precise estimates and reduce annotation costs. In this paper, we propose a statist…
▽ More
Model performance evaluation is a critical and expensive task in machine learning and computer vision. Without clear guidelines, practitioners often estimate model accuracy using a one-time completely random selection of the data. However, by employing tailored sampling and estimation strategies, one can obtain more precise estimates and reduce annotation costs. In this paper, we propose a statistical framework for model evaluation that includes stratification, sampling, and estimation components. We examine the statistical properties of each component and evaluate their efficiency (precision). One key result of our work is that stratification via k-means clustering based on accurate predictions of model performance yields efficient estimators. Our experiments on computer vision datasets show that this method consistently provides more precise accuracy estimates than the traditional simple random sampling, even with substantial efficiency gains of 10x. We also find that model-assisted estimators, which leverage predictions of model accuracy on the unlabeled portion of the dataset, are generally more efficient than the traditional estimates based solely on the labeled data.
△ Less
Submitted 18 July, 2024; v1 submitted 11 June, 2024;
originally announced June 2024.
-
Reimplementation of Learning to Reweight Examples for Robust Deep Learning
Authors:
Parth Patil,
Ben Boardley,
Jack Gardner,
Emily Loiselle,
Deerajkumar Parthipan
Abstract:
Deep neural networks (DNNs) have been used to create models for many complex analysis problems like image recognition and medical diagnosis. DNNs are a popular tool within machine learning due to their ability to model complex patterns and distributions. However, the performance of these networks is highly dependent on the quality of the data used to train the models. Two characteristics of these…
▽ More
Deep neural networks (DNNs) have been used to create models for many complex analysis problems like image recognition and medical diagnosis. DNNs are a popular tool within machine learning due to their ability to model complex patterns and distributions. However, the performance of these networks is highly dependent on the quality of the data used to train the models. Two characteristics of these sets, noisy labels and training set biases, are known to frequently cause poor generalization performance as a result of overfitting to the training set. This paper aims to solve this problem using the approach proposed by Ren et al. (2018) using meta-training and online weight approximation. We will first implement a toy-problem to crudely verify the claims made by the authors of Ren et al. (2018) and then venture into using the approach to solve a real world problem of Skin-cancer detection using an imbalanced image dataset.
△ Less
Submitted 10 May, 2024;
originally announced May 2024.
-
Optimal Ridge Regularization for Out-of-Distribution Prediction
Authors:
Pratik Patil,
Jin-Hong Du,
Ryan J. Tibshirani
Abstract:
We study the behavior of optimal ridge regularization and optimal ridge risk for out-of-distribution prediction, where the test distribution deviates arbitrarily from the train distribution. We establish general conditions that determine the sign of the optimal regularization level under covariate and regression shifts. These conditions capture the alignment between the covariance and signal struc…
▽ More
We study the behavior of optimal ridge regularization and optimal ridge risk for out-of-distribution prediction, where the test distribution deviates arbitrarily from the train distribution. We establish general conditions that determine the sign of the optimal regularization level under covariate and regression shifts. These conditions capture the alignment between the covariance and signal structures in the train and test data and reveal stark differences compared to the in-distribution setting. For example, a negative regularization level can be optimal under covariate shift or regression shift, even when the training features are isotropic or the design is underparameterized. Furthermore, we prove that the optimally-tuned risk is monotonic in the data aspect ratio, even in the out-of-distribution setting and when optimizing over negative regularization levels. In general, our results do not make any modeling assumptions for the train or the test distributions, except for moment bounds, and allow for arbitrary shifts and the widest possible range of (negative) regularization levels.
△ Less
Submitted 1 April, 2024;
originally announced April 2024.
-
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
Authors:
Gemini Team,
Petko Georgiev,
Ving Ian Lei,
Ryan Burnell,
Libin Bai,
Anmol Gulati,
Garrett Tanzer,
Damien Vincent,
Zhufeng Pan,
Shibo Wang,
Soroosh Mariooryad,
Yifan Ding,
Xinyang Geng,
Fred Alcober,
Roy Frostig,
Mark Omernick,
Lexi Walker,
Cosmin Paduraru,
Christina Sorokin,
Andrea Tacchetti,
Colin Gaffney,
Samira Daruki,
Olcan Sercinoglu,
Zach Gleicher,
Juliette Love
, et al. (1092 additional authors not shown)
Abstract:
In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February…
▽ More
In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February version on the great majority of capabilities and benchmarks; (2) Gemini 1.5 Flash, a more lightweight variant designed for efficiency with minimal regression in quality. Gemini 1.5 models achieve near-perfect recall on long-context retrieval tasks across modalities, improve the state-of-the-art in long-document QA, long-video QA and long-context ASR, and match or surpass Gemini 1.0 Ultra's state-of-the-art performance across a broad set of benchmarks. Studying the limits of Gemini 1.5's long-context ability, we find continued improvement in next-token prediction and near-perfect retrieval (>99%) up to at least 10M tokens, a generational leap over existing models such as Claude 3.0 (200k) and GPT-4 Turbo (128k). Finally, we highlight real-world use cases, such as Gemini 1.5 collaborating with professionals on completing their tasks achieving 26 to 75% time savings across 10 different job categories, as well as surprising new capabilities of large language models at the frontier; when given a grammar manual for Kalamang, a language with fewer than 200 speakers worldwide, the model learns to translate English to Kalamang at a similar level to a person who learned from the same content.
△ Less
Submitted 14 June, 2024; v1 submitted 8 March, 2024;
originally announced March 2024.
-
Failures and Successes of Cross-Validation for Early-Stopped Gradient Descent
Authors:
Pratik Patil,
Yuchen Wu,
Ryan J. Tibshirani
Abstract:
We analyze the statistical properties of generalized cross-validation (GCV) and leave-one-out cross-validation (LOOCV) applied to early-stopped gradient descent (GD) in high-dimensional least squares regression. We prove that GCV is generically inconsistent as an estimator of the prediction risk of early-stopped GD, even for a well-specified linear model with isotropic features. In contrast, we sh…
▽ More
We analyze the statistical properties of generalized cross-validation (GCV) and leave-one-out cross-validation (LOOCV) applied to early-stopped gradient descent (GD) in high-dimensional least squares regression. We prove that GCV is generically inconsistent as an estimator of the prediction risk of early-stopped GD, even for a well-specified linear model with isotropic features. In contrast, we show that LOOCV converges uniformly along the GD trajectory to the prediction risk. Our theory requires only mild assumptions on the data distribution and does not require the underlying regression function to be linear. Furthermore, by leveraging the individual LOOCV errors, we construct consistent estimators for the entire prediction error distribution along the GD trajectory and consistent estimators for a wide class of error functionals. This in particular enables the construction of pathwise prediction intervals based on GD iterates that have asymptotically correct nominal coverage conditional on the training data.
△ Less
Submitted 26 February, 2024;
originally announced February 2024.
-
Parallel Approximate Maximum Flows in Near-Linear Work and Polylogarithmic Depth
Authors:
Arpit Agarwal,
Sanjeev Khanna,
Huan Li,
Prathamesh Patil,
Chen Wang,
Nathan White,
Peilin Zhong
Abstract:
We present a parallel algorithm for the $(1-ε)$-approximate maximum flow problem in capacitated, undirected graphs with $n$ vertices and $m$ edges, achieving $O(ε^{-3}\text{polylog} n)$ depth and $O(m ε^{-3} \text{polylog} n)$ work in the PRAM model. Although near-linear time sequential algorithms for this problem have been known for almost a decade, no parallel algorithms that simultaneously achi…
▽ More
We present a parallel algorithm for the $(1-ε)$-approximate maximum flow problem in capacitated, undirected graphs with $n$ vertices and $m$ edges, achieving $O(ε^{-3}\text{polylog} n)$ depth and $O(m ε^{-3} \text{polylog} n)$ work in the PRAM model. Although near-linear time sequential algorithms for this problem have been known for almost a decade, no parallel algorithms that simultaneously achieved polylogarithmic depth and near-linear work were known.
At the heart of our result is a polylogarithmic depth, near-linear work recursive algorithm for computing congestion approximators. Our algorithm involves a recursive step to obtain a low-quality congestion approximator followed by a "boosting" step to improve its quality which prevents a multiplicative blow-up in error. Similar to Peng [SODA'16], our boosting step builds upon the hierarchical decomposition scheme of Räcke, Shah, and Täubig [SODA'14].
A direct implementation of this approach, however, leads only to an algorithm with $n^{o(1)}$ depth and $m^{1+o(1)}$ work. To get around this, we introduce a new hierarchical decomposition scheme, in which we only need to solve maximum flows on subgraphs obtained by contracting vertices, as opposed to vertex-induced subgraphs used in Räcke, Shah, and Täubig [SODA'14]. In particular, we are able to directly extract congestion approximators for the subgraphs from a congestion approximator for the entire graph, thereby avoiding additional recursion on those subgraphs. Along the way, we also develop a parallel flow-decomposition algorithm that is crucial to achieving polylogarithmic depth and may be of independent interest.
△ Less
Submitted 22 February, 2024;
originally announced February 2024.
-
Autoencoder with Ordered Variance for Nonlinear Model Identification
Authors:
Midhun T. Augustine,
Parag Patil,
Mani Bhushan,
Sharad Bhartiya
Abstract:
This paper presents a novel autoencoder with ordered variance (AEO) in which the loss function is modified with a variance regularization term to enforce order in the latent space. Further, the autoencoder is modified using ResNets, which results in a ResNet AEO (RAEO). The paper also illustrates the effectiveness of AEO and RAEO in extracting nonlinear relationships among input variables in an un…
▽ More
This paper presents a novel autoencoder with ordered variance (AEO) in which the loss function is modified with a variance regularization term to enforce order in the latent space. Further, the autoencoder is modified using ResNets, which results in a ResNet AEO (RAEO). The paper also illustrates the effectiveness of AEO and RAEO in extracting nonlinear relationships among input variables in an unsupervised setting.
△ Less
Submitted 20 February, 2024;
originally announced February 2024.
-
Misalignment, Learning, and Ranking: Harnessing Users Limited Attention
Authors:
Arpit Agarwal,
Rad Niazadeh,
Prathamesh Patil
Abstract:
In digital health and EdTech, recommendation systems face a significant challenge: users often choose impulsively, in ways that conflict with the platform's long-term payoffs. This misalignment makes it difficult to effectively learn to rank items, as it may hinder exploration of items with greater long-term payoffs. Our paper tackles this issue by utilizing users' limited attention spans. We prop…
▽ More
In digital health and EdTech, recommendation systems face a significant challenge: users often choose impulsively, in ways that conflict with the platform's long-term payoffs. This misalignment makes it difficult to effectively learn to rank items, as it may hinder exploration of items with greater long-term payoffs. Our paper tackles this issue by utilizing users' limited attention spans. We propose a model where a platform presents items with unknown payoffs to the platform in a ranked list to $T$ users over time. Each user selects an item by first considering a prefix window of these ranked items and then picking the highest preferred item in that window (and the platform observes its payoff for this item). We study the design of online bandit algorithms that obtain vanishing regret against hindsight optimal benchmarks.
We first consider adversarial window sizes and stochastic iid payoffs. We design an active-elimination-based algorithm that achieves an optimal instance-dependent regret bound of $O(\log(T))$, by showing matching regret upper and lower bounds. The key idea is using the combinatorial structure of the problem to either obtain a large payoff from each item or to explore by getting a sample from that item. This method systematically narrows down the item choices to enhance learning efficiency and payoff.
Second, we consider adversarial payoffs and stochastic iid window sizes. We start from the full-information problem of finding the permutation that maximizes the expected payoff. By a novel combinatorial argument, we characterize the polytope of admissible item selection probabilities by a permutation and show it has a polynomial-size representation. Using this representation, we show how standard algorithms for adversarial online linear optimization in the space of admissible probabilities can be used to obtain a polynomial-time algorithm with $O(\sqrt{T})$ regret.
△ Less
Submitted 21 February, 2024;
originally announced February 2024.
-
Maximum Likelihood Quantum Error Mitigation for Algorithms with a Single Correct Output
Authors:
Dror Baron,
Hrushikesh Pramod Patil,
Huiyang Zhou
Abstract:
Quantum error mitigation is an important technique to reduce the impact of noise in quantum computers. With more and more qubits being supported on quantum computers, there are two emerging fundamental challenges. First, the number of shots required for quantum algorithms with large numbers of qubits needs to increase in order to obtain a meaningful distribution or expected value of an observable.…
▽ More
Quantum error mitigation is an important technique to reduce the impact of noise in quantum computers. With more and more qubits being supported on quantum computers, there are two emerging fundamental challenges. First, the number of shots required for quantum algorithms with large numbers of qubits needs to increase in order to obtain a meaningful distribution or expected value of an observable. Second, although steady progress has been made in improving the fidelity of each qubit, circuits with a large number of qubits are likely to produce erroneous results. This low-shot, high-noise regime calls for highly scalable error mitigation techniques. In this paper, we propose a simple and effective mitigation scheme, qubit-wise majority vote, for quantum algorithms with a single correct output. We show that our scheme produces the maximum likelihood (ML) estimate under certain assumptions, and bound the number of shots required. Our experimental results on real quantum devices confirm that our proposed approach requires fewer shots than existing ones, and can sometimes recover the correct answers even when they are not observed from the measurement results.
△ Less
Submitted 18 February, 2024;
originally announced February 2024.
-
Gemini: A Family of Highly Capable Multimodal Models
Authors:
Gemini Team,
Rohan Anil,
Sebastian Borgeaud,
Jean-Baptiste Alayrac,
Jiahui Yu,
Radu Soricut,
Johan Schalkwyk,
Andrew M. Dai,
Anja Hauth,
Katie Millican,
David Silver,
Melvin Johnson,
Ioannis Antonoglou,
Julian Schrittwieser,
Amelia Glaese,
Jilin Chen,
Emily Pitler,
Timothy Lillicrap,
Angeliki Lazaridou,
Orhan Firat,
James Molloy,
Michael Isard,
Paul R. Barham,
Tom Hennigan,
Benjamin Lee
, et al. (1325 additional authors not shown)
Abstract:
This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr…
▽ More
This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultra model advances the state of the art in 30 of 32 of these benchmarks - notably being the first model to achieve human-expert performance on the well-studied exam benchmark MMLU, and improving the state of the art in every one of the 20 multimodal benchmarks we examined. We believe that the new capabilities of the Gemini family in cross-modal reasoning and language understanding will enable a wide variety of use cases. We discuss our approach toward post-training and deploying Gemini models responsibly to users through services including Gemini, Gemini Advanced, Google AI Studio, and Cloud Vertex AI.
△ Less
Submitted 17 June, 2024; v1 submitted 18 December, 2023;
originally announced December 2023.
-
Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models
Authors:
Avi Singh,
John D. Co-Reyes,
Rishabh Agarwal,
Ankesh Anand,
Piyush Patil,
Xavier Garcia,
Peter J. Liu,
James Harrison,
Jaehoon Lee,
Kelvin Xu,
Aaron Parisi,
Abhishek Kumar,
Alex Alemi,
Alex Rizkowsky,
Azade Nova,
Ben Adlam,
Bernd Bohnet,
Gamaleldin Elsayed,
Hanie Sedghi,
Igor Mordatch,
Isabelle Simpson,
Izzeddin Gur,
Jasper Snoek,
Jeffrey Pennington,
Jiri Hron
, et al. (16 additional authors not shown)
Abstract:
Fine-tuning language models~(LMs) on human-generated data remains a prevalent practice. However, the performance of such models is often limited by the quantity and diversity of high-quality human data. In this paper, we explore whether we can go beyond human data on tasks where we have access to scalar feedback, for example, on math problems where one can verify correctness. To do so, we investig…
▽ More
Fine-tuning language models~(LMs) on human-generated data remains a prevalent practice. However, the performance of such models is often limited by the quantity and diversity of high-quality human data. In this paper, we explore whether we can go beyond human data on tasks where we have access to scalar feedback, for example, on math problems where one can verify correctness. To do so, we investigate a simple self-training method based on expectation-maximization, which we call ReST$^{EM}$, where we (1) generate samples from the model and filter them using binary feedback, (2) fine-tune the model on these samples, and (3) repeat this process a few times. Testing on advanced MATH reasoning and APPS coding benchmarks using PaLM-2 models, we find that ReST$^{EM}$ scales favorably with model size and significantly surpasses fine-tuning only on human data. Overall, our findings suggest self-training with feedback can substantially reduce dependence on human-generated data.
△ Less
Submitted 17 April, 2024; v1 submitted 11 December, 2023;
originally announced December 2023.
-
Enhancing Low Resource NER Using Assisting Language And Transfer Learning
Authors:
Maithili Sabane,
Aparna Ranade,
Onkar Litake,
Parth Patil,
Raviraj Joshi,
Dipali Kadam
Abstract:
Named Entity Recognition (NER) is a fundamental task in NLP that is used to locate the key information in text and is primarily applied in conversational and search systems. In commercial applications, NER or comparable slot-filling methods have been widely deployed for popular languages. NER is used in applications such as human resources, customer service, search engines, content classification,…
▽ More
Named Entity Recognition (NER) is a fundamental task in NLP that is used to locate the key information in text and is primarily applied in conversational and search systems. In commercial applications, NER or comparable slot-filling methods have been widely deployed for popular languages. NER is used in applications such as human resources, customer service, search engines, content classification, and academia. In this paper, we draw focus on identifying name entities for low-resource Indian languages that are closely related, like Hindi and Marathi. We use various adaptations of BERT such as baseBERT, AlBERT, and RoBERTa to train a supervised NER model. We also compare multilingual models with monolingual models and establish a baseline. In this work, we show the assisting capabilities of the Hindi and Marathi languages for the NER task. We show that models trained using multiple languages perform better than a single language. However, we also observe that blind mixing of all datasets doesn't necessarily provide improvements and data selection methods may be required.
△ Less
Submitted 10 June, 2023;
originally announced June 2023.
-
Confidence Intervals for Error Rates in 1:1 Matching Tasks: Critical Statistical Analysis and Recommendations
Authors:
Riccardo Fogliato,
Pratik Patil,
Pietro Perona
Abstract:
Matching algorithms are commonly used to predict matches between items in a collection. For example, in 1:1 face verification, a matching algorithm predicts whether two face images depict the same person. Accurately assessing the uncertainty of the error rates of such algorithms can be challenging when data are dependent and error rates are low, two aspects that have been often overlooked in the l…
▽ More
Matching algorithms are commonly used to predict matches between items in a collection. For example, in 1:1 face verification, a matching algorithm predicts whether two face images depict the same person. Accurately assessing the uncertainty of the error rates of such algorithms can be challenging when data are dependent and error rates are low, two aspects that have been often overlooked in the literature. In this work, we review methods for constructing confidence intervals for error rates in 1:1 matching tasks. We derive and examine the statistical properties of these methods, demonstrating how coverage and interval width vary with sample size, error rates, and degree of data dependence on both analysis and experiments with synthetic and real-world datasets. Based on our findings, we provide recommendations for best practices for constructing confidence intervals for error rates in 1:1 matching tasks.
△ Less
Submitted 26 April, 2024; v1 submitted 1 June, 2023;
originally announced June 2023.
-
Generalized equivalences between subsampling and ridge regularization
Authors:
Pratik Patil,
Jin-Hong Du
Abstract:
We establish precise structural and risk equivalences between subsampling and ridge regularization for ensemble ridge estimators. Specifically, we prove that linear and quadratic functionals of subsample ridge estimators, when fitted with different ridge regularization levels $λ$ and subsample aspect ratios $ψ$, are asymptotically equivalent along specific paths in the $(λ,ψ)$-plane (where $ψ$ is…
▽ More
We establish precise structural and risk equivalences between subsampling and ridge regularization for ensemble ridge estimators. Specifically, we prove that linear and quadratic functionals of subsample ridge estimators, when fitted with different ridge regularization levels $λ$ and subsample aspect ratios $ψ$, are asymptotically equivalent along specific paths in the $(λ,ψ)$-plane (where $ψ$ is the ratio of the feature dimension to the subsample size). Our results only require bounded moment assumptions on feature and response distributions and allow for arbitrary joint distributions. Furthermore, we provide a data-dependent method to determine the equivalent paths of $(λ,ψ)$. An indirect implication of our equivalences is that optimally tuned ridge regression exhibits a monotonic prediction risk in the data aspect ratio. This resolves a recent open problem raised by Nakkiran et al. for general data distributions under proportional asymptotics, assuming a mild regularity condition that maintains regression hardness through linearized signal-to-noise ratios.
△ Less
Submitted 17 October, 2023; v1 submitted 29 May, 2023;
originally announced May 2023.
-
Subsample Ridge Ensembles: Equivalences and Generalized Cross-Validation
Authors:
Jin-Hong Du,
Pratik Patil,
Arun Kumar Kuchibhotla
Abstract:
We study subsampling-based ridge ensembles in the proportional asymptotics regime, where the feature size grows proportionally with the sample size such that their ratio converges to a constant. By analyzing the squared prediction risk of ridge ensembles as a function of the explicit penalty $λ$ and the limiting subsample aspect ratio $φ_s$ (the ratio of the feature size to the subsample size), we…
▽ More
We study subsampling-based ridge ensembles in the proportional asymptotics regime, where the feature size grows proportionally with the sample size such that their ratio converges to a constant. By analyzing the squared prediction risk of ridge ensembles as a function of the explicit penalty $λ$ and the limiting subsample aspect ratio $φ_s$ (the ratio of the feature size to the subsample size), we characterize contours in the $(λ, φ_s)$-plane at any achievable risk. As a consequence, we prove that the risk of the optimal full ridgeless ensemble (fitted on all possible subsamples) matches that of the optimal ridge predictor. In addition, we prove strong uniform consistency of generalized cross-validation (GCV) over the subsample sizes for estimating the prediction risk of ridge ensembles. This allows for GCV-based tuning of full ridgeless ensembles without sample splitting and yields a predictor whose risk matches optimal ridge risk.
△ Less
Submitted 16 July, 2023; v1 submitted 25 April, 2023;
originally announced April 2023.
-
KILDST: Effective Knowledge-Integrated Learning for Dialogue State Tracking using Gazetteer and Speaker Information
Authors:
Hyungtak Choi,
Hyeonmok Ko,
Gurpreet Kaur,
Lohith Ravuru,
Kiranmayi Gandikota,
Manisha Jhawar,
Simma Dharani,
Pranamya Patil
Abstract:
Dialogue State Tracking (DST) is core research in dialogue systems and has received much attention. In addition, it is necessary to define a new problem that can deal with dialogue between users as a step toward the conversational AI that extracts and recommends information from the dialogue between users. So, we introduce a new task - DST from dialogue between users about scheduling an event (DST…
▽ More
Dialogue State Tracking (DST) is core research in dialogue systems and has received much attention. In addition, it is necessary to define a new problem that can deal with dialogue between users as a step toward the conversational AI that extracts and recommends information from the dialogue between users. So, we introduce a new task - DST from dialogue between users about scheduling an event (DST-USERS). The DST-USERS task is much more challenging since it requires the model to understand and track dialogue states in the dialogue between users and to understand who suggested the schedule and who agreed to the proposed schedule. To facilitate DST-USERS research, we develop dialogue datasets between users that plan a schedule. The annotated slot values which need to be extracted in the dialogue are date, time, and location. Previous approaches, such as Machine Reading Comprehension (MRC) and traditional DST techniques, have not achieved good results in our extensive evaluations. By adopting the knowledge-integrated learning method, we achieve exceptional results. The proposed model architecture combines gazetteer features and speaker information efficiently. Our evaluations of the dialogue datasets between users that plan a schedule show that our model outperforms the baseline model.
△ Less
Submitted 18 January, 2023;
originally announced January 2023.
-
Asymptotics of the Sketched Pseudoinverse
Authors:
Daniel LeJeune,
Pratik Patil,
Hamid Javadi,
Richard G. Baraniuk,
Ryan J. Tibshirani
Abstract:
We take a random matrix theory approach to random sketching and show an asymptotic first-order equivalence of the regularized sketched pseudoinverse of a positive semidefinite matrix to a certain evaluation of the resolvent of the same matrix. We focus on real-valued regularization and extend previous results on an asymptotic equivalence of random matrices to the real setting, providing a precise…
▽ More
We take a random matrix theory approach to random sketching and show an asymptotic first-order equivalence of the regularized sketched pseudoinverse of a positive semidefinite matrix to a certain evaluation of the resolvent of the same matrix. We focus on real-valued regularization and extend previous results on an asymptotic equivalence of random matrices to the real setting, providing a precise characterization of the equivalence even under negative regularization, including a precise characterization of the smallest nonzero eigenvalue of the sketched matrix, which may be of independent interest. We then further characterize the second-order equivalence of the sketched pseudoinverse. We also apply our results to the analysis of the sketch-and-project method and to sketched ridge regression. Lastly, we prove that these results generalize to asymptotically free sketching matrices, obtaining the resulting equivalence for orthogonal sketching matrices and comparing our results to several common sketches used in practice.
△ Less
Submitted 6 October, 2023; v1 submitted 7 November, 2022;
originally announced November 2022.
-
Threat Detection In Self-Driving Vehicles Using Computer Vision
Authors:
Umang Goenka,
Aaryan Jagetia,
Param Patil,
Akshay Singh,
Taresh Sharma,
Poonam Saini
Abstract:
On-road obstacle detection is an important field of research that falls in the scope of intelligent transportation infrastructure systems. The use of vision-based approaches results in an accurate and cost-effective solution to such systems. In this research paper, we propose a threat detection mechanism for autonomous self-driving cars using dashcam videos to ensure the presence of any unwanted o…
▽ More
On-road obstacle detection is an important field of research that falls in the scope of intelligent transportation infrastructure systems. The use of vision-based approaches results in an accurate and cost-effective solution to such systems. In this research paper, we propose a threat detection mechanism for autonomous self-driving cars using dashcam videos to ensure the presence of any unwanted obstacle on the road that falls within its visual range. This information can assist the vehicle's program to en route safely. There are four major components, namely, YOLO to identify the objects, advanced lane detection algorithm, multi regression model to measure the distance of the object from the camera, the two-second rule for measuring the safety, and limiting speed. In addition, we have used the Car Crash Dataset(CCD) for calculating the accuracy of the model. The YOLO algorithm gives an accuracy of around 93%. The final accuracy of our proposed Threat Detection Model (TDM) is 82.65%.
△ Less
Submitted 6 September, 2022;
originally announced September 2022.
-
Classification of Electroencephalograms during Mathematical Calculations Using Deep Learning
Authors:
Umang Goenka,
Param Patil,
Kush Gosalia,
Aaryan Jagetia
Abstract:
Classifying Electroencephalogram(EEG) signals helps in understanding Brain-Computer Interface (BCI). EEG signals are vital in studying how the human mind functions. In this paper, we have used an Arithmetic Calculation dataset consisting of Before Calculation Signals (BCS) and During Calculation Signals (DCS). The dataset consisted of 36 participants. In order to understand the functioning of neur…
▽ More
Classifying Electroencephalogram(EEG) signals helps in understanding Brain-Computer Interface (BCI). EEG signals are vital in studying how the human mind functions. In this paper, we have used an Arithmetic Calculation dataset consisting of Before Calculation Signals (BCS) and During Calculation Signals (DCS). The dataset consisted of 36 participants. In order to understand the functioning of neurons in the brain, we classified BCS vs DCS. For this classification, we extracted various features such as Mutual Information (MI), Phase Locking Value (PLV), and Entropy namely Permutation entropy, Spectral entropy, Singular value decomposition entropy, Approximate entropy, Sample entropy. The classification of these features was done using RNN-based classifiers such as LSTM, BLSTM, ConvLSTM, and CNN-LSTM. The model achieved an accuracy of 99.72% when entropy was used as a feature and ConvLSTM as a classifier.
△ Less
Submitted 31 August, 2022;
originally announced September 2022.
-
Multi-Study Boosting: Theoretical Considerations for Merging vs. Ensembling
Authors:
Cathy Shyr,
Pragya Sur,
Giovanni Parmigiani,
Prasad Patil
Abstract:
Cross-study replicability is a powerful model evaluation criterion that emphasizes generalizability of predictions. When training cross-study replicable prediction models, it is critical to decide between merging and treating the studies separately. We study boosting algorithms in the presence of potential heterogeneity in predictor-outcome relationships across studies and compare two multi-study…
▽ More
Cross-study replicability is a powerful model evaluation criterion that emphasizes generalizability of predictions. When training cross-study replicable prediction models, it is critical to decide between merging and treating the studies separately. We study boosting algorithms in the presence of potential heterogeneity in predictor-outcome relationships across studies and compare two multi-study learning strategies: 1) merging all the studies and training a single model, and 2) multi-study ensembling, which involves training a separate model on each study and ensembling the resulting predictions. In the regression setting, we provide theoretical guidelines based on an analytical transition point to determine whether it is more beneficial to merge or to ensemble for boosting with linear learners. In addition, we characterize a bias-variance decomposition of estimation error for boosting with component-wise linear learners. We verify the theoretical transition point result in simulation and illustrate how it can guide the decision on merging vs. ensembling in an application to breast cancer gene expression data.
△ Less
Submitted 12 July, 2022; v1 submitted 10 July, 2022;
originally announced July 2022.
-
Sublinear Algorithms for Hierarchical Clustering
Authors:
Arpit Agarwal,
Sanjeev Khanna,
Huan Li,
Prathamesh Patil
Abstract:
Hierarchical clustering over graphs is a fundamental task in data mining and machine learning with applications in domains such as phylogenetics, social network analysis, and information retrieval. Specifically, we consider the recently popularized objective function for hierarchical clustering due to Dasgupta. Previous algorithms for (approximately) minimizing this objective function require line…
▽ More
Hierarchical clustering over graphs is a fundamental task in data mining and machine learning with applications in domains such as phylogenetics, social network analysis, and information retrieval. Specifically, we consider the recently popularized objective function for hierarchical clustering due to Dasgupta. Previous algorithms for (approximately) minimizing this objective function require linear time/space complexity. In many applications the underlying graph can be massive in size making it computationally challenging to process the graph even using a linear time/space algorithm. As a result, there is a strong interest in designing algorithms that can perform global computation using only sublinear resources. The focus of this work is to study hierarchical clustering for massive graphs under three well-studied models of sublinear computation which focus on space, time, and communication, respectively, as the primary resources to optimize: (1) (dynamic) streaming model where edges are presented as a stream, (2) query model where the graph is queried using neighbor and degree queries, (3) MPC model where the graph edges are partitioned over several machines connected via a communication channel.
We design sublinear algorithms for hierarchical clustering in all three models above. At the heart of our algorithmic results is a view of the objective in terms of cuts in the graph, which allows us to use a relaxed notion of cut sparsifiers to do hierarchical clustering while introducing only a small distortion in the objective function. Our main algorithmic contributions are then to show how cut sparsifiers of the desired form can be efficiently constructed in the query model and the MPC model. We complement our algorithmic results by establishing nearly matching lower bounds that rule out the possibility of designing better algorithms in each of these models.
△ Less
Submitted 15 June, 2022;
originally announced June 2022.
-
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
Authors:
Aarohi Srivastava,
Abhinav Rastogi,
Abhishek Rao,
Abu Awal Md Shoeb,
Abubakar Abid,
Adam Fisch,
Adam R. Brown,
Adam Santoro,
Aditya Gupta,
Adrià Garriga-Alonso,
Agnieszka Kluska,
Aitor Lewkowycz,
Akshat Agarwal,
Alethea Power,
Alex Ray,
Alex Warstadt,
Alexander W. Kocurek,
Ali Safaya,
Ali Tazarv,
Alice Xiang,
Alicia Parrish,
Allen Nie,
Aman Hussain,
Amanda Askell,
Amanda Dsouza
, et al. (426 additional authors not shown)
Abstract:
Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur…
▽ More
Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-future capabilities and limitations of language models. To address this challenge, we introduce the Beyond the Imitation Game benchmark (BIG-bench). BIG-bench currently consists of 204 tasks, contributed by 450 authors across 132 institutions. Task topics are diverse, drawing problems from linguistics, childhood development, math, common-sense reasoning, biology, physics, social bias, software development, and beyond. BIG-bench focuses on tasks that are believed to be beyond the capabilities of current language models. We evaluate the behavior of OpenAI's GPT models, Google-internal dense transformer architectures, and Switch-style sparse transformers on BIG-bench, across model sizes spanning millions to hundreds of billions of parameters. In addition, a team of human expert raters performed all tasks in order to provide a strong baseline. Findings include: model performance and calibration both improve with scale, but are poor in absolute terms (and when compared with rater performance); performance is remarkably similar across model classes, though with benefits from sparsity; tasks that improve gradually and predictably commonly involve a large knowledge or memorization component, whereas tasks that exhibit "breakthrough" behavior at a critical scale often involve multiple steps or components, or brittle metrics; social bias typically increases with scale in settings with ambiguous context, but this can be improved with prompting.
△ Less
Submitted 12 June, 2023; v1 submitted 9 June, 2022;
originally announced June 2022.
-
Mitigating multiple descents: A model-agnostic framework for risk monotonization
Authors:
Pratik Patil,
Arun Kumar Kuchibhotla,
Yuting Wei,
Alessandro Rinaldo
Abstract:
Recent empirical and theoretical analyses of several commonly used prediction procedures reveal a peculiar risk behavior in high dimensions, referred to as double/multiple descent, in which the asymptotic risk is a non-monotonic function of the limiting aspect ratio of the number of features or parameters to the sample size. To mitigate this undesirable behavior, we develop a general framework for…
▽ More
Recent empirical and theoretical analyses of several commonly used prediction procedures reveal a peculiar risk behavior in high dimensions, referred to as double/multiple descent, in which the asymptotic risk is a non-monotonic function of the limiting aspect ratio of the number of features or parameters to the sample size. To mitigate this undesirable behavior, we develop a general framework for risk monotonization based on cross-validation that takes as input a generic prediction procedure and returns a modified procedure whose out-of-sample prediction risk is, asymptotically, monotonic in the limiting aspect ratio. As part of our framework, we propose two data-driven methodologies, namely zero- and one-step, that are akin to bagging and boosting, respectively, and show that, under very mild assumptions, they provably achieve monotonic asymptotic risk behavior. Our results are applicable to a broad variety of prediction procedures and loss functions, and do not require a well-specified (parametric) model. We exemplify our framework with concrete analyses of the minimum $\ell_2$, $\ell_1$-norm least squares prediction procedures. As one of the ingredients in our analysis, we also derive novel additive and multiplicative forms of oracle risk inequalities for split cross-validation that are of independent interest.
△ Less
Submitted 25 May, 2022;
originally announced May 2022.
-
A Sharp Memory-Regret Trade-Off for Multi-Pass Streaming Bandits
Authors:
Arpit Agarwal,
Sanjeev Khanna,
Prathamesh Patil
Abstract:
The stochastic $K$-armed bandit problem has been studied extensively due to its applications in various domains ranging from online advertising to clinical trials. In practice however, the number of arms can be very large resulting in large memory requirements for simultaneously processing them. In this paper we consider a streaming setting where the arms are presented in a stream and the algorith…
▽ More
The stochastic $K$-armed bandit problem has been studied extensively due to its applications in various domains ranging from online advertising to clinical trials. In practice however, the number of arms can be very large resulting in large memory requirements for simultaneously processing them. In this paper we consider a streaming setting where the arms are presented in a stream and the algorithm uses limited memory to process these arms. Here, the goal is not only to minimize regret, but also to do so in minimal memory. Previous algorithms for this problem operate in one of the two settings: they either use $Ω(\log \log T)$ passes over the stream (Rathod, 2021; Chaudhuri and Kalyanakrishnan, 2020; Liau et al., 2018), or just a single pass (Maiti et al., 2021).
In this paper we study the trade-off between memory and regret when $B$ passes over the stream are allowed, for any $B \geq 1$, and establish tight regret upper and lower bounds for any $B$-pass algorithm. Our results uncover a surprising *sharp transition phenomenon*: $O(1)$ memory is sufficient to achieve $\widetildeΘ\Big(T^{\frac{1}{2} + \frac{1}{2^{B+2}-2}}\Big)$ regret in $B$ passes, and increasing the memory to any quantity that is $o(K)$ has almost no impact on further reducing this regret, unless we use $Ω(K)$ memory. Our main technical contribution is our lower bound which requires the use of information-theoretic techniques as well as ideas from round elimination to show that the *residual problem* remains challenging over subsequent passes.
△ Less
Submitted 2 May, 2022;
originally announced May 2022.
-
L3Cube-MahaNER: A Marathi Named Entity Recognition Dataset and BERT models
Authors:
Parth Patil,
Aparna Ranade,
Maithili Sabane,
Onkar Litake,
Raviraj Joshi
Abstract:
Named Entity Recognition (NER) is a basic NLP task and finds major applications in conversational and search systems. It helps us identify key entities in a sentence used for the downstream application. NER or similar slot filling systems for popular languages have been heavily used in commercial applications. In this work, we focus on Marathi, an Indian language, spoken prominently by the people…
▽ More
Named Entity Recognition (NER) is a basic NLP task and finds major applications in conversational and search systems. It helps us identify key entities in a sentence used for the downstream application. NER or similar slot filling systems for popular languages have been heavily used in commercial applications. In this work, we focus on Marathi, an Indian language, spoken prominently by the people of Maharashtra state. Marathi is a low resource language and still lacks useful NER resources. We present L3Cube-MahaNER, the first major gold standard named entity recognition dataset in Marathi. We also describe the manual annotation guidelines followed during the process. In the end, we benchmark the dataset on different CNN, LSTM, and Transformer based models like mBERT, XLM-RoBERTa, IndicBERT, MahaBERT, etc. The MahaBERT provides the best performance among all the models. The data and models are available at https://github.com/l3cube-pune/MarathiNLP .
△ Less
Submitted 12 April, 2022;
originally announced April 2022.
-
Mono vs Multilingual BERT: A Case Study in Hindi and Marathi Named Entity Recognition
Authors:
Onkar Litake,
Maithili Sabane,
Parth Patil,
Aparna Ranade,
Raviraj Joshi
Abstract:
Named entity recognition (NER) is the process of recognising and classifying important information (entities) in text. Proper nouns, such as a person's name, an organization's name, or a location's name, are examples of entities. The NER is one of the important modules in applications like human resources, customer support, search engines, content classification, and academia. In this work, we con…
▽ More
Named entity recognition (NER) is the process of recognising and classifying important information (entities) in text. Proper nouns, such as a person's name, an organization's name, or a location's name, are examples of entities. The NER is one of the important modules in applications like human resources, customer support, search engines, content classification, and academia. In this work, we consider NER for low-resource Indian languages like Hindi and Marathi. The transformer-based models have been widely used for NER tasks. We consider different variations of BERT like base-BERT, RoBERTa, and AlBERT and benchmark them on publicly available Hindi and Marathi NER datasets. We provide an exhaustive comparison of different monolingual and multilingual transformer-based models and establish simple baselines currently missing in the literature. We show that the monolingual MahaRoBERTa model performs the best for Marathi NER whereas the multilingual XLM-RoBERTa performs the best for Hindi NER. We also perform cross-language evaluation and present mixed observations.
△ Less
Submitted 24 March, 2022;
originally announced March 2022.
-
Video Content Classification using Deep Learning
Authors:
Pradyumn Patil,
Vishwajeet Pawar,
Yashraj Pawar,
Shruti Pisal
Abstract:
Video content classification is an important research content in computer vision, which is widely used in many fields, such as image and video retrieval, computer vision. This paper presents a model that is a combination of Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) which develops, trains, and optimizes a deep learning network that can identify the type of video content…
▽ More
Video content classification is an important research content in computer vision, which is widely used in many fields, such as image and video retrieval, computer vision. This paper presents a model that is a combination of Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) which develops, trains, and optimizes a deep learning network that can identify the type of video content and classify them into categories such as "Animation, Gaming, natural content, flat content, etc". To enhance the performance of the model novel keyframe extraction method is included to classify only the keyframes, thereby reducing the overall processing time without sacrificing any significant performance.
△ Less
Submitted 26 November, 2021;
originally announced November 2021.
-
Blockchain-based Security Services for Fog Computing
Authors:
Arvind W. Kiwelekar,
Pramod Patil,
Laxman D. Netak,
Sanjay U Waikar
Abstract:
Fog computing is a paradigm for distributed computing that enables sharing of resources such as computing, storage and network services. Unlike cloud computing, fog computing platforms primarily support {\em non-functional properties} such as location awareness, mobility and reduced latency. This emerging paradigm has many potential applications in domains such as smart grids, smart cities, and tr…
▽ More
Fog computing is a paradigm for distributed computing that enables sharing of resources such as computing, storage and network services. Unlike cloud computing, fog computing platforms primarily support {\em non-functional properties} such as location awareness, mobility and reduced latency. This emerging paradigm has many potential applications in domains such as smart grids, smart cities, and transport management.
Most of these domains collect and monitor personal information through edge devices to offer personalized services. A {\em centralized} server either at the level of cloud or fog, has been found ineffective to provide a high degree of security and privacy-preserving services.
Blockchain technology supports the development of {\em decentralized} applications designed around the principles of immutability, cryptography, consistency preserving consensus protocols and smart contracts. Hence blockchain technology has emerged as a preferred technology in recent times to build trustworthy distributed applications.
The chapter describes the potential of blockchain technology to realize security services such as authentication, secured communication, availability, privacy and trust management to support the development of dependable fog services.
△ Less
Submitted 16 February, 2021;
originally announced February 2021.
-
Align-RUDDER: Learning From Few Demonstrations by Reward Redistribution
Authors:
Vihang P. Patil,
Markus Hofmarcher,
Marius-Constantin Dinu,
Matthias Dorfer,
Patrick M. Blies,
Johannes Brandstetter,
Jose A. Arjona-Medina,
Sepp Hochreiter
Abstract:
Reinforcement learning algorithms require many samples when solving complex hierarchical tasks with sparse and delayed rewards. For such complex tasks, the recently proposed RUDDER uses reward redistribution to leverage steps in the Q-function that are associated with accomplishing sub-tasks. However, often only few episodes with high rewards are available as demonstrations since current explorati…
▽ More
Reinforcement learning algorithms require many samples when solving complex hierarchical tasks with sparse and delayed rewards. For such complex tasks, the recently proposed RUDDER uses reward redistribution to leverage steps in the Q-function that are associated with accomplishing sub-tasks. However, often only few episodes with high rewards are available as demonstrations since current exploration strategies cannot discover them in reasonable time. In this work, we introduce Align-RUDDER, which utilizes a profile model for reward redistribution that is obtained from multiple sequence alignment of demonstrations. Consequently, Align-RUDDER employs reward redistribution effectively and, thereby, drastically improves learning on few demonstrations. Align-RUDDER outperforms competitors on complex artificial tasks with delayed rewards and few demonstrations. On the Minecraft ObtainDiamond task, Align-RUDDER is able to mine a diamond, though not frequently. Code is available at https://github.com/ml-jku/align-rudder. YouTube: https://youtu.be/HO-_8ZUl-UY
△ Less
Submitted 28 June, 2022; v1 submitted 29 September, 2020;
originally announced September 2020.
-
Uplink-Downlink Duality Between Multiple-Access and Broadcast Channels with Compressing Relays
Authors:
Liang Liu,
Ya-Feng Liu,
Pratik Patil,
Wei Yu
Abstract:
Uplink-downlink duality refers to the fact that under a sum-power constraint, the capacity regions of a Gaussian multiple-access channel and a Gaussian broadcast channel with Hermitian transposed channel matrices are identical. This paper generalizes this result to a cooperative cellular network, in which remote access-points are deployed as relays in serving the users under the coordination of a…
▽ More
Uplink-downlink duality refers to the fact that under a sum-power constraint, the capacity regions of a Gaussian multiple-access channel and a Gaussian broadcast channel with Hermitian transposed channel matrices are identical. This paper generalizes this result to a cooperative cellular network, in which remote access-points are deployed as relays in serving the users under the coordination of a central processor (CP). In this model, the users and the relays are connected over noisy wireless links, while the relays and the CP are connected over noiseless but rate-limited fronthaul links. Based on a Lagrangian technique, this paper establishes a duality relationship between such a multiple-access relay channel and broadcast relay channel, under the assumption that the relays use compression-based strategies. Specifically, we show that under the same total transmit power constraint and individual fronthaul rate constraints, the achievable rate regions of the Gaussian multiple-access and broadcast relay channels are identical, when either independent compression or Wyner-Ziv and multivariate compression strategies are used. The key observations are that if the beamforming vectors at the relays are fixed, the sum-power minimization problems under the achievable rate and fronthaul constraints in both the uplink and the downlink can be transformed into either a linear programming or a semidefinite programming problem depending on the compression technique, and that the uplink and downlink problems are Lagrangian duals of each other. Moreover, the dual variables corresponding to the downlink rate constraints become the uplink powers; the dual variables corresponding to the downlink fronthaul constraints become the uplink quantization noises. This duality relationship enables an efficient algorithm for optimizing the downlink transmission and relaying strategies based on the uplink.
△ Less
Submitted 26 August, 2021; v1 submitted 25 August, 2020;
originally announced August 2020.
-
City-Scale Agent-Based Simulators for the Study of Non-Pharmaceutical Interventions in the Context of the COVID-19 Epidemic
Authors:
Shubhada Agrawal,
Siddharth Bhandari,
Anirban Bhattacharjee,
Anand Deo,
Narendra M. Dixit,
Prahladh Harsha,
Sandeep Juneja,
Poonam Kesarwani,
Aditya Krishna Swamy,
Preetam Patil,
Nihesh Rathod,
Ramprasad Saptharishi,
Sharad Shriram,
Piyush Srivastava,
Rajesh Sundaresan,
Nidhin Koshy Vaidhiyan,
Sarath Yasodharan
Abstract:
We highlight the usefulness of city-scale agent-based simulators in studying various non-pharmaceutical interventions to manage an evolving pandemic. We ground our studies in the context of the COVID-19 pandemic and demonstrate the power of the simulator via several exploratory case studies in two metropolises, Bengaluru and Mumbai. Such tools become common-place in any city administration's tool…
▽ More
We highlight the usefulness of city-scale agent-based simulators in studying various non-pharmaceutical interventions to manage an evolving pandemic. We ground our studies in the context of the COVID-19 pandemic and demonstrate the power of the simulator via several exploratory case studies in two metropolises, Bengaluru and Mumbai. Such tools become common-place in any city administration's tool kit in our march towards digital health.
△ Less
Submitted 11 August, 2020;
originally announced August 2020.
-
Representation via Representations: Domain Generalization via Adversarially Learned Invariant Representations
Authors:
Zhun Deng,
Frances Ding,
Cynthia Dwork,
Rachel Hong,
Giovanni Parmigiani,
Prasad Patil,
Pragya Sur
Abstract:
We investigate the power of censoring techniques, first developed for learning {\em fair representations}, to address domain generalization. We examine {\em adversarial} censoring techniques for learning invariant representations from multiple "studies" (or domains), where each study is drawn according to a distribution on domains. The mapping is used at test time to classify instances from a new…
▽ More
We investigate the power of censoring techniques, first developed for learning {\em fair representations}, to address domain generalization. We examine {\em adversarial} censoring techniques for learning invariant representations from multiple "studies" (or domains), where each study is drawn according to a distribution on domains. The mapping is used at test time to classify instances from a new domain. In many contexts, such as medical forecasting, domain generalization from studies in populous areas (where data are plentiful), to geographically remote populations (for which no training data exist) provides fairness of a different flavor, not anticipated in previous work on algorithmic fairness.
We study an adversarial loss function for $k$ domains and precisely characterize its limiting behavior as $k$ grows, formalizing and proving the intuition, backed by experiments, that observing data from a larger number of domains helps. The limiting results are accompanied by non-asymptotic learning-theoretic bounds. Furthermore, we obtain sufficient conditions for good worst-case prediction performance of our algorithm on previously unseen domains. Finally, we decompose our mappings into two components and provide a complete characterization of invariance in terms of this decomposition. To our knowledge, our results provide the first formal guarantees of these kinds for adversarial invariant domain generalization.
△ Less
Submitted 19 June, 2020;
originally announced June 2020.
-
COVID-19 Epidemic Study II: Phased Emergence From the Lockdown in Mumbai
Authors:
Prahladh Harsha,
Sandeep Juneja,
Preetam Patil,
Nihesh Rathod,
Ramprasad Saptharishi,
A. Y. Sarath,
Sharad Sriram,
Piyush Srivastava,
Rajesh Sundaresan,
Nidhin Koshy Vaidhiyan
Abstract:
The nation-wide lockdown starting 25 March 2020, aimed at suppressing the spread of the COVID-19 disease, was extended until 31 May 2020 in three subsequent orders by the Government of India. The extended lockdown has had significant social and economic consequences and `lockdown fatigue' has likely set in. Phased reopening began from 01 June 2020 onwards. Mumbai, one of the most crowded cities in…
▽ More
The nation-wide lockdown starting 25 March 2020, aimed at suppressing the spread of the COVID-19 disease, was extended until 31 May 2020 in three subsequent orders by the Government of India. The extended lockdown has had significant social and economic consequences and `lockdown fatigue' has likely set in. Phased reopening began from 01 June 2020 onwards. Mumbai, one of the most crowded cities in the world, has witnessed both the largest number of cases and deaths among all the cities in India (41986 positive cases and 1368 deaths as of 02 June 2020). Many tough decisions are going to be made on re-opening in the next few days. In an earlier IISc-TIFR Report, we presented an agent-based city-scale simulator(ABCS) to model the progression and spread of the infection in large metropolises like Mumbai and Bengaluru. As discussed in IISc-TIFR Report 1, ABCS is a useful tool to model interactions of city residents at an individual level and to capture the impact of non-pharmaceutical interventions on the infection spread. In this report we focus on Mumbai. Using our simulator, we consider some plausible scenarios for phased emergence of Mumbai from the lockdown, 01 June 2020 onwards. These include phased and gradual opening of the industry, partial opening of public transportation (modelling of infection spread in suburban trains), impact of containment zones on controlling infections, and the role of compliance with respect to various intervention measures including use of masks, case isolation, home quarantine, etc. The main takeaway of our simulation results is that a phased opening of workplaces, say at a conservative attendance level of 20 to 33\%, is a good way to restart economic activity while ensuring that the city's medical care capacity remains adequate to handle the possible rise in the number of COVID-19 patients in June and July.
△ Less
Submitted 5 June, 2020;
originally announced June 2020.
-
Network Clustering Via Kernel-ARMA Modeling and the Grassmannian The Brain-Network Case
Authors:
Cong Ye,
Konstantinos Slavakis,
Pratik V. Patil,
Johan Nakuci,
Sarah F. Muldoon,
John Medaglia
Abstract:
This paper introduces a clustering framework for networks with nodes annotated with time-series data. The framework addresses all types of network-clustering problems: State clustering, node clustering within states (a.k.a. topology identification or community detection), and even subnetwork-state-sequence identification/tracking. Via a bottom-up approach, features are first extracted from the raw…
▽ More
This paper introduces a clustering framework for networks with nodes annotated with time-series data. The framework addresses all types of network-clustering problems: State clustering, node clustering within states (a.k.a. topology identification or community detection), and even subnetwork-state-sequence identification/tracking. Via a bottom-up approach, features are first extracted from the raw nodal time-series data by kernel autoregressive-moving-average modeling to reveal non-linear dependencies and low-rank representations, and then mapped onto the Grassmann manifold (Grassmannian). All clustering tasks are performed by leveraging the underlying Riemannian geometry of the Grassmannian in a novel way. To validate the proposed framework, brain-network clustering is considered, where extensive numerical tests on synthetic and real functional magnetic resonance imaging (fMRI) data demonstrate that the advocated learning framework compares favorably versus several state-of-the-art clustering schemes.
△ Less
Submitted 18 February, 2020;
originally announced February 2020.
-
Brain-Network Clustering via Kernel-ARMA Modeling and the Grassmannian
Authors:
Cong Ye,
Konstantinos Slavakis,
Pratik V. Patil,
Sarah F. Muldoon,
John Medaglia
Abstract:
Recent advances in neuroscience and in the technology of functional magnetic resonance imaging (fMRI) and electro-encephalography (EEG) have propelled a growing interest in brain-network clustering via time-series analysis. Notwithstanding, most of the brain-network clustering methods revolve around state clustering and/or node clustering (a.k.a. community detection or topology inference) within s…
▽ More
Recent advances in neuroscience and in the technology of functional magnetic resonance imaging (fMRI) and electro-encephalography (EEG) have propelled a growing interest in brain-network clustering via time-series analysis. Notwithstanding, most of the brain-network clustering methods revolve around state clustering and/or node clustering (a.k.a. community detection or topology inference) within states. This work answers first the need of capturing non-linear nodal dependencies by bringing forth a novel feature-extraction mechanism via kernel autoregressive-moving-average modeling. The extracted features are mapped to the Grassmann manifold (Grassmannian), which consists of all linear subspaces of a fixed rank. By virtue of the Riemannian geometry of the Grassmannian, a unifying clustering framework is offered to tackle all possible clustering problems in a network: Cluster multiple states, detect communities within states, and even identify/track subnetwork state sequences. The effectiveness of the proposed approach is underlined by extensive numerical tests on synthetic and real fMRI/EEG data which demonstrate that the advocated learning method compares favorably versus several state-of-the-art clustering schemes.
△ Less
Submitted 5 June, 2019;
originally announced June 2019.
-
Merging versus Ensembling in Multi-Study Prediction: Theoretical Insight from Random Effects
Authors:
Zoe Guan,
Giovanni Parmigiani,
Prasad Patil
Abstract:
A critical decision point when training predictors using multiple studies is whether these studies should be combined or treated separately. We compare two multi-study learning approaches in the presence of potential heterogeneity in predictor-outcome relationships across datasets. We consider 1) merging all of the datasets and training a single learner, and 2) multi-study ensembling, which involv…
▽ More
A critical decision point when training predictors using multiple studies is whether these studies should be combined or treated separately. We compare two multi-study learning approaches in the presence of potential heterogeneity in predictor-outcome relationships across datasets. We consider 1) merging all of the datasets and training a single learner, and 2) multi-study ensembling, which involves training a separate learner on each dataset and combining the predictions resulting from each learner. In a linear regression setting, we show analytically and confirm via simulation that merging yields lower prediction error than ensembling when the predictor-outcome relationships are relatively homogeneous across studies. However, as cross-study heterogeneity increases, there exists a transition point beyond which ensembling outperforms merging. We provide analytic expressions for the transition point in various scenarios, study asymptotic properties, and illustrate how transition point theory can be used for deciding when studies should be combined with an application from metabolomics.
△ Less
Submitted 17 June, 2021; v1 submitted 17 May, 2019;
originally announced May 2019.
-
Hybrid Data-Sharing and Compression Strategy for Downlink Cloud Radio Access Network
Authors:
Pratik Patil,
Binbin Dai,
Wei Yu
Abstract:
This paper studies transmission strategies for the downlink of a cloud radio access network, in which the base stations are connected to a centralized cloud-computing based processor with digital fronthaul or backhaul links. We provide a system-level performance comparison of two fundamentally different strategies, namely the data-sharing strategy and the compression strategy, that differ in the w…
▽ More
This paper studies transmission strategies for the downlink of a cloud radio access network, in which the base stations are connected to a centralized cloud-computing based processor with digital fronthaul or backhaul links. We provide a system-level performance comparison of two fundamentally different strategies, namely the data-sharing strategy and the compression strategy, that differ in the way the fronthaul/backhaul is utilized. It is observed that the performance of both strategies depends crucially on the available fronthaul or backhaul capacity. When the fronthaul/backhaul capacity is low, the datasharing strategy performs better, while under moderate-to-high fronthaul/backhaul capacity, the compression strategy is superior. Using insights from such a comparison, we propose a novel hybrid strategy, combining the data-sharing and compression strategies, that allows for better control over the fronthaul/backhaul capacity utilization. An optimization framework for the hybrid strategy is proposed. Numerical evidence demonstrates the performance gain of the hybrid strategy.
△ Less
Submitted 2 June, 2018;
originally announced June 2018.
-
Channel Diagonalization for Cloud Radio Access
Authors:
Liang Liu,
Pratik Patil,
Wei Yu
Abstract:
The diagonalization of a conventional multiple-input multiple-output (MIMO) channel into parallel and independent subchannels via singular value decomposition (SVD) is a fundamental strategy that allows the MIMO channel capacity to be achieved using scalar channel codes. This letter establishes a similar channel diagonalization result for the uplink and the downlink of a cloud radio access network…
▽ More
The diagonalization of a conventional multiple-input multiple-output (MIMO) channel into parallel and independent subchannels via singular value decomposition (SVD) is a fundamental strategy that allows the MIMO channel capacity to be achieved using scalar channel codes. This letter establishes a similar channel diagonalization result for the uplink and the downlink of a cloud radio access network (C-RAN), in which a central processor (CP) is connected to a remote radio head (RRH) serving a single user via rate-limit digital fronthaul carrying the compressed baseband signal. Specifically, we show that the diagonalization of the MIMO channel between the RRH and the user via SVD and the subsequent independent and parallel quantization of scalar signals and channel coding in each of the subchannels is optimal. This letter establishes this fact using the majorization theory. Further, an uplink-downlink duality for the multiple-antenna C-RAN is identified for this single-user case.
△ Less
Submitted 25 February, 2018; v1 submitted 6 February, 2018;
originally announced February 2018.
-
Generalized Compression Strategy for the Downlink Cloud Radio Access Network
Authors:
Pratik Patil,
Wei Yu
Abstract:
This paper studies the downlink of a cloud radio access network (C-RAN) in which a centralized processor (CP) communicates with mobile users through base stations (BSs) that are connected to the CP via finite-capacity fronthaul links. Information theoretically, the downlink of a C-RAN is modeled as a two-hop broadcast-relay network. Among the various transmission and relaying strategies for such m…
▽ More
This paper studies the downlink of a cloud radio access network (C-RAN) in which a centralized processor (CP) communicates with mobile users through base stations (BSs) that are connected to the CP via finite-capacity fronthaul links. Information theoretically, the downlink of a C-RAN is modeled as a two-hop broadcast-relay network. Among the various transmission and relaying strategies for such model, this paper focuses on the compression strategy, in which the CP centrally encodes the signals to be broadcasted jointly by the BSs, then compresses and sends these signals to the BSs through the fronthaul links. This paper characterizes an achievable rate region for a generalized compression strategy with Marton's multicoding for broadcasting and multivariate compression for fronthaul transmission. We then compare this rate region with the distributed decode-forward (DDF) scheme, which achieves the capacity of the general relay networks to within a constant gap, and show that the difference lies in that DDF performs Marton's multicoding and multivariate compression jointly as opposed to successively as in the compression strategy. A main result of this paper is that under the assumption that the fronthaul links are subject to a \emph{sum} capacity constraint this difference is immaterial, so the successive encoding based compression strategy can already achieve the capacity region of the C-RAN to within a constant gap, where the gap is independent of the channel parameters and the power constraints at the BSs. For the special case of the Gaussian network, we further establish that under individual fronthaul constraints, the compression strategy achieves to within a constant gap to the \emph{sum} capacity of the C-RAN.
△ Less
Submitted 22 July, 2019; v1 submitted 31 December, 2017;
originally announced January 2018.
-
An uplink-downlink duality for cloud radio access network
Authors:
Liang Liu,
Pratik Patil,
Wei Yu
Abstract:
Uplink-downlink duality refers to the fact that the Gaussian broadcast channel has the same capacity region as the dual Gaussian multiple-access channel under the same sumpower constraint. This paper investigates a similar duality relationship between the uplink and downlink of a cloud radio access network (C-RAN), where a central processor (CP) cooperatively serves multiple mobile users through m…
▽ More
Uplink-downlink duality refers to the fact that the Gaussian broadcast channel has the same capacity region as the dual Gaussian multiple-access channel under the same sumpower constraint. This paper investigates a similar duality relationship between the uplink and downlink of a cloud radio access network (C-RAN), where a central processor (CP) cooperatively serves multiple mobile users through multiple remote radio heads (RRHs) connected to the CP with finite-capacity fronthaul links. The uplink of such a C-RAN model corresponds to a multipleaccess relay channel; the downlink corresponds to a broadcast relay channel. This paper considers compression-based relay strategies in both uplink and downlink C-RAN, where the quantization noise levels are functions of the fronthaul link capacities. If the fronthaul capacities are infinite, the conventional uplinkdownlink duality applies. The main result of this paper is that even when the fronthaul capacities are finite, duality continues to hold for the case where independent compression is applied across each RRH in the sense that when the transmission and compression designs are jointly optimized, the achievable rate regions of the uplink and downlink remain identical under the same sum-power and individual fronthaul capacity constraints. As an application of the duality result, the power minimization problem in downlink C-RAN can be efficiently solved based on its uplink counterpart.
△ Less
Submitted 29 June, 2016;
originally announced June 2016.
-
A Processing In-Memory Realization Using QCA: Proposal and Implementation
Authors:
P. P. Chougule,
B. Sen,
R. Mukherjee,
V. C. Karade,
P. S. Patil,
T. D. Dongale,
R. K. Kamat
Abstract:
Processing in Memory (PIM) is a computing paradigm that promises enormous gain in processing speed by eradicating latencies in the typical von Neumann architecture. It has gained popularity owing to its throughput by embedding storage and computation of data in a single unit. We portray implementation of Akers array architecture endowed with PIM computation using Quantum-dot Cellular Automata (QCA…
▽ More
Processing in Memory (PIM) is a computing paradigm that promises enormous gain in processing speed by eradicating latencies in the typical von Neumann architecture. It has gained popularity owing to its throughput by embedding storage and computation of data in a single unit. We portray implementation of Akers array architecture endowed with PIM computation using Quantum-dot Cellular Automata (QCA). We present the proof of concept of PIM with its realization in the QCA designer paradigm. We illustrate implementation of Ex-OR gate with the help of QCA based Akers Array and put forth many interesting potential possibilities.
△ Less
Submitted 6 February, 2016;
originally announced February 2016.
-
Investigating Reliability Aspects of Memristor based RRAM with Reference to Write Voltage and Frequency
Authors:
T. D. Dongale,
K. V. Khot,
S. V. Mohite,
N. K. Desai,
S. S. Shinde,
A. V. Moholkar,
K. Y. Rajpure,
P. N. Bhosale,
P. S. Patil,
P. K. Gaikwad,
R. K. Kamat
Abstract:
In this paper, we report the effect of write voltage and frequency on memristor based Resistive Random Access Memory (RRAM). The above said parameters have been investigated on the linear drift model of memristor. With a variation of write voltage from 0.2V to 1.2V and a subsequent frequency modulation from 1, 2, 4, 10, 100 and 200 Hz the corresponding effects on memory window, Low Resistance Stat…
▽ More
In this paper, we report the effect of write voltage and frequency on memristor based Resistive Random Access Memory (RRAM). The above said parameters have been investigated on the linear drift model of memristor. With a variation of write voltage from 0.2V to 1.2V and a subsequent frequency modulation from 1, 2, 4, 10, 100 and 200 Hz the corresponding effects on memory window, Low Resistance State (LRS) and High Resistance State (HRS) have been reported. Thus the lifetime (τ) reliability analysis of memristor based RRAM is carried out using above results. It is found that, the HRS is independent of write voltage, whereas LRS shows dependency on write voltage and frequency. The simulation results showcase that the memristor possess higher memory window and lifetime (τ) in the higher voltage with lower frequency region, which has been attributed to the fewer data losses in the memory architecture.
△ Less
Submitted 5 February, 2016;
originally announced February 2016.
-
TiO2 based Nanostructured Memristor for RRAM and Neuromorphic Applications: A Simulation Approach
Authors:
T. D. Dongale,
P. J. Patil,
N. K. Desai,
P. P. Chougule,
S. M. Kumbhar,
P. P. Waifalkar,
P. B. Patil,
R. S. Vhatkar,
M. V. Takale,
P. K. Gaikwad,
R. K. Kamat
Abstract:
We report simulation of nanostructured memristor device using piecewise linear and nonlinear window functions for RRAM and neuromorphic applications. The linear drift model of memristor has been exploited for the simulation purpose with the linear and non-linear window function as the mathematical and scripting basis. The results evidences that the piecewise linear window function can aptly simula…
▽ More
We report simulation of nanostructured memristor device using piecewise linear and nonlinear window functions for RRAM and neuromorphic applications. The linear drift model of memristor has been exploited for the simulation purpose with the linear and non-linear window function as the mathematical and scripting basis. The results evidences that the piecewise linear window function can aptly simulate the memristor characteristics pertaining to RRAM application. However, the nonlinear window function could exhibit the nonlinear phenomenon in simulation only at the lower magnitude of control parameter. This has motivated us to propose a new nonlinear window function for emulating the simulation model of the memristor. Interestingly, the proposed window function is scalable up to f(x)=1 and exhibits the nonlinear behavior at higher magnitude of control parameter. Moreover, the simulation results of proposed nonlinear window function are encouraging and reveals the smooth nonlinear change from LRS to HRS and vice versa and therefore useful for the neuromorphic applications.
△ Less
Submitted 25 January, 2016;
originally announced January 2016.
-
Integrated ERP System for Improving the Functional efficiency of the organization by Customized Architecture
Authors:
Suryakant B. Patil,
Vijay S. Suryawanshi,
Dipali V. Suryawanshi,
Preeti Patil
Abstract:
An ERP is a kind of package which consist front end and backend as DBMS like a collection of DBMSs. You can create DBMS to manage one aspect of your business. For example, a publishing house has a database of books that keeps information about books such as Author Name, Title, Translator Name, etc. But this database app only helps enter books data and search them. It doesn't help them, for example…
▽ More
An ERP is a kind of package which consist front end and backend as DBMS like a collection of DBMSs. You can create DBMS to manage one aspect of your business. For example, a publishing house has a database of books that keeps information about books such as Author Name, Title, Translator Name, etc. But this database app only helps enter books data and search them. It doesn't help them, for example, sell books. They get or develop another DBMS database that has all the Books data plus prices, discount formulas, names of common clients, etc. Now they connect the Books database to Sales database and maybe also the inventory database. Now its DBMS slowly turning into an ERP. They may add payroll database and connect it to this ERP. They may develop sales staff and commissions database and connect it to this ERP and so on. In the traditional Database management system the different databases are used for the various Campuses of the JSPM Group of Education like Wagholi Campus, Tathwade Campus, Narhe Campus, Hadpsar Campuses, Bhavdhan Campus as well as Corporate office at Katraj of same organization so it is not possible to keep different databases for the same so in this paper proposed the use of Integrated Database for the Entire organization using ERP system. The Proposed ERP system applied on the existing Architecture of the JSPM Group; the marginal difference observed in the Databases need to be accessed to generate the same number of Reports when use the Traditional DBMS which end up with improvement in the Functional efficiency of Organizational Architecture.
△ Less
Submitted 29 July, 2014;
originally announced July 2014.
-
Layered Constructions for Low-Delay Streaming Codes
Authors:
Ahmed Badr,
Pratik Patil,
Ashish Khisti,
Wai-Tian Tan,
John Apostolopoulos
Abstract:
We propose a new class of error correction codes for low-delay streaming communication. We consider an online setup where a source packet arrives at the encoder every $M$ channel uses, and needs to be decoded with a maximum delay of $T$ packets. We consider a sliding-window erasure channel --- $\cC(N,B,W)$ --- which introduces either up to $N$ erasures in arbitrary positions, or $B$ erasures in a…
▽ More
We propose a new class of error correction codes for low-delay streaming communication. We consider an online setup where a source packet arrives at the encoder every $M$ channel uses, and needs to be decoded with a maximum delay of $T$ packets. We consider a sliding-window erasure channel --- $\cC(N,B,W)$ --- which introduces either up to $N$ erasures in arbitrary positions, or $B$ erasures in a single burst, in any window of length $W$. When $M=1$, the case where source-arrival and channel-transmission rates are equal, we propose a class of codes --- MiDAS codes --- that achieve a near optimal rate. Our construction is based on a {\em layered} approach. We first construct an optimal code for the $\cC(N=1,B,W)$ channel, and then concatenate an additional layer of parity-check symbols to deal with $N>1$. When $M > 1$, the case where source-arrival and channel-transmission rates are unequal, we characterize the capacity when $N=1$ and $W \ge M(T+1),$ and for $N>1$, we propose a construction based on a layered approach. Numerical simulations over Gilbert-Elliott and Fritchman channel models indicate significant gains in the residual loss probability over baseline schemes. We also discuss the connection between the error correction properties of the MiDAS codes and their underlying column distance and column span.
△ Less
Submitted 17 August, 2013;
originally announced August 2013.
-
Performance Evaluation of on demand and Table driven Protocol for Wireless Ad hoc Network
Authors:
V. P. Patil
Abstract:
Mobile Adhoc Network is a wireless network without infrastructure.It is a kind of wireless adhoc network,and is a self configuring network of mobile routers connected by wireless links.The routers are free to move randomly and organize themselves arbitrarily,thus the network's wireless topology may change rapidly and unpredictably. Such a network may operate in a standalone fashion,or may be conne…
▽ More
Mobile Adhoc Network is a wireless network without infrastructure.It is a kind of wireless adhoc network,and is a self configuring network of mobile routers connected by wireless links.The routers are free to move randomly and organize themselves arbitrarily,thus the network's wireless topology may change rapidly and unpredictably. Such a network may operate in a standalone fashion,or may be connected to the larger Internet.There are various routing protocols available for MANET.The most popular ones are DSR,AODV and DSDV.This paper examines two routing protocols for mobile ad hoc networks :the Destination Sequenced Distance Vector,the table driven protocol and the Ad hoc On Demand Distance Vector routing,an On Demand protocol and evaluates both protocols based on packet delivery fraction, average end to end delay,throughput and routing overhead while varying pause time.The performance evaluation has been done by using simulation tool NS2 which is the main simulator, NAM(Network Animator)and excel graph which is used for preparing the graphs from the trace files.Simulation revealed that although DSDV perfectly scales to small networks with low node speeds,AODV is preferred due to its more efficient use of bandwidth.
△ Less
Submitted 8 September, 2012;
originally announced September 2012.
-
Experimentation for Packet Loss on MSP430 and nRF24L01 Based Wireless Sensor Network
Authors:
S. S. Sonavane,
B. P. Patil,
V. Kumar
Abstract:
In this paper,a new design of wireless sensor network (WSN)node is discussed which is based on components with ultra low power.We ha e de eloped a Low cost and low power WSN Node using MSP430 and nRF24L01.The architectural circuit details are presented.This architecture fulfils the requirements like low cost,low power,compact size and self organization.Various tests are carried out to test the per…
▽ More
In this paper,a new design of wireless sensor network (WSN)node is discussed which is based on components with ultra low power.We ha e de eloped a Low cost and low power WSN Node using MSP430 and nRF24L01.The architectural circuit details are presented.This architecture fulfils the requirements like low cost,low power,compact size and self organization.Various tests are carried out to test the performance of the nRF24L01 module.The packet loss,free Space loss (FSL)and battery lifetime calculations are described.These test results will help the researchers to build new applications using abo e node and to work efficiently with nRF24L01.
△ Less
Submitted 14 June, 2010;
originally announced June 2010.
-
Convergence Time Evaluation of Algorithms in MANETs
Authors:
Annapurna P Patil,
Narmada Sambaturu,
Krittaya Chunhaviriyakul
Abstract:
Since the advent of wireless communication, the need for mobile ad hoc networks has been growing exponentially. This has opened up a Pandoras Box of algorithms for dealing with mobile ad hoc networks, or MANETs, as they are generally referred to. Most attempts made at evaluating these algorithms so far have focused on parameters such as throughput, packet delivery ratio, overhead etc. An analysi…
▽ More
Since the advent of wireless communication, the need for mobile ad hoc networks has been growing exponentially. This has opened up a Pandoras Box of algorithms for dealing with mobile ad hoc networks, or MANETs, as they are generally referred to. Most attempts made at evaluating these algorithms so far have focused on parameters such as throughput, packet delivery ratio, overhead etc. An analysis of the convergence times of these algorithms is still an open issue. The work carried out fills this gap by evaluating the algorithms on the basis of convergence time. Algorithms for MANETs can be classified into three categories: reactive, proactive, and hybrid protocols. In this project, we compare the convergence times of representative algorithms in each category, namely Ad hoc On Demand Distance Vector (AODV) reactive, Destination Sequence Distance Vector protocol (DSDV) proactive, and Temporally Ordered Routing Algorithm (TORA) hybrid. The algorithm performances are compared by simulating them in ns2. Tcl is used to conduct the simulations, while perl is used to extract data from the simulation output and calculate convergence time. The design of the experiments carried on is documented using Unified modeling Language. Also, a user interface is created using perl, which enables the user to either run a desired simulation and measure convergence time, or measure the convergence time of a simulation that has been run earlier.
△ Less
Submitted 8 October, 2009;
originally announced October 2009.
-
An Autonomous Distributed Admission Control Scheme for IEEE 802.11 DCF
Authors:
Preetam Patil,
Varsha Apte
Abstract:
Admission control as a mechanism for providing QoS requires an accurate description of the requested flow as well as already admitted flows. Since 802.11 WLAN capacity is shared between flows belonging to all stations, admission control requires knowledge of all flows in the WLAN. Further, estimation of the load-dependent WLAN capacity through analytical model requires inputs about channel data…
▽ More
Admission control as a mechanism for providing QoS requires an accurate description of the requested flow as well as already admitted flows. Since 802.11 WLAN capacity is shared between flows belonging to all stations, admission control requires knowledge of all flows in the WLAN. Further, estimation of the load-dependent WLAN capacity through analytical model requires inputs about channel data rate, payload size and the number of stations. These factors combined point to a centralized admission control whereas for 802.11 DCF it is ideally performed in a distributed manner. The use of measurements from the channel avoids explicit inputs about the state of the channel described above. BUFFET, a model based measurement-assisted distributed admission control scheme for DCF proposed in this paper relies on measurements to derive model inputs and predict WLAN saturation, thereby maintaining average delay within acceptable limits. Being measurement based, it adapts to a combination of data rates and payload sizes, making it completely autonomous and distributed. Performance analysis using OPNET simulations suggests that BUFFET is able to ensure average delay under 7ms at a near-optimal throughput.
△ Less
Submitted 19 May, 2007;
originally announced May 2007.