Skip to main content

Showing 1–8 of 8 results for author: Hamdan, S

  1. arXiv:2311.14079  [pdf, other

    cs.LG stat.ML

    Empirical Comparison between Cross-Validation and Mutation-Validation in Model Selection

    Authors: Jinyang Yu, Sami Hamdan, Leonard Sasse, Abigail Morrison, Kaustubh R. Patil

    Abstract: Mutation validation (MV) is a recently proposed approach for model selection, garnering significant interest due to its unique characteristics and potential benefits compared to the widely used cross-validation (CV) method. In this study, we empirically compared MV and $k$-fold CV using benchmark and real-world datasets. By employing Bayesian tests, we compared generalization estimates yielding th… ▽ More

    Submitted 15 February, 2024; v1 submitted 23 November, 2023; originally announced November 2023.

  2. arXiv:2311.04179  [pdf

    cs.LG cs.AI

    On Leakage in Machine Learning Pipelines

    Authors: Leonard Sasse, Eliana Nicolaisen-Sobesky, Juergen Dukart, Simon B. Eickhoff, Michael Götz, Sami Hamdan, Vera Komeyer, Abhijit Kulkarni, Juha Lahnakoski, Bradley C. Love, Federico Raimondo, Kaustubh R. Patil

    Abstract: Machine learning (ML) provides powerful tools for predictive modeling. ML's popularity stems from the promise of sample-level prediction with applications across a variety of fields from physics and marketing to healthcare. However, if not properly implemented and evaluated, ML pipelines may contain leakage typically resulting in overoptimistic performance estimates and failure to generalize to ne… ▽ More

    Submitted 5 March, 2024; v1 submitted 7 November, 2023; originally announced November 2023.

    Comments: second draft

  3. arXiv:2310.12568  [pdf, other

    cs.LG q-bio.NC

    Julearn: an easy-to-use library for leakage-free evaluation and inspection of ML models

    Authors: Sami Hamdan, Shammi More, Leonard Sasse, Vera Komeyer, Kaustubh R. Patil, Federico Raimondo

    Abstract: The fast-paced development of machine learning (ML) methods coupled with its increasing adoption in research poses challenges for researchers without extensive training in ML. In neuroscience, for example, ML can help understand brain-behavior relationships, diagnose diseases, and develop biomarkers using various data sources like magnetic resonance imaging and electroencephalography. The primary… ▽ More

    Submitted 19 October, 2023; originally announced October 2023.

    Comments: 13 pages, 5 figures

  4. arXiv:2210.09232  [pdf, other

    cs.LG cs.AI stat.ML

    Confound-leakage: Confound Removal in Machine Learning Leads to Leakage

    Authors: Sami Hamdan, Bradley C. Love, Georg G. von Polier, Susanne Weis, Holger Schwender, Simon B. Eickhoff, Kaustubh R. Patil

    Abstract: Machine learning (ML) approaches to data analysis are now widely adopted in many fields including epidemiology and medicine. To apply these approaches, confounds must first be removed as is commonly done by featurewise removal of their variance by linear regression before applying ML. Here, we show this common approach to confound removal biases ML models, leading to misleading results. Specifical… ▽ More

    Submitted 27 October, 2022; v1 submitted 17 October, 2022; originally announced October 2022.

    Comments: Revised Introduction, added CoI, results unchanged

  5. arXiv:2209.07999  [pdf, other

    cs.LG cs.AI cs.CV cs.IT eess.IV

    Self-Supervised Learning with an Information Maximization Criterion

    Authors: Serdar Ozsoy, Shadi Hamdan, Sercan Ö. Arik, Deniz Yuret, Alper T. Erdogan

    Abstract: Self-supervised learning allows AI systems to learn effective representations from large amounts of data using tasks that do not require costly labeling. Mode collapse, i.e., the model producing identical representations for all inputs, is a central problem to many self-supervised learning approaches, making self-supervised tasks, such as matching distorted variants of the inputs, ineffective. In… ▽ More

    Submitted 16 September, 2022; originally announced September 2022.

    ACM Class: I.2; I.4; I.5

  6. arXiv:2206.04615  [pdf, other

    cs.CL cs.AI cs.CY cs.LG stat.ML

    Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

    Authors: Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza , et al. (426 additional authors not shown)

    Abstract: Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur… ▽ More

    Submitted 12 June, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench

    Journal ref: Transactions on Machine Learning Research, May/2022, https://openreview.net/forum?id=uyTL5Bvosj

  7. arXiv:2007.13203  [pdf, other

    cs.DC

    A containerized proof-of-concept implementation of LightChain system

    Authors: Yahya Hassanzadeh-Nazarabadi, Nazir Nayal, Shadi Sameh Hamdan, Öznur Özkasap, Alptekin Küpçü

    Abstract: LightChain is the first Distributed Hash Table (DHT)-based blockchain with a logarithmic asymptotic message and memory complexity. In this demo paper, we present the software architecture of our open-source implementation of LightChain, as well as a novel deployment scenario of the entire LightChain system on a single machine aiming at results reproducibility.

    Submitted 26 July, 2020; originally announced July 2020.

  8. Detecting Sybil Attacks in Vehicular Ad Hoc Networks

    Authors: Salam Hamdan, Amjad Hudaib, Arafat Awajan

    Abstract: Ad hoc networks is vulnerable to numerous number of attacks due to its infrastructure-less nature, one of these attacks is the Sybil attack. Sybil attack is a severe attack on vehicular ad hoc networks (VANET) in which the intruder maliciously claims or steals multiple identities and use these identities to disturb the functionality of the VANET network by disseminating false identities. Many solu… ▽ More

    Submitted 9 May, 2019; originally announced May 2019.