-
GPT-4 Understands Discourse at Least as Well as Humans Do
Authors:
Thomas Shultz,
Jamie Wise,
Ardavan Salehi Nobandegani
Abstract:
We test whether a leading AI system GPT-4 understands discourse as well as humans do, using a standardized test of discourse comprehension. Participants are presented with brief stories and then answer eight yes/no questions probing their comprehension of the story. The questions are formatted to assess the separate impacts of directness (stated vs. implied) and salience (main idea vs. details). G…
▽ More
We test whether a leading AI system GPT-4 understands discourse as well as humans do, using a standardized test of discourse comprehension. Participants are presented with brief stories and then answer eight yes/no questions probing their comprehension of the story. The questions are formatted to assess the separate impacts of directness (stated vs. implied) and salience (main idea vs. details). GPT-4 performs slightly, but not statistically significantly, better than humans given the very high level of human performance. Both GPT-4 and humans exhibit a strong ability to make inferences about information that is not explicitly stated in a story, a critical test of understanding.
△ Less
Submitted 25 March, 2024;
originally announced March 2024.
-
Towards Machines that Trust: AI Agents Learn to Trust in the Trust Game
Authors:
Ardavan S. Nobandegani,
Irina Rish,
Thomas R. Shultz
Abstract:
Widely considered a cornerstone of human morality, trust shapes many aspects of human social interactions. In this work, we present a theoretical analysis of the $\textit{trust game}$, the canonical task for studying trust in behavioral and brain sciences, along with simulation results supporting our analysis. Specifically, leveraging reinforcement learning (RL) to train our AI agents, we systemat…
▽ More
Widely considered a cornerstone of human morality, trust shapes many aspects of human social interactions. In this work, we present a theoretical analysis of the $\textit{trust game}$, the canonical task for studying trust in behavioral and brain sciences, along with simulation results supporting our analysis. Specifically, leveraging reinforcement learning (RL) to train our AI agents, we systematically investigate learning trust under various parameterizations of this task. Our theoretical analysis, corroborated by the simulations results presented, provides a mathematical basis for the emergence of trust in the trust game.
△ Less
Submitted 20 December, 2023;
originally announced December 2023.
-
Cognitive Models as Simulators: The Case of Moral Decision-Making
Authors:
Ardavan S. Nobandegani,
Thomas R. Shultz,
Irina Rish
Abstract:
To achieve desirable performance, current AI systems often require huge amounts of training data. This is especially problematic in domains where collecting data is both expensive and time-consuming, e.g., where AI systems require having numerous interactions with humans, collecting feedback from them. In this work, we substantiate the idea of $\textit{cognitive models as simulators}$, which is to…
▽ More
To achieve desirable performance, current AI systems often require huge amounts of training data. This is especially problematic in domains where collecting data is both expensive and time-consuming, e.g., where AI systems require having numerous interactions with humans, collecting feedback from them. In this work, we substantiate the idea of $\textit{cognitive models as simulators}$, which is to have AI systems interact with, and collect feedback from, cognitive models instead of humans, thereby making their training process both less costly and faster. Here, we leverage this idea in the context of moral decision-making, by having reinforcement learning (RL) agents learn about fairness through interacting with a cognitive model of the Ultimatum Game (UG), a canonical task in behavioral and brain sciences for studying fairness. Interestingly, these RL agents learn to rationally adapt their behavior depending on the emotional state of their simulated UG responder. Our work suggests that using cognitive models as simulators of humans is an effective approach for training AI systems, presenting an important way for computational cognitive science to make contributions to AI.
△ Less
Submitted 8 October, 2022;
originally announced October 2022.
-
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
Authors:
Aarohi Srivastava,
Abhinav Rastogi,
Abhishek Rao,
Abu Awal Md Shoeb,
Abubakar Abid,
Adam Fisch,
Adam R. Brown,
Adam Santoro,
Aditya Gupta,
Adrià Garriga-Alonso,
Agnieszka Kluska,
Aitor Lewkowycz,
Akshat Agarwal,
Alethea Power,
Alex Ray,
Alex Warstadt,
Alexander W. Kocurek,
Ali Safaya,
Ali Tazarv,
Alice Xiang,
Alicia Parrish,
Allen Nie,
Aman Hussain,
Amanda Askell,
Amanda Dsouza
, et al. (426 additional authors not shown)
Abstract:
Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur…
▽ More
Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-future capabilities and limitations of language models. To address this challenge, we introduce the Beyond the Imitation Game benchmark (BIG-bench). BIG-bench currently consists of 204 tasks, contributed by 450 authors across 132 institutions. Task topics are diverse, drawing problems from linguistics, childhood development, math, common-sense reasoning, biology, physics, social bias, software development, and beyond. BIG-bench focuses on tasks that are believed to be beyond the capabilities of current language models. We evaluate the behavior of OpenAI's GPT models, Google-internal dense transformer architectures, and Switch-style sparse transformers on BIG-bench, across model sizes spanning millions to hundreds of billions of parameters. In addition, a team of human expert raters performed all tasks in order to provide a strong baseline. Findings include: model performance and calibration both improve with scale, but are poor in absolute terms (and when compared with rater performance); performance is remarkably similar across model classes, though with benefits from sparsity; tasks that improve gradually and predictably commonly involve a large knowledge or memorization component, whereas tasks that exhibit "breakthrough" behavior at a critical scale often involve multiple steps or components, or brittle metrics; social bias typically increases with scale in settings with ambiguous context, but this can be improved with prompting.
△ Less
Submitted 12 June, 2023; v1 submitted 9 June, 2022;
originally announced June 2022.
-
Bringing Order to the Cognitive Fallacy Zoo
Authors:
Ardavan S. Nobandegani,
William Campoli,
Thomas R. Shultz
Abstract:
In the eyes of a rationalist like Descartes or Spinoza, human reasoning is flawless, marching toward uncovering ultimate truth. A few centuries later, however, culminating in the work of Kahneman and Tversky, human reasoning was portrayed as anything but flawless, filled with numerous misjudgments, biases, and cognitive fallacies. With further investigations, new cognitive fallacies continually em…
▽ More
In the eyes of a rationalist like Descartes or Spinoza, human reasoning is flawless, marching toward uncovering ultimate truth. A few centuries later, however, culminating in the work of Kahneman and Tversky, human reasoning was portrayed as anything but flawless, filled with numerous misjudgments, biases, and cognitive fallacies. With further investigations, new cognitive fallacies continually emerged, leading to a state of affairs which can fairly be characterized as the cognitive fallacy zoo! In this largely methodological work, we formally present a principled way to bring order to this zoo. We introduce the idea of establishing implication relationships (IRs) between cognitive fallacies, formally characterizing how one fallacy implies another. IR is analogous to, and partly inspired by, the fundamental concept of reduction in computational complexity theory. We present several examples of IRs involving experimentally well-documented cognitive fallacies: base-rate neglect, availability bias, conjunction fallacy, decoy effect, framing effect, and Allais paradox. We conclude by discussing how our work: (i) allows for identifying those pivotal cognitive fallacies whose investigation would be the most rewarding research agenda, and importantly (ii) permits a systematized, guided research program on cognitive fallacies, motivating influential theoretical as well as experimental avenues of future research.
△ Less
Submitted 15 October, 2018;
originally announced October 2018.
-
Over-representation of Extreme Events in Decision-Making: A Rational Metacognitive Account
Authors:
Ardavan S. Nobandegani,
Kevin da Silva Castanheira,
A. Ross Otto,
Thomas R. Shultz
Abstract:
The Availability bias, manifested in the over-representation of extreme eventualities in decision-making, is a well-known cognitive bias, and is generally taken as evidence of human irrationality. In this work, we present the first rational, metacognitive account of the Availability bias, formally articulated at Marr's algorithmic level of analysis. Concretely, we present a normative, metacognitiv…
▽ More
The Availability bias, manifested in the over-representation of extreme eventualities in decision-making, is a well-known cognitive bias, and is generally taken as evidence of human irrationality. In this work, we present the first rational, metacognitive account of the Availability bias, formally articulated at Marr's algorithmic level of analysis. Concretely, we present a normative, metacognitive model of how a cognitive system should over-represent extreme eventualities, depending on the amount of time available at its disposal for decision-making. Our model also accounts for two well-known framing effects in human decision-making under risk---the fourfold pattern of risk preferences in outcome probability (Tversky & Kahneman, 1992) and in outcome magnitude (Markovitz, 1952)---thereby providing the first metacognitively-rational basis for those effects. Empirical evidence, furthermore, confirms an important prediction of our model. Surprisingly, our model is unimaginably robust with respect to its focal parameter. We discuss the implications of our work for studies on human decision-making, and conclude by presenting a counterintuitive prediction of our model, which, if confirmed, would have intriguing implications for human decision-making under risk. To our knowledge, our model is the first metacognitive, resource-rational process model of cognitive biases in decision-making.
△ Less
Submitted 29 January, 2018;
originally announced January 2018.
-
Converting Cascade-Correlation Neural Nets into Probabilistic Generative Models
Authors:
Ardavan Salehi Nobandegani,
Thomas R. Shultz
Abstract:
Humans are not only adept in recognizing what class an input instance belongs to (i.e., classification task), but perhaps more remarkably, they can imagine (i.e., generate) plausible instances of a desired class with ease, when prompted. Inspired by this, we propose a framework which allows transforming Cascade-Correlation Neural Networks (CCNNs) into probabilistic generative models, thereby enabl…
▽ More
Humans are not only adept in recognizing what class an input instance belongs to (i.e., classification task), but perhaps more remarkably, they can imagine (i.e., generate) plausible instances of a desired class with ease, when prompted. Inspired by this, we propose a framework which allows transforming Cascade-Correlation Neural Networks (CCNNs) into probabilistic generative models, thereby enabling CCNNs to generate samples from a category of interest. CCNNs are a well-known class of deterministic, discriminative NNs, which autonomously construct their topology, and have been successful in giving accounts for a variety of psychological phenomena. Our proposed framework is based on a Markov Chain Monte Carlo (MCMC) method, called the Metropolis-adjusted Langevin algorithm, which capitalizes on the gradient information of the target distribution to direct its explorations towards regions of high probability, thereby achieving good mixing properties. Through extensive simulations, we demonstrate the efficacy of our proposed framework.
△ Less
Submitted 18 January, 2017;
originally announced January 2017.
-
Neural Implementation of Probabilistic Models of Cognition
Authors:
Milad Kharratzadeh,
Thomas R. Shultz
Abstract:
Bayesian models of cognition hypothesize that human brains make sense of data by representing probability distributions and applying Bayes' rule to find the best explanation for available data. Understanding the neural mechanisms underlying probabilistic models remains important because Bayesian models provide a computational framework, rather than specifying mechanistic processes. Here, we propos…
▽ More
Bayesian models of cognition hypothesize that human brains make sense of data by representing probability distributions and applying Bayes' rule to find the best explanation for available data. Understanding the neural mechanisms underlying probabilistic models remains important because Bayesian models provide a computational framework, rather than specifying mechanistic processes. Here, we propose a deterministic neural-network model which estimates and represents probability distributions from observable events --- a phenomenon related to the concept of probability matching. Our model learns to represent probabilities without receiving any representation of them from the external world, but rather by experiencing the occurrence patterns of individual events. Our neural implementation of probability matching is paired with a neural module applying Bayes' rule, forming a comprehensive neural scheme to simulate human Bayesian learning and inference. Our model also provides novel explanations of base-rate neglect, a notable deviation from Bayes.
△ Less
Submitted 14 April, 2016; v1 submitted 13 January, 2015;
originally announced January 2015.
-
Managing Uncertainty in Rule Based Cognitive Models
Authors:
Thomas R. Shultz
Abstract:
An experiment replicated and extended recent findings on psychologically realistic ways of modeling propagation of uncertainty in rule based reasoning. Within a single production rule, the antecedent evidence can be summarized by taking the maximum of disjunctively connected antecedents and the minimum of conjunctively connected antecedents. The maximum certainty factor attached to each of the r…
▽ More
An experiment replicated and extended recent findings on psychologically realistic ways of modeling propagation of uncertainty in rule based reasoning. Within a single production rule, the antecedent evidence can be summarized by taking the maximum of disjunctively connected antecedents and the minimum of conjunctively connected antecedents. The maximum certainty factor attached to each of the rule's conclusions can be sealed down by multiplication with this summarized antecedent certainty. Heckerman's modified certainty factor technique can be used to combine certainties for common conclusions across production rules.
△ Less
Submitted 27 March, 2013;
originally announced April 2013.