"The work has really just begun." New Research From Anthropic (The Maker of Claude) - They have the first detailed look inside a modern large language model. -A subfield of AI research: Mechanistic Interpretability Aims to understand how these models work by examining their internal mechanisms Or “Reverse engineer neural networks” - Anthropic -For the first time, Anthropic made significant strides in interpreting AI models, specifically Claude 3 Sonnet, using a technique called "dictionary learning." -Finding Patterns: They identified approximately 10 million patterns, or "features," that represent different concepts within the model. -When these features are triggered they change model output. -This is the first step in understanding models and tracing LLMs from training data to final output.
CO/AI’s Post
More Relevant Posts
-
Exploring the Intricacies and Future of AI Systems 🤖✨ 1. Understanding AI Complexity David Bau notes that unlike traditional software, modern AI systems, including those using machine learning and neural networks, often remain elusive even to experts. Their pattern recognition capabilities far exceed straightforward comprehension. 2. Pioneering Explainable AI (XAI) Efforts are underway to make AI's decisions more transparent through methods like image part highlighting and decision trees. These tools aim to demystify AI's underlying processes and improve trustworthiness. 3. Decoding Large Language Models (LLMs) With their extensive parameters, LLMs pose interpretability challenges despite their crucial roles in tasks from medical advice to news summarization. Concerns about their reliability and potential biases underline the need for scrutiny. 4. The Drive for LLM Explainability Researchers strive to enhance AI safety and efficacy by elucidating how LLMs function, thereby meeting user and regulatory demands for dependable outputs. 5. Unraveling LLM Oddities Dubbed 'stochastic parrots,' LLMs generate content by remixing past inputs, sometimes displaying unexpected, seemingly intelligent behaviors that puzzle researchers. 6. Dialogues with LLMs By conversing with these models, scientists draw parallels to psychological methods, uncovering sophisticated responses under specific conditions. 7. Chain-of-Thought Prompting This technique involves guiding chatbots towards accurate responses by modeling clear reasoning paths, significantly boosting performance. 8. Addressing Social Bias in LLMs Studies reveal that LLMs can mirror human-like biases, affecting their outputs and decision-making. Users are urged to approach interactions with caution. 9. Neuroscientific Insights into AI Leveraging neuroscience, researchers can probe LLMs' "neural" activities to better understand and refine their honesty and functionality. 10. AI Explainability and Regulation Regulatory frameworks like the EU's AI Act insist on transparency, particularly for high-risk AI, pushing companies towards responsible innovation. #AI #MachineLearning #ExplainableAI #Technology #Regulation #Neuroscience
To view or add a comment, sign in
-
-
Part 2 😘 Continued:- Ann was prompted to attempt to say certain text or attempt to perform a specified action as her brain activity was recorded. Instead of training the AI deep learning models to identify words, the UCSF and UC Berkeley researchers trained the algorithms to decode words using the smallest unit of sound in a language called a phoneme. For example, the word “when” has four phonemes consisting of the w, e, n, and t sound units. Using this approach, their AI algorithm was able to decode any word in the English language from just 39 phonemes. During the training of the AI decoders, researchers used a Connectionist Temporal Classification Loss (CTC Loss) function to infer sequences when the exact time alignment between a letter or speech sound (phone) and speech waveforms is not known. Connectionist Temporal Classification is often used for automatic speech recognition tasks. CTC is a scoring function for neural network output where there may not be alignment at every timestep between the input and output sequence. “We used CTC loss during training of the text, speech, and articulatory decoding models to enable prediction of phone probabilities, discrete speech-sound units, and discrete articulator movements, respectively, from the ECoG signals,” the researchers reported. According to the researchers, their neurotechnology solution can rapidly decode brain signals into text in real time at a median rate of 78 words per minute (WPM). This performance greatly exceeds the 14 WPM rate of Ann’s current assistive device that requires her to select letters, a much slower process.
To view or add a comment, sign in
-
NEW STATE-OF-THE-ART ML METHOD FROM Artefact UK You need to carefully choose where to start from when you train #artificialneuralnetworks, as this will impact how well and how fast your model learns its task. Working with Autoencoders (a type of Artificial Neural Network used mainly for anomaly detection and dimensionality reduction) we have demonstrated via experiments that our method - the Straddled Matrix Initialiser - provides better results faster. It not only improves accuracy but also saves time and $$$ spent on training AI systems. Full description of the R&D Marcel Marais, Mate Hartstein, and I have done is available here: https://lnkd.in/dD4U4n6Z #datascience #ai #machinelearning Artefact
Using linear initialisation to improve speed of convergence and fully-trained error in Autoencoders
arxiv.org
To view or add a comment, sign in
-
AI terms explained: Artificial neuron An artificial neuron, often referred to as a “perceptron,” is a fundamental computational unit inspired by (but much simpler than) the way biological neurons work in the human brain. It takes multiple input values, applies individual weights to them, sums up these weighted inputs, and then passes the result through an activation function. The process can be broken down into several steps: 1. Input values: An artificial neuron receives input values, which represent various features or characteristics of the data being processed. 2. Weights: Each input is associated with a weight, which indicates the significance or influence of that input on the neuron’s output. 3. Weighted sum: The neuron calculates the weighted sum of its inputs by multiplying each input by its corresponding weight and then summing up these products. 4. Activation function: The weighted sum is then passed through an activation function. This function introduces non-linearity into the neuron’s behavior and determines whether the neuron “fires” (activates) or remains dormant based on the calculated result. Artificial neurons are the foundation of neural networks, which consist of layers of interconnected neurons. The connections between neurons, characterized by the weights, are learned during the training process using optimization techniques. Through this process, neural networks can learn to perform tasks such as image recognition, natural language processing, and more, by adjusting the weights and activation functions to minimize prediction errors. #ai #neuralnetworks #ann
To view or add a comment, sign in
-
-
AI trained data will train new AI Models Huang explores the concept of AI training itself by generating its own data by drawing parallels to how humans learn through self-reflection and problem-solving. Just as neuroscientists emphasize the importance of sleep in neural development, the idea is to facilitate AI's growth by allowing it to learn from its experiences. This approach could potentially lead to AI models that are more adaptive, resilient, and capable of solving complex problems autonomously. Through this self-training mechanism, AI could accelerate its evolution, marking a significant milestone in the trajectory of artificial intelligence.
To view or add a comment, sign in
-
Another sense where AI is being trained to be better than humans. While it will make it harder to hide in an an AI apocalypse it may make finding the perfect fruit or flower easier... The Rundown AI: Researchers have created an AI model that can predict how a chemical will smell just from its molecular structure, matching human testers. Key points: A neural network successfully inferred the scents of 400 mystery chemicals, also generating odor profiles for 500,000 hypothetical molecules. Perceived smells reportedly vary greatly — but the model learned to link atomic features to smells from training data, and got closer to ‘human averages’ than any participant in the group. The findings could speed searches for better-smelling consumer products — though the researchers note the next frontier needed for study is discerning ‘mixtures of molecules’ found in the real world. https://lnkd.in/emBMwyVy
A principal odor map unifies diverse tasks in olfactory perception
science.org
To view or add a comment, sign in
-
To assess physical systems, we use CAT scans. To assess deep neural networks, this paper presents the LAT scan. LAT = Linear Artificial Tomography It seems reasonable that to be able to interpret deep neural networks we should aim to look at network properties as a whole. This paper builds a model to represent arbitrary internal states of transformer in order to be able to assess or interpret the characteristics of the output. It goes through three steps: 1) Designs a specific task/stimulus 2) Collects activation data 3) Builds a linear model to relate activation data to task/output pairs For example, if the task is to state a specific fact, the output may be true or a hallucination. In each case, activations are collected and composed into linear models of "honesty" or "deception" This can be done on an arbitrary number of task or features. To validate that the models are useful interpretations, the authors perform a series of steps: 1) Determine the correlation of the model with the outputs 2) Manipulate the model by adding/subtracting activations during inference to modulate the output 3) Removing the activations (e.g. to eliminate "deception" as a feature) and recovering it by adding back in. Overall this paper demonstrates a mechanism for both modelling and potentially arbitrarily modifying the alignment of a transformer neural network to any arbitrary bias or behavior. It still somewhat anthropomorphizes this behavior, which seems incorrect. The most important part of this is then to determine what the desired values or attributes are, designing a battery of standards to assess any given network and being able to assess how networks may be artificially manipulated after training. #ai #aisafety
Representation Engineering: A Top-Down Approach to AI Transparency
arxiv.org
To view or add a comment, sign in
-
AI seems capable of self learning & improving. Point to be considered.
Brain Inspired AI Learns Like Humans A new AI model, inspired by the human brain, enables real-time learning and adjustment. This innovation could revolutionize AI efficiency and accessibility.
Brain Inspired AI Learns Like Humans - Neuroscience News
https://neurosciencenews.com
To view or add a comment, sign in
-
Weight initialisation strategies can dramatically reduce training cost in neural networks. Interesting paper based on R&D done at Artefact speeding up the training of Autoencoders. See link below!
NEW STATE-OF-THE-ART ML METHOD FROM Artefact UK You need to carefully choose where to start from when you train #artificialneuralnetworks, as this will impact how well and how fast your model learns its task. Working with Autoencoders (a type of Artificial Neural Network used mainly for anomaly detection and dimensionality reduction) we have demonstrated via experiments that our method - the Straddled Matrix Initialiser - provides better results faster. It not only improves accuracy but also saves time and $$$ spent on training AI systems. Full description of the R&D Marcel Marais, Mate Hartstein, and I have done is available here: https://lnkd.in/dD4U4n6Z #datascience #ai #machinelearning Artefact
Using linear initialisation to improve speed of convergence and fully-trained error in Autoencoders
arxiv.org
To view or add a comment, sign in
-
Decomposing Language Models Into Understandable Components | Careers | Communications of the ACM: Researchers have presented examples of activating datasets and their effects on learned dictionaries. Neural networks are trained on data to improve their performance, but it is still unclear why certain behaviors occur. The paper suggests that there may be better units of analysis than individual neurons, called features, which correspond to patterns of neuron activations. - Artificial Intelligence topics! #ai #artificialintelligence #intelligenzaartificiale
Decomposing Language Models Into Understandable Components
cacm.acm.org
To view or add a comment, sign in
Digital Product Designer, Entrepreneur
1moI've been fascinated by the work of the Anthropic team, specifically their focus on introspection. I avoid using the term "Mechanistic Interpretability," as it tends to confuse people. Instead, I explain that the creators of LLMs largely don't understand how the neural networks function, but they do have some insights. They are developing tools to observe how an LLM connects information and generates a response, similar to an MRI machine for an LLM. While people often find this intriguing and ask further complex questions, I always attempt to provide simple answers. It's exciting to see Anthropic making progress in this area.