Sign in to view Mitchell’s full profile
Welcome back
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
or
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
San Francisco, California, United States
Contact Info
Sign in to view Mitchell’s full profile
Welcome back
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
or
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
2K followers
500+ connections
Sign in to view Mitchell’s full profile
Welcome back
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
or
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
View mutual connections with Mitchell
Welcome back
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
or
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
View mutual connections with Mitchell
Welcome back
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
or
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
Sign in to view Mitchell’s full profile
Welcome back
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
or
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
Experience & Education
-
Phantom
****** ******** ********
-
******
**-******* & *** (******** ** *********)
-
******
******** ********
-
* **********
****** ***** (****)
-
-
********** ** ********
********'* ****** ************, ********, *** ********** ***********
-
View Mitchell’s full experience
See their title, tenure and more.
Welcome back
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
or
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
Honors & Awards
-
Governor General's Award
Rideau Hall
Awarded to the student with the highest overall average in his/her
secondary school graduating class. -
Canadian Engineering Competition Champion
-
First overall at the national level (CEC) for junior design
-
Sandford Fleming Foundation Senior Design Competition Award
-
Sandford Fleming Foundation & University of Waterloo
View Mitchell’s full profile
Sign in
Stay updated on your professional world
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
Other similar profiles
-
Brandon Millman
San Francisco, CAConnect -
Francesco Agosti
San Francisco, CAConnect -
Kevin D. Kim
San Francisco Bay AreaConnect -
Anvit Mangal
MálagaConnect -
Julian Royal
Alpharetta, GAConnect -
Martin Lundberg
SwedenConnect -
Laamia Islam
Miami, FLConnect -
David Ferris
Toronto, ONConnect -
Crystal Billy
San Francisco, CAConnect -
Nicolas Cruz
United StatesConnect -
Bernard Ma
Greater Los Angeles, CAConnect -
Ash Bhimasani
New York, NYConnect -
Gaby (Eliason) Zwerling
San Francisco Bay AreaConnect -
Maxim G.
Antwerp Metropolitan AreaConnect -
Jorge Valdeiglesias
San Francisco, CAConnect -
Jennifer Yen
San Francisco, CAConnect -
Carlo Licciardi
Miami Beach, FLConnect -
Ludovic Landry
San Francisco, CAConnect -
Kash Pourdeilami
Toronto, ONConnect -
Fazleem Baig
Halifax, NSConnect
Explore more posts
-
Srimouli B.
🚀 Weekend Read: Delving into Transformer Reasoning Capabilities with Graph Algorithms 🚀 Looking for a thought-provoking read this weekend? I’m excited to dive into "Understanding Transformer Reasoning Capabilities via Graph Algorithms" by the talented team at Google Research and Columbia University. 🔍 What’s Inside: Transformer Scaling Regimes: Discover how transformers handle different classes of algorithmic problems, especially graph reasoning tasks. Theoretical Foundations: Gain insights into the theoretical underpinnings of transformers’ representational capabilities based on their depth, width, and additional tokens. Empirical Validation: See how transformers perform on graph reasoning tasks using the GraphQA benchmark compared to specialized graph neural networks (GNNs). 📊 Why It’s a Good Read: Innovative Framework: The paper introduces a novel representational hierarchy categorizing graph reasoning tasks by complexity and transformer capabilities. Performance Insights: Learn about the superior performance of transformers in tasks requiring long-range dependency analysis. Practical Impact: Explore the potential real-world applications of these insights in AI, from language modeling to computer vision. Check out the paper in the first comment. Let’s make this weekend insightful reading an excellent research! #AI #MachineLearning #Transformers #GraphAlgorithms #WeekendRead #ResearchDiscussion
11
1 Comment -
Manuel Romero
📊 Exciting News: TransformerFAM is Revolutionizing Sequence Modeling! 🤖 There's a powerful new architecture on the block that's taking transformer models to the next level. Introducing TransformerFAM - the game-changing approach that incorporates a feedback attention mechanism (FAM) to supercharge transformers with working memory capabilities. 🔑 Key Highlights: - FAM enables transformers to process sequences of unlimited length by attending to their own latent representations - Prompt tuning allows efficient learning of FAM weights with minimal additional parameters - Memory-efficient implementations make FAM computationally feasible even for large-scale models - Experiments show significant performance gains on long-range context and coherence tasks Curious to dive deeper? Check out the groundbreaking research paper linked below: https://lnkd.in/dmhXUuZZ #DataScience #MachineLearning #DeepLearning #TransformerFAM #NLP
15
-
Shivam K.
🔥🚀 Exciting News from MIT: Introducing FlowMap for Deep Learning-based Structure from Motion! 🔥🚀 MIT unveils FlowMap, a groundbreaking self-supervised, end-to-end differentiable SfM method delivering COLMAP-level accuracy for 360° scenes. FlowMap revolutionizes the process by precisely solving camera poses, intrinsics, and dense depth of video sequences without relying on unique image features or manual matching. This innovation outperforms prior gradient-descent based methods and rivals COLMAP, marking a significant advancement in SfM technology. 👉 Explore the project page, paper, and code: https://lnkd.in/gyyinuTu #Reconstruction #SfM #AI #DeepLearning #Innovation
7
-
Adam Sullivan
The coding capability of Claude v3 Opus is absolutely insane. I am working on a small project to create a POC app, and in about 4 hours iterating with Opus I have generated nearly 1300 lines of code and 4 different apps/packages. Opus is making calls back to itself for GenAI tasks. I have edited maybe 30 lines of code in total. I am iterating the same way I do when developing... start small and continue to add function(ality)s with each new feature I want. I will take the code base, feed into Opus, and ask it to add the functionality I want. I'm completely blown away in it's ability to generate code and also fix code when I encounter issues/bugs during runtime.
70
3 Comments -
Luis Serrano
Happy to finish this series of videos on Reinforcement Learning, guided towards fine-tuning LLMs! Here they are, check them out! Video 0 (optional): Deep Reinforcement Learning and Policy Gradients Video 1: Principal Policy Optimization (PPO) Video 2: Reinforcement Learning with Human Feedback (RLHF) Video 3: Direct Preference Optimization (DPO) https://lnkd.in/gRdyzAKX
461
12 Comments -
Luis Serrano
Somehow I can never say “Proximal Policy Optimization” and “Direct Preference Optimization” without messing up at least one of them. 🙃 Nonetheless, I think🤞🏾I got the math right, so check out these tutorials and learn Principal Policy Optimization and Deterministic Proxy Organisomething. They help computers talk or something like that… 🤣 https://lnkd.in/gRdyzAKX
23
-
Thierry Moreau
Replacing an expensive generalist LLM like GPT-4 with a cheaper more specialized model like a fine-tuned Mistral 7B is a tried and true approach to really drive your GenAI expenses down. We've got a handy guide on how to start your fine LLM cost optimization journey with OpenPipe for fine tuning and OctoAI for model serving https://lnkd.in/e8udh8de
85
-
Ramsri Goutham Golla
People ask me what they should attempt newly if they are training Indic LLMs since there are already a lot of Llama2/3 and Gemma based models. Two things: 1. Train with DORA (not plain LORA) Lora was decomposing additional trainable parameters into two low rank matrices (A&B). DORA goes one step further and adds additional parameters: magnitude and direction vector. Now the direction vector is decomposed into two low rank trainable matrices (A&B). So you have a LORA type configuration + additional magnitude vector trainable in DORA. But you get a superior training setup moving closer to full-finetuning with just magnitude vector training parameters as the overhead. 2. Train with ORPO The current Indic models released have one big problem. Most of them are just SFT (instruction finetuned) models. So if you ask a question like "how to make a bomb" or "how to kidnap", it doesn't hesitate to provide an answer. Doing DPO or RLHF is cumbersome given data constraints. ORPO eliminates the need by combining the SFT + DPO/RLHF step into one. Find a good ORPO dataset, translate it into any specific Indic language (Hindi, Tamil, Telugu etc.) and finetune such that you combine the instruction and preference alignment task into one. #llms #artificialintelligence #generativeai #ai
88
6 Comments -
Taddeus Buica
So, is it as BIG as they say? 🤭 🚀 Excited to share a new LLM benchmark! RULER 😏 🔍 What happens to model performance as context length increases? RULER by NVIDIA dives deep, testing long-context capabilities in language models. 📉 Many models boast large context sizes, but how many hold up under pressure? How does it work? 🤔 RULER tests the models' performance degradation as context length increases and provides a more nuanced understanding of each model's capabilities. It reveals that while many models claim large context sizes, only a few can maintain satisfactory performance when tested with RULER's comprehensive and challenging tasks. RULER reveals only a few can truly perform well with extended texts. 🔗 Dive into the full paper: https://lnkd.in/dihNR7ZR 💡 Consider this benchmark when choosing a model for a task that requires a long context length. I've seen Gemini's 1.5 Pro with 1 mil token context length responses degrade after a while. Sometimes using a vector DB with similarity search on indexed content and a better model could lead to better results. Developed by @NVIDIA 👏 Big thanks to the NVIDIA team for pushing the boundaries of AI research! #AI #MachineLearning #NLP #LanguageModels #Innovation
1
-
Tsung-Hsien Wen
🎙️ Dive into the world of Generative Voice Assistants with us! In our latest episode, we break down the fascinating anatomy of these complex systems and reveal the magic behind the scenes. Curious about the science? Follow us and join the journey as we peel back the layers, one episode at a time! 🔍✨ #VoiceAssistants #TechTalk #BehindTheScenes
20
1 Comment -
Danilo T.
if you know, you know NVIDIA annual shareholder meeting today: june 26 2024 09:00 pst in about an hour we'll have a chance to listen to new developments. it's breath-taking. the speed of not only NVIDIA's ability to navigate (well i guess they must be using their own (ai) to decide what where how to do and then some. huh?) more importantly just as Apple Microsoft defined the pervasiveness of computers. their applications. we believe NVIDIA holds that position currently. they have been handsomely rewarded in the markets. a valuation as the most valuable in the world. 3.3 trillys. and going. so every and all announcements are truly where the centre of the universe of chip design, manufacturing, exists. it's more than simply offering state of the art chip design for (ai) modelling. where Alphabet Inc. organizes the world's information. NVIDIA makes sense of it. and everyone else applies it, to their subject matter expertise. this is the change that is afoot. at this moment. where it was a race to offer advertisers of services/products a better understanding of their clientele, now the lens is sharper. there is no need to suggest age/sex/ethnicity/generational differences. there are behaviours. theses behaviours are independent of what data was used to point adverts to. to get these people to buy more sh^t. that's the base of the u.s. economy. it's apparently consumer driven. and so in this post pandemic world where it literally stopped. and we were forced to sit still. wait it out as the storm clouds passed. in hopes no one. not us. or loved ones would perish. that time was a generational shift. we all felt. it. no matter what class you were. NVIDIA is the fulcrum Google is the waterfall or Google is the head NVIDIA the neck wherever the neck turns that is where the head has their attention. the neck keeps the attention of the head there. so it appears as if in many ways Google has in essence offered the search part of it's business up for grabs. this is why Perplexity exists, and others. a new dynamic pricing auction model is about to emerge. it will as it is by nature adversarial. let's not fool ourselves. we are designed to be adversarial. it's our nature. and in coveting $$$$$ as the prime objective. money for the sake of money this will destroy a lot of thoughtful, mindful, intentionally beneficial innovation. why? well. it's not playing according to the whims of a recommendation machine that rewards those that are part of it's invisible army. today is a crowning. it's a crown. upon the head of jensen. this is akin to king charles and his crowning. we wonder who will be at the crowning. to recognize this eventful day. it's good to be king. ;) < . > source: https://lnkd.in/eumX4num
1
-
Olivier Mills
Great article by Andrew Ng on context windows and prompting. https://lnkd.in/gMv7hNPt This reminded me of the often mentioned fact how we have completely underutilized the power of most models out there (I think Matt Shumer is big on this). Inefficient prompting, just defaulting to the high parameter model to “save our butts” on predictable outcome when we just needed to fine tune our prompts better or bring 2 smaller models together as an agent. The speed at which newer models are coming out is making many devs avoid the work of taking a breath and moment to deep dive existing models. I know this because I fall into this myself.
-
Ivan Nikolaichuk
Language models are revolutionizing how we interact with technology. Let's explore NVIDIA's Nemotron-4 340B, a tool that's been making significant strides in the AI community. Nemotron-4 340B is an open-source language model optimized for instruction-following, reasoning, and mathematical tasks. Its versatility makes it ideal for applications like chatbots and personal assistants. One of its standout features is performance. Recent evaluations show that Nemotron-4 340B outperforms Llama-3-70B, positioning itself as one of the top open models available. Plus, it comes with a permissive license, allowing free use for both commercial and non-commercial purposes. Powered by TensorRT LLM, this model ensures efficient execution of complex computational tasks while maintaining high accuracy levels. Developers can easily integrate it into their projects via the API catalog provided by upstageai. NVIDIA has also shared hyperparameters for base/instruct/reward configurations in the Nemotron-4 340B model. These details offer valuable insights into its workings and allow users to tweak the model to meet specific needs. In summary, Nemotron-4 340B combines high performance, versatility, and user-friendliness. It's worth exploring whether you're an experienced developer or new to AI. #NVIDIA #LanguageModel #AI
1
-
Baris Aksoy
NVIDIA's Jensen said "ChatGPT democratized computing, Llama2 democratized generative AI" ...and now Llama3 is the next level 🔥 It's fascinating to watch Meta's strategic moves. With Llama3, they prioritized training on a massive 15T token dataset to pack all into a lean 70B param model, instead of building a massive model. This allows Llama 3 to match trillion+ parameter models like GPT-4, but at 1/10th the compute, storage and inference costs! 💰 This technique was published by Google DeepMind a few years ago https://lnkd.in/gx7VU8aA Meta is not a dark horse in AI anymore. They might be the top dog. https://lnkd.in/gRRJYuxt #llama2 #llama3 #llm #chatgpt #gpt4 #ai #ml
16
5 Comments -
Nick Black
Last night's Ascend Dinner - putting GenerativeAI into production - was another great one. Here're the topics we deep dived into. 🤖 Agentic Workflows - involve using multiple prompts or GPTs (agents) to review and refine each other's work. Examples like Nvidia's AI agent Voyager have show how, by passing outputs of one LLM promp such as Python code, to another LLM prompted to review or enhance the output of the first, LLMs can achieve goals with significant complexity. (See https://lnkd.in/eH6fS2Qp and https://lnkd.in/eCxhjKxT) 🐪 Hallucinations - feature or bug? Perhaps hallucinations - when the LLM "makes up" an answer to a question that seems true but isn't - are a feature of an intelligence that knows nothing about facts and thinks only in terms of similarities? What if we could harness LLMs' ability to build connections between seemeingly disperate ideas to 20x our creativty? 🏰 Defensibiltiy - of business models in the age of AI has been part of nearly every conversation I've had with founders and investors recently. My conclusion is that AI does little to change the fundamentals. IP (as in copyright and patents on code) has been a weak defense for the last 20 years in nearly all cases. Instead, tech startups have shown how counter positioning (Netflix), network effects (Facebook, WhatsApp, Instagram, Apple, Google), switching costs (Salesforce, Adobe, Nvidia) and branding (Apple, Nvidia) have led to dominance. 🧪 Synthetic Biology - one of my new areas of interest, the combination of LLMs with synth bio lets biologists write genetic sequences like you'd write prose with ChatGPT, then bring those sequences to life inside real living cells. What level of accuracy is needed in this domain to lead to a 5x or 10x improvement in food production, medicine, agriculture? Thanks to my wondeful guests Alexander Walsh, CFA, Glenn Smith, Adam Martin, Charles Wahab and Jean-Michel Chalayer who shared their expertise and ignorance in equal measure :-) Wondering about the photo? We got so lost in conversation I forgot to take a photo, so I fed ChatGPT this prompt, along with 6 images of my guest and one of the restaurant we were in. "Please help me write a prompt for Dall-E that features these 6 people having dinner in a private dining room in a restaurant like the one in the image." #ai #generativeai #founders #productstrategy
38
5 Comments -
Aritra Roy Gosthipaty
A small win. We (/w Ritwik Raha) wanted to read about diffusion models for some time now. After the NeRF tutorial series on PyImageSearch, we were thinking of something that was out of our comfort zone. We came across DDPM and were excited to see how we could break the paper down for the community. Well our poor formal math education would prohibited us from understanding the paper. We quickly moved into covering other things, and the idea got shelved. We came back to the idea of diffusion models early this year. There were far more expamples from the community to ease our way into it. We decided on taking two approaches. I would go on the math route, while Ritwik Raha takes the code route. How do I know that I am on the right track? I run an experiment on myself, note what I feel and then run the same expriment later. This time I read Lilian Weng's blog post on diffusion models. Notes on Feb 3 2024: need to understand probability theory. Notes on Apr 25 2024: - This is so thorough. - Is the forward process close to CLT? - Equating the reverse process steps as latent variable model is amazing! I am genuinely happy that I understand something from scratch and can communicate my ideas with the people in the field without having to search for words. Expect something from PyImageSearch very soon.
54
4 Comments -
Jacob Choi
Qwen2: The latest open source model that surpasses Llama-3! The Qwen Team has just unveiled their new Qwen2 model, licensed under Apache 2.0, allowing for free use. After trying the demo available on Hugging Face, I found that it performs exceptionally well with Korean and Japanese! Qwen2 has outperformed previous open-source state-of-the-art models like Llama3 and Mixtral 8x22B across multiple renowned benchmarks, including MMLU and GPQA. Impressively, it has been trained in 28 languages. This achievement highlights the potential of open source in driving innovation! For more details, visit https://lnkd.in/gmpXeaAh
15
1 Comment
Explore collaborative articles
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
Explore More