Mountain View, California, United States
Contact Info
3K followers
500+ connections
About
Activity
-
Vijay Krishnamurthy and I are thrilled to introduce the founding engineers at alloan.ai. Each of them brings a unique set of skills, experiences, and…
Vijay Krishnamurthy and I are thrilled to introduce the founding engineers at alloan.ai. Each of them brings a unique set of skills, experiences, and…
Liked by Yanqi Zhou
-
Super thrilled to share our latest work, AlphaGeometry from Google DeepMind, the first AI system ever approaching the IMO gold medalists in solving…
Super thrilled to share our latest work, AlphaGeometry from Google DeepMind, the first AI system ever approaching the IMO gold medalists in solving…
Liked by Yanqi Zhou
-
It took us ~3.5 years to finally publish this paper on Nature, which started in January 2020 as an effort to fight COVID. https://lnkd.in/gg8Axfjm…
It took us ~3.5 years to finally publish this paper on Nature, which started in January 2020 as an effort to fight COVID. https://lnkd.in/gg8Axfjm…
Liked by Yanqi Zhou
Experience & Education
Volunteer Experience
-
Volunteer
Shanghai Science and Technology Museum
- 2 months
Social Services
Publications
-
CASH: Supporting IaaS Customers with a Sub-core Configurable Architecture
ACM/IEEE ISCA
CASH is a sub-core configurable architecture co-designed with a control theory based runtime. It supports IaaS customers to configure their virtual core configuration to meet QoS while minimizing cost.
Other authorsSee publication -
MITTS: Memory Inter-arrival Time Traffic Shaping
ACM/IEEE ISCA
MITTS is a distributed hardware mechanism that shapes memory transaction inter-arrival time into a pre-determined distribution on a per-core/per-thread basis. MITTS enables better system throughput and fairness like conventional memory scheduling algorithms. Moreover, it enables fine-grain memory bandwidth provisioning in an IaaS Cloud, which improves economic efficiency.
Other authorsSee publication -
The Sharing Architecture: Sub-core Configurability for IaaS Clouds
ACM ASPLOS
We design a configurable architecture on a general-purpose fabric. The Sharing Architecture allows us to configure a virtual core with different number of ALUs and cache. Unlike conventional composable architectures, the Sharing Architecture does not rely on the compiler or a new ISA support. A full chip is composed of hundreds of Slices and L2 cache banks.
Other authors -
-
Transferable Graph Optimizers for ML Compilers
NeurIPS 2020
Most compilers for machine learning (ML) frameworks need to solve many correlated optimization problems to generate efficient machine code. Current ML compilers rely on heuristics based algorithms to solve these optimization problems one at a time. However, this approach is not only hard to maintain but often leads to sub-optimal solutions especially for newer model architectures. Existing learning based approaches in the literature are sample inefficient, tackle a single optimization problem…
Most compilers for machine learning (ML) frameworks need to solve many correlated optimization problems to generate efficient machine code. Current ML compilers rely on heuristics based algorithms to solve these optimization problems one at a time. However, this approach is not only hard to maintain but often leads to sub-optimal solutions especially for newer model architectures. Existing learning based approaches in the literature are sample inefficient, tackle a single optimization problem, and do not generalize to unseen graphs making them infeasible to be deployed in practice. To address these limitations, we propose an end-to-end, transferable deep reinforcement learning method for computational graph optimization (GO), based on a scalable sequential attention mechanism over an inductive graph neural network. GO generates decisions on the entire graph rather than on each individual node autoregressively, drastically speeding up the search compared to prior methods. Moreover, we propose recurrent attention layers to jointly optimize dependent graph optimization tasks and demonstrate 33%-60% speedup on three graph optimization tasks compared to TensorFlow default optimization. On a diverse set of representative graphs consisting of up to 80,000 nodes, including Inception-v3, Transformer-XL, and WaveNet, GO achieves on average 21% improvement over human experts and 18% improvement over the prior state of the art with 15x faster convergence, on a device placement task evaluated in real systems.
Other authors -
Courses
-
Analog Circuits
-
-
Big Data Analytics
-
-
Compiler
-
-
Computer Architecture
-
-
Computer Network
-
-
Computer Organizations
-
-
Data Structures and Algorithms
-
-
Digital Circuits
-
-
Finance and Investment
-
-
German
-
-
Introduction to programming
-
-
Linear Algebra
-
-
Mathematics of Finance
-
-
Microeconomics
-
-
Numerical Analysis
-
-
Operating System
-
-
Parallel Computing
-
-
Semiconductor and Devices
-
-
Theory of Algorithm
-
-
VLSI
-
Honors & Awards
-
Princeton Wu Fellowship
Princeton University
-
Microsoft PhD Fellow
Microsoft Research
I was selected as Microsoft PhD Fellow of year 2014.
Languages
-
English
Full professional proficiency
-
Chinese
Native or bilingual proficiency
-
German
Elementary proficiency
More activity by Yanqi
-
I started my journey with Facebook/Meta in December, 2009. After 14 years, I made a hard decision to leave Meta in Dec 2023. I joined Facebook/Meta…
I started my journey with Facebook/Meta in December, 2009. After 14 years, I made a hard decision to leave Meta in Dec 2023. I joined Facebook/Meta…
Liked by Yanqi Zhou
-
We are hiring interns and fulltime researchers to join the LLM and Foundation Model research team. Details below, please feel free to ping me if you…
We are hiring interns and fulltime researchers to join the LLM and Foundation Model research team. Details below, please feel free to ping me if you…
Liked by Yanqi Zhou
-
ECE assistant professor and Y. T. Lo Faculty Fellow Jian Huang has won the inaugural ACM SIGMICRO Early Career Award for pioneering contributions to…
ECE assistant professor and Y. T. Lo Faculty Fellow Jian Huang has won the inaugural ACM SIGMICRO Early Career Award for pioneering contributions to…
Liked by Yanqi Zhou
-
I am super excited to share that we are introducing a new addition to the Firefly family of models - Generative Fill. After months of intense…
I am super excited to share that we are introducing a new addition to the Firefly family of models - Generative Fill. After months of intense…
Liked by Yanqi Zhou
-
A NYT article on the debate around whether LLM base models should be closed or open. Meta argues for openness, starting with the release of LLaMA…
A NYT article on the debate around whether LLM base models should be closed or open. Meta argues for openness, starting with the release of LLaMA…
Liked by Yanqi Zhou
-
Grad student Grigory Chirkov and his adviser, David Wentzlaff, have developed an open-source platform to make high-tech chips easier and cheaper to…
Grad student Grigory Chirkov and his adviser, David Wentzlaff, have developed an open-source platform to make high-tech chips easier and cheaper to…
Liked by Yanqi Zhou
-
Had a wonderful time giving a talk on "The Processor: A Journey to the Brain Behind Our Technology" at my daughters' elementary school last…
Had a wonderful time giving a talk on "The Processor: A Journey to the Brain Behind Our Technology" at my daughters' elementary school last…
Liked by Yanqi Zhou
-
Congrats to my friend Sharon Zhou and Greg Diamos for launching Lamini, the LLM engine that gives every developer the superpowers that took the world…
Congrats to my friend Sharon Zhou and Greg Diamos for launching Lamini, the LLM engine that gives every developer the superpowers that took the world…
Liked by Yanqi Zhou
-
I was honored to be invited to the CMU Crossroads Seminar series and share my thoughts on how AMD Versal ACAP architecture with #FPGA + #asics works…
I was honored to be invited to the CMU Crossroads Seminar series and share my thoughts on how AMD Versal ACAP architecture with #FPGA + #asics works…
Liked by Yanqi Zhou
-
After 10+ years, I have decided to leave Microsoft and start something new. I will reveal what's next later when we are ready, but I wanted to take…
After 10+ years, I have decided to leave Microsoft and start something new. I will reveal what's next later when we are ready, but I wanted to take…
Liked by Yanqi Zhou
-
Introducing WebLLM, an open-source chatbot that brings language models (LLMs) directly onto web browsers. We can now run instruction fine-tuned…
Introducing WebLLM, an open-source chatbot that brings language models (LLMs) directly onto web browsers. We can now run instruction fine-tuned…
Liked by Yanqi Zhou
-
New PaLM API launched! 🔥 This was an incredible team effort and I was glad to contribute by helping to co-lead the LLM modeling (architecture and…
New PaLM API launched! 🔥 This was an incredible team effort and I was glad to contribute by helping to co-lead the LLM modeling (architecture and…
Liked by Yanqi Zhou
People also viewed
Explore collaborative articles
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
Explore MoreOthers named Yanqi Zhou
-
Yanqi Zhou
Mechanical or Industrial Engineering Professional
-
yanqi Zhou
Student at University of Sydney
-
yanqi zhou
中南大学学生
-
YANQI ZHOU
Storeman at Kordia Solutions
70 others named Yanqi Zhou are on LinkedIn
See others named Yanqi Zhou