The Data Scientist Show

Daliana Liu
A deep dive into data scientists' day-to-day work, tools and models they use, how they tackle problems, and their career journeys. This podcast helps you grow a successful career in data science. Listening to an episode is like having lunch with an experienced mentor. Guests are data science practitioners from various industries, AI researchers, economists, and CTOs of AI companies. Host: Daliana Liu, an ex-Amazon senior data scientist with 180k followers on Linkedin.
Join 20k subscribers at www.dalianaliu.com to learn more about data science, career, and this show. Twitter @DalianaLiu.

All Episodes

Daliana interviewed 6 data scientists from her meetup in New York City. It's a unique episode where you get to hear the real frustrations of data scientists. We talked about struggles working in healthcare, finance, data quality and AI, how to advocate for yourself, and align with your managers. Subscribe to Daliana's newsletter on ⁠www.dalianaliu.com⁠ for more on data science and career. Daliana's Twitter: ⁠https://twitter.com/DalianaLiu⁠ Daliana���s LinkedIn: ⁠https://www.linkedin.com/in/dalianaliu/

Apr 17

42 min 22 sec

Most experimentations fail, Kristi Angel shares her expertise on scaling experimentation and avoiding common A/B testing pitfalls. Learn five things that can help boost test velocity, designing impactful experiments, and leveraging knowledge repos. (Chapters below) Kristi Angel’s LinkedIn: ⁠https://www.linkedin.com/in/kristiangel/ Subscribe to Daliana's newsletter on ⁠www.dalianaliu.com⁠ for more on data science and career. Daliana's Twitter: ⁠https://twitter.com/DalianaLiu⁠ Daliana’s LinkedIn: ⁠https://www.linkedin.com/in/dalianaliu/⁠ (00:00:00) Intro (00:01:26) Why do most experimentations fail? (00:07:05) Mistakes in choosing metrics (00:10:05) Is revenue a good metric? (00:13:18) Split metrics in three ways (00:15:10) Daliana's story with too many category breakdowns (00:16:59) What makes the best data science team? (00:19:24) Data scientist work in silo vs in a data science team (00:21:15) Building a knowledge center (00:23:40) Example of knowledge center; nuance of experimentations (00:26:09) How many metrics and variants? (00:30:56) How to reduce noise - CUPED (00:33:01) Future of A/B testing (00:38:33) Q&A: Low statistical power

Apr 8

43 min 46 sec

Julia Silge is an engineering manager at Posit PBC, formerly know as R-studio, where she leads a team of developers building open source software MLOps. Before Posit, she finished a PhD in astrophysics, worked for several years in the nonprofit space, and was a data scientist at Stack Overflow where some of her most public work involved the annual developer survey. We talked about MLOps tools, challenges in survey data, text analysis, and balancing her interests in data science and engineering. Subscribe to Daliana's newsletter on ⁠www.dalianaliu.com⁠ for more on data science and career. Daliana's Twitter: ⁠https://twitter.com/DalianaLiu⁠ Daliana’s LinkedIn: ⁠https://www.linkedin.com/in/dalianaliu/⁠ (00:00:00) Introduction (00:00:56) Getting into data science (00:04:50) Transition from data centers to engineering manager (00:14:04) Common challenges in tool development (00:17:38) Challenges with survey data (00:26:47) Engineering skills for data scientists (00:28:59) Balancing roles (00:34:49) Developing skills in Exploratory Data Analysis (EDA) (00:39:19) Python vs. R for data analysis (00:44:40) Exciting aspects in career and personal life

Mar 30

46 min 18 sec

Wes McKinney is the co-creator of pandas library and he is the cofounder of Voltron data. Currently he is a principal Architect at Posit and an investor in data systems. Daliana's Twitter: ⁠https://twitter.com/DalianaLiu⁠ Daliana’s LinkedIn: ⁠https://www.linkedin.com/in/dalianaliu/⁠ Wes' LinkedIn: https://www.linkedin.com/in/wesmckinn/ (00:00:00) Introduction (00:00:44) How Pandas Started (00:06:40) Voltron Data (00:10:03) Benefits of Easy-to-Use Data Tools (00:13:20) The Rise of New Data Tools (00:18:07) Choosing Tools: Vertical or Flexible? (00:23:01) Big Models and Data Tools (00:29:29) Challenges in Building a Product (00:31:28) Becoming a Top Architect (00:34:55) Missed Aspects of Previous Roles (00:39:04) A Busy Week: Advising, Designing, Investing (00:43:42) Improving Open Source (00:45:24) How to Decide What to Work On (00:46:28) What he’s learning now (00:47:56) Excitement in Career and Life (00:48:29) Using ChatGPT for Learning (00:50:27) Future Impact Goals

Mar 22

52 min 24 sec

Christopher Fricker is a senior director in analytics and BI at Renaissance Learning. He started his career in finance and later became a data science consultant working with Meta, Netflix, and pre-IPO tech companies doing analytics. We talked about the mental models that helped him grow from a finance analyst to an analytics leader. Subscribe to Daliana's newsletter on ⁠www.dalianaliu.com⁠ for more on data science and career. Chris’ LinkedIn: https://www.linkedin.com/in/christopherfricker/ Daliana's Twitter: ⁠https://twitter.com/DalianaLiu⁠ Daliana’s LinkedIn: ⁠https://www.linkedin.com/in/dalianaliu/⁠ (00:00:00) Introduction (00:01:46) How to get promoted quickly (00:08:40) Power vs authority (00:11:21) First principal thinking (00:32:34) ROI of a data team (00:40:29) How to be persuasive (00:54:52) All Data is wrong (00:56:22) How he audits the data (01:00:52) How to make someone help you at work

Mar 15

1 hr 13 min

I interviewed Geoffery Angus, ML team lead @Predibase to talk about why adapter-based training is a game changer. We started with an overview of fine-tuning and then discussed five reasons why adapters are the future of LLMs. Later we also shared a demo and answered questions from the live audience. Try fine-tuning for free: https://pbase.ai/GetStarted Geoffrey’s LinkedIn:https://www.linkedin.com/in/geoffreyangus Daliana's Twitter: ⁠https://twitter.com/DalianaLiu⁠ Daliana’s LinkedIn: ⁠https://www.linkedin.com/in/dalianaliu/⁠ Daliana's Twitter: ⁠https://twitter.com/DalianaLiu⁠ Daliana’s LinkedIn: ⁠https://www.linkedin.com/in/dalianaliu/ Geoffrey’s LinkedIn: https://www.linkedin.com/in/geoffreyangus Try finetuning for free: https://pbase.ai/GetStarted (00:00:00) Intro (00:01:19) What is Fine-tuning? (00:08:18) Utilizing Adapters for Finetuning Enhancement (00:09:50) 5 reasons why adapters are the future of LLMs (00:26:34) Common Mistakes in Adapters Usage (00:28:34) Training Your Own Adapter (00:32:23) Behind the Scenes of the Adapter Training Process (00:37:51) Config File Guidance for Fine-Tuning (00:39:41) Debugging Strategies for Suboptimal Fine-Tuning Results (00:42:23) User Queries: Creating a LoRa Adapter and Future Support (00:51:06) Key Takeaways and Recap

Mar 8

52 min 45 sec

Jay Feng created a viral project using Seattle crime data and later got into data science. He later founded "Interview Query" helping data scientists get jobs. We'll talk about how he landed his data science job through his blog, and his journey from data scientist to founder. Subscribe to Daliana's newsletter on ⁠www.dalianaliu.com⁠ for more on data science and career. Daliana's Twitter: ⁠https://twitter.com/DalianaLiu⁠ Daliana’s LinkedIn: ⁠https://www.linkedin.com/in/dalianaliu/⁠ Jay Feng's LinkedIn: ⁠https://www.linkedin.com/in/jay-feng-ab66b049/⁠ Jay Feng's YouTube: ⁠https://www.youtube.com/c/DataScienceJay⁠ (00:00:00) Introduction (00:01:11) From engineer to data scientist (00:03:10) Got a job through a project (00:05:35) Daliana's portfolio project with Zillow (00:09:13) From data scientist to entreprenuer (00:13:19) "Tinder" for job (00:15:01) How he chose companies to work for (00:15:56) Why he became an entreprenuer (00:17:37) How many hours does he work (00:18:54) Challenges when building "interview query" (00:20:18) Speed vs scale (00:22:11) Growth hacks he used (00:24:22) YouTube vs newsletter (00:27:21) Lessons he learned as a CEO (00:29:16) How to grow from tech employee to founder (00:31:59) How he defines success (00:34:38) If you have a business idea for Jay

Feb 29

35 min 41 sec

Erik Gafni builds AI systems and teams. He founded Eventum AI (https://bit.ly/eventum-ai), an ML consulting company working with high-growth startups. We talked about GenAI projects he worked on, how he built production ML systems, how to scale ML teams, and his journey from biologist to ML researcher. Interested in working with Erik: https://bit.ly/erik-consulting Erik's LinkedIn: https://bit.ly/erik-gafni-LI (00:00:00) Introduction (00:01:59) Is GenAI overhyped? (00:04:28) Ascent translation with AI (00:11:58) Social media app with AI (00:14:00) Stable diffusion model evaluation (00:15:57) "Consult-to-hire" model (00:17:35) AI in biotech (00:22:46) Self-supervised learning (00:31:22) How he hires people (00:33:19) Research vs production (00:35:57) Is AGI coming? (00:37:30) New trends in GenAI (00:41:45) Data quality in GenAI (00:42:58) Philosophy in LLMs (00:49:48) OpenAI vs Open Source (00:53:58) Mistakes he made (00:57:41) How did he get into ML

Feb 24

1 hr 3 min

Jay Feng is the CEO of interview query, a service that help data scientists get jobs. Previously he worked as a data scientist at Nextdoor, Monster. We talked about data science job market, the rise of AI engineering, and the softskills people overlook during interviews. Subscribe to Daliana's newsletter on www.dalianaliu.com for more on data science and career. Daliana's Twitter: https://twitter.com/DalianaLiu Daliana’s LinkedIn: https://www.linkedin.com/in/dalianaliu/ Jay Feng's LinkedIn: https://www.linkedin.com/in/jay-feng-ab66b049/ Jay Feng's YouTube: https://www.youtube.com/c/DataScienceJay 00:00:00 Introduction 00:01:11 Data science job market in 2024 00:09:13 Build projects with AI 00:16:19 Softskills in interviews 00:23:18 Daliana's story on "socializing ideas" 00:28:38 Common mistakes in interviews 00:35:30 Product DS vs ML interviews 00:36:27 Product analytics interview questions 00:39:18 Career transition in DS 00:43:04 Jay's career journey 00:45:38 Is there a principal data analyst? 00:51:52 AI engineer 00:54:28 New roles vs obsolete roles in DS 01:04:46 Is data science dead?

Feb 16

1 hr 7 min

We are joined by two data scientists who have firsthand experience with layoffs. We’ll talk about how to negotiate severance packages, how to handle stress, strategies for job hunting post-layoff, and how to reduce risks in full-time employment. Working with Daliana on personal branding: https://forms.gle/heNuZzaHjaAMQwLu6 Her email: daliana@dalianaliu.com Guests: Susan Shu Chang: Linkedin: https://www.linkedin.com/in/susan-shu-chang/ Newsletter: susanshu.substack.com Sundar Swaminathan Linkedin: https://www.linkedin.com/in/sswamina3/ Website: https://www.sundarswaminathan.com/⁠ (00:00:00) Introduction (00:06:13) Severance Negotiation (00:20:29) Identity crisis (00:26:22) Job search after layoff (00:30:21) Networking (00:35:23) Risk at pre-seed startups (00:37:03) How should data scientists pick companies (00:40:43) What to ask hiring managers (00:45:01) Does GenAI change interview processes? (00:47:17) Are data science teams getting leaner? (00:48:56) Future of data science roles (00:50:37) Full time employment and job security (00:53:46) Benefits of full time jobs (00:58:14) Reduce risk of being laid off (01:00:43) How to sell yourself (01:02:43) How to plan your finances (01:05:09) How to become an independent consultant

Feb 9

1 hr 6 min

Jenny Wu is a data analyst turned sales engineer for data products at Hex. We talked about sales engineer vs data analyst, how to design a career based on your personality, and how to transition into a customer-facing role. Jenny’s LinkedIn: https://www.linkedin.com/in/jenny-wu-... Daliana's Twitter: https://twitter.com/DalianaLiu Daliana’s LinkedIn: https://www.linkedin.com/in/dalianaliu/ (00:00:00) Introduction (00:01:34) What is a Sales Engineer? (00:09:35) Sales Engineering Day-to-Day (00:13:09) Challenge in sales (00:21:37) Traits of Successful Salespeople (00:30:32) Stakeholder Engagement (00:36:24) Getting into customer-facing roles (00:43:55) Quitting her job to travel the world (00:48:05) Advice on Career Breaks (00:50:39) Embedding Career and Personal Goals (00:51:57) How do you achieve happiness?

Feb 1

57 min 26 sec

Barry McCardel is the cofounder and CEO of Hex(free trial: hex.tech/dsshow), a collaborative data workspace. Their customers include FiveTran, Notion, and Anthropic. We talked about what does the future of data team look like, how to tackle challenges of data team collaborations, and how to leverage AI in data science’s workflow. 60-day Free Trial: hex.tech/dsshow Barry’s LinkedIn: https://www.linkedin.com/in/barrymccardel (00:00:00) Introduction (00:01:25) Is AI replacing data scientists? (00:06:08) Are data science teams getting smaller? (00:09:54) What is Hex? (00:11:24) How to communicate with stakeholders (00:24:29) Should data scientists be full stack? (00:31:23) How data team measure ROI (00:33:35) Quantitative vs qualitative analysis (00:35:33) When you shouldn't use data? Data vs product intuition (00:41:39) How to hire your first data team? (00:48:59) Is the modern data stack dead? (00:53:55) GenAI in data science workflows (00:59:03) Future of data scientist (01:02:30) New features in Hex

Jan 21

1 hr 4 min