Sam Davidson

Seattle, Washington, United States Contact Info

Sign in to view Sam’s full profile

Welcome back

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

or

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

160 followers 156 connections

View mutual connections with Sam

Welcome back

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

or

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

Join to view profile

Amazon Web Services (AWS)

University of California, Davis

Activity

I’m excited to share that Duolingo has acquired Hobbes, a world-class animation and motion design studio based in Detroit. This acquisition will do…

I’m excited to share that Duolingo has acquired Hobbes, a world-class animation and motion design studio based in Detroit. This acquisition will do…

Liked by Sam Davidson
What is the biggest barrier to build open-source AI coding agents? It’s *data*, there are no high-quality, large, and open agentic coding datasets…

What is the biggest barrier to build open-source AI coding agents? It’s *data*, there are no high-quality, large, and open agentic coding datasets…

Liked by Sam Davidson
𝕎𝕙𝕪 #𝔸𝕀 𝕗𝕠𝕝𝕜𝕤 𝕟𝕖𝕖𝕕 𝕒 𝕓𝕣𝕠𝕒𝕕 𝕓𝕒𝕤𝕖𝕕 𝕀𝕟𝕥𝕣𝕠 𝕥𝕠 #𝔸𝕀 👉 As I go around giving talks/tutorials on the planning and…

𝕎𝕙𝕪 #𝔸𝕀 𝕗𝕠𝕝𝕜𝕤 𝕟𝕖𝕖𝕕 𝕒 𝕓𝕣𝕠𝕒𝕕 𝕓𝕒𝕤𝕖𝕕 𝕀𝕟𝕥𝕣𝕠 𝕥𝕠 #𝔸𝕀 👉 As I go around giving talks/tutorials on the planning and…

Liked by Sam Davidson

Join now to see all activity

Experience & Education

Amazon Web Services (AWS)

********** ** **********, *****

*** *********
**** *******

*** **** ******* ******
********** ** **********, *****

****** ** ********** - *** ***********

2018 - 2024
********** ** **********, *****

******** ** ****, **** ******* ****** *********** & ******** *******

2016 - 2018

View Sam’s full experience

See their title, tenure and more.

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

Publications

Developing a New Classifier for Automated Identification of Incivility in Social Media

Proceedings of the 4th Workshop on Online Abuse and Harms (November, 2020 - co-located with EMNLP, 2020) Nov 2020

Incivility is not only prevalent on online social media platforms, but also has concrete effects on individual users, online groups, and the platforms themselves. Given the prevalence and effects of online incivility, and the challenges involved in human-based incivility detection, it is urgent to develop validated and versatile automatic approaches to identifying uncivil posts and comments. This project advances both a neural, BERT-based classifier as well as a logistic regression classifier…

Incivility is not only prevalent on online social media platforms, but also has concrete effects on individual users, online groups, and the platforms themselves. Given the prevalence and effects of online incivility, and the challenges involved in human-based incivility detection, it is urgent to develop validated and versatile automatic approaches to identifying uncivil posts and comments. This project advances both a neural, BERT-based classifier as well as a logistic regression classifier to identify uncivil comments. The classifier is trained on a dataset of Reddit posts, which are annotated for incivility, and further expanded using a combination of labeled data from Reddit and Twitter. Our best performing model achieves an F1 of 0.802 on our Reddit test set. The final model is not only applicable across social media platforms and their distinct data structures, but also computationally versatile, and-as such-ready to be used on vast volumes of online data. All trained models and annotated data are made available to the research community.

See publication
Developing NLP Tools with a New Corpus of Learner Spanish

Proceedings of the 12th Language Resources and Evaluation Conference (LREC), 2020 2020

The development of effective NLP tools for the L2 classroom depends largely on the availability of large annotated corpora of language learner text. While annotated learner corpora of English are widely available, large learner corpora of Spanish are less common. Those Spanish corpora that are available do not contain the annotations needed to facilitate the development of tools beneficial to language learners, such as grammatical error correction. As a result, the field has seen little…

The development of effective NLP tools for the L2 classroom depends largely on the availability of large annotated corpora of language learner text. While annotated learner corpora of English are widely available, large learner corpora of Spanish are less common. Those Spanish corpora that are available do not contain the annotations needed to facilitate the development of tools beneficial to language learners, such as grammatical error correction. As a result, the field has seen little research in NLP tools designed to benefit Spanish language learners and teachers. We introduce COWS-L2H, a freely available corpus of Spanish learner data which includes error annotations and parallel corrected text to help researchers better understand L2 development, to examine teaching practices empirically, and to develop NLP tools to better serve the Spanish teaching community. We demonstrate the utility of this corpus by developing a neural-network based grammatical error correction system for Spanish learner writing.

See publication
Dependency Parsing for Spoken Dialog System

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP) November 5, 2019

Dependency parsing of conversational input can play an important role in language understanding for dialog systems by identifying the relationships between entities extracted from user utterances. Additionally, effective dependency parsing can elucidate differences in language structure and usage for discourse analysis of human-human versus human-machine dialogs. However, models trained on datasets based on news articles and web data do not perform well on spoken human-machine dialog, and…

Dependency parsing of conversational input can play an important role in language understanding for dialog systems by identifying the relationships between entities extracted from user utterances. Additionally, effective dependency parsing can elucidate differences in language structure and usage for discourse analysis of human-human versus human-machine dialogs. However, models trained on datasets based on news articles and web data do not perform well on spoken human-machine dialog, and currently available annotation schemes do not adapt well to dialog data. Therefore, we propose the Spoken Conversation Universal Dependencies (SCUD) annotation scheme that extends the Universal Dependencies (UD) (Nivre et al., 2016) guidelines to spoken human-machine dialogs. We also provide ConvBank, a conversation dataset between humans and an open-domain conversational dialog system with SCUD annotation. Finally, to demonstrate the utility of the dataset, we train a dependency parser on the ConvBank dataset. We demonstrate that by pre-training a dependency parser on a set of larger public datasets and fine-tuning on ConvBank data, we achieved the best result, 85.05% unlabeled and 77.82% labeled attachment accuracy.

See publication
Gunrock: A Social Bot for Complex and Engaging Long Conversations

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing November 5, 2019

Gunrock is the winner of the 2018 Amazon Alexa Prize, as evaluated by coherence and engagement from both real users and Amazon-selected expert conversationalists. We focus on understanding complex sentences and having in-depth conversations in open domains. In this paper, we introduce some innovative system designs and related validation analysis. Overall, we found that users produce longer sentences to Gunrock, which are directly related to users' engagement (e.g., ratings, number of turns)…

Gunrock is the winner of the 2018 Amazon Alexa Prize, as evaluated by coherence and engagement from both real users and Amazon-selected expert conversationalists. We focus on understanding complex sentences and having in-depth conversations in open domains. In this paper, we introduce some innovative system designs and related validation analysis. Overall, we found that users produce longer sentences to Gunrock, which are directly related to users' engagement (e.g., ratings, number of turns). Additionally, users' backstory queries about Gunrock are positively correlated to user satisfaction. Finally, we found dialog flows that interleave facts and personal opinions and stories lead to better user satisfaction.

See publication

Languages

English

Native or bilingual proficiency
French

Professional working proficiency

More activity by Sam

It was great to share thoughts on pros and cons of big and smaller models

It was great to share thoughts on pros and cons of big and smaller models

Liked by Sam Davidson
Did Open Science just beat OpenAI? 🤯 Kyutai just announced Moshi, a real-time native multimodal foundation model that can listen and speak, similar…

Did Open Science just beat OpenAI? 🤯 Kyutai just announced Moshi, a real-time native multimodal foundation model that can listen and speak, similar…

Liked by Sam Davidson
Exciting life update! 🎓 Last week I defended my dissertation and completed my PhD from UC Riverside! It was truly a life changing journey and I want…

Exciting life update! 🎓 Last week I defended my dissertation and completed my PhD from UC Riverside! It was truly a life changing journey and I want…

Liked by Sam Davidson
Announcement: Robert Brennan, Xingyao Wang, and I have formed a company! Our name is All Hands AI 🙌 Site: https://www.all-hands.dev/ Our mission…

Announcement: Robert Brennan, Xingyao Wang, and I have formed a company! Our name is All Hands AI 🙌 Site: https://www.all-hands.dev/ Our mission…

Liked by Sam Davidson
A few weeks ago, I got to travel to Ann Arbor, MI to the Annual Human Sentence Processing conference to share my recent work on how native Arabic…

A few weeks ago, I got to travel to Ann Arbor, MI to the Annual Human Sentence Processing conference to share my recent work on how native Arabic…

Liked by Sam Davidson
I'm excited to announce that I have started a new position as an Applied Scientist II on the Amazon Web Services Next Generation Developer Experience…

I'm excited to announce that I have started a new position as an Applied Scientist II on the Amazon Web Services Next Generation Developer Experience…

Posted by Sam Davidson
It was great to be at NAACL 2024 in Mexico city. I really enjoyed the conference and big congrats to my group and our MSR collaborators for winning…

It was great to be at NAACL 2024 in Mexico city. I really enjoyed the conference and big congrats to my group and our MSR collaborators for winning…

Liked by Sam Davidson
Soon after OpenAI released GPT-4o on Monday, May 13, some Chinese speakers started to notice that something seemed off about this newest version of…

Soon after OpenAI released GPT-4o on Monday, May 13, some Chinese speakers started to notice that something seemed off about this newest version of…

Liked by Sam Davidson
Attending #NAACL this week for presenting our Amazon Science paper "FLAP: Flow-Adhering Planning with Constrained Decoding in LLMs". Work done in…

Attending #NAACL this week for presenting our Amazon Science paper "FLAP: Flow-Adhering Planning with Constrained Decoding in LLMs". Work done in…

Liked by Sam Davidson
AI is not some sort of natural phenomenon that will just emerge and become dangerous. *WE* design it and *WE* build it. I can imagine thousands of…

AI is not some sort of natural phenomenon that will just emerge and become dangerous. *WE* design it and *WE* build it. I can imagine thousands of…

Liked by Sam Davidson
Hiring Applied Scientists for Amazon Q science team. If you are interested in building conversational AI assistants for enterprises (with expertise…

Hiring Applied Scientists for Amazon Q science team. If you are interested in building conversational AI assistants for enterprises (with expertise…

Liked by Sam Davidson
My new paper «Your Transformer is Secretly Linear» has been accepted at ACL! 🎉 We have discovered that most layers of language models are 99%…

My new paper «Your Transformer is Secretly Linear» has been accepted at ACL! 🎉 We have discovered that most layers of language models are 99%…

Liked by Sam Davidson
Turns out that everything they told you about scaling laws was specific to web data, because that’s what people used in their experiments, and doing…

Turns out that everything they told you about scaling laws was specific to web data, because that’s what people used in their experiments, and doing…

Liked by Sam Davidson
Thanks Saab, great working with everyone last summer. Looking forward to sharing our work with the wider NLP community!

Thanks Saab, great working with everyone last summer. Looking forward to sharing our work with the wider NLP community!

Liked by Sam Davidson

View Sam’s full profile

See who you know in common
Get introduced
Contact Sam directly

Join to view full profile

Sign in

Stay updated on your professional world

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

New to LinkedIn? Join now

Other similar profiles

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Others named Sam Davidson in United States

321 others named Sam Davidson in United States are on LinkedIn

See others named Sam Davidson

Add new skills with these courses

See all courses

Activity

I’m excited to share that Duolingo has acquired Hobbes, a world-class animation and motion design studio based in Detroit. This acquisition will do…

Liked by Sam Davidson

What is the biggest barrier to build open-source AI coding agents? It’s *data*, there are no high-quality, large, and open agentic coding datasets…

Liked by Sam Davidson

𝕎𝕙𝕪 #𝔸𝕀 𝕗𝕠𝕝𝕜𝕤 𝕟𝕖𝕖𝕕 𝕒 𝕓𝕣𝕠𝕒𝕕 𝕓𝕒𝕤𝕖𝕕 𝕀𝕟𝕥𝕣𝕠 𝕥𝕠 #𝔸𝕀 👉 As I go around giving talks/tutorials on the planning and…

Liked by Sam Davidson

Experience & Education

Amazon Web Services (AWS)

******* ********* **

View Sam’s full experience

See their title, tenure and more.

Publications

Developing a New Classifier for Automated Identification of Incivility in Social Media

Proceedings of the 4th Workshop on Online Abuse and Harms (November, 2020 - co-located with EMNLP, 2020) Nov 2020

Developing NLP Tools with a New Corpus of Learner Spanish

Proceedings of the 12th Language Resources and Evaluation Conference (LREC), 2020 2020

Dependency Parsing for Spoken Dialog System

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP) November 5, 2019

Gunrock: A Social Bot for Complex and Engaging Long Conversations

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing November 5, 2019

Languages

English

Native or bilingual proficiency

French

Professional working proficiency

More activity by Sam

It was great to share thoughts on pros and cons of big and smaller models

Liked by Sam Davidson

Did Open Science just beat OpenAI? 🤯 Kyutai just announced Moshi, a real-time native multimodal foundation model that can listen and speak, similar…

Liked by Sam Davidson

Exciting life update! 🎓 Last week I defended my dissertation and completed my PhD from UC Riverside! It was truly a life changing journey and I want…

Liked by Sam Davidson

Announcement: Robert Brennan, Xingyao Wang, and I have formed a company! Our name is All Hands AI 🙌 Site: https://www.all-hands.dev/ Our mission…

Liked by Sam Davidson

A few weeks ago, I got to travel to Ann Arbor, MI to the Annual Human Sentence Processing conference to share my recent work on how native Arabic…

Liked by Sam Davidson

I'm excited to announce that I have started a new position as an Applied Scientist II on the Amazon Web Services Next Generation Developer Experience…

Posted by Sam Davidson

It was great to be at NAACL 2024 in Mexico city. I really enjoyed the conference and big congrats to my group and our MSR collaborators for winning…

Liked by Sam Davidson

Soon after OpenAI released GPT-4o on Monday, May 13, some Chinese speakers started to notice that something seemed off about this newest version of…

Liked by Sam Davidson

Attending #NAACL this week for presenting our Amazon Science paper "FLAP: Flow-Adhering Planning with Constrained Decoding in LLMs". Work done in…

Liked by Sam Davidson

AI is not some sort of natural phenomenon that will just emerge and become dangerous. *WE* design it and *WE* build it. I can imagine thousands of…

Liked by Sam Davidson

Hiring Applied Scientists for Amazon Q science team. If you are interested in building conversational AI assistants for enterprises (with expertise…

Liked by Sam Davidson

My new paper «Your Transformer is Secretly Linear» has been accepted at ACL! 🎉 We have discovered that most layers of language models are 99%…

Liked by Sam Davidson

Turns out that everything they told you about scaling laws was specific to web data, because that’s what people used in their experiments, and doing…

Liked by Sam Davidson

Thanks Saab, great working with everyone last summer. Looking forward to sharing our work with the wider NLP community!

Liked by Sam Davidson

View Sam’s full profile

Sign in

Other similar profiles

Elizabeth Conrad

Shamik Roy

Caterina Keri

Aly Butler

Caroline Glabik

Nicholas Villarreal

Skyler Reese

Ben Wiebe

Christian Ridmark

Jules Vonessen

Explore collaborative articles

Others named Sam Davidson in United States

Sam Davidson

Sam Davidson

Sam Davidson

Sam Davidson

Sam Davidson

Add new skills with these courses

Advanced NLP with Python for Machine Learning (2020)

Natural Language Processing with PyTorch

Building Recommender Systems with Machine Learning and AI

What is the biggest barrier to build open-source AI coding agents? It’s data, there are no high-quality, large, and open agentic coding datasets…

AI is not some sort of natural phenomenon that will just emerge and become dangerous. WE design it and WE build it. I can imagine thousands of…