Profluent is an AI-first protein design company. Founded in 2022, we develop deep generative models to design and validate novel, functional proteins to revolutionize biomedicine. Based in Berkeley, CA, we are backed by leading investors including Spark Capital, Insight Partners, Air Street Capital, AIX Ventures, and Convergent Ventures.
The Role
At Profluent, data is our lifeline. Our generative models learn the blueprint of life by modeling large-scale evolutionary data. We are seeking a Bioinformatics Scientist / Engineer to perform genomic data mining to expand the known universe of biological sequences.
This is an excellent opportunity to shape the future of AI-driven protein design and to work cross-functionally with a diverse team of experts across machine learning, protein engineering, cell biology, and gene editing.
Responsibilities
Perform large-scale metagenome assembly across the NCBI Sequence Read Archive
Maintain and expand the world’s largest database of protein sequences
Deploy cloud-based pipelines to process large-scale genomic datasets
Build cloud databases for scalable storage and fast retrieval of terabases of genomic data, including genomes, genes, proteins, and structures
Clearly document code and communicate outcomes to colleagues
Qualifications
BS, MS, or PhD in Bioinformatics, Genomics, Computer Science, or a related quantitative bioscience field
3+ years of industry or postdoc experience
Experience working with Google Cloud Platform (GCP) or other cloud-based compute services (e.g. AWS)
Experience building cloud pipelines, pipelining tools (snakemake, nextflow), and containerized applications (docker)
Experience with highly parallelized cloud-based computing platforms (Batch or Kubernetes)
Experience with scalable databases (BigTable, BigQuery, Snowflake) and proficient in database programming (SQL)
Fluent in Python data analysis tools (numpy, pandas, Jupyter notebook, biopython)
Experience with Linux environments and version control (git)
Pays attention to detail, highly organized, and excels in a fast-paced work environment
Preferences (but not required)
Experience with bioinformatics sequence analysis and alignment tools
Experience working with next-generation sequencing data
Familiarity with public repositories like UniProt, EBI, JGI, NCBI, and SRA
Familiar with concepts in molecular biology, biochemistry, and structural biology
Biological knowledge about prokaryotic gene and genome structure
Publications in major scientific journals or conferences
What we offer at Profluent (aside from a fulfilling journey with awesome people)…
A high-growth opportunity with meaningful impact
Competitive compensation package
Health insurance (health/dental/vision)
Generous paid time off (PTO) policy
Commitment to physical and mental well-being
More benefits and perks to be announced shortly!
Profluent Bio is an equal opportunity employer promoting diversity and inclusion in the workspace. We do not discriminate on the basis of race, color, religion, marital status, age, national origin, ancestry, physical or mental disability, medical conditions, veteran status, sexual orientation, gender (including gender identity and gender expression), sex (which includes pregnancy, childbirth, and breastfeeding), genetic information, taking or requesting statutorily protected leave, or any other basis protected by law.
Legal authorization to work in the United States is required. In compliance with federal law, all persons hired will be required to verify identity and eligibility to work in the United States and to complete the required employment eligibility verification form upon hire.
Employment type
Full-time
Referrals increase your chances of interviewing at Profluent by 2x