Senior Quantitative Scientist (ML/NLP)
Verana Health, a digital health company that delivers quality drug lifecycle and medical practice insights from an exclusive real-world data network, recently secured a $150 million Series E led by Johnson & Johnson Innovation – JJDC, Inc. (JJDC) and Novo Growth, the growth-stage investment arm of Novo Holdings.
Existing Verana Health investors GV (formerly Google Ventures), Casdin Capital, and Brook Byers also joined the round, as well as notable new investors, including the Merck Global Health Innovation Fund, THVC, and Breyer Capital.
We are driven to create quality real-world data in ophthalmology, neurology and urology to accelerate quality insights across the drug lifecycle and within medical practices. Additionally, we are driven to advance the quality of care and quality of life for patients. DRIVE defines our internal purpose and is the galvanizing force that helps ground us in a shared corporate culture. DRIVE is: Diversity, Responsibility, Integrity, Voice-of-Customer and End-Results. Click here to read more about our culture and values.
Our headquarters are located in San Francisco and we have additional offices in Knoxville, TN and New York City with employees working remotely in AZ, CA, CO, CT, FL, GA, IL, LA, MA, NC, NJ, NY, OH, OR, PA, TN, TX, UT , VA, WA, WI. All employees are required to have residency in one of these states. Candidates who are willing to relocate are also encouraged to apply.
Job Title: Senior Quantitative Scientist
We are looking for a Senior Quantitative Scientist with data science, machine learning, and natural language processing (NLP) expertise that will work closely with clinical structured and unstructured text data derived from electronic health records in support of Verana Qdata® development serving internally-driven research areas, and commercial projects, with day-to-day activities including algorithm development, protocol creation, code implementation, modeling, inference, and interpretation
This role will report directly to the Manager within the Quantitative Sciences Data Development team. Built on the values of continuous learning, cross-functional collaboration, and rigorous scientific research, the Quantitative Sciences team strives to improve patient care by innovating at the intersection of real world data, clinical context, and methodology with our partners to ensure all available data is being used to in the most efficient, data-driven way possible.
Job Duties and Responsibilities:
- Develop and leverage state-of-the-art advances in natural language processing using pre-trained large language models (LLMs) for analyzing and reasoning over clinical notes and other unstructured data in the context of clinical problems.
- Drive cutting-edge research on language modeling with emphasis on scientific accuracy and explainability.
- Communicate analysis results via presentations to a multi-disciplinary audience using clear, intuitive visualizations.
- Establish and maintain best practices for data exploration, end-to-end model development and deployment lifecycle, and data/code/documentation management
- Work on Qdata development and commercial projects leveraging real-world data through responsibilities such as creation of study plans, implementation of analyses, development of algorithms, and/or writing of publications.
- Collaborate cross-functionally with teams (e.g., Commercial, Product, Medical, Engineering/Technology, etc.) to translate clinical investigation questions into detailed data analytics requirements for internal and external projects.
- Provide mentorship and knowledge sharing to team members in standardizing machine learning/natural language processing best practices.
- Master’s or doctorate in a quantitative discipline (e.g., data science, computer science, machine learning, biostatistics, health economics, etc.) or equivalent practical experience.
- 5+ years of hands-on experience with messy data (e.g., electronic health records, outcomes data) and analytical methodologies.
- 3+ years of hands-on experience with machine learning model implementation & deployment.
- 3+ years of hands-on experience with state-of-the-art natural large language models (e.g., BERT, Longformer, RoBERTa, etc.) in resolving use cases like named entity recognition (NER), text classification, entity relation extraction, etc.
- Strong familiarity with programming languages, especially Python, Pyspark, R, SQL.
- Strong familiarity with coding platforms, especially Databricks, Amazon Sagemaker, Visual Studio Code.
- Strong familiarity with unstructured text processing techniques.
- Familiarity with clinical datasets and coding systems such as ICD, CPT, and RxNorm.
- Ability to work effectively with cross-functional teams.
- Clear communication skills and able to deliver internal/external presentations.
- Ability to prioritize and manage multiple projects with high attention to detail.
- Direct machine learning/natural language processing experiences using EHR datasets in healthcare domain
- We provide health, vision, and dental coverage for employees
- Verana pays 100% of employee insurance coverage and 70% of family
- Plus an additional monthly $100 individual / $200 HSA contribution with HDHP
- Spring Health mental health support
- Flexible vacation plans
- A generous parental leave policy and family building support through the Carrot app
- $500 learning and development budget
- $25/wk in Doordash credit
- Headspace meditation app - unlimited access
- Gympass - 3 free live classes per week + monthly discounts for gyms like Soulcycle
You do not need to match every listed expectation to apply for this position. Here at Verana, we know that diverse perspectives foster the innovation we need to be successful, and we are committed to building a team that encompasses a variety of backgrounds, experiences, and skills.
Please note pay ranges for major metropolitan areas may be different.