Senior/Principal Data Scientist - NLP (Remote) - United Kingdom

Veeva Systems is a mission-driven organization and pioneer in industry cloud, helping life sciences companies bring therapies to patients faster. As one of the fastest-growing SaaS companies in history, we surpassed $2B in revenue in our last fiscal year with extensive growth potential ahead.

At the heart of Veeva are our values: Do the Right Thing, Customer Success, Employee Success, and Speed. We're not just any public company – we made history in 2021 by becoming a public benefit corporation (PBC), legally bound to balancing the interests of customers, employees, society, and investors.

As a Work Anywhere company, we support your flexibility to work from home or in the office, so you can thrive in your ideal environment.

Join us in transforming the life sciences industry, committed to making a positive impact on its customers, employees, and communities.

The Role

Link is a key part of  Veeva Systems, aimed at "Connecting life sciences and key people to improve research and care." Our product provides real-time academic, social, and medical data to create comprehensive profiles, helping life-science partners find experts to speed up new therapeutics' development and adoption.

As part of the AI team, your main task will be to develop LLM-based agents specialized in extracting detailed information about Key Opinion Leaders (KOLs) in healthcare. You’ll build an end-to-end pipeline to analyze unstructured websites, and medical documents, and enable semantic searches for KOL data across various languages.

Using cloud infrastructure, you'll create models for information extraction and collaborate with software developers to deploy them. We aim to set new industry standards by training ML models with input from 2000+ curators, ensuring quality and scalability across different regions, languages, and specialties.

You can work remotely from anywhere in Portugal, the UK, or Spain, but must reside in one of these countries and be legally authorized to work there without visa or relocation support from Veeva. If you don't meet this requirement but consider yourself an exceptional candidate, please include a separate note for consideration.

What You'll Do

    • Adopt the latest NLP technologies and trends
    • Develop LLM-based agents capable of performing function calls and using tools like browsers for enhanced data retrieval
    • Design and implement end-to-end pipelines to extract predefined information from large-scale, unstructured, multi-domain, and multilingual data
    • Use and develop techniques like named entity recognition, entity-linking, slot-filling, few-shot learning, active learning, question answering, and dense passage retrieval for information extraction
    • Collaborate with data quality teams to define annotation metrics and conduct qualitative and quantitative evaluations

Requirements

    • 4+ years of data science experience (or 2+ years with a Ph.D.)
    • Master’s or Ph.D. in Computer Science, AI, Computational Linguistics, or a related field
    • Strong foundation in Natural Language Processing (NLP) and Machine Learning (ML)
    • Experience with Reinforcement Learning from Human Feedback (RLHF) methods like Direct Preference Optimization (DPO) and Proximal Policy Optimization (PPO) for training LLMs based on human preferences
    • Hands-on experience with large language models and transformers (e.g., GPT, BERT)
    • Proficient in Python and NLP libraries (e.g., NLTK, SpaCy, Hugging Face)
    • Familiarity with Big Data frameworks (e.g., Ray, Spark) and Deep Learning frameworks (e.g., PyTorch, JAX)
    • Experience with cloud infrastructure, containerization (Docker, Kubernetes)
    • Excellent collaboration and communication skills for cross-functional teamwork

Nice to Have

    • AWS experience
    • Experience in the life/health sciences industry, especially pharma
    • Published work in the AI field
    • Production-level development skills
    • Leadership abilities with a strong network for team growth and hiring
    • Familiarity with model registry tools (e.g., MLflow)
    • Knowledge of distributed computing platforms (e.g., Ray, Spark)

Perks & Benefits

    • Work anywhere
    • Personal development budget 
    • Veeva charitable giving program
    • Fitness reimbursement
    • Life insurance + pension fund
#RemoteUK

Veeva’s headquarters is located in the San Francisco Bay Area with offices in more than 15 countries around the world.

As an equal opportunity employer, Veeva is committed to fostering a culture of inclusion and growing a diverse workforce. Diversity makes us stronger. It comes in many forms. Gender, race, ethnicity, religion, politics, sexual orientation, age, disability and life experience shape us all into unique individuals. We value people for the individuals they are and the contributions they can bring to our teams.

If you need assistance or accommodation due to a disability or special need when applying for a role or in our recruitment process, please contact us at talent_accommodations@veeva.com.

Similar Jobs