Data Scientist

Pittsburgh, PA | Work from home flexibility

Posted: 02/03/2023 Job Category: Database Job Number: 6583

Job Description

Job Summary
The successful candidate will support data analysis efforts of our company, the successful candidate will develop and apply cutting-edge machine learning methods that enable the analysis of multi-omic biological tumor
datasets to help develop novel insights towards discovering novel therapeutic cancer targets and treatments.

• Support leadership of the systems immunology and computational biology teams to analyze and prepare biological datasets.
• Develop and implement appropriate machine learning (ML) models and deploy them to scale.
• Implement metrics to verify model and algorithm effectiveness.
• Automate model training, testing and deployment and ensure proper code documentation.
• Collaborate with computational team to develop ontology-based NLP platform to support in-house target discovery efforts.
• Must be willing to work flexible hours as necessary, and work beyond 40 hrs is likely to be required.
• Reports to the leadership of the systems immunology and computational biology teams.
• Document code produces reports and communicates results to the broader TTMS team.
• Support the computational biology and systems immunology teams in developing analytical tools that meet operational requirements associated with interrogation and management of clinical and non-clinical datasets.
• Ensure adherence to standards required to conduct work in a HIPAA compliant data management and sharing.
• Keep up with emerging trends in ML, Deep Learning (DL) and Natural Language Processing (NLP).
• Performs in accordance with system-wide competencies/behaviors.
• Performs other duties as assigned.

Educational and Knowledge Requirements
Computational, data science, statistics, mathematics, physics, or a related quantitative field
Experience in developing and applying ML algorithms to high-dimensional datasets.
Strong understanding of statistics, ML fundamentals, modern ML, DL and NLP libraries. Proficient in Python and R scripting.
Expertise in ML/DL frameworks like TensorFlow, Keras, Scikit-learn/Caret and NLP libraries like BERT, BioBERT, NLTK.

Demonstrated ability to write high-quality, production-ready code. Experience with version control systems like GIT. Self-motivated, organized, goal oriented, team player focused on a career in biotech. Demonstrated ability to
adhere to and follow defined timelines, milestone, and objectives. Able to deal with uncertainty and solve problems creatively and independently with solid judgement. Experience with multi-omic biological datasets (e.g., RNAseq,
Exome-seq , next-gen sequencing) is highly desirable. Experience with cloud computing (AWS) and distributed architectures like Sagemaker and Spark preferred. Experience with relational databases and SQL preferred.
Knowledge of cancer biology and immunology preferred.

The company is an equal opportunity employer and will consider all applications without regards to race, sex, age, color, religion, national origin, veteran status, disability, sexual orientation, gender identity, genetic information or any characteristic protected by law.

