Research and Implement cutting edge techniques(Fine tuning, RLHF) in aligning Generative models to specific problem domains.
Build the necessary tooling for data acquisition, data cleaning, data augmentation, model training and visualization.
Evaluate and Implement the ML/Deep learning/GenAI models
Optimize models for production usage and help productize the generation scenarios to a production setting.
Required Qualifications, Capabilities, And Skills:
Masters in a STEM field such as Statistics, Math, Engineering, Information Systems, etc. or relevant degree in Data Science.
4+ years industry experience working as a Data Scientist on large-scale data science projects, with a proven track record of delivering business value.
Proficiency in Python or R
Expertise in statistical concepts and experience with traditional ML libraries such as scikit-learn, stats models and pandas
Experience in optimization and scaling of ML solutions for real world business use cases.
Extensive experience with developing and serving large scale Deep learning models across different data domains.
Proficiency with at least one deep learning library (Pytorch, Tensorflow or Keras) with building and deploying DNN models in production.
Expertise in NLP, Transformers, Large Language Models, hugging face library. Optimizations around LLM training and serving.
Experience with production operations and good practices for putting quality code in production and troubleshoot issues when they arise
Take initiative and be responsible for delivering complex software by working effectively with the team and other stakeholders
Can easily communicate technical ideas verbally and in writing (technical proposals, design specs, architecture diagrams and presentations)
Preferred Qualifications
Master’s degree in Data Science/ML/AI
Certification in cloud platforms such as AWS, GCP, and/or Azure.