Design and develop a platform with services for data ingestion, orchestration, and analytics, that accelerates and automates data engineering, data preparation, and analysis
Develop analytic solutions for industry verticals and cross-enterprise product research using a variety of methodologies including ETL, data mining, and machine learning
Design and prototype ETL pipelines that integrate data from disparate sources and in different formats
Build flexible, insightful tools, dashboards, and reports that enable better decision making
Contribute to and maintain a codebase that facilitates rapid iteration and prototyping across the analytics team
Present analytics solutions and results to senior stakeholders and clients
Work with the Engineering team to implement final analytics solutions
Required Qualifications:
Proficiency in at least one programming language and familiarity with others (Python, Java, Scala)
Experience working in a Big Data ecosystem (Apache Spark, Hive, GCP – Dataproc, Bigquery, AWS– EMR, Athena, etc.)
Experience with Machine learning software such as Pandas/Scikit, R, Spark MLlib
Experience working with BI tools like Tableau as well as advanced visualization tools like D3, Matplotlib, GGPlot
Experience working with RESTful Services and Microservices
Skilled in using data science solutions to tackle complex business problems
Flexible, self-motivated individual with a desire to work in a dynamic, fast-paced start-up environment
Preferred Qualifications:
Good understanding of Distributed Systems and parallel computing paradigms
Strong understanding of SQL and NoSQL databases and development paradigms