Data Engineer
Innowatts
Software Engineering, Data Science
Posted on Thursday, June 11, 2020
/ by Innowatts
Summary:
As a Data Engineer you will develop and maintain scalable data pipelines while collaborating with analytical and business teams to improve data models.
Responsibilities:
- Implements processes and systems to monitor data quality, ensuring production data is always accurate and available for key stakeholders and business processes that depend on it.
- Writes unit/integration tests, contributes to engineering wiki, and documents work.
- Performs data analysis required to troubleshoot data related issues and assist in the resolution of data issues.
- Works closely with a team of frontend and backend engineers, product managers, and analysts.
- Defines company data assets (data models), and other jobs to populate data models.
- Designs data integrations and data quality framework.
- Designs and evaluates open source and vendor tools for data lineage.
- Works closely with all business units and engineering teams to develop strategy for long term data platform architecture.
Minimum Qualifications:
- Preferable to have a Degree in an analytical field (e.g. Computer Science, Mathematics, Statistics, Engineering, Operations Research, Management Science) and 4+ years of professional experience.
- At least 4 years of data analytics experience in a distributed computing environment
- Database maintenance
- Building and analyzing dashboards and reports
- Evaluating and defining metrics and perform exploratory analysis
- Monitoring key product metrics and understanding root causes of changes in metrics
- Empower and assist operation and product teams through building key data sets and data-based recommendations
- Automating analyses and authoring pipelines via SQL/python based ETL framework
- Superb SQL programming skill.
- Understanding of ETL tools and database architecture.
- Advanced knowledge of data warehousing.
- Strong knowledge of code and programming concepts. Experience with Python.
- Experience with Kubernetes deployments and DevOps approach
- Highly motivated self-starter who is flexible and goal oriented
- Strong Python Knowledge
- Data Models
- Object-Oriented Programming
- Testing (Unit / Regression)
- Database Experience
- Window Functions
- Partitioning/Indexes
- Relational and Non-Relational
- Big Data Experience
- Hadoop
- Spark
- DataFrame API
- Performance Benchmarking
- Cluster Configuration/Optimization
- Spark Optimzation
- Version Control, CI/CD
- Git
- Jenkins, Drone
- Some Cloud Experience
- AWS (primary), Azure, Google Cloud
Nice to Haves
- Data Science Experience (Either direct or from working closely with a DS team)
- Scikit-Learn, Tensorflow, Spark.Mllib, General Algebra & Algorithms
- Airflow (Scheduling Tools)
- Container Experience
- Docker, Kubernetes
- Streaming Experience
- Kafka, Spark Streaming, Flink
Benefits and Additional Perks:
- Fast paced, collaborative and fun environment
- Work with data and latest technology to transform industry
- Competitive salary and bonus
- Medical, dental, vision, 401k, life and long-term disability insurance
- Paid Time Off
- Hybrid working
See more open positions at Innowatts
Something looks off?