Data Engineer

Innowatts

Innowatts

Software Engineering, Data Science
Posted 6+ months ago

About Innowatts

Innowatts is an energy technology company based in Houston, TX that is transforming the way
energy is bought, sold, managed and consumed. We are a leading provider of AMI-enabled predictive
analytics and AI-based solutions for utilities, energy retailers, emerging retailers, and smart energy
communities. To date, the Innowatts eUtility™ technology platform has enabled over 52 million
energy consumers and their energy providers with access to lower energy costs and a more reliable
and personalized energy experience. Innowatts is backed by Energy Impact Partners, Shell Ventures,
Iberdrola Energy Ventures, Veronorte and Energy and Environment Investment (Japan).

Summary

As a Data Engineer you will develop and maintain scalable data pipelines while collaborating with
analytical and business teams to improve data models.

Responsibilities

  • Implements processes and systems to monitor data quality, ensuring production data is
    always accurate and available for key stakeholders and business processes that depend on it.
  • Writes unit/integration tests, contributes to engineering wiki, and documents work.
  • Performs data analysis required to troubleshoot data related issues and assist in the
    resolution of data issues.
  • Works closely with a team of frontend and backend engineers, product managers, and
    analysts.
  • Defines company data assets (data models), and other jobs to populate data models.
  • Designs data integrations and data quality framework.
  • Designs and evaluates open source and vendor tools for data lineage.
  • Works closely with all business units and engineering teams to develop strategy for long
    term data platform architecture.

Minimum Qualifications

  • Preferable to have a Degree in an analytical field (e.g. Computer Science, Mathematics,
    Statistics, Engineering, Operations Research, Management Science) and 4+ years of
    professional experience.
  • At least 4 years of data analytics experience in a distributed computing environment
  • Database maintenance
  • Building and analyzing dashboards and reports
  • Evaluating and defining metrics and perform exploratory analysis
  • Monitoring key product metrics and understanding root causes of changes in metrics
  • Empower and assist operation and product teams through building key data sets and data-
    based recommendations
  • Automating analyses and authoring pipelines via SQL/python based ETL framework
  • Superb SQL programming skill
  • Understanding of ETL tools and database architecture
  • Advanced knowledge of data warehousing.
  • Strong knowledge of code and programming concepts. Experience with Python.
  • Experience with Kubernetes deployments and DevOps approach
  • Highly motivated self-starter who is flexible and goal oriented
  • Strong Python Knowledge
    • Data Models
    • Object-Oriented Programming
    • Testing (Unit / Regression)
  • Database Experience
    • Window Functions
    • Partitioning/Indexes
    • Relational and Non-Relational
  • Big Data Experience
    • Hadoop
    • Spark
    • DataFrame API
  • Performance Benchmarking
    • Cluster Configuration/Optimization
    • Spark Optimzation
  • Version Control, CI/CD
    • Git
    • Jenkins, Drone
  • Some Cloud Experience
    • AWS (primary), Azure, Google Cloud.
  • Nice to have
    • Data Science Experience (Either direct or from working closely with a DS team)
    • Scikit-Learn, Tensorflow, Spark.Mllib, General Algebra & Algorithms
    • Airflow (Scheduling Tools)
    • Container Experience : Docker, Kubernetes
    • Streaming Experience : Kafka, Spark Streaming, Flink

Benefits and Additional Perks

  • Fast paced, collaborative and fun environment
  • Work with data and latest technology to transform industry
  • Competitive salary and bonus
  • Medical, dental, vision, 401k, life and long-term disability insurance
  • Paid Time Off
  • Hybrid working.