Data Engineer (Python)

  • Location

    New York

  • Sector:

    Software Engineering, Data Science

  • Salary:

    Circa $200,000

  • Contact:

    Jason Rumney

  • Contact email:


  • Job ref:


  • Consultant:

    Jason Rumney

The Role:

By market definition, a data engineer is responsible for creating, maintaining and understanding data and the resulting delivery infrastructure. They are the connection between smart business users and not-so-smart data repositories. They are capable (through a solid command of various scripting languages (Python, R, SQL)) of taking any source of data and performing an EVL(ST) Extract, Validate, Load, Standardize, Transform to the correct data store, in the form agreed upon by the Data Engineer and the end-user. Data Engineers are often responsible for the efficacy, quality and elegance of their solutions. They are business savvy and understand the importance of the data they are piping in - to an extent.



  • Contribute to the design and development of our Python data workflow management platform

  • Design and develop tools to wrangle datasets of small and large volumes of data into cleaned, normalized, and enriched datasets

  • Build and enhance a large, scalable Big Data platform (Spark, Hadoop)

  • Refine processes for normalization and performance-tuning analytics



  • You love building elegant solutions that scale

  • You bring deep experience in the architecture and development of quality backend production systems, specifically in Python

  • You love working on high-performing teams, collaborating with team members, and improving our ability to deliver delightful experiences to our clients

  • You are excited by the opportunity to solve challenging technical problems, and you find learning about data fascinating

  • You understand Server, Network, and Hosting Environments, RESTful and other common APIs, common data distribution, and hosted storage solutions

Must Have:

  • 5+ years of full-time experience in a professional environment

  • Expertise in Python

  • Experience with ETL and/or other big data processes

  • Experience with at least 2 popular big data / distributed computing frameworks, eg. Spark, Hive, Kafka, Map Reduce, Flink

  • Experience working independently, or with minimal guidance

  • Strong problem solving and troubleshooting skills

  • Ability to exercise judgment to make sound decisions

  • Proficiency in multiple programming languages

  • Strong communications skills, interpersonal skills, and a sense of humor