Connecting to LinkedIn...


Lead Data Engineer - NYC or Boston

Job Title: Lead Data Engineer - NYC or Boston
Contract Type: Permanent
Location: New York City
Start Date: ASAP
Contact Name: Rebaz
Contact Email:
Job Published: 30 days ago

Job Description

Lead Data Engineer – Leadership track

  • Work at the new Soho, NYC or Boston HQ of a leading Healthcare firm with a market leading package as they go through an unprecedented merger
  • The Advanced Analytics teams include 100+ of the best Data Scientists, Data Engineers and Consultants in the country


  • Manages and responsible for successful delivery of large-scale data structures, pipelines and efficient ETL workflows.
  • Design, implement and build ETL pipelines that deliver data with measurable quality
  • Data engineering team lead for large and complex projects involving multiple resources and tasks, providing individual mentoring in support of company objectives.

Day to day

  • Designs and develops complex and large-scale data structures and pipelines to organize, collect and standardize data to generate insights and addresses reporting needs.
  • Building real-time data pipelines using Redshift, S3, Kinesis, Spark structured streaming, Akka
  • streams, and similar stacks on leading cloud platforms.
  • Writes complex ETL (Extract / Transform / Load) processes, designs database systems and develops tools for realtime and offline analytic processing.
  • Develop frameworks, standards & reference material for architecture and associated products.
  • Designs data marts and data models to support Data Science and other internal customers.
  • Behaves as mentor to junior team members to provide technical advice.
  • Applies knowledge of systems and products to consult and advise on additional efforts across multiple domains spanning broader enterprise.
  • Collaborates with data science team to transform data and integrate algorithms and models into highly available, production systems.
  • Uses in-depth knowledge on Hadoop architecture, HDFS commands and experience designing & optimizing queries to build scalable, modular, and efficient data pipelines.
  • Uses advanced programming skills in Python, Java or any of the major languages to build robust data pipelines and dynamic systems.
  • Integrates data from a variety of sources, assuring that they adhere to data quality and accessibility standards.
  • Experiments with available tools and advises on new tools in order to determine optimal solution given the
    requirements dictated by the model/use case.

What will help you get this role

  • Masters or PHD in Computer Science preferred
  • 8 or more years of progressively complex related experience
  • In-depth knowledge of large-scale search applications and building high volume data pipelines
  • Experience building and implementing data transformation and processing solutions
  • Advanced knowledge in Hadoop or Spark architecture, HDFS commands and experience designing & optimizing queries against data in the HDFS environment
  • Advanced knowledge in Java, Python, Hive, Cassandra, Pig, MySQL or NoSQL or similar
  • Experience with bash shell scripts, UNIX utilities & UNIX Commands
  • Ability to understand and build complex systems and solve challenging analytical problems
  • Ability to leverage multiple tools and programming languages to analyze and manipulate large data sets from disparate data sources
  • Proven ability to create innovative solutions to highly complex technical problems
  • Ability to communicate technical ideas and results to non-technical clients in written and verbal form

What’s here

  • A tight-knit team of passionate people and a tech-first business
  • Autonomy and end-to-end ownership
  • Very competitive pay, equity, full medical, dental & vision benefits and more
  • Opportunity for fast growth & promotion
  • Opportunity to work in one of our other offices