W1siziisimnvbxbpbgvkx3rozw1lx2fzc2v0cy9jbnrlbgxldgvjig5ldy9qcgcvbmv3lwjhbm5lci1kzwzhdwx0lmpwzyjdxq

Senior Data Engineer

  • Location

    United States

  • Sector:

    Software Engineering, Data Science

  • Job type:

    Permanent

  • Salary:

    Circa $150k + Bonuses

  • Contact:

    Jennifer Beltran

  • Contact email:

    jennifer@intelletec.com

  • Job ref:

    JB - DAT0853

  • Consultant:

    Jennifer Beltran

Locations: CT-Hartford, MA-Wellesley, NY-New York

Intelletec has partnered in helping the Nation's Premier Health Innovators in helping people on their path to better health. They are building a new health care model that is easier to use, less expensive, and puts the consumer at the center of their care!!


Description:

Leads and participates in the design, built and management of large scale data structures and pipelines and efficient Extract/Load/Transform (ETL) workflows.

  • Key skills: Data Pipelines, ETL (short for Extract, Transform, Load), data warehouse, data warehousing, data architecture, database
  • Key tools & technologies:  Programming (Python), Data Warehousing (Hadoop, MapReduce, HIVE, PIG, Apache Spark, Kafka) and databases (SQL and NoSQL)

Fundamental Components:

  • Develops large scale data structures and pipelines to organize, collect and standardize data that helps generate insights and addresses reporting needs.
  • Writes ETL (Extract / Transform / Load) processes, designs database systems and develops tools for real-time and offline analytic processing.
  • Collaborates with data science team to transform data and integrate algorithms and models into automated processes.
  • Uses knowledge in Hadoop architecture, HDFS commands and experience designing & optimizing queries to build data pipelines.
  • Uses strong programming skills in Python, Java or any of the major languages to build robust data pipelines and dynamic systems.
  • Builds data marts and data models to support Data Science and other internal customers.
  • Integrates data from a variety of sources, assuring that they adhere to data quality and accessibility standards.
  • Analyzes current information technology environments to identify and assess critical capabilities and recommend solutions.
  • Experiments with available tools and advises on new tools in order to determine optimal solution given the requirements dictated by the model/use case.

Background Experience:

  • 5 or more years of progressively complex related experience.Ability to leverage multiple tools and programming languages to analyze and manipulate data sets from disparate data sources.
  • Ability to understand complex systems and solve challenging analytical problems.
  • Experience with bash shell scripts, UNIX utilities & UNIX Commands.Knowledge in Java, Python, Hive, Cassandra, Pig, MySQL or NoSQL or similar.
  • Knowledge in Hadoop architecture, HDFS commands and experience designing & optimizing queries against data in the HDFS environment.
  • Experience building data transformation and processing solutions.
  • Has strong knowledge of large scale search applications and building high volume data pipelines.
  • Master’s degree or PhD preferred.
  • Bachelor's degree or equivalent work experience in Computer Science, Engineering, Machine Learning, or related discipline.