-
Location
United States
-
Sector:
-
Job type:
-
Salary:
Circa $150k + Bonuses
-
Contact:
Jennifer Beltran
-
Contact email:
jennifer@intelletec.com
-
Job ref:
JB - DAT0853
-
Consultant:
Jennifer Beltran
Locations: CT-Hartford, MA-Wellesley, NY-New York
Intelletec has partnered in helping the Nation's Premier Health Innovators in helping people on their path to better health. They are building a new health care model that is easier to use, less expensive, and puts the consumer at the center of their care!!
Description:
Leads and participates in the design, built and management of large scale data structures and pipelines and efficient Extract/Load/Transform (ETL) workflows.
- Key skills: Data Pipelines, ETL (short for Extract, Transform, Load), data warehouse, data warehousing, data architecture, database
- Key tools & technologies: Programming (Python), Data Warehousing (Hadoop, MapReduce, HIVE, PIG, Apache Spark, Kafka) and databases (SQL and NoSQL)
Fundamental Components:
- Develops large scale data structures and pipelines to organize, collect and standardize data that helps generate insights and addresses reporting needs.
- Writes ETL (Extract / Transform / Load) processes, designs database systems and develops tools for real-time and offline analytic processing.
- Collaborates with data science team to transform data and integrate algorithms and models into automated processes.
- Uses knowledge in Hadoop architecture, HDFS commands and experience designing & optimizing queries to build data pipelines.
- Uses strong programming skills in Python, Java or any of the major languages to build robust data pipelines and dynamic systems.
- Builds data marts and data models to support Data Science and other internal customers.
- Integrates data from a variety of sources, assuring that they adhere to data quality and accessibility standards.
- Analyzes current information technology environments to identify and assess critical capabilities and recommend solutions.
- Experiments with available tools and advises on new tools in order to determine optimal solution given the requirements dictated by the model/use case.
Background Experience:
- 5 or more years of progressively complex related experience.Ability to leverage multiple tools and programming languages to analyze and manipulate data sets from disparate data sources.
- Ability to understand complex systems and solve challenging analytical problems.
- Experience with bash shell scripts, UNIX utilities & UNIX Commands.Knowledge in Java, Python, Hive, Cassandra, Pig, MySQL or NoSQL or similar.
- Knowledge in Hadoop architecture, HDFS commands and experience designing & optimizing queries against data in the HDFS environment.
- Experience building data transformation and processing solutions.
- Has strong knowledge of large scale search applications and building high volume data pipelines.
- Master’s degree or PhD preferred.
- Bachelor's degree or equivalent work experience in Computer Science, Engineering, Machine Learning, or related discipline.