Intelletec has partnered with an AI-Based Document Processing company, turning documents into data within seconds with great accuracy.
This company is looking for a great data engineer to build out and scale our analytics platform and corresponding data pipelines. Responsible for building and scaling a robust platform that will deliver our ML/AI-driven insights to coordinate with the data visualization team to create engaging and insightful content
- Craft data engineering components, applications, and entities to empower the self-service of our big data
- Develop and implement technical best ETL practices for data movement, data quality, and data cleansing
- Optimize and tune ETL processes, utilize reusability, parameterization, workflow design, caching, parallel processing, and other performance tuning techniques.
- Knowledgeable about data engineering best practices, comfortable in a fast-paced startup
- Experience with data warehousing, streaming data, and supporting architectures: pub/sub, stream processor/data aggregator, real-time analytics, data lake cluster computing framework
- Master of components necessary to architect solutions for complex data platforms, and large scale CI/CD data pipelines using a variety of technologies (REST APIs, Advanced SQL, Amazon S3, Apache Kafka, Data-Lakes, etc.), relational SQL DBs (e.g. MySQL, Postgres), newer (e.g. Mongo, Neo4j) to in-memory caches (e.g. Redis, Memcache)
- Working knowledge of distributed computing and data modeling principles.
- Experience with object-oriented design and coding and testing patterns, including experience with engineering software platforms and data infrastructures.
- Experience in Big Data, PySpark, and Streaming Data.
- Knowledge of data management standards, data governance practices, and data quality dimensions.
- Experience in UNIX systems, writing shell scripts, and programming in Python
- Hands-on experience in Python using libraries like NumPy, Pandas, and PySpark.
Reach out to firstname.lastname@example.org for more info!