A world leading Technology company who are leaders in their sector are expanding their SRE teams and looking to grow.
You will provide the infrastructure that supports core data services including data science and machine learning platform, search infrastructure, and various others. Our challenges span both software and hardware and the scale they work on is massive.
Their team is trusted to build the systems that run our newest cutting-edge platforms. They’re also depended upon to manage the configuration, deployment, and operation of the systems that power. You’ll have the ability to truly innovate and invent, helping define the technical foundations of groundbreaking systems. Built with containerization and Kubernetes on top of leading-edge hardware, including GPU's and DNN-specific hardware, we’ve built systems that rival super-computing platforms across the world.
- You will:
- - Design, build, and automate new solutions centered around the Kubernetes container orchestration platform and its ecosystem of projects
- - Be responsible for solutions which maintain configuration and robustness of systems
- - Analyze performance, metric placement and interpretation, and capacity planning
- - Troubleshoot and debug runtime issues with software and hardware
- - vDo OS and hardware level optimizations
- - Interact with platform developers to understand and validate their workflows, requirements, application performance, and application resilience
What’s in it for you:
- An opportunity to make key technical decisions which help define the future of data and analytics infrastructure platforms
- The chance to apply your existing experience while gaining cutting edge new experience in Kubernetes, containerization, GPU's, Data Science, and distributed database systems
- Your solutions will drive new functionality within the Bloomberg Terminal and other client interfaces -- direct drivers of financial decisions around the world
- - 2+ years systems configuration and automation experience (e.g. Ansible, Chef, Puppet, SaltStack -- error handling, idempotency, configuration management)
- - 2+ years Linux systems experience (Ubuntu, Debian experience preferred, ideally conversant in Unix networking and C system calls)
- - Proven experience in a programming and/or scripting language (e.g. python, go, java, ruby)
- - A strong familiarity with Continuous Integration and Continuous Deployment methodologies, chat-ops, etc.
- - Proven experience building and scaling out mission-critical, elastic load distributed, and high throughput systems