Position Overview:
As a Cloud Infrastructure Engineer at Circadia Health, you will play a critical role in ensuring our technical infrastructure's stability, scalability, and efficiency, which powers innovative predictive clinical analytics solutions. Reporting directly to the CTO, you will work collaboratively with the engineering and product teams to maintain and enhance the systems and processes that underpin our mission to save lives through advanced healthcare technology.
You will manage key technical systems and workflows, oversee infrastructure, pipeline optimization, and the seamless integration of external data sources. This hands-on role requires deep technical expertise, problem-solving abilities, and a passion for healthcare technology.
This role demands a deep passion for improving patient care through healthcare technology and a hands-on approach to solving challenges. Your ultimate goal will be to deliver an exceptional customer experience while building the foundation for Circadia's future success. This is a mission critical role where reliability standards are extremely important since you will be directly responsible for the health of our software architecture serving 30k+ patients monitored by our Circadia Contactless Monitor (IoT devices) every day, growing to 100k+ in the next 2 to 3 years.
Key Responsibilities:- Maintain and enhance AWS infrastructure instrumentation and observability tools (e.g., Grafana, alarms) to ensure system reliability.
- Oversee Circadia's CI/CD pipelines (Jenkins) to enable efficient and seamless code deployment.
- Manage and maintain a fully separated staging environment for testing and development.
- Monitor AWS infrastructure for cost efficiency, identifying and implementing improvements.
- Optimize Snowflake ETL pipelines to reduce costs while maintaining performance and reliability.
- Manage GPT pipelines in Azure to ensure performance and cost-efficiency.
- Develop and maintain data pipelines for integrating external electronic health record (EHR) system data.
- Monitor and maintain MySQL databases to guarantee optimal performance and reliability.
- Collaborate with the backend team to design and implement APIs supporting Circadia's suite of products.
- Architecture: Design, deploy, and manage AWS infrastructure solutions to support various applications and services.
- Design scalable systems for storage and processing of large amounts of medical data.
- Manage databases (e.g., MySQL, MongoDB), optimizing for performance, scalability, and cost-efficiency.
- Manage compute clusters (e.g., ECS), serving various internal and customer-facing products and services.
- Utilize Terraform to efficiently manage cloud infrastructure.
- Ensure high availability, scalability, and reliability of the cloud environment.
- Security: Collaborate with development, operations, and security teams to ensure seamless integration and delivery of applications.
- Manage cloud infrastructure roles, permissions, and access credentials.
- Oversee regular and thorough rotation of access credentials and keys.
- Reliability: Troubleshoot and resolve infrastructure-related issues promptly and effectively.
- Maintain comprehensive and actionable runbooks for dealing with incidents and infrastructure outages.
- Create detailed post-mortems in case of significant outages.
- Implement automated alerting and incident response systems to identify and resolve issues quickly.
- Documentation: Create and maintain comprehensive documentation for cloud infrastructure and processes.
- Maintain documentation at a level required for a cloud infrastructure powering a SaMD (Software as a Medical Device) product.
- Instrumentation: Develop and maintain instrumentation infrastructure to ensure system health.
- Build instrumentation systems to provide timely system health checks and alerts using Prometheus and Grafana.
- Implement and maintain automated alerting and incident response systems for quick issue identification and resolution.
- DevOps Support: Automate routine tasks and processes to improve efficiency and reduce manual intervention.
- Implement and maintain CI/CD pipelines (Jenkins, CircleCI, or similar) to manage the deployment of Circadia’s services and products (backend services, Android, iOS, React apps).
- Write clean, testable code with a commitment to maintaining high coding standards through comprehensive testing (Jest, PyTest, JUnit, etc.).
Attributes:- Need to Haves:
- Advanced knowledge of Python and related frameworks (FastAPI, NumPy, Pandas, Pydantic) including multithreading and parallel design principles.
- Understanding of AWS, including knowledge of Cognito, Pinpoint, IoT, MSK and other services.
- Expertise in Javascript and frameworks such as ReactJS
- Deep understanding of user-centered design principles, design thinking methodologies, and usability best practices.
- Knowledge of HTTP(S) as a protocol
- Proficient in using and maintaining Docker containers.
- Strong understanding of RESTful API design principles and best practices.
- Experience with TDD and testing frameworks such as PyTest.
- Nice to Haves:
- Experience with Azure services for managing GPT pipelines and multi-cloud infrastructure.
- Familiarity with big data technologies such as Apache Spark, Kafka, and MSK for large-scale data processing.
- Advanced experience in cost optimization strategies for cloud infrastructure and database performance tuning.
- Technical acumen: Advanced knowledge of all AWS systems and services.
- Detail oriented: Responsible for mission-critical healthcare products and services.
- Communications and Trust: Phenomenal communication skills with the ability to maintain the highest levels of confidentiality on a consistent basis.
- Organisation and Getting Stuff Done: Juggling multiple projects and timelines. Prioritising. Keen eye for detail in all tasks and projects. Exceptional at making lists and maintaining organisation.
- Growth Mindset: Your ability to learn from mistakes, reflect on mistakes, and not make mistakes again. Being curious and asking questions and facing resilience in the face of setbacks.
Benefits- Join an energetic, diverse team dedicated to working towards the challenge of improving and saving patient lives
- Private health insurance with Vitality Health for you and your family, including discounted gym memberships, wellness retreats, fitness devices, and lots more
- 28 days paid annual leave during each holiday year (including bank holidays)
- Fully financed learning and personal development courses to help you grow in your role
- Opportunity to attend conferences and acquire certifications, paid for by the company
- New laptop of your choice for you to work on either at home, our at Circadia’s London Bridge office
- Flexible / hybrid working to suit your personal circumstances and allow you to be productive wherever you are most comfortable working
- Participate in and help plan regular team events, lunches and dinners
£100,000 - £200,000 a year