We are seeking a talented Site Reliability Engineer to join our team in Israel.
Our SRE team is responsible for the operations and maintenance of cloud applications, deployment and monitoring frameworks and leading FinOPs activities.
This is an amazing opportunity to work with a world-class technical team and to hold responsibility for designing, creating, and provisioning infrastructure as well as deploying and maintaining applications with a focus on infrastructure as code and monitoring tools.
We aim to adopt the best solutions to fulfill our requirements, and you will be encouraged to bring your ideas and new thinking.
So, if you want to grow with us and work for a world-leading company, now is the right time to join.
What You'll Do:
Our mission is to protect, provide for, and progress the software and systems behind all of our Crossix services
Ensure high uptime and reliability of Crossixs production environments on AWS
Take responsibility for managing the production environment, security, change management, deployment, architecture, and tools
Perform root cause analysis for complex failures and offer modern solutions and tools
Develop effective dashboards that provide key insights and system performance
Analyze performance and stability issues and create in-house automation tooling
Work closely with DevOps, R&D, product, and integration managers to enable automated CI/CD methods
Analyze cloud infrastructure and application costs and raise ideas for cost-saving
Design, develop, and drive troubleshooting & mitigation tools as part of driving self-healing agenda
Constantly improve the technology stack, supporting the data growth
Work in a Big Data company with cutting-edge technologies
Requirements: 3+ years of Experience as SRE / DevOps in a production environment
2+ years of Experience with scripting languages
2+ years of hands-on Experience with cloud services
2+ years of Experience with Infrastructure as Code (IaC) tools such as AWS CloudFormation, Terraform, or similar
2+ years of Experience with CI/CD systems such as GitHub Actions and Jenkins
1+ years of Experience with containerization and managing Kubernetes clusters
1+ years of Experience with Linux
1+ years of Experience with common networking, firewall, and load-balancing protocols
1+ years of Experience with SQL and relational database administration
Team player
Nice to Have:
Managing noSQL databases such as ElasticSearch and MongoDB
Experience working with BigData tools such as Spark
Understanding of application security in a cloud environments
This position is open to all candidates.