Description

Responsibilities include:
•                 Support the operation and maintenance of Linux servers, ensuring operational availability & performance, conducting health checks, managing software upgrades, patching (including testing and implementation), system optimization and administration.
•                 Monitor server health and performance to identify issues, bugs, or potential improvements
•                 Strict adherence to change management processes to ensure changes are properly planned, documented, and deployed
•                 Develop, review, and update existing operational documentation (SOPs, application checklists, playbooks, etc)
•                 Provide after-hours on-call technical support
•                 Collaborate with the Security Operations Center (SOC) team for process optimization, tool tuning & integration, information sharing, playbook development and incident response
•                 Implement automated near real-time monitoring of all tools to ensure proper operation and collection of pertinent data
•                 Incident and Problem Management; including both during and post-incident, along with Root Cause Analysis
•                 Application support, issue management and escalation
•                 Perform incident investigation, diagnosis, and resolution
•                 Perform system monitoring and remediation
The successful candidate will meet the following qualifications:
•                 7+ years of experience installing, administering, and maintaining Oracle or Red Hat Linux based servers
•                 5+ years of experience designing and implementing redundant systems including data backups/recoveries, high availability, load balancing, and disaster recovery
•                 5+ years of experience designing, analyzing, and repairing large-scale distributed systems
•                 Experience with deploying and maintaining AWS and on-premises Linux servers
•                 Experience in application deployment automation, modern DevOps practices, and infrastructure as code
•                 Experience with IT automation tools such as Ansible Automation Platform, Chef, Puppet, or Terraform
•                 Knowledgeable of core IT infrastructure technologies including virtualization, networking, and storage management
•                 Technical documentation skills
•                 Comfortable interacting with management at various levels in a professional manner
•                 Takes ownership of areas of responsibility and makes recommendations and decisions on the improvement and operation of those areas
•                 High level of organizational skills
 

Education

Any Graduate