Description

Responsibilities:

  • Focused and attentive to business-critical issues.
  • Highly proactive and takes initiative to identify problem areas to evolve solutions.
  • Excellent analytical and problem-solving skills.
  • Responsible for Incident management for defined IT applications ensuring 100% uptime of systems and Applications.
  • Perform daily system monitoring (through monitoring tool and script-based notifications). verifying the integrity and availability of all hardware, server resources and key processes. Reviewing system and application logs and verifying completion of scheduled jobs such as application start-up processes and backups.
  • Manage and Administer applications running under LINUX /AIX/Solaris/Windows based and Cloud native systems, including configuration, troubleshooting, and automation.
  • Identifying the root causes of incidents to prevent future recurrence.
  • Act as the main point of contact for coordinating, resolving, and discussing application, interface, and integration problems with vendors and other teams.
  • Perform ongoing application performance tuning, application upgrades, and resource optimization as required.
  • Responsible for deployment & movement of application code / bug fixes / data patches and release management in Production, SIT and UAT environments in coordination with Development Team and vendors.
  • Manage Execution of OS and DB patching for the defined IT applications.
  • Conduct regular bug/issue tracker review meetings & follow-ups with respective teams for quick problem resolution.
  • Assist team in achieving best possible design solution for functional requirements.
  • Attending Change Control Board meetings & planning the production movement.
  • Preparing Application process documents/ KB documents. Periodic review of KB documents & publishing the same to the L1 Team.
  • Collaborate with development teams to ensure smooth integration and deployment of applications in a cloud-native environment.
  • Collaborate with security teams to implement and maintain secure cloud environments.
  • Must be ready to work in 24*7 support environment.

 

Requirements:

  • Prior L1 & L2 Production support experience.
  • Atleast 2-3 years of relevant experience in LINUX / UNIX / Windows platforms, preferably with BFSI segment; in a high-volume or critical production application environment.
  • Hands-on experience in installation, configuration, monitoring and management of Applications & Databases.
  • Strong Unix/Linux skills – well versed in tasks like file editing, system resource monitoring, running, and scheduling processes, and troubleshooting system issues.
  • Hands-on experience in RDBMS – Oracle / MySQL/ MSSQL and SQL Query.
  • Exposure to Web-App servers (WebLogic, Web Sphere, JBoss, Apache Tomcat), IIS.
  • Hands-on experience in writing and executing SQL queries with JOINS.
  • Hand-on Experience in any Scripting languages such as Shell, Python, JAVA etc is Plus.
  • Good exposure to Alerting and Monitoring Tools.
  • Exposure to scheduling products, such as Autosys or crontab.
  • Understanding of various cloud platforms, such as AWS, GCP, and Azure, as well as expertise in service mesh, Kubernetes, networking, and infrastructure automation.
  • Deploy, configure, and maintain Kubernetes clusters (GKE, EKS, AKS) using infrastructure-as-code tools like Helm charts, Ansible, CloudFormation, and Terraform.
  • Support database technologies such as Cloud SQL, Amazon RDS, Oracle, and PostgreSQL.
  • In-depth knowledge of networking concepts, including CIDR, load balancing, VPCs, and transit gateways.
  • Experience in application server log analysis, troubleshooting, and problem solving.
  • Exposure to Incident, Problem, Capacity, Change & Release Management Process.
  • Working knowledge of Hardware and Networking Technologies such as Load Balancer, DNS Firewalls etc.
  • Hands-on experience of API and Microservices.
  • Exposure to maintaining and operating CI/CD pipeline.

Education

Any Graduate