Quantcast

Site Reliability Engineer Resume Alphare...
Resumes | Register

Candidate Information
Name Available: Register for Free
Title Site Reliability Engineer
Target Location US-GA-Alpharetta
Email Available with paid plan
Phone Available with paid plan
20,000+ Fresh Resumes Monthly
    View Phone Numbers
    Receive Resume E-mail Alerts
    Post Jobs Free
    Link your Free Jobs Page
    ... and much more

Register on Jobvertise Free

Search 2 million Resumes
Keywords:
City or Zip:
Related Resumes

Devops Engineer Site Reliability Cumming, GA

Site Reliabilty Engineer Atlanta, GA

Network Engineer Atlanta, GA

Java Engineer Software Atlanta, GA

Data Center Sqa Engineer Atlanta, GA

Security Engineer Network Alpharetta, GA

Devops Engineer Azure Atlanta, GA

Click here or scroll down to respond to this candidate
Candidate's Name
EMAIL AVAILABLE PHONE NUMBER AVAILABLE Alpharetta, GASUMMARYSeasoned Site Reliability Engineer with 13 years of experience, including recent roles at Last9, Morgan-Stanley, and SIEMENS, seeking to leverage expertise in SRE practices, observability, and automation. Proficient in configuring monitoring/alerting with PromQL, automating deployments with GitHub, and managing AWS infrastructure using Terraform. Adept at collaborating with DevOps teams, evangelizing SRE principles, and defining SLIs/SLOs to enhance system reliability and performance.WORK EXPERIENCELast9 (US)Site Reliability Engineer Jul 2022 - Current date Assisted customers with the configuration and utilization of SRE-focused products to meet their Service Level Objectives (SLOs). Collaborated with clients to identify high cardinality metrics and implemented streaming aggregation strategies to address them effectively. Developed automated solutions, including Bash and Python scripting, Github-based deployment pipelines, and infrastructure provisioning with Terraform for EKS deployments. Morgan-Stanley (US)Consulting Site Reliability Engineer Jan 2022 - Jul 2022 Enhanced observability and fine-tuned alerting mechanisms using an APM tool, leading to improved system monitoring and performance insights. Automated key manual processes, resulting in a 20% reduction in repetitive workload for the team. Developed and maintained Splunk dashboards for real-time monitoring, enabling proactive alerts on critical application transactions.SIEMENS (US)Consulting Site Reliability Engineer Sep 2021 - Apr 2022 Implemented Site Reliability Engineering methodologies to enhance the reliability and performance of cloud-native applications, resulting in more robust and scalable systems. Collaborated with cross-functional teams to automate infrastructure upgrades and developed comprehensive KPI dashboards, improving operational efficiency and visibility across the CI/CD pipeline. Established and refined monitoring and alerting systems, utilizing tools like Prometheus and DataDog, to proactively detect and resolve issues in critical applications and infrastructure components. FICO (Mexico)Site Reliability Engineer Oct 2020 - Sep 2021 Enhanced software development infrastructure by implementing build scripts, integrating continuous deployment tools, and refining continuous integration processes, facilitating efficient global engineering operations. Managed Kubernetes applications, ensuring robust data processing through Python scripts and maintaining a complex multi-master, multi-worker environment with extensive CRON job scheduling and data management. Strengthened system security and reliability by automating task execution using Jenkins, fortifying environments with MFA and SSH tunneling, and conducting rigorous SSL/TLS certificate management and application support.BMC Software (Mexico)Sr. Technical Support Analyst Oct 2016 - Oct 2020 Managed incident resolution for server automation tools, ensuring adherence to SLAs and maintaining high levels of customer satisfaction. Facilitated customer migrations to AWS, including setup of VPCs, RDS, Security Groups, and EC2 instances, enhancing infrastructure scalability and efficiency. Diagnosed and resolved complex network and application server connectivity issues, and implemented AD and LDAP authentication for multiple clients, bolstering system security. Delivered expert analysis of Java errors and thread dumps, identifying root causes of execution errors or memory leaks, and provided tailored solutions or escalated bug reports. Hewlett Packard Enterprise (Mexico)SAN Storage Engineer Feb 2015 - Oct 2016 Configured LUNs and managed SAN connections on Brocade switches to efficiently allocate new storage capacity for applications, ensuring high availability through redundant connections and data replication across geographically dispersed data centers. Monitored storage array utilization to prevent exceeding an 85% threshold, conducted migration planning for applications at 80% capacity, and executed storage reclamation to optimize resource allocation. Performed troubleshooting for server connectivity issues with HBA and Brocade switches, and contributed to the infrastructure upgrade by transitioning to 3PAR storage arrays, enhancing system performance and reliability. WebApps Administrator Nov 2011 - Feb 2015 Ensured optimal functionality and performance of over 400 enterprise applications by monitoring system health, troubleshooting errors, and executing necessary modifications to prevent outages and enhance efficiency. Developed and implemented automation scripts for routine maintenance tasks, contributing to system reliability and data integrity by resolving ETL errors that directly affected business intelligence reporting for executive decision-making. Collaborated with application engineering teams to identify and rectify software bugs and infrastructure issues, improving application stability and performance while managing web server configurations across multiple platforms.Tata Consultancy Services (Mexico)Technical Support Analyst Apr 2010 - Nov 2011 Delivered comprehensive technical support for ERP systems, resolving issues across Order Management, Inventory, Purchase Orders, and General Ledger modules. Managed web application administration on Weblogic servers, conducting routine maintenance tasks such as disk cleanup and server recycling. Executed weekly updates of trade compliance rules, ensuring accurate database bulk loads and effective coordination with database administrators and stakeholders. EDUCATIONUniversidad de GuadalajaraBachelor's in Communications and Electronics Engineering 2009SKILLSSite Reliability Engineering  Incident Management  Configuration Management  Change Management  Build Management  Release Management  Version Control System  SAN Storage  Release Engineering  Graphana  Prometheus  Promql  Zabbix  AppDynamics  1000eyes  Datadog  Dynatrace  AWS Services  Ec2  VPC  Route 53  S3  Lambda  Cloud Formation Templates  Load Balancers  CloudWatch  Security Groups  Ebs  Iam  Eks Ansible  Docker  Kubernetes  OpenShift  SSL/TLS  Automation  Github  Python  Bash Shell  NSH  Blcli  Apache  Tomcat  JBoss  Java  High-Availability  Disaster Recovery  RHEL/CentOS  Ubuntu  Suse  Debian  Ibm Aix  Puppet  Jenkins  Terraform  GitLab  Sus Foundation V3  AWS Certified Cloud Practitioner  Network Protocols and Services  HTTP  SMTP  TCP/Ip  Ldap  NFS  Nis  DNS  DHCP  SSH  TLS  Samba  FTP  Amazon Web Services (AWS)  Oracle  SQL Server  PSSQL  Apache Tomcat  JBoss  Shell Script  Bash  Python PowerShell  PowerCLI  NSH  Blcli  Service Now  Jira  HPSM  SalesForce  Docker  Kubernetes  OpenShift AppDynamics  Zabbix  Splunk  Prometheus  Grafana  DataDog  1000eyes  Dynatrace

Respond to this candidate
Your Message
Please type the code shown in the image:

Note: Responding to this resume will create an account on our partner site postjobfree.com
Register for Free on Jobvertise