Aws Data Engineer Resume Orlando, FL

Aws Data Engineer Resume Orlando, FL
Resumes | Register

Candidate Information
Name	Available: Register for Free
Title	AWS Data Engineer
Target Location	US-FL-Orlando
Email	Available with paid plan
Phone	Available with paid plan

20,000+ Fresh Resumes Monthly

View Phone Numbers

Receive Resume E-mail Alerts

Post Jobs Free

Link your Free Jobs Page

... and much more
Register on Jobvertise Free

Related Resumes

Sr AWS DevOps Engineer/SRE Orlando, FL

Sql Developer Data Engineer Orlando, FL

Data Engineer Power Bi Kissimmee, FL

Data Engineer Engineering Orlando, FL

Devops Engineer Software Engineering Melbourne, FL

Devops Engineer Security Clearance Orlando, FL

It Support Engineer Altamonte Springs, FL

Click here or scroll down to respond to this candidate

Candidate's Name
Data Engineer/Scientist
EMAIL AVAILABLE PHONE NUMBER AVAILABLE LINKEDIN LINK AVAILABLE

CAREER SUMMARY

A Data Engineer expert in developing, implementation, and optimization data plumbing systems and
ETL processes. Skilled in Big Data tech including Hadoop, Impala, Sqoop, Pig, Zookeeper, Hive, and
cloud platforms like AWS, GCP & Azure. Proficient in creating data pipelines with AWS services for
storage, analytics, and modeling. Service-oriented programmer proficient in JavaScript, Python,
Pyspark, NodeJS, R, Java, and SQL. Strong in data cleaning, reshaping, ETL pipeline building, and
generating subsets with Databricks, Data Warehouse, Data Lake, Numpy, Pandas, and PySpark. Expertise
in DevOps, Jenkins, Docker, and Splunk.

PROFESSIONAL EXPERIENCE

Seacoast Bank, Senior Data Engineer Feb 2023 present | Tampa,FL
Built a Data Lake in Amazon S3 using Python, R, and Pyspark for the client, Imported data from

Snowflake tables using CRM Postgres DB, Salesforce, MySQL Server, Amazon RDS, and Integrated
data from multiple sources.
Developed automation regression scripts to validate ETL processes across multiple databases,

including AWS Redshift, Oracle, MongoDB, T-SQL, and SQL Server, using Python & Pyspark.
Transformed vast sets of financial temporal data into actionable insights by leveraging AWS S3, EMR,

Athena, HDFS, Databricks, and Apache Spark, resulting in a 30% increase in data processing
efficiency.
Using Amazon Aurora Databases and DynamoDB, Increased storage capacity by 50% while reducing

latency by 250%, Implemented real-time data processing solutions using AWS Kinesis and AWS
Lambda.
Optimized the performance of AWS-hosted applications using CloudWatch monitoring, which reduced

error rates by 10%, and migrated the company's entire workload to AWS cloud using EC2 and S3 for
efficient scaling, which resulted in 40% more efficiency.
Evaluating performance of business requirements, performed data segmentation, integrated customer

data into emails, enforced compliance approvals, and analyzed customer engagement using Data
bricks, snowflake, PySpark, and AWS S3.
Developed and maintained cloud-based data manipulation pipelines utilizing SQL, Python scripting,

and dbt, ensuring efficient data transformation and integration across banking applications.
Implemented Data Mining and parsing techniques to extract valuable insights from large datasets,

improving decision-making processes.
Created and optimized SQL queries for robust data reporting and visualization in Looker, enhancing

the accuracy and accessibility of financial reports for stakeholders.
Aligned OLTP and OLAP processes by collaborating with cross-functional teams; achieved a 20%

increase in data consistency, enhancing strategic planning, and accelerating decision-making
capabilities across the organization by 25%
Played a vital role in implementing CI/CD pipelines using GIT, Jenkins, and Maven, streamlining
development processes.
Exported the analyzed data to the relational databases using Amazon EMR for visualization and to

generate reports using Power BI and Tableau. Utilized Apache Airflow pipelines to automate DAGs,
their dependencies, and log files.

Thryve Digital Health LLP, Senior Data Analyst / Engineer Jun 2019 Aug 2021 | Hyd, India
Develop predictive models using regression in collaboration with the healthcare analytics team using

Python, AWS SageMaker, EC2, and S3 to analyze total charges and length of stay for patients with
COVID-19 and mental illness.
Contributed to the implementation of a medical records filing system which helped decrease

outpatient wait time by 13.2%, adhering to Agile principles and delivering projects on time.
Analyzed and transformed patients' time-series data by running batch-processing jobs in data

warehouses using HDFS, Azure Databricks, Azure Data Factory, and Apache Spark.
Managed on-premises data infrastructure including Apache Kafka, Apache Spark, Elasticsearch,

Redis, and MongoDB, using Docker containers.
Constructed MapReduce jobs to validate, clean, and access data and worked with Sqoop jobs with

incremental load to populate and load into Hive External tables.
Leveraged statistical modeling techniques including decision trees and generalized linear models

(GLM) using SAS and MATLAB for predictive modeling in the insurance industry.
Conducted comprehensive data analysis and transformation, ensuring compliance with industry

regulations.
Experience with Data warehousing techniques like Star Schema, Snowflake Schema, normalizing,

denormalization, transformations, and aggregation.
Authored detailed technical documentation to facilitate knowledge transfer and maintain high

standards of information technology practices.
Utilized advanced data analysis and risk assessment methodologies to enhance annuity and insurance

risk predictions, resulting in a 15% improvement in accuracy.
Integrated results into database applications and generated reports using spreadsheets and VBA,

facilitating data-driven decision-making.
Created and Designed dashboards using ggplot, Python matplotlib, Power BI, and Tableau to analyze

important features and model performance.
Implemented version control using Git/GitHub for Storing data on Medical projects, ensuring better

collaboration and code management within the team.

Cliff.AI, Software Engineer- Data Crawling & Analytics, Dec 2018 May 2019 | Hyd, India
Implemented web scraping scripts to gather data from various websites, ensuring ethical and efficient
data extraction practices, using technologies like Python(libraries), Java, R, and SQL.
Utilized Pandas and NumPy to transform raw data into a usable format, improving the quality of
datasets for further analysis.
Automated ETL processes across billions of rows of data, which reduced manual workload by 29%
monthly.
Ingested data from disparate data sources using a combination of SQL, Google Analytics API, and

Salesforce API using Python to create data views to be used in BI tools like Tableau.
Designed and maintained robust data pipelines for annuity computations and risk analysis using SQL,

JSON, and XML formats.
Automated data extraction and processing workflows using VBA, ensuring seamless integration of

diverse data sources into predictive modeling frameworks.
Collaborated with cross-functional teams to develop and deploy database applications and statistical

models, leveraging skills in HTML and teamwork to enhance data analysis and decision support
systems.
Led the implementation of RESTful API integrations for seamless data exchange between domain

systems, leveraging JSON and XML formats for data transfer.
Engineered data repositories and automated ETL processes to support complex data requirements in

the banking sector.
Contributed to developing BI tooling solutions, integrating Power BI dashboards with existing data

infrastructure, using Tableau to create and maintain data visualizations.
Collaborated with cross-functional teams, including data scientists & application developers to guide

the development and implementation of Cloud applications, systems, and processes using DevOps
methodologies.

EDUCATION

Master's in Computer Technology, Eastern Illinois University Aug 2021 May 2023 | Chicago, IL

Bachelor's in Computer Science, JNTUH Apr 2015 May 2019 | Hyd, India

SKILLS

Programming languages: Python, Java, SQL, pyspark, Scala, Bash. Data: Hadoop, s3, Redshift, Hiva,
Elasticsearch, Redis, PostgreSQL, MongoDB, MySQL Distrusted systems: Apache Spark, Databricks,
Kubernetes, Kafka, AWS/Azure cloud: s3, EC2, EMR, Airflow, Lambda, Athena, Glue, IAM, Redshirt,
Dynamo DB, CloudWatch, Sagemaker, Kinesis, Azure SQL Database, Azure Load Balancer, DevOps Tool
Integrations. Other: Docker, Git, Kibana, Flask, PyTorch, Salesforce, Tableau, Power Bl, Mata Trader 4,
Jira, Pandas, NumPy s , OpenStreetMap, Terraform, Grafana,

Respond to this candidate
Your Email	«
Your Message
Please type the code shown in the image: