Aws Data Engineer Resume Irving, TX

Aws Data Engineer Resume Irving, TX
Resumes | Register

Candidate Information
Name	Available: Register for Free
Title	AWS Data Engineer
Target Location	US-TX-Irving
Email	Available with paid plan
Phone	Available with paid plan

20,000+ Fresh Resumes Monthly

View Phone Numbers

Receive Resume E-mail Alerts

Post Jobs Free

Link your Free Jobs Page

... and much more
Register on Jobvertise Free

Related Resumes

Click here or scroll down to respond to this candidate

Suraj PContact- PHONE NUMBER AVAILABLE Email EMAIL AVAILABLE
LinkedIn - LINKEDIN LINK AVAILABLEOBJECTIVE:An Enthusiastic Data Engineer with around 3 years of experience and altogether 4+ Years in the field of IT, with the skills to create, build, install, test, and maintain highly scalable data management systems, ensure that systems satisfy business needs, create high- performance algorithms, and develop data set processes. I'm seeking opportunities to use my knowledge and abilities to develop innovative solutions for corporate clients.SUMMARY: Deployed Instances, provisioned EC2, S3 bucket, Configured Security groups and Hadoop eco system for Cloudera in AWS. Experience in using distributed computing architectures like AWS products (e.g., EC2) and working on raw data migration to Amazon cloud into S3 and performed refined data processing. Using snowflake cloud data warehouse and AWS S3 bucket to integrate data from multiple sources, including loading nested JSON formatted data into snowflake table. Created AWS Lambda, EC2 instances provisioning on AWS environment and implemented security groups, administered Amazon VPC's. Developed environments of different applications on AWS by provisioning on EC2 instances using Docker, Bash and Terraform. Perform analysis of large, complex data sets and assist with data management processes in academic research using Jupyter Notebooks with Apache Spark, Python, R, SAS, and Snowflake Data Warehouse on AWS. Development and maintenance of data pipeline on Azure Analytics platform using Azure Databricks, PySpark, Python, Pandas and NumPy libraries. Proficient in design and development of various dashboards, reports utilizing Tableau Visualizations like bar graphs, line diagrams, pareto charts, funnel charts, scatter plots, pie-charts, donut charts, bubble charts, funnel charts, heat maps, tree maps according to the end user requirements. Designed and developed Power BI graphical and visualization solutions with business requirement documents and plans for creating interactive dashboards. Good Understanding of Spark Architecture with Databricks, Structured Streaming Setting up AWS and Microsoft Azure with Databricks, Databricks workspace for Business Analytics, manage clusters in data bricks. Designed and setup Enterprise Data Lake to provide support for various cases including Analytics, processing, storing and reporting of voluminous, rapidly changing data. Worked on research and analysis on data sources in support of data discovery of OLAP cubes. Experience with data modeling, schema design, and SQL development. Worked on building data pipelines in airflow in GCP for ETL related jobs using different airflow operators. Experience in google cloud components, google container builders and GCP client libraries. Created databases on RDS and loaded the data from AWS S3 to RDS SQL Server. Experience with creating APIs and created security groups for accessing the APIs externally. Converted the Code from Scala to PySpark in DHF (Data Harmonization Framework) AND Migrated the Code and DHF_UI from DHF 1.0 to DHF 2.1. Analyzed the requirements and framed the business logic for the ETL process. Identified and designed process flows to transform data and populate the target databases.TECHNICAL SKILLSCloud Technologies(Azure| Azure Data Lake| Databricks| Blob Storage| Data Factory| Synapse| Amazon EC2| Amazon S3| Amazon RDS| AWS Lambda| Amazon EMR| Amazon Glue| Big Query| DataProc| GCS| GCP)Code Repository Tools(Git| GitHub| Bit Bucket)Database(MySQL| SQL Server Management Studio 18| MySQL Workbench| Oracle Database 11g Release 1| Amazon Redshift| PostgreSQL| Snowflake)End User Analytics(Power BI| Tableau)NoSQL Databases(Hbase| Cassandra| MongoDB| Dynamo DB)Languages(Python| SQL| PySpark| PL/SQL| UNIX Shell Script| JAVA| C| C++)ETL(Snowflake| AWS Glue)Operating System(Windows 10/7/XP/2000/NT/98/95| UNIX| LINUX| DOS)PROFESSIONAL EXPERIENCE:Data EngineerCapital One, McLean, VA May 2022 - PresentResponsibilities: Designed and developed Security Framework to provide fine grained access to objects in AWS S3 using AWS Lambda, DynamoDB. Implemented Lambda to configure Dynamo DB Autoscaling feature and implemented Data Access Layer to access AWS DynamoDB data. Loaded data into S3 buckets using AWS Glue and PySpark. Involved in filtering data stored in S3 buckets using Elasticsearch and loaded data into Hive external tables. Experience in Migrating existing databases from on-premises to AWS Redshift using various AWS Services. Developed the PySpark code for AWS Glue Jobs. Worked on different files like CSV, JSON, Flat, Parquet to load the data from source to raw tables. Implemented Triggers to schedule pipelines. Migrated an existing on-premises application to AWS. Used AWS services like EC2 and S3 for data sets processing and storage. Performed configuration, deployment, and support of cloud services in Amazon Web Services (AWS). Was responsible for creating on-demand tables on S3 files using Lambda Functions and AWS Glue using Python and PySpark. Transformed the data using AWS Glue dynamic frames with PySpark, cataloged the transformed the data using Crawlers and scheduled the job and crawler using workflow feature. Design Develop and test ETL Processes in AWS Glue to migrate Campaign data from external sources like S3, ORC/Parquet/Text Files into AWS Redshift. Used SQL approach to create notebooks and DHF_UI in DHF 2.1. Experienced in designing, building, and deploying and utilizing almost all the AWS stack (Including EC2, S3,), focusing on high- availability, fault tolerance, and auto-scaling. Extracted structured data from multiple relational data sources as Data Frames in Spark SQL on Databricks. Responsible for loading data from the internal server and the Snowflake data warehouse into S3 buckets. Responsible for estimating the cluster size, monitoring and troubleshooting of the Spark Databricks cluster. Created Databrick notebooks to streamline and curate the data from various business use cases. Worked on migrating existing on-premises applications to AWS Redshift. Used AWS services like EC2 and S3 for processing and storage. Assist with the development and review of technical and end user documentation including ETL workflows, research, and data analysis. Used Snowflake extensively to do the ETL operations and imported the data from Snowflake to S3 and S3 to Snowflake. Good Knowledge of Snowflake architecture and concepts. Worked on Scheduling all Jobs using Airflow scripts using python added different tasks to DAG, LAMBDA. Created database tables, indexes, constraints and triggers for data integrity.Environment: AWS, Snowflake, Python SQL, PostgreSQL, PySpark, PL/SQL, UNIX Shell Script, EC2, Spark, Databricks, AWS GLUE, Redshift, ETL.Data EngineerT-Mobile, Bellevue WA June 2020 to Apr 2022Responsibilities:
Wrote AWS Lambda functions in Python for AWS's Lambda which invokes python scripts to perform various transformations on large data sets in EMR clusters. Designed the ETL process from various sources into Hadoop/HDFS for analysis and further processing of data modules. Worked on Amazon Web service (AWS) to integrate EMR with Spark and S3 storage and Snowflake. Using Spark, performed various transformations and actions and the result data is saved back to HDFS from there to target database Snowflake. Design, develop, test, implement and support Data Warehousing ETL using Talend and Hadoop Technologies. Extensively worked with MySQL for identifying required tables and views to export into HDFS. Engaged directly with IT to understand their key challenges and demonstrate and price solutions that fit their needs for PaaS and Iaas based solutions. Responsible for creating Hive tables on top of HDFS and developed Hive Queries to analyze the data. Staged data by persisting to Hive and connected Tableau with Spark cluster and developed dashboards. Implemented UDFS, UDAFS, UDTFS in Java for Hive to process the data that can't be performed using Hive inbuilt functions. Used Hive to analyze the partitioned, bucketed data and compute various metrics for reporting. Created Build and Release for multiple projects (modules) in production environment using Visual Studio Team Services (VSTS). Have Knowledge regarding Stream sets which are pipelines used for the Injecting data into Raw layer from Oracle Source. Used Terraform scripts to Automate Instances for Manual Instances that were launched before.Environment: AWS, Hadoop, Hive, Snowflake, Python SQL, PostgreSQL, PySpark, PL/SQL, UNIX Shell Script, EC2, Spark, Databricks, EMR, Redshift, ETL.Jr. Python DeveloperRite Software Solutions, Hyderabad, India June 2018 - Dec 2019Responsibilities: Wrote and executed various MYSQL database queries from Python using Python-My SQL connector and My SQL db package. Worked with tools like Jenkins to implement build automation. Development of companys internal CI system, providing a comprehensive API for CI/CD. Associated with various phases of Software Development Life Cycle (SDLC) of the application like requirement gathering, Design, Analysis and Code development. Worked with team of developers on Python applications for RISK management. Generated Python Django Forms to record data of online users. Used Python and Django creating graphics, XML processing, data exchange and business logic implementation. Designed and developed the UI of the website using HTML, XHTML, AJAX, CSS, and JavaScript. Developed and tested many features for dashboard using Python, Java, Bootstrap, CSS, JavaScript, and J Query. Developed custom Jenkins jobs/pipelines that contained Bash shell scripts utilizing the AWS CLI to automate infrastructure provisioning. Experience in writing Sub Queries, Stored Procedures, Triggers, Cursors, and Functions on My SQL and Post gre SQL database. Wrote and executed various MYSQL database queries from python using Python-MySQL connector and My SQLdb package. Used Selenium Library to write fully functioning test automation process that allowed the simulation of submitting different we request from multiple browsers to web application. Performed Compatibility testing of applications for dynamic and static content of browsers using HTML Ids and X Path in Selenium. Worked on Integration of Selenium RC/Web Driver with existing API to test Framework. Cleaned data and processed third party spending data into maneuverable deliverables within specific formats with Excel macros and python libraries. Used TDD (Test driven development) methodology.Environment: Python, Django, Mongo DB, Selenium, Pandas, Java, J Query, Zookeeper, bootstrap, My SQL, Linux, Ajax, Java Script, Apache, JIRA, Cassandra, HTML5 and CSS, Angular JS, Backbone JS.EDUCATIONAL DETAILS:Masters in Computer and Information systems January 2020 -December 2021Cleveland State University, Cleveland, Ohio, USA. Two-time recipient of Prestigious AHUJA Scholarship in the year 2020 and 2021, based upon the academics & projects Ive worked on.Bachelors in Electronics and Communication Engineering August 2015 April 2019Jawaharlal Nehru Technological University, Hyderabad, Telangana, India. Enthusiastic Student who participated and volunteered in multiple years for Technical Fests and Competitions.
Member of Computer Society of India.

Respond to this candidate
Your Email	«
Your Message
Please type the code shown in the image: