Data Engineer Big Resume Austin, TX

Data Engineer Big Resume Austin, TX
Resumes | Register
Candidate Information
Name	Available: Register for Free
Title	Data Engineer Big
Target Location	US-TX-Austin
Email	Available with paid plan
Phone	Available with paid plan
20,000+ Fresh Resumes Monthly
View Phone Numbers
Receive Resume E-mail Alerts
Post Jobs Free
Link your Free Jobs Page
... and much more
Register on Jobvertise Free
Related Resumes
Senior big data engineer San Marcos, TX
Data Engineer Big San Antonio, TX
Big Data Software Engineer Austin, TX
Technology Lead Data Engineer Round Rock, TX
Data Quality Engineer Austin, TX
Data Engineer Integration Austin, TX
Data Engineer Austin, TX
Click here or scroll down to respond to this candidate
Candidate's Name
Austin, TX Contact: PHONE NUMBER AVAILABLE Email: EMAIL AVAILABLEOBJECTIVE:Big Data Engineer having 5+ years of experience with strong background in end-to-end enterprise data warehousing and big data projects. An exceptionally creative and enthusiastic Employee, seeking for full time opportunities as a Data engineer to utilize my knowledge and skills to develop innovative solutions for corporate customers.Experience working with MySQL and NoSQL database technologies, including MongoDB,Experience in writing complex SQL queries, creating reports and dashboards.Developed Spark/Scala, Python for regular expression (regex) projects in the Hadoop/Hive environment with Linux/Windows for big data resources.Good experience working with Python oriented to data manipulation, data wrangling and data analysis using libraries like Pandas, NumPy, Scikit-Learn and Matplotlib.Designed, developed and implemented performant ETL pipelines using python API (PySpark) of Apache Spark using Databricks.Experience working with AWS (Amazon Web Services), Elastic Map Reduce (EMR), Storage S3, EC2 instances, AWS Glue, Lambda, Step Functions and Data Warehousing.Hands on experience designing and building data models and data pipelines on Data Warehouse focus and Data Lakes. Experience in project life cycle including Data Acquisition, Data Cleaning, Validation, Data Manipulation, Data Validation, Data Mining, Algorithms, and Visualization.Experience in creating Visual reports, Graphical analysis and Dashboard reports using Tableau.Experience in migrating the reports with the tableau migration tool.Working experience with version control tools like Git, GitHub and Bitbucket.TECHNICAL SKILLSProgramming Languages: Python, Pyspark, SQL, C, C++Database Management: PostGres, MySQL, SQL Server, Oracle, MS Access, AWSNoSQL Databases: Introduction to MongoDB-NoSQL.Tools and IDEs: Eclipse, Spyder, Code Blocks, GitHub, MATLAB, SAP, Data analysis, Microsoft office.Cloud Technologies: AWS (S3, ), GCP(google cloud platform),AZURE.Visualization & ETL tools: Tableau, Power BI, Talend, DataStage, Tableau Migration server.Operating Systems: Unix, Linux, Windows, Mac OS.Concepts: Cloud Bigdata, Cloud Big Queries, Data Mining and Business Intelligence, Data warehousing, Data visualization, Database Development and Administration, Machine learning, E-R and Integrity Diagrams, SQL queries, System Analysis Methods, Introduction to Hadoop Ecosystem.PROFESSIONAL EXPERIENCE :Client: AJLA(Americas Job Link Alliance), Kansas.Role: Data Engineer Apr 22- PresentGenerated report on predictive analytics using SQL and Tableau including visualizing model performance and prediction results.Created reports in TABLEAU for visualization of the data sets created and tested SQL connectors.Developed storytelling dashboards in Tableau Desktop and published them on to Tableau Server which allowed end users to understand the data on the fly with the usage of quick filters for on demand needed information.Expertise in writing complex SQL queries, made use of Indexing, Aggregation and materialized views to optimize query performance.Involved in converting SQL queries into Spark transformations using Spark data frames, Scala and Python. Used Python for solving data and converting it into time-series data.Data sources are extracted, transformed, and loaded to generate CSV data files with Python programming and SQL queries.Integrated a large amount of data from multiple data sources.Imported data into power BI using different ETL methods, Direct Query, and Restful API.worked on building stream-processing systems, using solutions such as Spark-Streaming.Extensive knowledge in various reporting objects like Facts, Attributes, Hierarchies, Transformations, filters, prompts, calculated fields, Sets, Groups, Parameters etc., in Tableau experience in working with Flume and NiFi for loading log files into Hadoop.Worked on a tableau migration tool to migrate all the reports developed into the tableau servers in all state tableau servers.Worked as an internal tester as well to get good quality reports and to make deployment easier.Environment: Hadoop, HDFS, Spark, Kafka, Azure, Power BI, Azure Data Lake, Data Factory, Data Storage, Data Bricks, Map reduce, Scala, Python, Spark, Hive, HBase, Pig, Zookeeper, Oozie, Sqoop, PL/SQL, Oracle 12c, MS SQL, Mongo DB, JSP, Git Hub.Client: Artificial Inventions, Charlotte NCRole: Data Engineer Dec 21- Apr 22Used AWS Atana extensively to ingest structured data from S3 into other systems such as RedShift or to produce reports.Performed end- to-end Architecture & implementation assessment of various AWS services like Amazon EMR, Redshift, S3, Atana, Glue and Kinesis.Using AWS Glue, I designed and deployed ETL pipelines on S3 parquet files in a data lake.Created data pipelines using python, PySpark and EMR services on AWS.Maintained AWS Data pipeline as web service to process and move data between Amazon S3, Amazon EMR and Amazon RDS resources.Worked on the code transfer of a quality monitoring program from AWS EC2 to AWS Lambda, as well as the creation of logical datasets to administrate quality monitoring on snowflake warehouses.Responsible for creating on-demand tables on S3 files using Lambda Functions and AWS Glue using Python and PySpark.Involved in the development of a shell script that collects and stores logs created by users in AWS S3 (Simple storage service) buckets. dis contains a record of all user actions and is a good indicator of security to detect cluster termination and safeguard data integrity.Environment: HDFS, Spark, Kafka, AWS (EC2, Lambda, S3, IAM, Cloud Watch, Cloud Formation, Redshift), Map reduce, Sqoop, Scala, Python, Spark, Hadoop, Hive, Impala, Hbase, Pig, Zookeeper, Oozie, PL/SQL, Oracle 12c, MongoDB, T-SQL, GitData Engineer - Graduate Teaching Assistant: Jan 20 - Apr 20Courses: Data Structures, Data MiningWorked as a teaching assistant under Dr. Phillips for data structures and Dr. Boettcher for data mining tools and techniques and helped them to manage the workload while tutoring and mentoring the class. Provided solutions to Data Structures in Java and assisted students in their worksheets and assignments.Using GDB_GP, I have analyzed the data sets and adjusted the performance to reach the perfect fitness value. Analyzed different datasets and posed different questions after characterizing the data set (Regression, classification)Used Talend data fabric which is used to specialize in incorporating ETL as part of a larger framework for managing big data sets.Used Talend for a large range of connectors for integrating with sources such as database server, salesforce, SAP.Datastage is an ETL tool that can extract data, transform it, and apply it to different business principles and then load it to any specific target.Data stage we used it to provide high quality data to aid in the gathering of business insight.Used data stage to design,build and also to estimate the data, make analysis and provide the analysis requirements and set up DataStage projects.Data Engineer- Intern, Skill bit, India. Jun19 - Dec19Knowledge of ETL methods for data extraction, transformation and loading in corporate-wide ETL Solutions and Data Warehouse tools for reporting and data analysis.Developed ETL procedures to ensure conformity, compliance with minimal redundancy, translated business rules and functionality requirements into ETL procedures.Used Spark Dataframe API to get the analysis fast by using Hive Context and handover the data to the machine learning analytics team based on the requirement.Performed data analysis and developed analytic solutions. Data investigation to discover correlations / trends.Assistant Site Engineer, CBRE South Asia Pvt Ltd, India. May18 - Jul18Inspecting facilities and analyzing operational data and estimating technical and material requirements for project development.Organizes and monitors projects, as well as providing technical assistance.Worked with the SCRUM team in delivering agreed user stories on time for every Sprint. Implemented UNIX scripts to define the use case workflow and to process the data files and automate the jobs.ACADEMIC PROJECTSCapstone Project in Masters:Title: Growers Basket Aug21 - Dec21Developed an application by using MS Access, which helps farmers(customers) to buy seeds and pesticides from their comfort zone by using this application. Using Tables to develop an application, Queries, and Reports to generate the report of the day/monthly sales and helps understand the customer and admin easily.I used Macros to navigate from the home page to all the available pages to run the application.Layout:Developed a customer registry form using tables, queries, forms, and reports. Admin shall make changes in the member record like deactivate them, delete them, and edit their informationCustomer has to login in with his credentials to see all the products available in the store and shopResearch project:Mini research on how important Walmart to use Databases.( under Dr. Jean Kofi)Walmart, as the biggest retailer in the world, acquires vast amounts of data every minute. It is reported that it generates upwards of 2.5 petabytes of data per hourWalmart utilizes NoSQL, the database focused on graphs, to monitor its shoppers and confidential informationNoSQL databases concentrate on data processing and are mainly used for commercial websites and internet, cloud applications, data processing.A relational database really wasn't matching our criteria for performance and simplicity, due to the nature of our queries, Marcos noted. To overcome this, Marcos' team opted to employ Neo4j, a graph database.Neo4j is the market leader in this area in the present era.Designing and implementing end-to-end data solutions (storage, integration, processing, visualization) in Azure.Creating Pipelines in ADF using Linked Services/Datasets/Pipeline/ to Extract, Transform, and load data from different sources like Azure SQL, Blob storage, Azure SQL Data warehouse, write-back tool and backwards.Certifications and licenses:Data Science on Google Cloud Platform: Designing Data Warehouses:I learned about the many storage choices available in GCP for files, relational data, documents, and big data, such as Cloud SQL, Cloud Bigtable, and Cloud Big Query.Knowledge on AWS services like S3, EMR, Lambda, Cloud watch, RDS, Auto scaling, CloudFormation, SQL DynamoDB, Route53, Glue etc.Learned how to Collect data using Spark Streaming from AWS S3 bucket in near-real-time and perform necessary Transformations.Python (2020):Learned how to deal with dates and timings, read and write files, and interpret HTML, JSON, and XML data from the internet.Learned how to work on JSON scripts generation and writing UNIX shell scripting to call the SQOOP Import/Export.Experienced in using Spark/Scala, Python for regular expression (regex) project in the Hadoop/Hive environment with Linux/Windows for big data resourcesEDUCATION:Master of Science in Computers and Information Systems Jan20 Dec21University of Mary-Hardin BaylorGPA  3.3/4.0Bachelors in civil engineering Aug15 May19Vidya Jyothi institute of technology, Hyderabad, India.GPA  3.2/4.0
Respond to this candidate
Your Message
Please type the code shown in the image: