| 20,000+ Fresh Resumes Monthly | |
|
|
| | Click here or scroll down to respond to this candidateCandidate's Name
Data Engineer/Tableau DeveloperEMAIL AVAILABLEPHONE NUMBER AVAILABLELINKEDIN LINK AVAILABLEProfessional SummaryProfessional Big Data Developer and Data Engineer with years of technical expertise in all phases of Software development cycle (SDLC) in Sales, Marketing, Enterprise Business expertizing in Bigdata and Cloud Computing areas.Experience in Cloud Computing (Azure and AWS) and Big Data analytics tools like Hadoop, HDFS, Map - Reduce, Hive, HBase, Spark, Spark Streaming, Azure Cloud, Amazon EC2, DynamoDB, Amazon S3,Kafka, Flume, Avro, Sqoop, PySpark.Experience building Data pipeline for Realtime streaming data and Data Analytics using Azure cloud components like Azure Data Factory, HDInsight (spark cluster), Azure ML Studio, Azure stream Analytics, Azure Blob Storage, Microsoft SQL DB, Neo4j (Graph DB).Hands on experience on Spark with Scala, PySpark.Experience working in SQL Server and My SQL database. good experience working with Parquet files and parsing, validating JSON format files.Worked on NoSQL databases like MongoDB, Document DB and Graph Databases like neo4j .Worked with Flask framework for designing REST API's using Python language Flask Framework.Experience working on pipelines to engineer the machine learning models using Azure ML studio.Consumed RESTful web services and invoked them using Postman.Experience building microservice using AWS lambda.Proficient with Software development methodologies like Agile Methodologies.Experience in developing ETL jobs using Spark to SQL Database Systems and NoSQL database.Hands on experience on version control like GitHub.Experience working on Continuous integration and continuous deployment using Jenkins.Technical SkillsBig Data EcosystemHDFS, Yarn, MapReduce, Spark, Hive, Airflow, Stream Sets, HBaseHadoop DistributionsApache Hadoop 2.x/1.x, Cloudera CDP, Hortonworks HDP, Amazon AWS - EMR, EC2, EBS, S3, Athena, Glue, Elasticsearch, SQS, DynamoDB, Redshift, ECS, Kinesis, Microsoft Azure - Databricks, Data Lake, Blob Storage, Azure Data Factory, SQL Database, SQL Data Warehouse, Cosmos DB, Azure Active Directory)Scripting LanguagesPython, Scala, HiveQL.Cloud EnvironmentAmazon Web Services (AWS), Microsoft AzureNoSQL DatabaseHBase, DynamoDBDatabaseMySQL, Oracle, Teradata, MS SQL SERVER, PostgreSQL, DB2ETL/BISnowflake, Redshift, Tableau Desktop, Tableau Prep, Tableau Server, PowerBIVersion ControlGit, BitbucketWork ExperienceFannie Mae, Herndon, VA.Jan 2023-May 2024Data Engineer/AdminResponsibilities :Designing, developing, and maintaining data pipelines to ingest data from various sources into the AWS ecosystem, such as Amazon S3, Redshift, or Glue. This involves extracting, transforming, and loading data using tools like Apache Spark, Python libraries, and AWS Glue.Implementing data processing and transformation logic to clean, enrich, and aggregate data as per business requirements. This may involve leveraging Spark jobs, Python scripts, or Databricks notebooks to perform complex transformations efficiently.Designing and implementing data models and data warehouses on AWS services like Amazon Redshift, Amazon Athena, or Amazon RDS. This includes schema design, optimization, and performance tuning to support analyticsOptimizing data processing and analysis tasks for performance and scalability, leveraging AWS services like Amazon EMR (Elastic MapReduce) or Databricks clusters.Writing efficient Python code and optimizing Spark jobs for better performance on distributed computing platforms.Monitoring data pipelines, job execution, and system performance using AWS CloudWatch, AWS CloudTrail, or Databricks monitoring tools.Troubleshooting issues and optimizing workflows to maintain data quality, reliability, and availability.Building real-time data processing pipelines using AWS Kinesis, Apache Kafka, or Databricks Structured Streaming for streaming data ingestion and processing.Developing Python applications to consume, process, and analyze streaming data in near real-time.Leveraging distributed computing frameworks like Apache Spark on AWS EMR (Elastic MapReduce) or Databricks clusters for big data analytics and processing.Developing and optimizing Spark jobs using Python to analyze large datasets efficiently and derive insights for business decision-making.Automating data engineering workflows and processes using AWS Lambda functions, Step Functions, or Databricks Jobs for scheduling and orchestrating ETL jobs.Writing Python scripts or using workflow management tools to automate repetitive tasks and streamline data pipeline operations.Implementing version control systems like Git for managing code and configurations related to data engineering projects.Collaborating with cross-functional teams including data scientists, analysts, and software engineers to develop and deploy data solutions effectively.Establishing CI/CD pipelines for automating the deployment and testing of data engineering artifacts using AWS Code Pipeline or similar tools.Integrating unit tests, integration tests, and deployment scripts into CI/CD pipelines to ensure the reliability and consistency of data engineering processes.Planning and provisioning infrastructure resources on AWS to meet the growing demands of data processing workloads.Implementing auto-scaling policies and monitoring resource utilization to ensure optimal performance and cost efficiency.Environment: Apache Spark, Apache Glue, Amazon S3, Amazon Redshift, Apache Kafka, PySpark, AWS Lamda, AWS Kinesis, Databricks, AWS EMR, MySQL, RDS, Athena, ETL, Python, Git, Azure Data Factory, AWS Cloudwatch.AIM Health Care, Franklin, TN. May 2022-Dec 2022Tableau Developer.Responsibilities :Designed and developed Tableau reports and dashboards to provide actionable insights for business stakeholders.Provided operational support for existing Tableau reports and dashboards, ensuring smooth functioning and performance optimization.Leveraged subject matter expertise to design Tableau workbooks and cater to diverse reporting requirements across business groups.Participated in the analysis, design, development, testing, deployment, and support of dashboards and reports.Conducted impact analysis and validation of production reports and universes post-application and database upgrades.Interfaced with business analysts and users to clarify requirements and ensure alignment with reporting objectives.Independently troubleshooted dashboard and report problems, identifying root causes and implementing effective solutions.Demonstrated proficiency in understanding reporting database schema/model, performing data analysis, and optimizing data queries.Developed reports and dashboards using Business Objects and Tableau, adhering to industry BI standards and best practices.Utilized advanced knowledge of SQL to extract and manipulate data for analysis and reporting purposes.Implemented data management and automation technologies, including Python, to streamline processes and enhance efficiency.Collaborated with cross-functional teams to proactively identify problems, issues, and risks, providing innovative solutions to drive continuous improvement.Provided regular status updates and demonstrated commitment to aggressive delivery timelines, ensuring timely project completion.Environment: Tableau(Desktop, Server), Oracle, SQL Server, Tera Data, SAP, Excel, CSV files, Enterprise Management Reporting Portal,AWS,NoSQL, PowerBI, Python,Cognos, Informatica, QlikView.Cyient Ltd., IndiaJan 2020 -Dec 2021SQL Server DeveloperResponsibilities:Performed data ETL by collecting, exporting, merging and massaging data from multiple sources and platforms including SSRS/SSIS (SQL Server Integration Services) in SQL Server.Worked with cross-functional teams (including data engineer team) to extract data and rapidly execute from MongoDB through MongoDB connector.Worked with tools like SQL workbench/J, PG Admin, DB Hawk, Squirrel SQL.Performed data cleaning and feature selection using Scikit-learn package in python.Partition clustering into 100 by k-means clustering using Scikit-learn package in Python.Determined the most accurately prediction model based on the accuracy rate.Used text-mining process of reviews to determine customers concentrations.Delivered result analysis to support team for hotel and travel recommendations.Designed Tableau bar graphs, scattered plots, and geographical maps to create detailed level summary reports and dashboards.Developed hybrid model to improve the accuracy rate.Involved in the ETL phase of the project & designed and analyzed the SQL Server database and involved in gathering the user requirements.Creation of packages depends on client requirement using SQL Server Integration Services & SQL Server 2008.Created complex stored procedure and processed large amount of data from raw to transformed layer.Created reports and dashboards using micro strategy and published to leadership.Environment: MSBI, SSIS, SSRS, SQL SERVER, Oracle, Micro Strategy, SharePoint, ETL, SQL Server, MongoDB, Python, Tableau. |