Quantcast

Senior Data Engineer Resume Charlotte, N...
Resumes | Register

Candidate Information
Name Available: Register for Free
Title Senior Data Engineer
Target Location US-NC-Charlotte
Email Available with paid plan
Phone Available with paid plan
20,000+ Fresh Resumes Monthly
    View Phone Numbers
    Receive Resume E-mail Alerts
    Post Jobs Free
    Link your Free Jobs Page
    ... and much more

Register on Jobvertise Free

Search 2 million Resumes
Keywords:
City or Zip:
Related Resumes

Senior Cloud/Data Engineering Charlotte, NC

Data Engineer Senior Charlotte, NC

Data Engineer Senior Charlotte, NC

Senior Process engineer and Data Engineer Charlotte, NC

Data Engineer Senior Charlotte, NC

Data Engineer Senior Charlotte, NC

Senior Software Engineer Charlotte, NC

Click here or scroll down to respond to this candidate
                                         Candidate's Name
                                               PHONE NUMBER AVAILABLE
                                            EMAIL AVAILABLE
                     https://LINKEDIN LINK AVAILABLE
                                            Senior Data Engineer

PROFESSIONAL SUMMARY:
   Experienced Senior Data Engineer with over 9 years in the field, specializing in data engineering and analytics.
   Proven proficiency in designing and implementing scalable data solutions across diverse domains, utilizing Azure
    services such as Azure Data Factory, Azure SQL Database, Azure Databricks, and Snowflake, along with a strong
    background in AWS.
   Expertise encompasses data warehousing, ETL processes, and data pipeline development, utilizing tools like Azure
    Data Factory, AWS Glue, Apache Airflow, and Talend.
   Well-versed in multiple database technologies, including Azure SQL Database, PostgreSQL, Oracle Database, AWS
    Redshift, and Snowflake.
   Deep understanding of big data technologies, including Azure Databricks, Apache Hadoop, and Apache Kafka.
   Proficient in data streaming and processing using Azure Stream Analytics, Apache Flink, Apache NiFi, and Snowflake.
   Experienced in cloud technologies and services such as Azure, AWS.
   Skilled in developing and deploying data solutions using Python, SQL, R, and other programming languages, with a
    primary focus on Azure ML tools.
   Expert in data visualization and reporting tools such as Power BI, Tableau, and QlikView, specializing in Azure Power
    BI.
   Demonstrated expertise in implementing data security and compliance measures, particularly in Azure
    environments.
   Thorough understanding of agile methodologies and DevOps practices, including continuous integration and
    deployment using Azure DevOps.
   Capable leader and mentor in high-paced environments, ensuring the timely delivery of high-quality data solutions.
   Exceptional problem-solving skills and meticulous attention to detail.
   Proven track record of delivering high-quality data solutions on time and within budget.
   Strong communication and collaboration skills, with the ability to work effectively with cross-functional teams in both
    Azure and AWS environments.
   Continuous learner with up-to-date knowledge of the latest trends and technologies in data engineering.
   Experience in the financial services and healthcare domains, providing domain-specific data solutions.
   Demonstrated ability to manage large datasets and handle complex data integration tasks.
   Proficient in automated data pipeline development using tools like Azure Data Factory, Apache Airflow, Talend, and
    Snowflake.
   Adept in machine learning operations and analytics using MLflow and other ML tools, with a focus on Azure Machine
    Learning.

TECHNICAL SKILLS:
     Programming Languages: Python, SQL, R, PL/SQL, Scala,
     Big Data Technologies: Apache Hadoop, Apache Kafka, Apache Flink, Apache NiFi, Databricks
     Database Technologies: PostgreSQL, Oracle Database, AWS Redshift, DynamoDB, Azure Cosmos DB,
      Cassandra
     Data Warehousing: AWS Redshift, Azure Synapse Analytics, Apache Hive
     ETL Tools: Apache Airflow, Data Pipeline, Informatica PowerCenter, AWS Glue, Talend, Azure Data Factory,
      Salesforce, Snowflake, Spark
     Cloud Platforms: AWS, Azure, Docker, Kubernetes
     Data Lake and Storage: Azure Data Lake Store Gen2, Amazon S3
     Data Visualization: Tableau, QlikView, Power BI, Platfora
     DevOps Tools: GitLab, Jenkins, Terraform, Azure DevOps, Ansible
     Machine Learning Operations: MLflow, Azure Machine Learning
        Other technologies: Apache Drill, Apache Sentry, Sqoop, AWS Lambda, AWS X-Ray, AWS Glue, Amazon EMR,
        Amazon SNS, SQS, Elasticsearch, EC2, IAM, RDS, CloudWatch, SAM
        Agile Methodologies and Project Management


PROFESSIONAL EXPERIENCE:

Client: Computomic, New Jersey, NJ.                                                      Apr 2022 to till date
Role: Senior Data Engineer
Project Description: Led a proficient data engineering team in the conception and execution of scalable solutions,
harnessing the capabilities of Azure, Snowflake, Salesforce, Talend, and Power BI. Demonstrated expertise in fine-
tuning ETL processes and refining data pipelines within the Azure ecosystem, with a specialized emphasis on
enhancing data warehousing functionalities.

Roles & Responsibilities:
   Developed and maintained an efficient data pipeline architecture within Microsoft Azure, employing tools such
     as Data Factory and Azure Databricks.
   Developed architectural solutions that incorporated Talend for robust ETL processes and Power BI for advanced
     reporting, tailored to meet specific requirements in Chevron's use case.
   Crafted user-friendly technical solutions, ensuring clarity and acceptance among stakeholders.
   Conducted client education sessions on the advantages and drawbacks of various Azure PaaS and SaaS solutions,
     with a focus on prioritizing cost-effective approaches.
   Implemented self-service reporting in Azure Data Lake Store Gen2 through an ELT approach, optimizing data
     processing efficiency.
   Applied Spark Vectorized pandas user-defined functions via Talend for intricate data manipulation and
     wrangling.
   Executed a staged data transfer approach, systematically moving data from System of Records to raw, refined,
     and produced zones to facilitate efficient translation and denormalization.
   Established Azure infrastructure components, including storage accounts, integration runtimes, service principal
     IDs, and app registrations, to ensure scalable and optimized utilization of analytical requirements.
   Wrote PySpark and Spark SQL transformations in Azure Databricks for intricate business rule implementations,
     seamlessly integrating Talend for enhanced capabilities.
   Developed Data Factory pipelines proficiently for bulk copying multiple tables from relational databases to
     Azure Data Lake Gen2.
   Engineered a custom logging framework for ELT pipeline logging in Data Factory using Append variables.
   Enabled monitoring and employed Azure Log Analytics to proactively alert support teams about the usage and
     statistics of daily runs.
   Spearheaded proof of concept projects from ideation to production pipelines, delivering tangible business value
     leveraging Azure Data Factory, Talend, and Power BI.
   Ensured secure data separation across national boundaries through multiple data centers and regions.
   Applied continuous integration/continuous development best practices using Azure DevOps, incorporating code
     versioning, and deploying with Ansible playbook.
   Delivered denormalized data for Power BI consumers from the produced layer in Data Lake, enriching modelling,
     and visualization experiences.
   Collaborated seamlessly in a SAFE (Scaled Agile Framework) team, actively participating in daily stand-ups, sprint
     planning, and quarterly planning sessions.
Environment: Azure Data Factory, Azure Databricks, Azure Data Lake Store Gen2, Azure Log Analytics, Azure DevOps,
Talend, Power BI, PySpark and Spark SQL, Spark Vectorised pandas user-defined functions, Data Factory pipelines,
Azure DevOps, Ansible playbook, Scaled Agile Framework (SAFE).

Client: Medtronic, Minneapolis, MA                                                     Sep 2020 to Apr 2022
Role: Big Data Engineer
Project Description: Led a pioneering healthcare data engineering initiative at Medtronic, leveraging Azure cloud
services to orchestrate efficient pipelines with a focus on Hadoop, Databricks, and Apache Flink. Managed Azure
Databricks clusters to ensure robust ETL processes, maintained data integrity in PostgreSQL, and automated
workflows with Apache Airflow. Azure was integral in the incorporation of Snowflake for advanced data warehousing
and analytics capabilities.

Roles & Responsibilities:
   Developed and deployed a versatile ETL framework for efficient data extraction from diverse sources using
     Spark, with a specific focus on Azure Databricks clusters.
   Utilized Platfora for data visualization on Hadoop, creating Lens and Viz boards for real-time insights, while
     leveraging Azure services for seamless integration.
   Executed data queries and analyses in Cassandra, applying various Data Modeling techniques tailored for
     Cassandra databases, with considerations for Azure compatibility.
   Used Spark and Scala for joining multiple tables in Cassandra, enabling seamless analytics on consolidated
     datasets within an Azure environment.
   Engaged in enterprise-wide upgrades, troubleshooting, and performance tuning for Hadoop clusters, including
     those hosted on Azure.
   Configured Apache Drill on Hadoop for seamless data integration across SQL and NoSQL databases, taking
     advantage of Azure capabilities for enhanced connectivity.
   Orchestrated data ingestion into Hadoop and Cassandra using Kafka from diverse sources, ensuring
     compatibility with Azure data storage solutions.
   Utilized Tidal enterprise scheduler and Oozie Operational Services for effective cluster coordination and
     workflow scheduling, with considerations for Azure cloud infrastructure.
   Implemented Spark streaming for real-time data transformation, considering Azure services for optimal
     scalability and performance.
   Designed and created Tableau dashboards to address diverse business needs, incorporating Azure connectors
     for seamless data access.
   Installed and configured Hive, wrote Hive UDFs, and utilized Piggybank repository for Pig Latin, ensuring
     compatibility with Azure-based data ecosystems.
   Implemented Partitioning, Dynamic Partitions, and Buckets in HIVE to enhance data access efficiency,
     considering Azure storage optimization.
   Employed Sqoop to export analyzed data to relational databases for BI team visualization in Tableau, considering
     Azure database services for integration.
   Implemented a Composite server for data virtualization needs, creating multiple views with restricted data
     access through a REST API, considering Azure API services.
   Led the conception and implementation of the next-generation architecture, optimizing data ingestion and
     processing efficiency, with a focus on Azure cloud services.
   Developed and implemented various shell scripts for job automation, considering Azure automation tools and
     scripts.
   Implemented Apache Sentry to restrict Hive table access on a group level, ensuring compatibility with Azure
     security protocols.
   Utilized AVRO format for comprehensive data ingestion, enhancing operational speed and minimizing space
     utilization, with considerations for Azure storage efficiency.
   Proficiently managed and reviewed Hadoop log files, incorporating Azure monitoring and logging solutions.
   Operated in an Agile environment, utilizing the Rally tool for maintaining user stories and tasks, with integration
     capabilities for Azure DevOps.
   Collaborated with Enterprise data support teams for Hadoop updates, patches, and version upgrades, ensuring
     seamless integration with Azure services.
   Implemented test scripts for test-driven development and continuous integration, with Azure-compatible
     testing frameworks.
   Leveraged Spark for parallel data processing, achieving enhanced performance outcomes, with considerations
     for Azure parallel processing capabilities.
Environment: SQL, NoSQL, PostgreSQL, Apache Spark, Azure Databricks, Platfora, Tableau, Apache Cassandra, Scala,
Hadoop clusters, Apache Drill, Azure Blob Storage, Azure DevOps, Apache Sentry, REST API, Azure API services, Sqoop,
Apache Kafka, Hadoop.

Client: Edward Jones, St. louis, MO                                                      Oct 2017 to Aug 2020
Role: AWS Data Engineer
Project Description: As an active AWS-Centric Data Engineer, I assumed a key role in leading a pioneering AWS
venture. I specialized in constructing and overseeing extensive pipelines in the AWS landscape, leveraging the
functionalities of Apache Airflow and Data Pipeline. My duties encompassed the coordination of ETL processes and
the optimization of data flows in both AWS Redshift and Apache Hadoop. Particularly noteworthy is my seamless
integration of Salesforce, adding value to a comprehensive and unified approach to data integration.

Roles & Responsibilities:
    Implemented a 'serverless' architecture using AWS components, including API Gateway, Lambda, and
      DynamoDB, facilitating seamless deployment of AWS Lambda code from Amazon S3 buckets.
    Designed and configured Lambda functions to receive events from S3 buckets, while creating robust data
      models for data-intensive AWS Lambda applications. These applications aimed at performing complex analyses
      and generating analytical reports, ensuring end-to-end traceability, and defining Key Business elements from
      Aurora.
    Wrote optimized code to enhance the performance of AWS services, addressing the needs of application
      teams. Ensured Code-level application security for clients by implementing IAM roles, credentials, and
      encryption strategies.
    Developed AWS Lambda functions using Python for efficient deployment management within the AWS
      ecosystem. Designed and implemented public-facing websites on Amazon Web Services, seamlessly
      integrating them with other applications' infrastructure.
    Created diverse AWS Lambda functions and API Gateways, enabling data submission through API Gateway
      accessible via Lambda functions.
    Led the creation of Cloud Formation templates for various AWS services, including SNS, SQS, Elasticsearch,
      DynamoDB, Lambda, EC2, VPC, RDS, S3, IAM, and CloudWatch. Ensured seamless integration with Service
      Catalog.
    Conducted regular monitoring activities on Unix/Linux servers, ensuring application availability and
      performance. Monitored logs, CPU usage, memory, load, and disk space using cloud watch and AWS X-ray.
    Implemented AWS X-Ray service within Confidential for visualizing node and edge latency distribution directly
      from the service map.
    Designed and developed ETL processes in AWS Glue, facilitating the migration of data from external sources
      like S3, ORC/Parquet/Text Files into AWS Redshift.
    Utilized Python libraries, including Boto3 and NumPy, for AWS operations. Employed Amazon EMR for
      MapReduce jobs, testing locally using Jenkins.
    Created external tables with partitions using Hive, AWS Athena, and Redshift. Developed PySpark code for
      AWS Glue jobs and for EMR.
    Demonstrated proficiency in other AWS services like S3, EC2 IAM, and RDS. Experienced in orchestration and
      Data Pipeline using AWS Step Functions, Data Pipeline, and Glue.
    Wrote SAM templates to deploy serverless applications on the AWS cloud.
 Environment: API Gateway, Lambda, DynamoDB, Amazon S3, Aurora, AWS X-Ray, SNS, SQS, Elasticsearch,
 CloudWatch, AWS Glue, AWS Redshift, Boto3, NumPy, Amazon EMR, Hive, AWS Athena, PySpark, AWS SAM
 (Serverless Application Model)

Client: Synechron Technologies Pvt. Ltd, Hyderabad, India                           Jan 2016 to May 2017
Role: SQL Developer
Project Description: Led comprehensive SQL development initiatives, mastering Oracle Database and PL/SQL for
optimal database management and integrity. Developed sophisticated data integration solutions with Informatica
PowerCenter and facilitated real-time processing using Apache Kafka.
Roles & Responsibilities:
   Mastered Oracle Database and PL/SQL for efficient management and optimization. Implemented performance
     tuning strategies, improving query response times.
   Developed complex solutions using Informatica PowerCenter for seamless data flow. Ensured scalability and
     maintainability of data integration workflows.
   Implemented Snowflake data warehousing solutions, optimizing storage and retrieval of extensive datasets.
   Led real-time processing initiatives using Apache Kafka for dynamic data streams. Integrated Kafka into existing
     architectures, enabling real-time analytics.
   Utilized Jenkins for CI/CD, establishing automated workflows for faster delivery. Implemented version control
     strategies for a reliable development process.
   Created advanced dashboards in Tableau for interactive data representations. Leveraged Tableau features for
     complex data analysis and trend identification.
   Managed Docker containers for efficient deployment across environments. Orchestrated containerized
     solutions, reducing setup and configuration times.
   Implemented agile practices, fostering adaptive and collaborative workflows. Emphasized regular sprints,
     feedback loops, and continuous improvement.
   Maintained comprehensive documentation for clarity and knowledge transfer. Advocated for data security and
     privacy best practices.
   Contributed to strategic planning for future data initiatives. Focused on fostering a culture of innovation and
     continuous learning.
Environment: Oracle DB, PL/SQL, Informatica PowerCenter, Apache Kafka, Snowflake, Jenkins, Tableau, Docker, SQL.

Client: Eclerx Service LTD Hyderabad, India                                   Jun 2014 to Dec 2015
Role: Data Analyst
Project Description: Led data analysis and reporting initiatives, leveraging Microsoft Excel for advanced analysis
and PostgreSQL databases for efficient storage. Developed Python scripts for complex processing, implemented data
warehousing with Apache Hive, and created interactive Tableau dashboards for insightful analytics.
Roles & Responsibilities:
   Conducted advanced data analysis and reporting using Microsoft Excel, optimizing insights for decision-making.
   Managed and optimized PostgreSQL databases, ensuring efficient data storage for large datasets.
   Developed Python scripts (pandas, NumPy) for complex data processing tasks, enhancing automation.
   Implemented data warehousing solutions using Apache Hive, accommodating the storage needs of extensive
     datasets.
   Created interactive dashboards and reports in Tableau, providing insightful analytics for stakeholders.
   Utilized GitLab for version control, ensuring effective tracking of changes in data projects.
   Advocated for and ensured high data quality and accuracy in all analysis projects.
   Collaborated with business teams to understand and meet data requirements, aligning solutions with
     organizational goals.
   Maintained comprehensive documentation for all data processes and systems, promoting knowledge sharing.
   Advocated for data-driven decision-making within the organization, fostering a culture of data-driven insights.
   Fostered a collaborative environment for data analysis and reporting, encouraging innovation and best
     practices.
 Environment: PostgreSQL, Excel, Python, Hive, Tableau, GitLab, QlikView.

 Education:
     Bachelor of Technology (B.Tech) in Information Technology from JNTUH University, Hyderabad, Telangana, India.
       - 2014

Respond to this candidate
Your Email «
Your Message
Please type the code shown in the image:
Register for Free on Jobvertise