Quantcast

Sr Data Engineer Resume Baltimore, MD
Resumes | Register

Candidate Information
Name Available: Register for Free
Title Sr.Data engineer.
Target Location US-MD-Baltimore
Email Available with paid plan
Phone Available with paid plan
20,000+ Fresh Resumes Monthly
    View Phone Numbers
    Receive Resume E-mail Alerts
    Post Jobs Free
    Link your Free Jobs Page
    ... and much more

Register on Jobvertise Free

Search 2 million Resumes
Keywords:
City or Zip:
Related Resumes
Click here or scroll down to respond to this candidate
                             Candidate's Name VB
                                       PHONE NUMBER AVAILABLE | EMAIL AVAILABLE




Summary
Accomplished Data and Cloud Engineer with over 8 years of experience in designing, implementing, and optimizing
data solutions across cloud and on-premise environments. Expertise in leading large-scale migrations, building real-time
data processing pipelines, and ensuring data integrity and security in diverse and complex systems. Proficient in a wide
range of technologies, including Azure and AWS cloud services, big data frameworks, and modern data warehousing
solutions like Snowflake. Adept at automating data workflows, enhancing ETL processes, and integrating machine
learning models into production environments. Strong background in data governance, compliance, and performance
tuning, with a proven track record of delivering actionable insights through advanced reporting and analytics tools.
Committed to driving innovation and efficiency through collaboration, problem-solving, and continuous learning.




Skills
    Build Tools: Azure Data Factory, Apache Maven,                 Reporting Tools: Power BI, Looker, Tableau
    Gradle, DBT, Informatica,SSIS                                  Languages & Scripting: Python, Spark, PySpark, Scala,
    Continuous Integration Tools: Jenkins, Azure DevOps,           Shell Scripting (Unix/Linux), JavaScript, T-SQL, PL/SQL,
    GitLab CI/CD                                                   SQL, HTML
    Cloud: Azure Data Lake Storage, Azure Data Factory             IDE Tools: SSIS, SSRS, SSAS, Visual Studio Code, PyCharm,
    (ADF), Azure Synapse, Azure SQL DB, Azure Cosmos DB,           Jupyter Notebook, IntelliJ IDEA, Eclipse
    Azure Analysis Services, Azure Databricks, Azure IoT Hub,      Management Utilities: JIRA, Agile, Scrum, Git/GitHub
    Azure Event Hubs, AWS S3, EC2, Lambda, RDS, Redshift,          /GitLab, Control-M, Terraform, Docker, Kubernetes,
    Glue                                                           Nagios, Bugzilla
    Databases: Snowflake, Microsoft SQL Server, MySQL,
    PostgreSQL, Oracle, MongoDB, Hive, HDFS




Experience
SR.DATA/CLOUD ENGINEER | 09/2023 - Current
AccessOne - SC
    Led the migration of critical on-premise Hadoop data pipelines to Azure, leveraging Azure Data Factory and
    Databricks for efficient ETL processing and real-time analytics.
    Implemented solutions using PySpark, ensuring seamless data integration across multiple sources.
    Designed and implementing data ingestion workflows using Azure IoT Hub to capture real-time data streams from IoT
    devices.
    Processed data using Azure Event Hubs and Kafka before storing it in Azure Data Lake and Snowflake for analytics.
    Played a key role in migrating financial and operational datasets from on-premise SQL databases to Snowflake on
    Azure.
    Designed and optimized ETL processes using DBT and PySpark to transform and load data into Snowflake, enhancing
    query performance and scalability.
    Collaborated with the team to build real-time data processing and analytics solutions using Fabric Real-Time
    Analytics integrated with Kafka.
    Developed streaming pipelines for processing and analyzing data from IoT devices and external sources, providing
    actionable insights.
    Enhanced legacy ETL processes by migrating them to cloud-based solutions, including IBM DataStage and Azure
    Data Factory.
    Used Python, PySpark, and SQL to build robust data transformation pipelines that improved data accuracy and
    reduced processing time.
    Automated the scheduling and monitoring of data jobs using Control-M and Unix scripts.
    Implemented performance tuning and error-handling mechanisms to ensure the timely and accurate processing of
    critical data across diverse environments.
    Assisted in setting up data governance frameworks and ensuring compliance with industry standards during the
    migration.
    Worked with Azure Synapse and Snowflake to establish role-based access controls and encryption protocols to
    secure sensitive data.


AZURE CLOUD ENGINEER | 09/2019 - 08/2023
TD Bank - NJ
    Designed and implemented data storage systems using Azure services, such as Azure SQL Database, Azure Data
    Lake Storage, and Azure Synapse, ensuring scalability, performance, and cost-effectiveness.
    Developed and implemented data integration processes using Azure Data Factory, extracting data from various
    sources, transforming it, and loading it into data warehouses or data lakes.
    Utilized big data technologies, such as Apache Spark, to create data processing workflows and pipelines, supporting
    data analytics and machine learning applications.
    Collaborated with data scientists to operationalize machine learning models, implementing MLOps practices to
    automate the deployment, monitoring, and management of models in production.
    Ensured seamless integration of AI models with existing data pipelines and infrastructure, enhancing system
    performance and efficiency.
    Monitored and optimized data pipelines and database performance to ensure data processing efficiency,
    troubleshooting, and resolving data-related issues to minimize downtime and maintain data integrity.
    Proficient in Azure Data Factory, Azure Databricks, Azure SQL Database, Azure Blob Storage, Python, PySpark, and
    Kafka.
    Strong knowledge of AI and machine learning concepts, with hands-on experience in MLOps.


DATA ENGINEER | 02/2016 - 07/2018
NeftX
    Developed, tested, and maintained Python scripts for data extraction, transformation, and loading (ETL) processes to
    ensure efficient data flow and integration.
    Implemented AWS services, including S3, EC2, RDS, and Lambda, to manage and deploy scalable data solutions,
    enhancing system performance and reliability.
    Utilized Oracle databases and SQL/PL-SQL for data modeling, performance tuning, and ensuring data integrity
    across various applications.
    Designed and maintained Hadoop clusters, leveraging HDFS for distributed data storage and Spark for real-time
    data processing and analytics.
    Employed Sqoop for efficient data transfer between Hadoop and relational databases, ensuring seamless data
    integration and accessibility.
    Developed and automated data pipelines using Informatica, ensuring timely and accurate data movement across
    the data architecture.
    Created and optimized data visualizations and dashboards using Looker, providing stakeholders with actionable
    insights and improved decision-making.
    Managed data storage and processing using Hive, ensuring efficient querying and data retrieval from large
    datasets.
    Utilized Bugzilla for tracking and resolving data-related issues, ensuring high data quality and system reliability.
    Conducted version control and collaborative development using Git, ensuring code integrity and facilitating cross-
    functional collaboration.
    Implemented system performance monitoring and alerting with Nagios, proactively addressing potential issues and
    ensuring system uptime.
    Coordinated project management and issue tracking using JIRA, streamlining workflows and ensuring timely delivery.
    Developed data governance policies and procedures, ensuring regulatory compliance and data security across the
    organization.


KAFKA ENGINEER | 09/2014 - 01/2016
Accenture
Developed and implemented real-time data streaming solutions using Apache Kafka for a financial client, ensuring high
throughput and low-latency data processing. Worked closely with cross-functional teams to build scalable, fault-tolerant
Kafka clusters that powered critical transaction and fraud detection systems.
    Designed and deployed a Kafka-based real-time fraud detection system, handling over 500,000 transactions daily
    with sub-millisecond latency.
    Implemented Kafka Connect to integrate with various data sources, including SQL Server and MongoDB, enabling
    real-time data ingestion and processing.
    Utilized KSQL to perform stream processing, filtering and transforming data in real-time to generate alerts for
    suspicious transactions.
    Managed and optimized Kafka clusters, ensuring high availability through replication, partitioning, and consumer
    group management.
    Automated infrastructure provisioning and scaling using Terraform and Ansible, reducing manual intervention by
    30%.
    Deployed Kafka clusters on AWS using EC2, S3, and RDS for data storage, ensuring a highly reliable and scalable
    environment.
    Configured and optimized Kafka broker, producer, and consumer settings for improved performance, reducing
    message delivery latency by 20%.
    Developed monitoring and alerting systems using Prometheus and Grafana to track Kafka cluster health and
    consumer lag, ensuring system uptime of 99.9%.
    Collaborated with the DevOps team to containerize Kafka components using Docker and deploy them on
    Kubernetes clusters for enhanced scalability and reliability.
Environments:
Kafka, Kafka Connect, KSQL, AWS (EC2, S3, RDS), Terraform, Ansible, Docker, Kubernetes, Prometheus, Grafana, SQL
Server, MongoDB, Python, Shell Scripting, Linux (Ubuntu)



Education and Training
University of Maryland - Baltimore County - Baltimore, MD | Master of Science
Information Science


Vellore Institute of Technology - Vellore | Bachelor of Technology
Electrical, Electronics and Communications Engineering

Respond to this candidate
Your Email «
Your Message
Please type the code shown in the image:
Register for Free on Jobvertise