Data Engineer Azure Resume Irving, TX

Data Engineer Azure Resume Irving, TX
Resumes | Register
Candidate Information
Name	Available: Register for Free
Title	Data Engineer Azure
Target Location	US-TX-Irving
Email	Available with paid plan
Phone	Available with paid plan
20,000+ Fresh Resumes Monthly
View Phone Numbers
Receive Resume E-mail Alerts
Post Jobs Free
Link your Free Jobs Page
... and much more
Register on Jobvertise Free
Related Resumes
Azure data engineer Allen, TX
Azure data engineer Frisco, TX
Sr. Azure Data Engineer Irving, TX
Data Engineer Azure Frisco, TX
Data Engineer Azure Prosper, TX
Data Engineer Azure Dallas, TX
Azure Data Engineer Plano, TX
Click here or scroll down to respond to this candidate
Candidate's Name
Irving, TX LinkedIn PHONE NUMBER AVAILABLE EMAIL AVAILABLEPROFESSIONAL SUMMARYDynamic and results-driven Data Engineer with over 6 years of experience in designing and implementing complex data solutions.Extensive expertise in Azure Data Factory (ADF), Apache Spark, and Databricks for developing intricate data pipelines and real-time data processing solutions.Skilled in integrating data from diverse on-premises and cloud-based sources, ensuring data accuracy, consistency, and availability.Proven ability to implement large-scale data processing solutions, optimize ETL workflows, and manage secure and scalable data storage with Azure Data Lake Storage (ADLS).Proficient in developing real-time data streaming solutions using Azure Stream Analytics, Event Hubs, and Kafka, enabling near real-time data analytics.Adept at transforming and integrating data from various sources like SQL Server, Cosmos DB, and Blob Storage, applying advanced data transformation techniques for high-quality analytics.Experience in developing Spark applications using Spark  SQL in Databricks for data extraction, transformation and aggregation from multiple file formats for analyzing and transforming the data.Experience in developing ETL solutions using spark SQL in Azure Databricks for data extraction, transformation and aggregation from multiple file formats and serve for analyzing & transforming the data to uncover insights into the customer usage patterns.Experience in Data migration from various sources to Azure cloud using Azure Data Factory.Experience in working with distributed computing frameworks like Apache Spark to process large-scale datasets, optimizing PySpark(Python API for Apache Spark) jobs for performance and scalability.Experience in debugging and troubleshooting PySpark applications, identifying and resolving performance bottlenecks, and ensuring data quality and accuracy in ETL processes.Experienced in designing scalable data architectures using Azure Synapse Analytics and implementing CI/CD pipelines with Azure DevOps for efficient data workflows.Demonstrated success in optimizing database performance, enhancing query efficiency, and ensuring data integrity and security across platforms.Strong background in developing interactive dashboards and reports using Tableau and Power BI, facilitating data-driven decision-making.Expertise in machine learning model development, achieving significant accuracy improvements.Committed to data governance and security best practices, managing data access controls and encryption to protect sensitive information.Excellent collaboration skills, working closely with data scientists, analysts, and business stakeholders to define data requirements and deliver actionable insights.Experienced in managing cloud resources and budgets, ensuring cost-effective use of Azure services and implementing autoscaling for high availability.Developed scalable real-time and batch data processing workflows using Dataflow, BigQuery. Developed scalable and efficient real-time and batch data processing workflows Dataflow (GCP).Dedicated to continuous learning and professional growth, holding certifications in Generative AI Fundamentals, Python for Data Science, and SQL and Relational Databases.Proven ability to convey complex technical concepts through well-structured documentation and effective communication, enhancing team collaboration and project success.Strong communication and documentation skills, demonstrated by the creation of comprehensive technical guides, runbooks, and knowledge base articles to facilitate knowledge sharing and collaboration within teams.Committed to continuous learning and staying abreast of emerging technologies and best practices in software development and cloud computing.TECHNICAL SKILLSProgramming Languages : Python, SQL, Java, JavaScript, PL/SQL, Shell ScriptingOperating Systems : Windows, Linux/UnixWeb Technologies : HTML, CSS, JavaScript,Databases : Oracle, SQL Server, MySQL, Mongo DB, GreenplumVersion Control : GIT, BitbucketCloud and DevOps : AWS, Azure, GCP, Docker, JenkinsIDE : Jupyter Notebook, VS CodeDesign Methodologies : Agile(Scrum, Kanban), WaterfallData Analysis & Visualization: Pandas, NumPy, Matplotlib, Seaborn, Tableau, Power BIEDUCATIONNorthwest Missouri State University Maryville, MOMS in Applied Computer ScienceCERTIFICATIONS & ACTIVITIESAzure Data Engineer Associate - Microsoft CertifiedAcademy Accreditation - Generative AI Fundamentals - Databricks.Python for Data Science - IBMSQL and Relational Databases, a course on cognitiveclass.ai Powered by IBM Developer Skills Network.Participated in Aagama 16, A National Level Student Technical paper & working Model Contest conducted by DVR & Dr. Hima Sekhar MIC College of Technology.WORK EXPERIENCEVitosha Inc. King of Prussia, PAData Engineer Jul 2023-till dateDeveloped intricate data pipelines leveraging Azure Data Factory (ADF) to seamlessly integrate data from a variety of on-premises and cloud-based sources.Designed workflows to ensure data accuracy, consistency, and availability.Implemented large-scale data processing solutions using Apache Spark and Databricks. Developed Spark jobs to process and analyze data in real-time, improving data throughput and reducing latency.Automated ETL workflows using ADF with a combination of time-based, event-based, and custom triggers. Ensured efficient and reliable data processing by integrating error handling and logging mechanisms.Managed Azure Data Lake Storage (ADLS) for secure and scalable data storage solutions. Implemented data lifecycle policies to optimize storage costs and performance.Developed real-time data streaming solutions using Azure Stream Analytics, Event Hubs, and Kafka. Enabled near real-time data analytics and monitoring for critical business applications.Used Azure Data Factory and Azure Databricks to transform and integrate data from various sources like SQL Server, Cosmos DB, and Blob Storage.Applied data transformation techniques to ensure data quality and readiness for analytics.Designed and implemented scalable data architectures using Azure Synapse Analytics for large-scale data warehousing. Optimized data ingestion and query performance through partitioning, indexing, and caching strategies.Implemented continuous integration and deployment (CI/CD) pipelines for data workflows using Azure DevOps. Automated testing and deployment processes to ensure rapid and reliable delivery of data solutions.Enforced data governance policies and security best practices across all data engineering activities. Managed data access controls and encryption to protect sensitive information.Collaborated with data scientists, analysts, and business stakeholders to define data requirements and deliver insights.Conducted performance tuning of ETL processes and Spark jobs to reduce execution time and resource usage. Utilized monitoring tools to identify bottlenecks and optimize resource allocation.Contributed to the development of a banking application using Python, implementing features such as transaction processing, account management, and real-time fraud detection algorithms.Managed cloud resources and budgets, ensuring cost-effective use of Azure services. Implemented autoscaling and load balancing for high availability and performance.Led on data migration projects from on-premises databases to Azure, ensuring minimal downtime and data integrity.Integrated Power BI with Azure Synapse and Databricks for advanced analytics and reporting. Developed interactive dashboards and reports to support data-driven decision-making.Utilized Azure Database Migration Service (DMS) for seamless and efficient data transfer.Implemented a metadata management system to track data lineage and improve data discoverability. Utilized Azure Data Catalog for centralized metadata repository.Utilized Python's pandas and matplotlib libraries to analyze financial data, identifying trends and generating visualizations to aid decision-making processes.Established data quality frameworks and validation rules to ensure high-quality data across the pipeline. Conducted regular data audits and implemented automated validation checks.Tech Stack: Azure Data Factory (ADF), Azure Data Lake Storage (ADLS), Azure Synapse Analytics, Azure Stream Analytics, Azure Event Hubs, Azure DevOps, Azure Database Migration Service (DMS), Azure Data Catalog, Apache Spark, Databricks, SQL Server, Cosmos DB, Blob Storage, Kafka, Power BI, HadoopNorthwest Missouri State University Maryville, MOData Engineer Feb 2022-May 2023Spearheaded the collection and preprocessing of data from diverse sources, including databases, APIs, and external datasets, ensuring high accuracy and consistency.Led the optimization of university databases, enhancing query performance by 30% and ensuring data integrity and security across all platforms.Worked in a cross-functional team in developing an enterprise-grade web application, ensuring adherence to coding standards, security, and scalability.Designed and implemented the web application architecture, utilizing technologies like Python, Fast API, Docker, Linux, and Cloud.Integrated CANVAS LMS and other third-party APIs into the web application, enhancing functionality and enabling seamless data synchronization processes, leading to improvement in user experience.Conducted in-depth data analysis on student data, uncovering key insights and trends that informed strategic decisions. Utilized advanced visualization tools, including Tableau, to present findings to stakeholders, enhancing data-driven decision-making.Developed interactive dashboards in Tableau to track student performance metrics, providing real-time insights to faculty and administrators.Developed and implemented machine learning models to predict student academic performance and graduation rates, achieving a model accuracy improvement of 15% through iterative enhancements.Designed and executed data pipelines to automate the ETL processes, significantly reducing manual data handling time by 40% and ensuring smooth data flow between systems.Implemented stringent data governance policies, ensuring adherence to GDPR and FERPA regulations, and maintained comprehensive documentation and metadata for datasets.Fostered strong relationships with faculty, administrators, and IT staff to align data projects with university priorities. Delivered clear and actionable insights to stakeholders, enhancing data literacy across the institution.Demonstrated a commitment to professional growth by mastering programming languages (Python, SQL), database systems (MySQL, PostgreSQL), and data processing frameworks (Apache Spark). Managed time effectively to balance professional responsibilities with coursework.Developed and maintained Tableau dashboards for various departments, enabling data visualization and reporting capabilities that improved decision-making processes.Trained faculty and staff on Tableau best practices, enhancing their ability to generate and interpret visual reports independently.Tech Stack: Python, SQL, Apache Spark, PySpark, TensorFlow, Keras, Scikit-learn, MySQL, PostgreSQL, Oracle, Tableau, FastAPI, RESTful APIs, Docker, Linux, AWS, Azure, GDPR, FERPAAcuity Software Technologies Ltd. Hyderabad, IndiaSoftware Engineer Jul 2021- Dec 2021Utilized Scala, Python, and PySpark to develop and optimize data pipelines for comprehensive data processing. Implemented data cleansing, transformation, and enrichment processes to ensure high data quality for business applications.Leveraged Spark MLlib and TensorFlow to build and deploy scalable machine learning models. Utilized Keras for deep learning model development, enhancing predictive analytics capabilities for bespoke applications.Implemented real-time data streaming solutions using Spark Streaming and Kafka, enabling near real-time data analytics for integrations with third-party accounting software and CRM applications.Developed data processing workflows using Google Cloud Dataflow(GCP) for real-time and batch data processing, enhancing the scalability and efficiency of data pipelines.Utilized BigQuery(GCP) for data warehousing and analytics, executing complex SQL queries to derive business insights and optimize data retrieval processes.Conducted complex data analyses and visualizations using Spark SQL and PowerBI. Developed interactive dashboards to provide insights into data trends, enhancing decision-making for business applications.Automated ETL workflows using PySpark, reducing manual intervention and increasing efficiency. Implemented error handling and logging mechanisms to ensure robust and reliable data processing.Integrated data from various sources, including QuickBooks and GoldMine Contact Manager, into centralized data platforms. Ensured seamless data synchronization and availability for business applications.Collaborated with cross-functional teams to design and develop custom applications tailored to client-specific requirements. Leveraged advanced data processing and machine learning techniques to deliver high-performance solutions.Leveraged machine learning algorithms(python) and libraries to analyze user behavior and preferences, enabling personalized recommendations tailored to individual user profiles.Through continuous optimization and refinement, contributed to increasing user engagement and fostering customer loyalty, resulting in improved conversion rates and revenue growth.Automated tasks with Cloud Functions and created advanced dashboards with Looker, ensuring data security and compliance.Developed and maintained interactive dashboards in PowerBI, providing real-time insights into key business metrics. Enhanced data visibility and accessibility for stakeholders.Conducted performance tuning of Spark jobs and machine learning models to reduce execution time and optimize resource usage. Utilized monitoring tools to identify and resolve bottlenecks.Worked closely with data scientists, analysts, and business stakeholders to define data requirements and deliver actionable insights. Facilitated clear communication and documentation to ensure project alignment and success.Used Google Cloud Storage(GCP) for storing and managing large datasets, ensuring high availability and durability of data.Implemented data governance policies and best practices to ensure data security and compliance. Managed data access controls and encryption to protect sensitive information.Applied innovative approaches to solve complex data challenges, ensuring reliable, flexible, and easy-to-use software solutions. Embraced cutting-edge technologies to drive continuous improvement and customer satisfaction.Ensured data security and compliance by implementing best practices for data encryption and access control within the GCP environment.Tech Stack: Scala, Python, GCP, PySpark, Spark MLlib, Spark SQL, TensorFlow, NumPy, Keras, PowerBI, Kafka.Capgemini Bangalore, IndiaSoftware Engineer May 2018- Jul 2021Implemented advanced security protocols on Linux servers, reducing system vulnerabilities and increasing overall network uptime.Streamlined server maintenance processes through automation using Bash scripting, resulting in decreasing system downtime.Collaborated with cross-functional teams to troubleshoot and resolve complex Linux server issues, increasing system performance and enhancing overall user experience.Managed the deployment of scripts onto pre-production and production servers based on change management or Incident management tickets on ServiceNow, ensuring adherence to change management processes and minimizing disruption to critical systems.Proactively managed database resources by expanding tablespace allocations upon user request, ensuring optimal performance, and accommodating growing data needs while adhering to organizational resource allocation policies.Configured Oracle & Greenplum database systems for development, UAT and production environments.Optimized database performance by implementing advanced indexing techniques in Greenplum and Oracle systems.Performing maintenance on tables in Greenplum involves various steps to ensure optimal performance and data integrity (Analyze, Reindexing, Vacuuming, Scheduled Maintenance based on database activity, data growth, and workload patterns)Implemented data backup and recovery strategies for critical databases, increasing system reliability and reducing database downtime.Developed automated monitoring tools for database health checks, reducing manual intervention time.Designed and modified PL/SQL stored procedures, triggers, and sequences to meet business requirements and enhance database performance.Performed backup & restoration, implemented recovery procedures, managed performance tuning, and conducted regular system backups.Created and maintained user accounts, privileged profiles, roles, and profiles. Provided technical and functional support to clients, troubleshooting issues and developing/executing SQL queries.Proficient in shell scripting with a demonstrated track record of developing robust and reusable scripts to automate complex data migration processes.Implemented automation solutions to facilitate seamless migration of data from on-premises Greenplum databases to AWS, leveraging shell scripting expertise to optimize performance and minimize manual intervention.Developed Python scripts to automate repetitive tasks across various applications, streamlining workflow processes and enhancing productivity. Tasks included data extraction from diverse sources, manipulation, and transformation, as well as the generation of comprehensive reports for stakeholders.By leveraging Python's versatility and libraries, contributed to improving efficiency and reducing manual effort in different domains.Demonstrated ability to convey complex technical concepts in a clear and concise manner through well-structured documentation, enhancing the efficiency and effectiveness of migration initiatives.Created comprehensive runbooks, technical guides, and knowledge base articles to facilitate knowledge sharing and collaboration among team members.Tech Stack: Linux servers, Oracle database, Greenplum database, PL/SQL, PostgresSQL, Python, Shell scripting, ServiceNow, AWS, PuTTY, GPCC, Oracle Enterprise Manager, TOAD, PGAdmin3, Dbeaver, Aginity Workbench etc.,
Respond to this candidate
Your Message
Please type the code shown in the image: