Azure Data Engineer Resume Allen, TX

Azure Data Engineer Resume Allen, TX
Resumes | Register

Candidate Information
Name	Available: Register for Free
Title	Azure data engineer
Target Location	US-TX-Allen
Email	Available with paid plan
Phone	Available with paid plan

20,000+ Fresh Resumes Monthly

View Phone Numbers

Receive Resume E-mail Alerts

Post Jobs Free

Link your Free Jobs Page

... and much more
Register on Jobvertise Free

Related Resumes

Azure data engineer Frisco, TX

Sr. Azure Data Engineer Irving, TX

Data Engineer Azure Irving, TX

Data Engineer Azure Frisco, TX

Data Engineer Azure Prosper, TX

Data Engineer Azure Dallas, TX

Azure Data Engineer Plano, TX

Click here or scroll down to respond to this candidate

Vinith SiripuramAzure Data EngineerPHONE NUMBER AVAILABLEEMAIL AVAILABLEPROFESSIONAL SUMMARY proficient and experienced Azure Data Engineer and Big Data Engineer with over 10+ years of expertise in the IT industry, specializing in Microsoft Azure Cloud and Big Data Technologies. Proficient in Azure components, including Data Factory, Data Lake Gen2, Blob Storage, Databricks, Synapse, Event Hubs, Logic Apps, Function Apps, Key Vault, Power Query for on-prem to Azure cloud solutions. And Cosmos DB, HDFS, SQL, PLSQL, Spark, Scala, Python, Kafka, PIG, HIVE, SQOOP, OOZIE for Big data. Proficient in designing and orchestrating scalable data workflows, automating complex processes seamlessly using Azure Data Factory. I was eager to utilize Azure Blob Storage for effective data management, storage, and retrieval within the Azure ecosystem. Workflows for data analytics were made more efficient by using Databricks for distributed data processing and transformations. Experienced in Highly efficiently storing and managing large-scale data with Azure Data Lake Gen2, ensuring optimized data retrieval. Capable of designing and implementing workflows, integrating applications and services seamlessly using Azure Logic Apps. Proficient in developing and deploying serverless functions using Azure Function Apps, enabling efficient execution of code in response to events. Expert in designing cloud-based data warehouse solutions using Azure Synapse, optimizing schemas, tables, and views for enhanced data storage and retrieval efficiency. Competent in securing and managing sensitive information with Azure Key Vault, ensuring robust key management and safeguarding credentials. Proficient in optimizing Spark jobs for distributed data processing, leveraging Apache Spark for enhanced data analytics. Experienced in scripting and developing applications using Scala, a versatile language for building scalable and robust systems. Skilled in optimizing schemas, tables, and views in Snowflake, leveraging features like Clone and Time Travel. Experienced in automating data loading with Snow pipe, executing SQL queries, and managing databases using Snow SQL. Proficient in organizing routine tasks with Snow Tasks for efficient database management. Proficient in implementing architectures streaming and processing using Apache Kafka, ensuring efficient and scalable event-driven architectures. Adept at utilizing Kafka's scalable and distributed messaging system to handle large data volumes, ensuring low-latency and fault-tolerant communication within complex data processing pipelines. Expert in utilizing HDFS for distributed storage and processing of large volumes of data within the Hadoop ecosystem.
Proficient in working within the Hadoop ecosystem, encompassing various tools and frameworks like Hive, HDFS, YARN, Oozie, Zookeeper, and Sqoop for comprehensive big data processing solutions. Experienced in configuring workflows and scheduling jobs using Apache Oozie, ensuring efficient coordination of Hadoop jobs within the Hadoop ecosystem. I worked with Apache Zookeeper for distributed coordination and management within the Hadoop ecosystem, ensuring reliability and synchronization across distributed systems. Proficient in utilizing Apache Sqoop for importing and exporting data between HDFS, Hive, and relational databases, facilitating seamless data transfer within the Hadoop ecosystem. Experienced in handling diverse file formats, including CSV, Parquet, ORC and Binary, adapting to varied data storage and processing needs. Adept at utilizing Git, GitHub, and Azure DevOps for version control, ensuring collaboration, tracking changes, and maintaining codebase integrity in data engineering projects. High proficiency in Agile methodologies, utilizing tools like JIRA for effective project management, reporting, and successful adaptation to evolving project requirements in dynamic IT environments.TECHNICAL SKILLSAzure ServicesAzure Data Factory, Azure Databricks, ADLS, Blob storage, Event Hubs, Logic Apps, Functional Apps, Azure Key Vaults, Azure Synapse Analytics, Azure DevOps, Power BI.Big Data TechnologiesHadoop, Hive, Python, Spark, Scala, Kafka, Spark streaming, Oozie, Sqoop, Zookeeper.LanguagesSQL, PL/SQL, Python, HiveQL, Scala, C#.Operating SystemsWindows (XP/7/8/10), UNIX, LINUX, UBUNTU, CentOS.Version ControlGIT, GitHub.IDE &Build Tools, Design Visual Studio Code, PyCharm.Databases & DatawarehouseMS SQL Server 2016/2014/2012, Azure SQL DB, Azure Synapse. MS Excel, MS Access, Oracle 11g/12c, Cosmos DB, SnowflakeEDUCATION DETAILS
Masters: Auburn University In 2012.Bachelors: Sri Indu College of Eng and Tech In 2011.CERTIFICATIONS:[Udemy Certified]: DP- 203 Data Engineering on Microsoft Azure - [2024] DP-900 Microsoft Azure Data Fundamentals Certification [2024]WORK EXPERIENCERole: Azure data engineer | Jan 2022 Till NowClient: OptumResponsibilities:
Designed and implemented scalable data ingestion pipelines using Azure Data Factory, ingesting data from various sources such as SQL databases, CSV files, and REST APIs. Developed data processing workflows using Azure Databricks, leveraging Spark for distributed data processing and transformation tasks.
Ensured data quality and integrity by performing data validation, cleansing, and transformation operations using Azure Data Factory and Databricks. Designed and implemented a scalable data warehouse solution, integrating data processing and analytics capabilities for comprehensive insights. Streamlined CI/CD workflows, automating data pipeline deployments and implementing version control practices using Git for efficient collaboration and continuous integration. Managed and processed data in Azure Data Lake Gen2 for scalable and cost-effective storage, supporting the diverse data processing needs of the project. Capable of designing and implementing workflows, integrating applications and services seamlessly using Azure Logic Apps. Proficient in developing and deploying serverless functions using Azure Function Apps, enabling efficient execution of code in response to events. Expert in designing cloud-based data warehouse solutions using Azure Synapse, optimizing schemas, tables, and views for enhanced data storage and retrieval efficiency. Effectively utilized Azure DevOps to streamline CI/CD workflows, ensuring automated data pipeline deployments and efficient version control practices. Implemented real-time data streaming using Apache Kafka, managing Kafka clusters and topics to facilitate efficient and scalable event-driven architectures. I leveraged Cosmos DB's global distribution for high availability and low latency across regions, while benefiting from its multi-model support for flexible data handling. Conducted thorough performance tuning and capacity planning exercises to ensure the scalability and efficiency of the data infrastructure. Identified and resolved performance bottlenecks in data processing and storage layers. Designed and implemented different pipeline types, including Incremental, Full Load, and Historical data po9rocessing retrieval pipelines. Orchestrated complex workflows to handle diverse data integration scenarios. Designed and optimized database schemas for OLTP systems to ensure data integrity and performance and extracted, transformed, and loaded (ETL) data from multiple sources into OLAP systems for comprehensive analysis. Implemented watermark columns for tracking data changes and managing incremental loads. Established comprehensive logging mechanisms to capture pipeline activities and errors. Configured email alert notifications for critical events and integrated monitoring solutions to track data flow and transformations. Worked with various file formats, including CSV, Parquet, ORC, Avro, Binary, to optimize data storage and based on specific analytics requirements.Environment: Azure Databricks, Data Factory, Snowflake, Logic Apps, Functional App, Azure Synapse Analytics, MS SQL, Oracle, Spark, Hive, SQL, Python, Scala, Spark, shell scripting, Kafka, ADF Pipeline.Role: Azure data engineer | Sep 2019 Dec 2021
Client: State of Texas, RemoteResponsibilities:
Implemented end-to-end data pipelines using Azure Data Factory to extract, transform, and load (ETL) data from diverse sources. Implemented data governance practices and data quality checks using Azure Data Factory and Snowflake, ensuring data accuracy and consistency.
Utilized Azure Blob Storage for efficient storage and retrieval of data files, implementing compression and encryption techniques to optimize storage costs and data security. Designed and implemented data processing workflows using Azure Databricks, leveraging Spark for large-scale data transformations, and optimized data pipelines and Spark jobs for improved performance by tuning Spark configurations, caching, and utilizing data partitioning techniques. Developed data ingestion pipelines using Azure Event Hubs and Azure Functions to enable real-time data streaming into Snowflake. Leveraged Azure Data Lake Storage Gen2 as a data lake for storing raw and processed data, implementing data partitioning and data retention strategies. Integrated Azure Data Factory with Azure Logic Apps for orchestrating complex data workflows and triggering actions based on specific events. Designed and implemented a cloud-based data warehouse solution using Snowflake on Azure, capitalizing on its scalability and performance capabilities. Skilled in optimizing schemas, tables, and views in Snowflake, leveraging features like Clone and Time Travel. Experienced in automating data loading with Snow pipe, executing SQL queries, and managing databases using Snow SQL. Proficient in organizing routine tasks with Snow Tasks for efficient database management. Effectively utilized Azure DevOps to streamline CI/CD workflows, ensuring automated data pipeline deployments and efficient version control practices. Streamlined CI/CD workflows, automating data pipeline deployments and implementing version control practices using Git for efficient collaboration and continuous integration. Designed and developed OLAP cubes/models to support complex analytical queries and reporting requirements and implemented and maintained OLTP databases to efficiently handle real-time transactional data. Designed and optimized database schemas for OLTP systems to ensure data integrity and performance and extracted, transformed, and loaded (ETL) data from multiple sources into OLAP systems for comprehensive analysis. Collaborated with cross-functional teams including data scientists, data analysts, and business stakeholders to understand data requirements and deliver scalable and reliable data solutions.Environment: Azure Databricks, Data Factory, Logic Apps, Snowflake, Functional App, MS SQL, Oracle, Spark, SQL, Python, Spark, git, GIT HUB, Kafka, ADF Pipeline, Power BI.Role: Big Data Developer | Oct 2017 Aug 2019Client: Kroger Inc., Blue Ash, OHResponsibilities:
Prepared and implemented an ETL framework using Sqoop, Pig, and Hive to extract, transform, and load data from diverse sources, ensuring availability for consumption. Extracted data from various sources into HDFS using Sqoop, facilitated data import, and performed transformations using Hive and MapReduce. Processed HDFS data and created external tables using Hive, developing scripts for table ingestion and repair for reuse across the project. Developed ETL jobs using Spark and Scala for migrating data from Oracle to new MySQL tables, utilizing Spark features such as RDDs, Data Frames, and Spark SQL. Implemented a Spark Streaming application for real-time sales analytics, utilizing Spark's streaming capabilities for immediate insights. Analyzed source data, handled data type modifications, and generated Power BI ad-hoc reports using Excel sheets, flat files, and CSV files. Analyzed SQL scripts and designed solutions using Spark, contributing to effective data processing and analysis. Extracted data from MySQL into HDFS using Sqoop, ensuring seamless data transfer and processing. Implemented data classification algorithms using MapReduce design patterns, contributing to effective data processing and analysis. Implemented automation for deployments using YAML scripts, streamlining builds and releases for efficient development processes. Worked with a variety of big data technologies, including Apache Hive, Apache Pig, HBase, Apache Spark, Zookeeper, Kafka, and Sqoop. Worked extensively on creating combiners, partitioning, and distributed cache to enhance the performance of MapReduce jobs, ensuring optimal data processing capabilities.Environment: Hadoop, Hive, spark, Spark, Sqoop, Spark SQL, Shell script, Cassandra, YAML, ETL.Role: Datawarehouse Developer | Feb 2015 Apr 2017Client: Mayo Clinic, Rochester, MNResponsibilities:
Worked in Agile Scrum Methodology, participating in daily stand-up meetings, and developed Visual SourceSafe for Visual Studio 2010. Generated Drill through and Drill down reports in Power BI, incorporating features such as drop-down menu options, data sorting, and defining subtotals. Used Data Warehouse for developing Data Mart, feeding downstream reports. Developed a User Access Tool for users to create ad-hoc reports and run queries in the proposed Cube. Deployed SSIS Packages and created jobs for the efficient running of packages, ensuring smooth ETL processes. Proficiency in creating ETL packages using SSIS to extract data from heterogeneous databases and transform and load it into the data mart. Conducted regular regression testing to ensure ETL processes function correctly after system upgrades or changes. Set up and configured test environments, ensuring data consistency and replication of production scenarios for accurate testing. Involved in creating SSIS jobs to automate report generation and cube refresh packages. Experienced with SQL Server Reporting Services (SSRS) to author, manage, and deliver both paper-based and interactive Web-based reports. Developed stored procedures and triggers to facilitate consistent data entry into the database. Shared data outside using Snowflake to quickly set up data sharing without the need for extensive data transfers or pipeline development.Environment: Windows server, MS SQL Server 2014, SSIS, SSAS, SSRS, SQL Profiler, Power BI, performance Point Server, MS Office, SharePoint.Role: Datawarehouse Developer | Feb 2013 Jan 2015Client: JP MORGAN CHASE, New Haven, CTResponsibilities:
Developed complex stored procedures, efficient triggers, and required functions for optimizing performance. Monitored SQL Server performance and implemented tuning strategies. Designed ETL data flows using SSIS, creating mappings/workflows for data extraction and transformation from SQL Server, Access, and Excel Sheets. Worked on building Cubes and Dimensions with different Architectures and Data Sources for Business Intelligence. Developed and executed comprehensive data validation tests for accuracy and integrity during extraction and transformation stages. Designed and documented test cases and scenarios based on ETL specifications and business requirements. Conducted performance testing to assess efficiency and scalability of ETL processes. Developed and maintained ETL testing scripts using automation tools like Data Validation Option, Query Surge, and Apache JMeter. Developed Dimensional Data Modeling for Data Mart design, identifying Facts and Dimensions. Used Slowly Changing Dimensions (SCD) for creating fact tables and dimension tables. Possessed thorough knowledge of features, structure, attributes, hierarchies, Star, and Snowflake Schemas of Data Marts. Developed SSAS Cubes, Aggregation, KPIs, Measures, Partitioning Cube, Data Mining Models, and Deploying and Processing SSAS objects. Developed ad hoc reports with complex formulas, querying the database for Business Intelligence. Demonstrated flexibility, enthusiasm, and a project-oriented team player with excellent written, verbal communication, and leadership skills.Environment: MS SQL Server 2016, Visual Studio 2017/2019, SSIS, Share point, MS Access, Team Foundation server.

Respond to this candidate
Your Email	«
Your Message
Please type the code shown in the image: