| 20,000+ Fresh Resumes Monthly | |
|
|
| | Click here or scroll down to respond to this candidateCandidate's Name
Email: EMAIL AVAILABLEProfile Summary:9+ years of experience in IT industry as an Azure Data Engineer, Power Bi developer, Python, PySpark developer in Collecting business requirements, Design, Analysis, Development, Enhancements of Client/Server Business systems like Business Intelligence and Data warehousing.Over 5+ years of experience on Data munging, Data cleaning, Data Analytics, Data Visualization, Azure Data ecosystems using Azure Data Lake Gen2, Azure Data Factory, Azure Synapse Analytics, Azure Data Bricks, Azure Blob Storage, Azure SQl Server, Power Bi and SSRS etc.Have experience in working in architectures using Azure Data platform capabilities like ADL, Azure Data Factory, Azure SQL and Azure SQLDW.Experience in Developing Spark applications using Spark - SQL/PySpark in Databricks for ETL from multiple file (Parquet, CSV file) formats for analyzing & transforming the data to customer usage patterns.Orchestrated data ingestion processes using Azure Data Factory (ADF), integrating data from various sources into Azure Data Lake Storage (ADLS) to create robust analytical data pipelinesLeveraged Python libraries such as Pandas, SciKit, Numpy, Matplotlib, and Seaborn for data wrangling, analysis, and visualization, enhancing data insights and decision-making processesExtensive experience with Azure Data Factory, Azure Databricks, Delta Lake, and Lakehouse architectures for batch and streaming data solutionsExperience in integrating Snowflake with Azure Data services to optimize data warehousing and analytics solutions.Designed and implemented data pipelines with Azure Databricks for data cleaning, transformation, and loading into Azure Synapse Analytics, ensuring efficient data processing and analysis.Conducted Data Wrangling and Exploratory Data Analysis (EDA) to uncover key metrics and actionable insights, supporting data-driven decision-making.Experience building data governance solutions using Unity Catalog and Azure Purview.Developed, integrated, and maintained Data Catalog and Metadata systems in the cloud, ensuring efficient data organization and accessibility.Executed batch and near real-time data integration processes, and led data migration projects, ensuring seamless data transfer and consistencyDeveloped and maintained data pipelines, wrote complex queries, and built data visualization reports, ensuring efficient data flow and addressing business casesDesigned and implemented Data Vault modeling in Azure Synapse to ensure scalable and auditable data integration, providing a robust foundation for enterprise data warehousing.Experience on migrating on Premises ETL process to Cloud.Implementing security layers in Delta Lake and Massive Parallel Processing Layers in Spark SQL and PySpark.Utilized version control tools like Git and built CI/CD pipelines to streamline the deployment and maintenance of data and ML modelsUtilized Kafka and Azure Event Hubs for seamless real-time data streaming and integration.Experience on ETL pipeline implementation using Azure services such as ADF, PySpark and DatabricksWorking experience Pipelines in ADF using Linked Services/Datasets/Pipeline to Extract and load data from different sources like Azure SQL, ADLS, Blob storage, Azure SQL Data warehouse.Working with ADF control flow transformations such as For Each, Lookup Activity, Until Activity, Web Activity, Wait Activity, If Condition Activity.Expert in RDMS management and optimization, utilizing Azure SQL Server to enhance data storage, retrieval, and performance for large-scale business applications.Configured the SQL auto loader on azure data Bricks notebooks to ingest data into delta tablesHands on experience in incrementally processing of new data files once they arrive data lake gen2 with the help of SQL auto loader without any additional setupGood understanding of Visualizations and reporting tools including Power Bi Desktop, Power Bi Service, Power Query Editor and MS excel.Implemented Delta Lake features like time travel, schema enforcement and merge functionalityScheduling Notebooks using ADF V2Monitoring / Troubleshooting ADF Jobs in production.Having experience in writing Stored procedure in Databases and very good knowledge on Joins.Experience in Always on High availability future in Azure cloud, including Load Balancer.Good working knowledge in project management activities in Agile / JIRA.Good Knowledge on integrating Databricks with different storages and databases.Hands on experience on different types of clusters in Databricks and their uses, and good troubleshooting skill on Cluster issues.Created numerous simple to complex queries involving self joins, correlated sub queries, CTE's and XML techniques for diverse business requirements. Tuned and optimized queries by altering database design, analyzing different query options, and indexing strategies.Experience in developing the interactive dashboards with NLPs integration to get the dynamic reports to help the client to take data driven decisions.Solid understanding of DAX, M Language and VBA macros.Experience in developing ETL applications on large volumes of data using different tools: PySpark, Azure data Factory, Azure SQL server.Experience in various databases such as MySQL, SQL, Azure SQL server.Experience in Continuous Integration and Deployments (CI/CD) using build tools like Jenkins, MAVEN, and ANT.Experience in processing large datasets with using PySpark.Exposure to Microsoft Logic app on Azure market place to automate the custom Email generations.Expertise in Creating, Debugging, Scheduling and Monitoring ADF pipelines for continuous and automated ETL processing to store data to snowflake data warehouse.Technical Skills:Repository/Version/Release ControlAzure Repos, Azure Pipelines, GitHubProject ManagementJIRA, Quality Center, RallyVisualizing and Reporting toolsMs.-Excel, SSRS, Power BI.Programming Languages/Frameworks/ToolsPython, SQL, Dax, M Language, VBA MacrosML FrameworksPandas, PySpark, NumPy, MatpatlibCloud TechnologiesAzure and AWSScript languagesShell Scripting, VB Script,Operating SystemsWindows, Ubuntu Linux, MacOSDomainHealthcare and Auto InsuranceEducational Qualification:Title of the DegreeCollege/UniversityDiploma in Cloud Technologies, Block chain and IOTIIT-M (Great Learning), 2021Masters in Pharmaceuticals & RegulatoryAndhra University, India (2015)Bachelor of Pharmaceutical SciencesAndhra University, India (2012)Software Certifications:Name of InstitutionDiploma or CertificateYear of CompletionMicrosoftAzure Fundamentals2023MicrosoftCertified Power Bi Data Analyst2023Professional Experience:Client Name: Optum Global Solutions, Remote Feb 2021 PresentRole: Lead Azure Data EngineerResponsibilities:Design and development, unit testing, integration, deployment packaging and checkout, scheduling, of various components in Azure Data Factory and Azure Data Bricks through several SDLCs for implementing ETL/ELT processes for cloud based very large data warehouse and ODS integrated with data for subject areas from disparate data sources and processes supporting downstream data marts.Performed analysis of ETL/ELT applications and database systems, data structure analysis, data profiling for application design and defects resolution.Converted legacy ETL processes into Azure ADF compatible architecture.Performed process enhancements such as capacity improvement, tuning/Performed extensive metadata gathering and analysis of database and metadata repositories for understanding data lineage, data flow, process design, process statistics and identifying storage elements and access to data using JSON, XML, CSV.Partnered with DBA/SA to optimize database tables/indexes, partitioning, parallelization and in implementing database/client configurations to support pipeline partitioning and multi-threaded processing.Migrated data to Snowflake for enhanced data warehousing capabilities, ensuring seamless data integration and improved query performanceImplemented Delta Lake and Lakehouse architecture in Databricks to ensure reliable data lake storage, enabling ACID transactions, scalable data processing, and seamless integration with downstream analytics platformsImplemented data governance solutions to ensure data quality, compliance, and consistency across the Azure environment.Utilized Python libraries such as Pandas, Numpy, Matplotlib, and Seaborn for data wrangling, analysis, and visualization, ensuring clean and structured data for analytics.Leverage the azure data bricks auto loader to simplify and accelerate data ingestion from files which are stored in azure data lake gen2.Used auto loader with delta live tables for incremental data ingestion, fault tolerance and to efficient process of files.Utilized Data Vault to streamline data lineage and historical tracking, enhancing the accuracy and traceability of enterprise data in the Azure environment.Handling team work and updating to client on daily basisCreated Linked Services/Datasets/Pipeline/ for different data sources like File System, Data Lake Gen2Handled Dependency pipelines runs using ADF activities like validation activityCreated Reusable pipelines for Pipeline logsCreated Parameterized stored procedure to updated pipelines logsCreated restorable pipelines for handling failure scenariosImplemented the alerts in ADF pipelines to trigger notification when pipeline is failedWritten PySpark code in python notebooks in Databricks for Data Cleansing.Experience on creating scope for Azure Key Vault service from Databricks.Experience on creating spark Transformations like merge functionality and window functions Performance Tuning: Enhanced and troubleshoot ETL process runtime contingencies applying Database tuning, query tuning and application tuning solutions.Design and Architecture, ETL and Data application systems and frameworks: Performed requirements analysis, systems analysis, data analysis and designed application systems and framework architecture incorporating/implementing functional and non-functional systems specifications governed by enterprise business rules, security policies, environmental and application frameworks, delivered in both agile adopted and non-agile SDLC programs.Performed service requests, access requests, deployment requests, role/user requests, operations request etc. and related workflow management with Service Now.ADLS: Data ingestion into ADLS (Storing the complete files into ADLS and path in Database table) for Space optimization.Developed complex SQL queries using stored procs, including Common Table Expressions to support Power BI and SSRS reports.Employed DAX formulas extensively to construct measures that produce visuals aligning with client specifications. Demonstrated efficiency in M language and Power Query Editor for data transformation, merging, and loading. Gained deep hands-on experience in creating custom columns and performing joins in Power Query Editor.Environment: Azure Synapse Analytics, Azure Hyperscale, Azure Data Factory, Azure Data bricks, Azure Synapse Studio, SQL Server, SSMS, SSAS, Power Bi, Azure Devops.Client Name: Cigna Jan 2020 Feb 2021Role: Azure Data EngineerResponsibilities:Worked on NumPy, Pyspark, Pandas Jupiter Notebook, in Python at various stages for developing, maintaining and optimizing machine learning model.Developed pipeline on Azure Data Factory for processing the files which are received from the SFTP lactations.Working Experience on Azure Databricks cloud to organizing the data into notebooks and making it easy to visualize data using dashboards.Performed ETL on data from different source systems to Azure Data Storage services using a combination of Azure Data Factory, T-SQL, Spark SQL, and U-SQL Azure Data Lake Analytics. Data Ingestion to one or more Azure Services - (Azure Data Lake, Azure Storage, Azure SQL, Azure DW) and processing the data in Azure Databricks.Worked on managing the Spark Databricks azureImplemented data ingestion from various source systems using Pyspark.Hands on experience implementing parameterized ADF pipelines and monitoring the ADF pipelines.Creating the custom alert system on Pipelines and ADF data flows by using the logic app.KS by proper troubleshooting, estimation, and monitoring of the clusters.Performed Data Aggregation, Validation and on Azure HDInsight using spark scripts written in Python.Expertise in creating the triggers on Azure Adf to automate the Pipelines run.Involved in extraction, transformation and loading of data directly from different source systems (flat files/Excel/Oracle/SQL) using SAS/SQL, SAS/macros.Generated PL/SQL scripts for data manipulation, validation, and materialized views for remote instances.Created large datasets by combining individual datasets using various inner and outer joins in Azure SQL server.Extensively worked on different activities of ADF pipelines like for Each, If, look up and meta data.Wrote Python scripts to parse XML documents and load the data in the database.Responsible for translating business and data requirements into logical data models in support of Enterprise data models, ODS, OLAP, OLTP and Operational data structures.Created SSIS packages to migrate data from heterogeneous sources such as MS Excel, Flat files and CVS files.Implemented a proof of concept deploying this product in AZURE BLOB and Snowflake.Utilize AZURE services with focus on big data architect /analytics / enterprise Data warehouse and business intelligence solutions to ensure optimal architecture, scalability, flexibility, availability,Build data pipelines in airflow in Azure for ETL related jobs using different airflow operators.Evaluated the accuracy and precision of the algorithm using a variety of validation techniques.Implemented Machine Learning models in Spark using the PySpark.Used PySpark for extract, filtering and transforming the Data in data pipelines.Designed and Developed Spark workflows using Python for data pull from AZURE BLOB and Snowflake applying transformations on it.Used Vault to set up Terraform enterprise, to seal and unseal keys and sensitive secrets within the enterprise work-space.Assisted in data cleansing process, improved dataset quality by 10%.Environment: Azure Synapse Analytics, Azure Hyperscale, Azure Data Factory, Azure Data bricks, Azure Synapse Studio, SQL Server, SSMS, SSAS, Power Bi, Azure Devops.Client Name: Geico, MD Oct 2017 Dec 2019Role: Azure Data EngineerResponsibilities: -Design and development, unit testing, integration, deployment packaging and checkout, scheduling, of various components in, Azure Data Factory and Azure Data Bricks through several SDLCs for implementing ETL/ELT processes for cloud based very large data warehouse and ODS integrated with data for subject areas from disparate data sources and processes supporting downstream data marts.Performed analysis of ETL/ELT applications and database systems, data structure analysis, data profiling for application design and defects resolution.Performed process enhancements such as capacity improvement, tuning/optimization and performance improvement of the Azure SQL server.Configured the storage event triggers and scheduled based triggers on adf pipelines for seamless ETL/ELT process.Streamlined SDLC processes by automating data sampling and data comparison for testing and analysis of source and target data for ETL application development, QA and UAT.Performance Tuning: Enhanced and troubleshoot ETL process runtime contingencies applying Database tuning, query tuning and application tuning solutions.Created data bases, blob storages and configuring azure storage resources.Developed the Views and stored procs for generating the Power Bi dashboards for quick data insights.Performed service requests, access requests, deployment requests, role/user requests, operations request etc. and related workflow management with Service Now.Environment: Azure Lake Gen2, Azure Blob Storage, MS-SQL, SQL server, Azure SQL, Azure Data Factory, Azure Data Bricks, PySpark, Pandas, Power Bi.Client Name: Progressive, OH Apr 2015 Oct 2017Role: Data EngineerResponsibilities:Wrote ETL scripts in azure data bricks for extraction and validating the data.Used data frames to read and store the files to SQL server.Utilized the various enter Design patterns to develop the Business modules based on the required functionality.Experienced in star schema and snow flake schema.Interacted with business analysts and other end users to resolve user requirements issues.Extensively used for loop and if condition iteration activities in ADF pipelines.Good exposure to writing the SQL and stored procs in azure data flows.Actively participated in development sessions and continuously Interacted with client for requirement gathering and analysis.Exposed to the agile project methodology.Environment: SQL Server, Azure ADF, Azure Synapse Analytics, Blob storage, SFTP, SSRS, Azure ADF, Azure Data bricks. |