| 20,000+ Fresh Resumes Monthly | |
|
|
| | Click here or scroll down to respond to this candidateCandidate's Name
Data EngineerMail: EMAIL AVAILABLE Contact: PHONE NUMBER AVAILABLE Linked in: LINKEDIN LINK AVAILABLEPROFESSIONAL SUMMARY:Over 3+ years of extensive experience in Information Technology with expertise on Data Analytics, Design, Development, Implementation, Testing and Deployment of Software Applications.Extensive experience with different phases of project (project initiation, project requirement and specification gathering, designing system, coding, testing, and debugging existing client-server-based applications).Experience in Azure Cloud Computing services, such as Azure Data Factory, Azure Storage accounts, Synapse, Key vaults, ADLS Gen2, Azure App insights, Azure data bricks, Arm templates, Azure devops, IRs, Azure data flows.Extensively involved through the Software Development Life Cycle (SDLC) from initial planning through implementation of the projects.Proficient in applying SLDC software development process to establish a business analysis methodology.Led data mapping, exploratory data analysis and logical data modelling between source & destination systems.Designed Azure architecture, Cloud migration, Dynamo DB and event processing using Azure data factoryExperience in managing and securing the Custom AMI's, Azure account access using IAM.Focused individual with excellent analytical & persuasion capabilities.Knowledge of conducting data flow modeling and object modeling, case analysis and functional decomposition analysis.Hands-on experience in designing end-to-end ETL Strategy utilizing SSIS as ETL tool and authored multitude of SSIS packages for data migration purposes.Experience in working on object-oriented analysis concept including implementation, design, and programming technical skills.Possess excellent understanding of Quality Assurance processes and SDLC.Proficient in designing the data pipeline which can capture data from streaming web data as well as RDBMS source data.Merit of building the ETL architecture and source to target mapping to load data into data warehouse.Experience in manipulating data for data loads, extracts, statistical analysis, and modeling.Good knowledge of data modeling with different techniques like Data Vault 2.0, start schema and Kimball modeling.Good exposure towards different types of activities in Azure data factory like Copy Activity, For Each, Look Up, Switch, IF, execute pipeline, web activity, set variable for implementation of ETL pipelines.Orchestration of several ETL pipelines in Azure data Factory and Synapse for loading data from different types of sources like SQL Server, Oracle, Host server, SharePoint path, Storage accounts in Azure to Target resources like snowflake DW and Synapse analytics.Good knowledge towards datasets, Integration run time and Linked services in Azure data factory for providing connection and access towards data which is part of different resources.Implemented sandbox Azure Data Factory pipelines taken as reference from projects using ARM templates files migrationDesigned and implemented end-to-end data pipelines to extract, cleanse, process and analyze huge amounts of behavioral data and log data.Ability to architect solutions that meet business and IT needs, create Data Platform roadmaps, and enable the Data Platform to scale to support additional use cases in Azure Data Lake, Azure Data bricks, Azure Data FactoryStrong Knowledge on designing & building the data model for Snowflake cloud data warehouse.Experience in developing production-ready spark applications using Spark RDD APIs, Data frames, Spark-SQL, and Spark-Streaming API.Worked on data processing and transformations and actions in spark by using Python (Pyspark) language.Exposure towards SQL implementations like complex queries, Views, CTEs, Stored Procedures and window functions.Experienced in using waterfall, Agile and Scrum models of software development process framework.Created various reports and dashboards in Power BI with different types of sources like ADLS Gen2, SharePoint path, Postgres SQL, SQL server, salesforce objects, Power BI shared datasets and SSAS cubesHaving good exposure towards visual like Table, Bar, Card, Funnel, Line, Gauge, Doughnut, Waterfall, Slicers, Matrix visualizations.Worked on DAX expressions like filters, Aggregate, Mathematical, Time Intelligence Functions etc.Created New Calculated Columns and Measures using DAX Expression like filters, Aggregate, Mathematical, Time Intelligence Functions etc.Implemented datasets in Power BI based on mapping sheet by following star schema modeling.Flexible in working with selection and bookmark panes for hiding and displaying visuals according to the required selections.Worked on Schedule refresh in Power BI Service based on timings provided for all types of source datasetsImplemented Role Based Security as part of security in Power Bi for users who are going to utilize the reports.Experience in Using multiple join statements to retrieve data from multiple tablesCreated different logics for implementing transformation of data using split columns, conditional columns, append and merge queries in Power Query editor.Active team player with excellent interpersonal skills, a keen learner with self-commitment& innovation.Superior communication skills, strong decision making and organizational skills along with outstanding analytical and problem-solving skills to undertake challenging jobs.Guiding team in giving trainings by scheduling calls on the skills required for project.TECHNICAL SKILLS:Big Data /HadoopSpark, Spark SQL, Azure, Spark Streaming, PySparkLanguagesSQL and PythonDev ToolsMicrosoft SQL Studio, Azure data factory and Azure DatabricksCloudAzure portal, Azure devops and Azure resourcesBuild ToolsAzure data factory, SQL developer and SSMSReporting ToolsMS Office (Word/Excel/Power Point/ Visio/Outlook), Power BI and TableauDatabasesMS SQL Server, MySQL, OracleOperating SysWindowsWORK EXPERIENCE:Ericsson, Plano, TX Mar 2022 PresentData EngineerResponsibilities:Participated in daily stand-up meetings to update the project status with internal Dev team.Designing and deploying Azure Solutions using Azure data factory, synapse, Azure data bricks, Azure data flows, storage accountsSetup and build Azure infrastructure various resources, key vaults, azure SQL DB, IAM, synapse.Developed, deployed, and managed event driven and scheduled ADF pipelines to be triggered in response to events on various Azure sources including logging.Deployed a code using Azure Devops by merging feature branch to master, raising PR and executing CI CD pipelinesWrote Python scripts to load data from Web APIs to staging DB.Developed test cases in Excel and conducted manual testing / execution.Presented automation test result analysis during daily Agile stand-up meetings.Reverse engineered existing data models to incorporate new changes utilizing Erwin.Developed artifacts that are consumed by the data engineering team such as source to target mappings, data quality rules, and data transformation rules, Joins etc.Implemented exploratory data analysis by utilizing simple machine learning algorithm.Performed data analysis and profiling of source data to better understand the sources.Initiated the data modelling sessions, to design and build/append appropriate data mart models to support the reporting needs of applications.Created complex stored procedures to perform index maintenance and data profiling for loading data marts and generating datasets for reports.Facilitated data collection sessions and analyzed data processes, scenarios, and information flow.Resolved multiple data governance issues to support data consistency at the enterprise level.Worked on data profiling & data validation to ensure the accuracy of the data between warehouse & source systems.Managed multiple activities in Azure data factoryWorked heavily with ADF and its infrastructure, including Copy activity, Get metadata, Web activity, execute pipeline, Azure data flows, IRS, Dataset and linked service implementation, IAM, triggers, synapse.Created Azure SQL DB and managing policies for Azure SQL DB and Utilized Azure SQL DB for storage and backup on Azure. Extensive knowledge in migrating applications from internal data center to AzureDeveloped data mapping, data governance, transformation and cleansing rules for the Master Data Management Architecture involving OLTP and OLAP.Handled structured and unstructured datasets.Built high quality, reliable, and consistent sound systems that are aligned and scale with our data business needs.Implemented ADF Organization to centrally manage multiple Azure accounts including consolidated billing and policy-based restrictionsCreated technical design & strategy for ETL, data warehouse design, reporting, requirement specifications, business rules, data mapping, key decisions, and metadata management.Designed ETL strategies to populate the data warehouse/data-mart with facts and dimensions.Created solutions utilizing data warehousing concepts, dimensional/cube data modeling techniques, via DAX scripting.Refreshed reports and dashboards on daily basis utilizing Data Gateways on Power BI services.Ingesting files using ADF loading data from files to processed and formatted layer and finally to snowflake Datawarehouse.Implemented UPSERT pipeline for Change Data Capture using Audit fields like AUD_INSRT_TMSTP and AUD_UPDT_TMSTP for updating and inserting latest data from source to TargetCreated Merge scripts and Stored procedure in SQL required for UPSERT pipelineCreated notebooks in Azure data bricks for doing transformations like adding headers to the files which doesnt contain headers, splitting file data header, detail and trailer separately finally loading to snowflake tables, removing last empty and unnecessary records from source files.Developing production-ready spark applications using Spark RDD APIs, Data frames, Spark-SQL and Spark-Streaming API.Worked on data processing and transformations and actions in spark by using Python (Pyspark) language.Implemented Spark Streaming, Spark SQL, and other components of spark like accumulators, Broadcast variables, different levels of caching and optimization techniques for spark jobsOrchestrating Azure data factory pipelines for migrating data from different files with format types like parquet, ORC, JSON, CSV, and fixed width files.Using web activity for calling URLs of APIs and azure functions into azure data factory while executing pipelines.Using different types of activities like Look Up, Stored Procedure, copy activity, For Each, switch etc. for migrating data from different types of sources to Staged layer of SnowflakeCreated Copy into queries in snowflake data warehouse for inserting data from parquet files which are stored in storage account of AzureCreated parameterized pipelines using parameters and variables in azure data factory which are used for incremental, UPSERT and truncate-Load pipelines load strategies.Created views and complex SQL scripts for pipeline implementation and developed several metadata scripts for inserting metadata which is used for pipeline execution.Created Alerts in Azure data factory which will allow end user or groups to identify pipeline execution status and errors in it.Created stored procedures for inserting, updating, and extracting data. Executing stored procedures using pipeline configuration.Created linked services and data sets for different resources like oracle, Azure SQL DB and snowflake DWGuiding team in orchestration of pipeline which will load data to data warehouses like snowflake and synapse analytics.After 1st phase of development deployed whole code to Production and monitored pipelines with the help of audit data available.Resolved several production issues like pipeline failure due to data type mismatch, metadata implementation issues,File configuration issues like quote character, zip or deflate functionality issues in copy activity, issues in spark SQL code while executing notebooks in Azure data bricks.Deploying Azure data factory code and SQL scripts for inserting metadata using Azure Devops from one environment to another.Worked on the creating case management reports by using azure data lake gen2 as data source.Created data model as different entities are to be related for all reports.Connecting Power BI desktop with sources using live connection of Power BI Shared dataset connection.Used Table, tree map, Bar, Card, Doughnut, Slicers visualizations.Used filter pane to display visuals according to the conditions.Created New Calculated Column and Measure using DAX Expression.Published reports in Power Bi Service.Done schedule refresh for the reports.Transform data by using merge queries/ append queries in Edit Queries section on Power Bi Desktop.Creating bookmarks for report to view in current scenario.Gateway connections implementation in Power BI Service for allowing data refresh in on premisesImplemented DAX expressions for MTD and YTD based on slicer selection.Using bookmark and selection pane for hiding and displaying visuals based on selection.Configuration of RLS (Row Level Security) for specified users with the help of specific user selection in Power BI service.Worked on different DAX functions like EOMONTH, DATEDIFF, CALCULATE, COUNT, SUM, FILTER,ALL etcImplemented Data sets using data modeling by connecting all the tables to satisfy all required conditions.Worked on all types of filters like Page level, Visual level, report level and drill through filters.Used all the functionalities of visual in Power BI like drill up, drill down, conditional formatting, tool tipsDisplaying images in all visuals using web URL from source tables.Implementation of sorting techniques in visual display of reports.Configuration of Power BI service account after publishing reports .Environment: Azure Data bricks, SSMS and Azure portalDXC Technology, Hyderabad, India May 2019 Aug 2021Data EngineerResponsibilities:Design robust, reusable, and scalable data driven solutions and data pipeline frameworks to automate the ingestion, processing and delivery of both structured and unstructured batch and real time data streaming data using Python Programming.Worked with building data warehouse structures, and creating facts, dimensions, aggregate tables, by dimensional modeling.Implemented Agile - Scrum Methodology for frequent changes to client requirements and following parallel development and testing.Performed feasibility analysis for applying CCI on multiple fact tables in the EDW by analyzing size of tables, CCI segment misalignment etc.Responsible for gathering requirements for the new projects and creating the data flow model of the business requirement.Used Python scripts to update content in the database and manipulate files.Created data mapping, data governance, transformation and cleansing rules involving OLTP and OLAP.Analyzed the source data and worked with business users and developers to develop the data model.Generated periodic reports based on the statistical analysis of the data from various time frame and division.Developed logical data model based on the requirements utilizing Erwin.Made logical data models and physical data models that capture current state/future state data elements and data flows.Automated Landing Zone and processed zone layers on Azure storage accounts by orchestrating Azure data factory pipelines.Conceptualized, developed, and maintained the data architecture, data models and standards for various data integration & data warehouse projects.Identified the dimensions along with the measures and fact on the top of OLTP source.Created data integration process for ETL, involving the access, manipulation, analysis, interpretation and presentation of information from both internal and secondary data sources to the business.Developed data mapping, data governance, transformation and cleansing rules involving OLTP and OLAP.Performed quality testing of converted data, identifying root cause of issues and designing / documenting proposed solutions; developed the logical and physical data model and designed the data flow.Led and improvised existing Enterprise Data Warehouse.Authored multitude of ETL packages to extract data from heterogenous OLTP sources such as flat files, excel files, SQL Server tables to populate dimensional data mart.Developed pipelines in Azure data factory for migrating data from SQL server to synapse tables.Worked on different load strategies like SCD 1, SCD2 and SCD 0 for loading data from different sources to synapse tablesHandled different types of files like JSON, fixed width, text files with header and without header, excel files and different types of delimited files in Azure data factoryImplemented notebooks in Azure data bricks for transforming complex files which cant be handled in Azure Data Factory.Understanding all types of files and their requirements which requires processing and transformationOrchestration of master pipeline by utilizing parameters and metadata driven solutions using control tables from Azure SQL DB where details of files will be stored to control the pipeline.Loading audit details like pipeline parameters, run id, timestamp, and pipeline details into Audit tables by using pipeline system variables and activity outputs in ADF.Handling code in Azure data bricks platform by utilizing spark SQL and pyspark functionalities.Deploying ADF code and SQL scripts from one environment to another with the help of Azure devops platformPublished reports to Power BI service and created dashboard and managed them into workspaces.Checked in project related documents and scripts on Team Foundation Server.Developed and implemented data cleansing, data security, data profiling and data monitoring processes.Determined data ownership, resolved data conflict, and aligned enterprise data with an emphasis on maturing data governance practices, improving data integrity, and reducing operational risk due to data quality issues.Worked closely with the Enterprise Data Warehouse team and Business Intelligence Architecture team to understand repository objects that support the business requirement and process.Environment: Python, Spark, Flume, HDFS, HBase, Hive, Pig, Sqoop, Zookeeper, EC2, EMR, S3, DataStage, AWS, Cloud watch.Education Details:Masters in UNIVERSITY OF THE PACIFIC, BUSINESS ANALYTICS, CALIFORNIABachelors in GITAM UNIVERSITY, COMPUTER SCIENCE, HYDERABAD |