| 20,000+ Fresh Resumes Monthly | |
|
|
| | Click here or scroll down to respond to this candidateSaikrishnaPHONE NUMBER AVAILABLEEMAIL AVAILABLEProfessional SummaryHaving 7+ years of experience in IT experience in software design, development, implementation, and support of business applications for health and Insurance industriesDeveloped Spark SQL statements for processing data.Support development with application architecture in both real time and batch big data processing.Developed Schedulers that communicated with the Cloud based services (AWS) to retrieve the data.Debugging, and Testing of Software Applications in a client server environment, Object Oriented Technology and Web based applications.Strong analytical and problem-solving skills, highly motivated, good team player with very good communication & interpersonal skill.Optimized and tuned ETL processes and SQL queries for better performance for a cost effective and efficient environment.Experience working in core Java concepts include Object-Oriented-Design and Java components like collections framework, Exception handling, I/O system.Knowledge in integration of data from various sources like RDBMS, Spreadsheets, Text files.Provided production support and involved with root cause analysis, bug fixing and promptly updating the business users on day-to-day production issues.Big Data/Hadoop TechnologiesSpark, Spark SQL, Azure, Spark Streaming, Kafka, PySpark, HueLanguagesXML, Json, Java, Scala, Python, Shell ScriptingNO SQL DatabasesCassandra, MongoDB, MariaDBDevelopment ToolsMicrosoft SQL Studio, IntelliJ, Azure Databricks, Eclipse, NetBeans.Public CloudEC2, IAM, S3, CloudWatch, EMR, RedShiftDevelopment MethodologiesAgile/Scrum, WaterfallBuild ToolsJenkins, Toad, SQL Loader, PostgreSQL, Talend, Maven, Hue, SOAP UIReporting ToolsPower BI, SSRSDatabasesMicrosoft SQL Server 2008,2010/2012, MySQL 4.x/5.x, Oracle 11g, 12c, DB2, TeradataPROFESSIONAL EXPERIENCE:Client: Bank of America, Charlotte, NC June 2023 April 2024Role : Data EngineerResponsibilities:Worked with SCRUM team in delivering agreed user stories on time for every Sprint.Implemented Partitioning, Dynamic Partitions, Buckets in HIVE.Developed ETL pipelines from multiple RDBMS services to Hadoops HDFS using Sqoop import, performed transformations in Hive, moved the data to staging area, DB2 using Sqoop export.Implemented solutions for ingesting data from various sources and processing the Data-at-Rest utilizing Big Data technologies using Hadoop, MapReduce, HBase, Hive.Experience with complete SDLC process staging code reviews, source code management and build process.Implemented Big Data platforms as data storage, retrieval, and processing systemsApplied transformation on the data loaded into Spark Data Frames and done in memory data computation to generate the output response.Developed solutions for import/export of data from Teradata, Oracle to HDFS and to Snowflake.Worked on Microsoft Azure services like HDInsight Clusters, BLOB, ADLS, Data Factory and Logic Apps and also done POC on Azure Data Bricks.Good knowledge of troubleshooting and tuning Spark applications and Hive scripts to achieve optimal performance.Experience in building enhanced information combination stage to give effective execution under creating information volumes by using snowflake schema and Tera data.Creating and modifying existing data ingestion pipelines using Kafka and Sqoop to ingest the database tables and streaming data into HDFS for analysis.Performed advanced analytical application by making use of Spark with Hive and SQL, Oracle, Snowflake.Extensive knowledge/hands on experience in architecting or designing Data warehouse/Database, Modelling, building SQL objects such as tables, views, user defined/ table valued functions, stored procedures, triggers, and indexes.Worked with multiple storage formats (Avro, Parquet) and databases (Hive, Impala).After running ETL queries validation checks are performed to report to client at every stage of project.Developed spark applications in python (PySpark) on distributed environment to load huge number of CSV files with different schema in to Hive ORC tables.Experience in Python (libraries used: Beautiful Soup, NumPy, SciPy, matplotlib, python-twitter, Pandas, data frame, network, urllib2, MySQL dB for database connectivity) and IDEs - sublime text.Developed testing scripts in Python and prepared test procedures, analyze test results data and suggest improvements of the system and software.Environment: Scrum, Git, SQL, Python, XML, Kafka, Spark, HiveClient: Reinsurance Group of America, St Louis, MO Dec 2021 April 2023Role: Data EngineerResponsibilities:Transforming business problems into Big Data solutions and define Big Data strategy and Roadmap.Installing, configuring and maintaining Data PipelinesDesigning the business requirement collection approach based on the project scope and SDLC methodology.Used Amazon Airflow for complex workflow automation. The process automation is done by wrapping scripts through shell scripting.Served as the Snowflake Database Administrator responsible for leading the data model design and database migration deployment production releases to endure our database objects and corresponding metadata were successfully implemented to the production platform environments AWS Cloud (Snowflake).Designed and implemented data warehouses and data marts using components of Kimball Methodology, like Data Warehouse Bus, Slowly Changing Dimensions, Surrogate Keys, Star Schema, Snowflake Schema, etc.Created users, roles and groups for securing the resources using local operating system authentication in azure.Worked on Ad hoc queries, Indexing, Replication, Load balancing, Aggregation in MongoDB.Perform troubleshooting and diagnosis to hardware/software network failures and provide resolutions using azure.Performed advanced analytical application by making use of Spark with Hive and SQL, Oracle, Snowflake.Monitored the Spark and Hadoop jobs on AWS CloudWatch.Worked on utilizing AWS cloud services like S3, EMR, Redshift, Athena and Glue Meta store.Worked extensively on fine tuning spark applications and providing production support to various pipelines running in production.Worked closely with business teams and data science teams and ensured all the requirements are translated accurately into our data pipelines.Environment: AWS, S3, Kafka, Scrum, Git, SQL, Python, XML, Unix.Client: Burwood Group Inc, Chicago, IL May 2021 Nov 2021Role: Data EngineerResponsibilities:Wrote Python scripts to process semi-structured data in formats like JSON.Design and Develop ETL Processes in AWS Glue to migrate Campaign data from external sources like S3, ORC/Parquet/Text Files into AWS Redshift.After running ETL queries, ran the validation check to report to client at every stage of project.Used Jira for bug tracking and Bit Bucket to check-in and checkout code changes.Responsible for generating actionable insights from complex data to drive real business results for various application teams and worked in Agile Methodology projects extensively.Architect & implement medium to large scale BI solutions on Azure using Azure Data Platform services (Azure Data Lake, Data Factory, Data Lake Analytics, Stream Analytics, Azure SQL DW, HDInsight/Databricks, NoSQL DBDeveloped entire frontend and backend modules using Python on Django Web Framework.Develop database management systems for easy access, storage, and retrieval of data.Perform DB activities such as indexing, performance tuning, and backup and restore.Environment: Python, Apache Kafka, Hive, Spark, AWS, Redshift, Jenkins, Maven, JIRA, Bitbucket, JSON.Client: Global Atlantic financial group, Indianapolis, IN April 2020 - April 2021Role: Data Analyst/ModelerResponsibilities:Good knowledge on Spark platform parameters like memory, cores and executors.Experienced with batch processing of data sources using Apache Spark and Elastic search.Implemented monitoring and established best practices around using elastic search.Used Python& SAS to extract, transform & load source data from transaction systems, generated reports, insights, and key conclusions.Profound knowledge in Database design, modeling skills, performance tuning and loading large data sets to AWS Redshift.Manipulated and summarized data to maximize possible outcomes efficiently.Implemented real-time data driven secured REST API's for data consumption using AWS (Lambda, API Gateway, Route 53, Certificate Manager, CloudWatch, Kinesis), Swagger, Okta and SnowflakeDeveloped story telling dashboards in Tableau Desktop and published them on to Tableau Server which allowed end users to understand the data on the fly with the usage of quick filters for on demand needed information.Worked on Microsoft Azure services like HDInsight Clusters, BLOB, ADLS, Data Factory and Logic Apps and also done POC on Azure Data Bricks.Worked on Dimensional and Relational Data Modeling using Star and Snowflake Schemas, OLTP/OLAP system, Conceptual, Logical and Physical data modeling using Erwin.Analyzed and recommended improvements for better data consistency and efficiencyDesigned and Developed data mapping procedures ETL-Data Extraction, Data Analysis and Loading process for integrating data using R programming.Effectively Communicated plans, project status, project risks and project metrics to the project team planned test strategies in accordance with project scope.Data cleaning, pre-processing and modelling using Spark and Python.Wrote data ingestion systems to pull data from traditional RDBMS platforms such as Oracle and Teradata and store it in NoSQL databases such as MongoDB.Involved in loading and transforming large sets of Structured, Semi-Structured and Unstructured data and analyzed them by running Hive queries.Environment: JavaScript, Django, SQL, MySQL, LAMP, jQuery, Adobe Dreamweaver, Apache web server, NoSQL, Spark, PythonClient: Mercury Gate International, Cary, NC January 2018 - April 2020Role: Data Analyst/ModelerResponsibilities:Performed ETL processes from the business data and created a spark pipeline that can efficiently perform ETL process.Great expose to Unix scripting and good hands on shell scripting.Wrote Python scripts to process semi-structured data in formats like JSON.Involved in building data models and dimensional modeling with Snowflake Schema, Star schema for OLAP applicationsDeveloped and maintained global data models for delivery process optimization and fleet telematics, use cases and applications using Python, Azure AKS, Application Insights and BLOB storage.Involved in loading and transforming of large sets of structured, semi structured and unstructured data.Construct the AWS data pipelines using VPC, EC2, S3, Auto Scaling Groups (ASG), EBS, Snowflake, IAM, CloudFormation, Route 53, CloudWatch, CloudFront, CloudTrail.Involved in the development of agile, iterative, and proven data modeling patterns that provide flexibility.Worked with data modelers to understand financial data model and provided suggestions to the logical and physical data model.Developed Spark jobs using Scala on top of MRv2 for interactive and Batch Analysis.Environment: Pyspark, AWS, S3, Kafka,, Scrum, Git, Informatica, SQL, Python, Spark, Unix.Client : Sonata Software, Hyderabad, India May 2014 - April 2015Role: Data AnalystResponsibilities:Convert raw data with sequence data format, such as Avro and Parquet to reduce data processing time and increase data transferring efficiency through the network.Worked on Normalization and De-normalization techniques for optimum performance in relational and dimensional databases environments.Designed developed and tested Extract Transform Load (ETL) applications with different types of sources.Exploring with Spark to improve the performance and optimization of the existing algorithms in Hadoop using Spark context, Spark-SQL, Data Frame, pair RDD's.Experience with PySpark for using Spark libraries by using Python scripting for data analysis.Worked on building custom ETL workflows using Spark to perform data cleaning and mapping.Implemented Kafka Custom encoders for custom input format to load data into Kafka portions.Support for the cluster, topics on the Kafka manager.Cloud formation scripting, security and resource automation.Environment: Python, Kafka, HQL, Spark, Kafka, ETL, Web Services, Linux RedHat, Unix.Education :Bachelors in Computer Science, JNTU, 2013Masters in Information Systems, Wilmington University, 2017 |