Data Engineer Resume Seaford, DE

Data Engineer Resume Seaford, DE
Resumes | Register
Candidate Information
Name	Available: Register for Free
Title	Data Engineer
Target Location	US-DE-Seaford
Email	Available with paid plan
Phone	Available with paid plan
20,000+ Fresh Resumes Monthly
View Phone Numbers
Receive Resume E-mail Alerts
Post Jobs Free
Link your Free Jobs Page
... and much more
Register on Jobvertise Free
Related Resumes
Big Data Engineer Milford, DE
Design Engineer Bishopville, MD
Software Engineer Appian Developer Dover, DE
Information Technology Software Engineer Dover, DE
Data Entry Patient Care Dover, DE
Data Management Primary Care Cambridge, MD
Patient Care Data Entry Houston, DE
Click here or scroll down to respond to this candidate
                                              Candidate's Name
                                                DATA ENGINEER
                      Baltimore, USA | PHONE NUMBER AVAILABLE | EMAIL AVAILABLE | LinkedIn
SUMMARY

    Data Engineer with almost 4 years of experience in designing, implementing, and optimizing large-scale data solutions
    across healthcare and financial sectors.
    Proven expertise in cloud-based data architecture, particularly with Microsoft Azure services including Azure Data
    Factory, Databricks, and Stream Analytics.
    Strong background in data visualization and reporting using tools like Tableau and Power BI to drive business insights
    and support decision-making.
    Strong proficiency in big data technologies, including Apache Spark, Kafka, and Hadoop, with experience processing
    and analyzing datasets exceeding 200+ TB.
    Skilled in data warehouse design and optimization, utilizing dimensional modeling techniques, star/snowflake schemas,
    and slowly changing dimensions (SCD) implementations.
    Experienced in developing robust ETL/ELT pipelines, integrating diverse data sources and APIs, and ensuring data
    quality and governance.
    Proficient in multiple programming languages and tools, including Python, SQL, PySpark, and dbt, for efficient data
    transformation and analysis.
    Adept at collaborating with cross-functional teams, documenting data processes, and driving internal improvements to
    enhance overall data management efficiency.
EXPERIENCE
Data Engineer | Optum, MD                                                                             Aug 2023   Present
  Designed and implemented scalable ETL processes that leveraged Apache Spark for distributed data processing and
   Python for data transformation, reducing data processing times by 40% and increasing overall system efficiency.
  Extracted and consolidated healthcare data from diverse sources like Electronic Health Record (EHR) systems and
   claims databases, ensuring efficient storage and retrieval in AWS S3 for further analysis and reporting.
  Designed and developed visually compelling, interactive dashboards in Tableau to enable stakeholders to monitor key
   performance indicators (KPIs) in real-time, driving data-driven decision-making.
  Implemented and fine-tuned machine learning models for patient risk stratification, increasing the predictive accuracy
   by 20%, which improved decision-making in clinical and operational contexts.
  Developed a star schema with a central fact table for patient events (claims, medical visits) and multiple dimension
   tables (patient demographics, provider information) to streamline reporting and analysis.
  Identified key entities such as Patients, Healthcare Providers, Claims, EHR Records, Facilities, and Medications,
   ensuring that core healthcare data is well-represented in the ER diagram.
  Utilized MapReduce to process large-scale healthcare datasets (exceeding 10TB) from Electronic Health Records
   (EHR), claims, and patient data efficiently, enabling distributed parallel processing across multiple nodes.
  Applied role-based security controls in the data warehouse to ensure sensitive healthcare data was accessible only to
   authorized users, adhering to HIPAA regulations for data privacy and protection.
  Worked alongside data science teams to deploy predictive models using Scikit-learn and TensorFlow, enhancing the
   platform s capabilities for predictive analytics and improving risk stratification for patients.
  Conducted daily stand-up meetings to track progress, address blockers, and align on priorities, ensuring the team
   stayed on track with healthcare data processing and reporting tasks.

Data Engineer | Zensar Technology, India                                                        Dec 2019 - Aug 2022
  Implemented real-time data streaming solutions using Kafka to enhance transaction monitoring and fraud detection
   capabilities which reduced risk exposure by 25% by providing up-to-date insights and alerts.
    Leveraged MongoDB to store unstructured and semi-structured financial data, such as transaction logs and audit trails,
    allowing flexible schema designs and scalable storage solutions for large volumes of data.
    Integrated Hive into ETL workflows to perform batch data processing and transformation tasks, enabling efficient
    extraction, transformation, and loading of financial data from various sources into a centralized repository.
    Applied query optimization techniques in Snowflake, including result caching and automatic clustering, to enhance
    query performance and reduce execution times for complex financial queries.
    Implemented data quality checks and cleansing procedures to ensure the accuracy and consistency of financial data
    before it was loaded into the data warehouse, enhancing data integrity.
    Developed and optimized ETL pipelines using PySpark in Azure Databricks, automating the extraction,
    transformation, and loading of financial data into the data warehouse, and reducing ETL processing time by 30%.
    Automated regulatory reporting processes with Apache Airflow, creating reliable workflows that ensured timely and
    accurate compliance reporting. This automation cut down manual reporting efforts by 40% and minimized errors.
    Used Hadoop Distributed File System (HDFS) to store and manage large volumes of financial data across a distributed
    cluster, providing scalable and reliable storage solutions.
    Employed Python for data transformation and cleaning tasks, leveraging Pandas to handle large datasets, remove
    duplicates, fill missing values, and perform data normalization to ensure data quality.
    Utilized advanced SQL joins and aggregation functions to combine and analyze data from multiple financial tables,
    such as calculating total transaction volumes and analyzing spending patterns.
    Leveraged Azure Blob Storage for storing unstructured data, such as financial reports and transaction logs, providing
    scalable and cost-effective storage solutions.
    Developed predictive models to forecast financial metrics such as revenue trends, credit risk, and market movements,
    using machine learning algorithms to enhance decision-making and strategic planning.
    Used Excel for in-depth data analysis and financial reporting, leveraging features like pivot tables and charts to
    summarize and visualize financial data effectively.

SKILLS
Methodologies:                  SDLC, Agile, Waterfall
Programming Language:           Python, SQL, R
Packages:                       NumPy, Pandas, Matplotlib, SciPy, Scikit-learn, TensorFlow, Seaborn
Visualization Tools:            Tableau, Power BI, Advanced Excel (Pivot Tables, VLOOKUP), Quick Sight
IDEs:                           Visual Studio Code, PyCharm, Jupyter Notebook, IntelliJ
Database:                       MySQL, PL/SQL, MSSQL, PostgreSQL, MongoDB, SQL Server
Data Engineering Concept:       Apache Spark, Apache Hadoop, Apache Kafka, Apache Beam, ETL/ELT, PySQL, PySpark
Cloud Platforms:                Microsoft Azure (Azure Blobs, Databricks, Data Lake ), Amazon Web Services (AWS)
Other Technical Skills:         SSIS, SSRS, SSAS, Maven, Docker, Kubernetes, Jenkins, Terraform, Informatica, Talend,
                                Snowflake, Google Big Query, Data Quality and Governance, Machine Learning
                                Algorithms, Natural Language Processing, Big Data, Advance Analytics, Statistical
                                Methods, Data Mining, Data Visualization, Data warehousing, Data transformation,
                                Critical Thinking, Communication Skills, Presentation Skills, Problem-Solving
Version Control Tools:          Git, GitHub
Operating Systems:              Windows, Linux, Mac OS

EDUCATION
Masters in Professional Studies in Data Sciences   University of Maryland, Baltimore County, Maryland, USA
Bachelor of Technology in EEE   Chaitanya Bharathi Institute of Technology, Gandipet, Hyderabad, Telangana, India
Respond to this candidate
Your Email	«
Your Message
Please type the code shown in the image: