| 20,000+ Fresh Resumes Monthly | |
|
|
| | Click here or scroll down to respond to this candidateCandidate's Name
West New York, NJ PHONE NUMBER AVAILABLEEMAIL AVAILABLE https://github.com/usadhana025 https://LINKEDIN LINK AVAILABLE SUMMARY:Experienced Data Scientist with strong expertise in model development and OpenAI summarizer implementation. Proficient in using Python and libraries/tools such as Plotly, NLTK, NumPy, Pandas, scikit-learn, Flask, and Fast API. Skilled in implementing various machine learning algorithms and building ETL pipelines on Snowflake. Experienced in cloud computing, particularly AWS, with expanding skills in Azure. Committed to delivering high-quality data science solutions and adapting to evolving technologies. EDUCATION:Master of Science, Data Science, Stevens Institute of Technology, NJ GPA: 3.6 2019- 2021 Bachelor of Engineering, Information Technology, GLA University, India GPA: 3.3 2012- 2016 TECHNICAL SKILLS:Languages: Python, R, SQL, Java, PysparkMachine Learning: NLP, Clustering, Linear Models, Tree-Based Models, Classification Visualizations: Plotly, Seaborn, Matplotlib, Tableau, Wordcloud, Folium, Streamlit Applications Statistics Cloud technologies: Descriptive, Inferential, Hypothesis testing, A/B testing AWS, Azure, Palantir Databases OS: MySQL, MongoDB Ubuntu, WindowsIndustry tools ETL tools: Jupiter notebook, VS code, PyCharm, R studio, Tableau, Snowflake, SonarQube PROFESSIONAL EXPERIENCE:Data Scientist at Tata Consultancy Services, New York USA July 2021 Current Technologies used: Python,OpenAI API,,SciPy,scikit-learn, Azure Cloud, Snowflake, Palantir Cloud,Pyspark,AWS Developed a natural language processing (NLP) model to classify and summarize large volumes of text data, improving information retrieval efficiency for internal stakeholders. Implemented chat completion project using GPT-4 with 32k tokens, configuring payload parameters to optimize model performance and enhance conversational fluency. Conducted feature engineering and model optimization using Python libraries such as scikit-learn, scipy to improve model accuracy and performance. Developed an email system integrating SQL Alchemy for data retrieval from MySQL database, pandas for data manipulation, and SMTPlib with email. MIME for email sending. Implemented functionality to attach files to email messages and notify designated managers via email and SMS. Leveraged Fast API and Jinja templates for email message customization. Designed a request form utilizing Flask API and Starlette exporter to capture infrastructure specifications such as CPU and Memory for servers. Developed a Prometheus middleware to streamline infrastructure allocation by semi-automating the process based on user input. Developed Streamlit applications for efficient retrieval and summarization of MySQL database data, enabling stakeholders to monitor server costs via dynamic dashboards, facilitated by a Fast API backend for seamless data transmission. Contributed to the creation of data ingestion pipelines on Azure HDInsight Spark cluster through the utilization of Azure Data Factory and Spark SQL. Worked on creating Data pipelines in the Palantir cloud to maintain KPIs for Engine production plants. Writing complex SQL scripts in Snowflake cloud DW utilizing data from AWS S3 bucket for business analysis and reporting. Collaborated with cross-functional teams to integrate machine learning models into production systems, ensuring scalability and robustness. Statistical Data Mining Consultant at Technical Consulting and Research, Weston, CT June 2020-December 2020 Technologies used: Jupyter Notebook, Python, NLP, R, Tableau, Machine Learning Implemented Python scripts for web scraping cybersecurity data from the Edgar website. Developed classification models using Random Forest, Decision Tree, and Gradient Boosting (XGBoost) for cybersecurity risk assessment. Designed an interactive map using Tableau and R Shiny to visualize active COVID-19 clinical trials. Software Developer at Tata Consultancy Services, Mumbai, India February 2017 July 2019 Technologies used: VS code, Python. Revamped a monolithic application to a microservices architecture (Python Flask + Angular), increasing concurrency and reducing processing time by 75%.ACADEMIC PROJECTS:Movie Recommendation SystemTechnologies used: Python, Pandas, Matplotlib, scikit-learn, seaborn, NumPy, SciPy Implemented a recommendation model to predict movie success rate based on casting team selection using machine learning algorithms, including SVM, Decision tree, and linear regression, and performed metrics in terms of MSE and MAE. Designed a Content-based filtering recommendation system to give a list of similar movie suggestions according to watched history utilizing TF-IDF and KNN algorithm. Created a Collaborative filtering recommendation system to give movie recommendations based on similar kinds of preferences by similar types of users by Matrix factorization techniques.ASHRAE-Great Energy Prediction IIITechnologies used: Python, Pandas, Matplotlib, scikit-learn, NumPy, Statsmodels, lightgbm Developed accurate models of metered building energy usage in the following areas: chilled water, electric, hot water, and steam meters using ensemble learning methods utilizing LightGBM and SARIMA(ARIMA-based model). Conducted feature engineering using PCA and model optimization to improve predictive accuracy. Utilized the SARIMA model to do n-step ahead prediction and handle time-series data account for seasonality and trends. |