This web application allows users to choose different machine learning models to predict diabetes and adjust parameters to achieve the best accuracy. Built using Streamlit, the app includes several popular classifiers such as SVM, Logistic Regression, Random Forest, K-Nearest Neighbors, Decision Tree, XGBoost, and LightGBM.
This project focuses on classifying cyberbullying in Indonesian tweets using Natural Language Processing (NLP) and machine learning. It involves comprehensive text preprocessing techniques, including tokenization, stemming, lemmatization, and vectorization, to prepare the data for analysis. Various classification models, such as Logistic Regression, Random Forest, and Support Vector Machine (SVM), are compared for their effectiveness in identifying instances of cyberbullying.
This project aims to analyze customer satisfaction in the airline industry by examining various factors that influence passenger experiences. The analysis includes identifying key independent variables that affect customer satisfaction and utilizing machine learning models to predict satisfaction levels. The primary objective is to understand how different service aspects impact customer perceptions and to identify areas for improvement.
A leading healthcare organization seeks to predict stroke risk using patient medical history and demographic data. As a data scientist, I built and validated a prediction model using the Random Forest algorithm. This involved data cleaning, processing, analysis, visualization, and deployment for clinical use. The model, achieving 95% accuracy, aims to mitigate stroke incidents and enhance patient outcomes.