Predictive Failure Modes
            This project tested my machine learing skills as well as my grit to continue and
                finish the project.
            Project Details
            
                    After acquiring a dataset, I embarked on a thorough exploration to identify which features were most
                    meaningful, both generally and for predicting asset failure. I employed various methods, including
                    statistical analysis, decision trees, visualization, and Principal Component Analysis (PCA). This
                    multifaceted approach yielded numerous valuable insights.
                One major finding was the notable scarcity of specific failure modes and the
                    detailed operating conditions—speeds, temperatures, and times—associated with each failure mode. By
                    focusing on and filtering failures to the forefront, I was able to produce clear and impactful
                    visuals. This process revealed significant patterns that were previously obscured by the non-failure
                    data, providing a much clearer understanding of the factors leading to asset failure.
                Once I had a thorough understanding of the dataset, I began building a predictive
                    model. Initially, I encountered concerning results as my model's recall dropped over multiple
                    epochs. This decline was due to the unbalanced nature of the dataset and the equal weighting between
                    failure and non-failure events.
                To address this issue, I first attempted to adjust the weights to account for the
                    imbalance, but this approach felt "hacky" and involved manually determining weights through a
                    formula, which I found unsatisfactory. Consequently, I decided to bootstrap the dataset using a
                    tabular Generative Adversarial Network (GAN). By generating synthetic data similar to my existing
                    dataset, I was able to extend the dataset and improve its balance.
                Using Keras tuners, I fine-tuned the model and ultimately achieved a recall of
                    approximately 95%. This approach not only enhanced the model's performance but also provided a more
                    robust solution to the dataset imbalance problem.