Dong, Sihan2023-09-202023-09-202023-09-202023-09-07http://hdl.handle.net/10012/19894Nanoparticles (NP) have become a promising drug delivery system in the past few decades in pharmaceutics for its diversity in encapsulating different types of drugs, including proteins/peptides, nucleic acids and small molecule drugs for the treatment of a variety of diseases. Application in cancer cell NP-based drug delivery has been a majority focus because of NP’s capability in delivering effective treatment while keeping side effects low. Often, series of chemical and biological assays need to be carried out to pursue certain research goals. However, NP fabrication process is rather time-consuming and costly, consisting of material selection, formation, purification, and characterization. As NP composition choices can directly influence the NP physicochemical properties and biological behaviors, it is crucial to find the optimized combination efficiently to achieve better NP performances. To ease the burden of conducting experiments manually, collaboration with artificial intelligence (AI) techniques is likely to be a promising choice. Machine learning (ML) as a sub-concept of AI has been a popular tool in many pharmaceutical sciences studies, such as prediction of protein molecular structures, drug discoveries, high throughput screening, and prediction of drug formulation compositions, etc. It has been of researchers’ great interests in implementing this emerging technique to a variety of tasks to speed up pharmaceutics development. In this study, we formulated 32 doxorubicin (DOX) or docetaxel (DTX)-loaded NPs to train and test ML-based Gaussian Processes (GP) models that can estimate the underlying relationships between four NP composition physicochemical properties (e.g., poly (lactic-co-glycolic) acid (PLGA) molecular weight (MW), PLGA lactic acid: glycolic acid (LA/GA) ratio, PLGA: drug weight ratio, and drug lipophilicity) and the corresponding drug EE% and therapeutic efficacy in ovarian cancer cells. No universal relationships between the predictor and response variables can be concluded. Three GP models including EE% model, DOX NP IC50 model, and DTX NP IC50 model were evaluated for their prediction accuracies that were measured by normalized-RMSE in testing sets. The normalized RMSE are 0.187, 0.296, and 0.206, respectively. The EE% model has the highest prediction accuracy that may be attributed to the larger training dataset compared to the other two models. Furthermore, a simplified Bayesian Optimization (BO) model was built to output a set of x variable values that can potentially help to find formulations that optimize the NP EE% and therapeutic efficacy. In EE% model, the suggested formulation is 2mg drug with lipophilicity of 2.12 being loaded in 94mg of 20001 Da, 1.17:1 (LA/GA) PLGA NP. In DOX NP IC50 model, the suggested formulation is 2 mg DOX-loaded 68mg of 39997 Da, 1.53:1 (LA/GA) PLGA NP. In DTX NP IC50 model, the suggested formulation is 2 mg DTX-loaded 90 mg of 20008 Da, 1.70:1 (LA/GA) PLGA NP.enMachine learning modelling in predicting and optimizing PLGA nanoparticle encapsulation efficiency and therapeutic efficacyMaster Thesis