Deanship of Graduate Studies | Researches | A STATISTICAL STUDY ABOUT PRINCIPAL COMPONENTS ANALYSIS OF MIXTURE MODELS

Main Page
Deanship
- The Dean
  - Dean's Word
  - Curriculum Vitae
  - Contact the Dean
- Vision and Mission
- Organizational Structure
- Vice- Deanship
- Vice- Dean
- KAU Graduate Studies
Research Services & Courses
- Research Services Unit
- Important Research for Society
- Deanship's Services
  - FAQs
  - Research
  - Staff Directory
  - Files
  - Favorite Websites
  - Deanship Access Map
Graduate Studies Awards
Deanship's Staff
- Staff Directory
Files
Researches
Contact us

- عربي
- English

Deanship of Graduate Studies

Document Details

Document Type	:	Thesis
Document Title	:	A STATISTICAL STUDY ABOUT PRINCIPAL COMPONENTS ANALYSIS OF MIXTURE MODELS دراسة احصائية حول تحليل المركبات الرئيسية للنماذج المختلطة
Subject	:	Faculty of Science
Document Language	:	Arabic
Abstract	:	Data scientists use various algorithms of machine learning to find patterns in large data that lead to practical insights. To treat this data properly, we need to examine if it can be interpreted in a low-dimensional space or not. In addition, we try fitting the new data with different mixture models to obtain the suitable model. This step will perform the statistical model that predicts and estimates the parameters as close as possible to the original data. In this research, we use principal component analysis as a representation of the data from high dimensional to low dimensional space and expressing the data in such a way to highlight their similarities and differences. we proposed two scenarios: The first one is dealing with the reduced data as one Gaussian mixture model. Then, we obtain the estimations of the parameters by using the expectation-maximization algorithm. The clustering method is applied on reduced data, then fit the mixture model on the new data by taking the cluster means as initial values of the means for mixture model. The second scenario is dealing with each variable in the reduced data individually, once by fitting Gaussian mixture model on each variable, and the other time by fitting Cauchy mixture model on each variable also. The benefit of using the Cauchy mixture model is demonstrated in its ability to handle with heterogeneity and outliers. The model's parameters were estimated based on the expectation maximization algorithm. The effectiveness of the discussed methods demonstrated through a simulation study and by real datasets. In this research, we also discussed the principal components analysis of mixed data (PCAMIX) and demonstrated how it is useful in today’s real-world data. Nowadays, most databases are mixed data, meaning that there is a combination of numerical and categorical variables in the database. The PCAMIX method is used to handle this type of database and to allow statistical information to be collected over the studied population. The efficiency of PCAMIX is investigated using data set available in the R package and using simulated data.
Supervisor	:	Dr. Zakiah Ibrahim Kalantan
Thesis Type	:	Master Thesis
Publishing Year	:	1441 AH 2020 AD
Added Date	:	Thursday, August 6, 2020

Researchers

Researcher Name (Arabic)	Researcher Name (English)	Researcher Type	Dr Grade	Email
ندى عائض القحطاني	Alqahtani, Nada Ayed	Researcher	Master

Files

File Name	Type	Description
46658.pdf	pdf

Back To Researches Page