Main Page
Deanship
The Dean
Dean's Word
Curriculum Vitae
Contact the Dean
Vision and Mission
Organizational Structure
Vice- Deanship
Vice- Dean
KAU Graduate Studies
Research Services & Courses
Research Services Unit
Important Research for Society
Deanship's Services
FAQs
Research
Staff Directory
Files
Favorite Websites
Deanship Access Map
Graduate Studies Awards
Deanship's Staff
Staff Directory
Files
Researches
Contact us
عربي
English
About
Admission
Academic
Research and Innovations
University Life
E-Services
Search
Deanship of Graduate Studies
Document Details
Document Type
:
Thesis
Document Title
:
A STATISTICAL STUDY ABOUT PRINCIPAL COMPONENTS ANALYSIS OF MIXTURE MODELS
دراسة احصائية حول تحليل المركبات الرئيسية للنماذج المختلطة
Subject
:
Faculty of Science
Document Language
:
Arabic
Abstract
:
Data scientists use various algorithms of machine learning to find patterns in large data that lead to practical insights. To treat this data properly, we need to examine if it can be interpreted in a low-dimensional space or not. In addition, we try fitting the new data with different mixture models to obtain the suitable model. This step will perform the statistical model that predicts and estimates the parameters as close as possible to the original data. In this research, we use principal component analysis as a representation of the data from high dimensional to low dimensional space and expressing the data in such a way to highlight their similarities and differences. we proposed two scenarios: The first one is dealing with the reduced data as one Gaussian mixture model. Then, we obtain the estimations of the parameters by using the expectation-maximization algorithm. The clustering method is applied on reduced data, then fit the mixture model on the new data by taking the cluster means as initial values of the means for mixture model. The second scenario is dealing with each variable in the reduced data individually, once by fitting Gaussian mixture model on each variable, and the other time by fitting Cauchy mixture model on each variable also. The benefit of using the Cauchy mixture model is demonstrated in its ability to handle with heterogeneity and outliers. The model's parameters were estimated based on the expectation maximization algorithm. The effectiveness of the discussed methods demonstrated through a simulation study and by real datasets. In this research, we also discussed the principal components analysis of mixed data (PCAMIX) and demonstrated how it is useful in today’s real-world data. Nowadays, most databases are mixed data, meaning that there is a combination of numerical and categorical variables in the database. The PCAMIX method is used to handle this type of database and to allow statistical information to be collected over the studied population. The efficiency of PCAMIX is investigated using data set available in the R package and using simulated data.
Supervisor
:
Dr. Zakiah Ibrahim Kalantan
Thesis Type
:
Master Thesis
Publishing Year
:
1441 AH
2020 AD
Added Date
:
Thursday, August 6, 2020
Researchers
Researcher Name (Arabic)
Researcher Name (English)
Researcher Type
Dr Grade
Email
ندى عائض القحطاني
Alqahtani, Nada Ayed
Researcher
Master
Files
File Name
Type
Description
46658.pdf
pdf
Back To Researches Page