Multivariate Data Analysis 7th Edition Chapter1
Essay by Yutian Zhou • November 16, 2017 • Coursework • 1,310 Words (6 Pages) • 1,332 Views
CH1 Q1-Q8
Q1. What is multivariate analysis?
In my own definition multivariate analysis contains three crucial components. The first is there must be multiple variates that are involved in the random interactions within each other. Second, the connection or their connections must happen simultaneously. Third, the interactions must at last end up with some effect or get into a solution or an outcome.
For example, multivariate analysis is the same as an army formation. The soldiers are the multiple variates and the actions between them, like stand in a line or a square, are the interactions. They in the end will make an uniformed army formation to beat the enemies. And this is the function.
Q2. The most important factors contributing to the increased application of multivariate analysis?
- Developments in computer hardware and software. The speed of computing equipment and user-friendly packages has emerged.
- Booming information but lack of knowledge in this new era.
Q3. Multivariate data analysis techniques and examples?
- Principle components and common factor analysis
They are both factor analysis and ways to simplify the data. A very common example is that when you rate a professor or do a course survey, you rate from various sides like is your professor responsible for the homework or is your professor humorous on class, etc. And in the end your professor will get an integrated grade from your marking. - Multiple regression
An example for multiple regression is if I want to know the amount of a person’s income, I can predict it by his or her frequency of dinning out, shopping amount, living expenditure, etc. - Multiple discriminant analysis and logistic regression
Logistic regression is the “combination” of multiple discriminant analysis and multiple regression. When we do a multiple discriminant analysis we simplify data by a category that contains several levels or choices. For instance, we distinguish a number of peoples by their ages. Infants, teenagers…and so on. Then they are different groups of people for further data processing uses. As mentioned, the example for logistics regression can be a conversion of the example of multiple regression. If I want to know the person’s income class, I can predict it by his or her frequency of dinning out, shopping amount, living expenditure, etc. We only changed the dependent variant from metric into nonmetric. - Canonical correlation
I think canonical correlation is very common in the companies and also the government when we are doing a performance appraisal. There are several key appraisal elements including in the performance appraisal, and each element contains several questions. After the evaluation, each element will get a score and then we usually benchmark the score of a person to the standard score. This is a typical example of canonical correlation. - Multivariate analysis of variance and covariance
In my opinion, this kind of technique is widely used in psychological experiments. It can analysis controlled trials. Assume that a company has made two colors of T-shirts, black and white, with same design and material. The company gives those two kinds of clothes to two groups of people and asks them to accomplish a survey including several dimensions of questions about the product. - Conjoint analysis
I think of a simple example for conjoint analysis. If we want to make the dining hall operate efficiently, we surveyed students the combination of their favorite dishes: rice or potato, broccoli or spinach, fish or chicken. When we get the answer, we can make a conjoint analysis to know which kind of dish we need to make more and we should put which two or three of them together in the dining hall. - Cluster analysis
The most important difference between cluster analysis and normal classification analysis is that the groups are not predefined. The computer needs to find common parts among data. For example, in China we have an online C2C shopping website called Taobao. We can submit our comments for a product after we purchased and used. The comments are always classified by some key words like, for clothing there might be soft, comfortable or so on. But as for stationary, for instance, someone might say good one to write, nice paper or so on. I think the process that the processor of Taobao handle the comments is similar to cluster analysis. It find the joint part and accomplish clustering itself. - Perceptual mapping
I don’t think the example in the book for perceptual mapping is very clear. I searched the meaning of perceptual mapping on the net and I can give an example of promotion. If a manager wants to promote a person from A and B, and they are very similar. We can evaluate the indexes on the perceptual map to see who is more close to the manager’s requirement. - Correspondence analysis
This kind of analysis method is more sophisticated but useful. Assume a cosmetic corporation has several sub-brands A, B, C, D, and their customers can be categorized by ages. They can utilize correspondence analysis to draw a correspondence analysis map to see which group of people like to use which brand, and we can draw other conclusions from the map. - Structural equation modeling and confirmatory factor analysis
I really can’t think of a proper example for this analysis technique. It’s like a group of people, and they all have connections with each other. However, in the structural equation modeling the relationships are all equations and formulas.
Q4. Why and how the various multivariate methods can be viewed as a family of techniques?
The textbook illustrates 3 judgements to classify the data analysis techniques. First, are variables able to be assorted into independent or dependent variables? Second, if they can be classified, how many of them are dependent in only one analysis? Third, are they metric or nonmetric, both the dependent variable and the independent variable? With these 3 questions, we can identify which kind of analysis technique we should use to handle our data. They can be classified into dependence techniques or independence techniques.
...
...