Check the attachments Please read the instructions and questions carefully in ” Assignment_5_2024_Fall.pdf” file and use “Auto.csv” to finish the as

Check the attachments

Please read the instructions and questions carefully in ” Assignment_5_2024_Fall.pdf” file and use “Auto.csv” to finish the assignment. You should submit both 1) an R code ; 2) A PDF report with answers through the link “Submit Assignment 5 Here”.

Guidelines:

· Use only R for this assignment

· Submit both R code and Report on findings

· Work is to be done individually for this assignment

1. In this problem, you will generate simulated data, and then perform K-means clustering on the data.

1.1 Generate a simulated data set with 30 observations in each of two classes (i.e. 60 observations in total), and 2 variables.

Code Hint: The first four lines of codes should be:

set.seed(2) x=matrix(rnorm(60*2), ncol=2) x[1:30,1]=x[1:30,1]+3

x[1:30,2]=x[1:30,2]-4

1.2 Perform K-means clustering of the observations with K = 2. Plot the data with each observation colored according to its cluster assignment (nstart=20). Take a screenshot of your plot. What is the total within-cluster sum of squares?

1.3 Perform K-means clustering with K = 3. Plot the data with each observation colored according to its cluster assignment (nstart=20). Take a screenshot of your plot. What is the total within-cluster sum of squares?

1.4 Now perform K-means clustering with K = 4. Plot the data with each observation colored according to its cluster assignment (nstart=20). Take a screenshot of your plot. What is the total within-cluster sum of squares?

1.5 Using the scale () function, perform K-means clustering with K = 2 on the data after scaling each variable to have standard deviation one. Take a screenshot of your plot. What is the total within-cluster sum of squares now? How do these results compare to those obtained in (2)?

1

2. Consider the USArrests data. We will now perform hierarchical clustering on the states. USArrests dataset is part of the base R package. You do not need to load any libraries.

2.1 Plot the hierarchical clustering dendrogram using complete linkage clustering with Euclidean distance as the dissimilarity measure. Take a screenshot of your plot.

2.2 Cut the dendrogram at a height that results in three distinct clusters. Which states belong to which clusters? You need to provide state names for each cluster (e.g. Cluster 1 has Alabama, Alaska,…).

2.3 Hierarchically cluster the states using complete linkage and Euclidean distance, after scaling the variables to have standard deviation one.

a) Take a screenshot of your plot.

b) What effect does scaling the variables have on the hierarchical clustering obtained?

c) In your opinion, should the variables be scaled before the inter-observation dissimilarities are computed? Provide a justification for your answer.

2.4 After scaling the variables to have standard deviation one, plot the hierarchical clustering dendrogram using average linkage clustering with Euclidean distance as the dissimilarity measure. Take a screenshot of your plot.

2.5 After scaling the variables to have standard deviation one, plot the hierarchical clustering dendrogram using single linkage clustering with Euclidean distance as the dissimilarity measure. Take a screenshot of your plot.

What to submit:

1.
R code.

a. Should include all the code to accomplish the tasks.

b. Clear and concise comments to indicate what part of the assignment each code chunk pertains to.

c. Code should be easily readable.

d. Filename should be in the format of: LastnameFirstname_A5.R

2.
Report.

a. Take screenshots of your outputs in R Studio and answer all the questions.

b. Submit in PDF format.

c. Answers questions clearly and concisely.

d. Includes appropriate plots. Make sure the plots are properly labeled.

e. The assignment will be graded on the correctness of the answers, comprehensiveness of the analysis, clarity of results’ presentation and neatness of the report.

Share This Post

Email
WhatsApp
Facebook
Twitter
LinkedIn
Pinterest
Reddit

Order a Similar Paper and get 15% Discount on your First Order

Related Questions

Understanding Operating Systems (Zero-cost course material) Unit 5 Lab

Understanding Operating Systems (Zero-cost course material) Unit 5 Lab Assignment 5: Networking and Security (40 points) In this assignment, you will learn to check network configurations and services and to configure an IP address on a computer. For each of the two questions, do the following: 1. Create a document (use

Research – System Analysis and design Assignment 9.1 – Business Processes and Process Management Research a scholarly paper or

Research – System Analysis and design Assignment 9.1 – Business Processes and Process Management Research a scholarly paper or professional video on  “Business Processes, and Process Specifications”  and reflect on only one (1) of the following topics: · “Processes”: What type of business system’s Processes exist? · “Specifications”: How important

1. Prepare a professional cover letter Articulate effective interview strategies Explain the difference between a “good” interview and a “bad” interview

1. Prepare a professional cover letter Articulate effective interview strategies Explain the difference between a “good” interview and a “bad” interview Lecture notes There are 3 sections of a cover letter that are required when writing a professional cover letter to include with your resume: Introductory paragraph (to attract the

PROJECT DESCRIPTION The use of digital media has transformed how companies communicate with their customers. The use of the websites, YouTube, e-books, e-mail

PROJECT DESCRIPTION The use of digital media has transformed how companies communicate with their customers. The use of the websites, YouTube, e-books, e-mail and various forms of social media such as Facebook, Twitter, Pinterest, Snapchat, Instagram, and blogs has shaped current day communication strategies. Project 1 Consists of: 1. Project

CSCI 351 Assignment 2 60 points Instruction: · Show your own work (

CSCI 351 Assignment 2 60 points Instruction: · Show your own work (at least 50% penalty otherwise) · Submit a single WORD document (*.doc or *docx only) containing all your answers to the assignment folder (“Assignment X”) under D2L (at least 10% penalty otherwise) · Make sure you submit the