Have a Question ?

Expert Answers

Search solutions for your assignments from our database.
We have 20+ millions solutions for question that will help you improve your grades

1

(Solved): wpc 300 : Assignment 8: Clustering Analysis...


WPC300

Assignment 8: Clustering Analysis
Started: Oct 

Quiz Instructions
•    Read the Assignment Instructions.
Actions
•    Start with datafile: US Demographics.jmp
•    Change the file name to: YourFirstName_YourLastName_Assignment8.jmp
•    You must use JMP to answer the following multiple-choice questions. 
•    Note: When you are asked to submit a screen capture, you need to make sure that your name is part of the capture. 
 

Question 1                          2.5 pts
Execute a hierarchical clustering to find out the segmentation among the US states based on the attributes mentioned in the instructions.  Submit parallel plots diagram to show the number of clusters obtained from the analysis. Make sure to show different colors for the different clusters.
Hint: put "State" into "Label."
Upload (attached) 

 
Question 2                           2.5 pts
Based on the hierarchical clustering analysis, which cluster of States have low population and low physical activities comparing to other clusters?

  • Alabama, Indiana, Kentucky
  • California, Texas, New York
  • Kansas, Idaho, Michigan
  • Tennessee, Alaska, Ohio

 
Question 3                                 2.5 pts
For a wellness campaign, where your goal is to create awareness for people’s health and weight related chronic illnesses (such as obesity), which cluster (based on hierarchical clustering analysis) would be an ideal target for an initial launch?

  • Cluster #1
  • Cluster #3
  • Cluster #2
  • Cluster #4

 
Question 4                       2.5 pts
Based on the hierarchical clustering analysis, submit a screenshot of the appropriate US map with colored clusters of the States. 
Upload (attached) 
 
Question 5                       2.5 pts
Which of the following statements is true about cluster #4 from the hierarchical clustering analysis?

  • High household income and high physical activity
  • High population and high obesity
  • Low household income & low obesity
  • Low population and very high obesity

 
Question 6                       2.5 pts
Execute a k-means clustering analysis using the same data attributes with a cluster range between 3 and 15. How many distinct clusters did you find from your analysis?

  • 10
  • 6
  • 4
  • 8

 
Question 7                     2.5 pts
Based on the K-Means clustering analysis in Q6 , what are the States that don’t cluster together?

  • Kansas & New Mexico
  • California & Utah
  • Texas & Florida
  • Colorado & Minnesota

 
Question 8                        2.5 pts
Based on K-means clustering analysis, which two States would appropriate for launching your campaign?

  • Colorado & Minnesota
  • New York and Texas
  • Alabama & Mississippi
  • Maryland & New Jersey

 
Question 9             2.5 pts
Based on K-means clustering, which of the following States you definitely won’t choose to run your wellness campaign?

  • Louisiana
  • West Virginia
  • Oklahoma
  • Vermont

 
Question 10                   2.5 pts
Submit the JMP file with saved scripts for hierarchical and k-means clustering.

(provided)


View Buy Answer $15 Sign In -- OR --

1

(Solved): WPC300 Practice Test (Practical)  march 2021...


WPC300

Practice Test (Practical) 

  • Due Mar 7 

Instructions

PRACTICE Practical Exam Instructions

This exam has a total of two sections. You are required to use two sets of data to answer all the questions. Please see the individual instructions for each section. The total time allowed for this exam is 75 minutes. You must complete the exam in one seating. You will need JMP Pro and Excel software to analyze data and answer questions. 

Note: You are expected to work individually to complete this exam before the due date. Getting help from outside resources other than what is made available to you via the canvas course site is considered a violation of the code of academic integrity for which you will be liable for the consequence. 

 

Attempt History


Submitted Mar 

Score for this attempt: 20 out of 20

 Data background

The sample includes various demographic and blood test responses for 442 diabetes patients (respondents). The response variable Y is a quantitative measure of disease progression one year after baseline measurements were taken. The ten variables measured at baseline time are age, gender (1 = male, 2 = female), body mass index (BMI), average blood pressure (BP), and six blood serum measurements (Total Cholesterol, LDL, HDL, TCH, LTG, & Glucose). The response Y Binary is constructed from the response Y and defined as high if Y is above 200 or low otherwise.

Section A:

Instructions: 

  • Use the following data file for this section: SampleDiabetes.xlsx  
  • Remember the honor code.
  • Use Excel to prepare your responses to the questions in this section
  • Note that sometimes numbers have been rounded.

Create a new column using a vlookup() function to categorize the age variable into age categories as follows:

Age

Category

70+

1

60-69

2

50-59

3

40-49

4

30-39

5

19-29

6

 

Question 1

1 / 1 pts

Using a pivot table, determine which of the following statements is incorrect.

  •   Category 4 has 97 respondents Correct!
  •   Category 3 has 54 respondents
  •   Category 5 has 73 respondents
  •   Category 2 has 90 respondents

Question 2

1 / 1 pts

Using a pivot table, determine which of the following statements is incorrect about the average age of respondents in each age category.

  •   Category 3 average age is 54.0 years
  •   Category 2 average age is 63.8 years
  •   Category 4 average age is 44.9 years!
  •   Category 1 average age is 71.2 years

Question 3

1 / 1 pts

Create a pivot table pie chart for people of age 40 or older using the same age categories as before, determine which of the following statements is correct.

  •   Category 3 has 28% of the respondents
  •   Category 2 has 20% of the respondentst!
  •   Category 2 has 28% of the respondents
  •   Category 4 has 22% of the respondents

Section B

Instructions

  • Use the following JMP data file for this section [Diabetes.JMP]
  • Remember the honor code.
  • Use JMP Pro to prepare your responses to the questions in this section
  • Note that sometimes numbers have been rounded.

 Question 4

1 / 1 pts

Which of the following statements is not correct based on the sample data provided?

  •   The mean for LDL is 115
  •   The upper limit of the 95% confidence interval for BP is 95.9
  •   The median for Total Cholesterol is 186
  •   The standard deviation for HDL is 0.6152

Question 5

1 / 1 pts

Looking at the distribution of BMI, you observe that the data centrality is measured as:

  •   n = 442 
  •   Standard Error = 0.21
  •   Standard deviation = 4.41orrect!
  •   Mean = 26.4

Question 6

1/ 1 pts

Looking at the distribution of Glucose, you observe that the distribution spread is measured as:Answered

  •   Mean is 91.3
  •   95% confidence interval is 90.2 to 92.3
  •   Standard error is 11.5
  •   Interquartile range is 15

Question 7

1 / 1 pts

It is generally believed that the average population age is 50. You claim that the population average age is less than 50. Perform a statistical test on the sample to see if the average age for the sample is consistent with your hypothesis (use a margin of error of 5%). What is the p-value from the test?

  •   0.05Correct!
  •   0.0089 
  •   0.9911 
  •   0.0179

 Question 8

1/ 1 pts

It is generally believed that the average population age is 50. You claim that the population average age is more than 50. Perform a statistical test on the sample to see if the average age for the sample is consistent with your hypothesis (use a margin of error of 5%). What can you conclude?

  •   We fail to reject the null hypothesis
  •   We accept the null hypothesis
  •   We do not have enough information to make a judgement on the null hypothesisAnswered
  •   We reject the null hypothesis

Question 9

1 / 1 pts

Perform a pairwise correlation analysis of the variables Y, age, BMI, BP, Total Cholesterol, LDL, HDL, TCH, LTG, & Glucose in the sample suggests that:

  •   The population has a significant negative correlation between TCH and Total Cholesterol
  •   The population has no correlation between Total Cholesterol and LDLt!
  •   The population has a significant negative correlation between HDL and TCH
  •   The population has a significant negative correlation between Y and BMI

Question 10

1 / 1 pts

If we are interested in determining a possible cause and effect relationship where BMI and Age are causing disease progression (Y), _____ is the independent variable and ____ is the dependent variable?

  •   Age, BMI respectivelyAnswered
  •   BMI, Y respectively
  •   BMI, Age respectively
  •   Y, BMI respectively

Question 11

1 / 1 pts

Perform a simple linear regression to predict Y using respondents’ BMI. What is the correct equation for the regression line?

  •   Y = 10.2*BMI
  •   BMI = -118 + 10.2*YCorrect!
  •   Y = -118 + 10.2*BMI
  •   BMI = 21 + 0.034*Y

Question 12

1 / 1 pts

Perform a multiple regression analysis (with a margin of error of 5%) that examines all of the variables in the sample (excluding Y binary) as potential predictors of Y. Which of the following conclusions can be made based on the analysis without removing any of the predictor variables?

  •   LDL is a significant predictor in the model, LTG is not.
  •   TCH is a significant predictor in the model, Glucose is not.Correct!
  •   BMI is a significant predictor in the model, HDL is not.
  •   Age is a significant predictor in the model, Total Cholesterol is not.

Question 13

1 / 1 pts

After performing model building by applying backward deletion to the model described in Q12, which of the following conclusions is valid based on the final model?

  •   Glucose is not a significant predictor, but Gender is
  •   Total Cholesterol is not a significant predictor, but BP isYou Answered
  •   HDL is not a significant predictor, but LTG is
  •   Age is not a significant predictor, but LDL is

 Question 14

1/ 1 pts

Based on the final model developed in Q13, which is the strongest predictor in the model?

  •   Intercept
  •   BMIou Answered
  •   Total Cholesterol 
  •   Gender

Question 15

1 / 1 pts

Based on the final model developed in Q13, which is the weakest predictor in the model?ct!

  •   Gender
  •   Total Cholesterol
  •   Intercept
  •   BMI

Question 16

1 / 1 pts

How much of the variation in the dependent variable can be explained by the final regression model developed in Q13?

  •   50.8%You Answered
  •   51.5%
  •   <.0001
  •   We cannot determine this quantity

Question 17

1 / 1 pts

Is there a multicollinearity concern for the final model developed in Q13?

  •   There is a multicollinearity problem in the final model and we should delete the LTG variableCorrect!
  •   There is a multicollinearity problem in the final model and we should delete the Total Cholesterol variable
  •   There is no multicollinearity problem in the final model
  •   There is a multicollinearity problem in the final model and we should delete the Gender variable

Question 18

1 / 1 pts

The Y Binary variable was developed to categorize respondents into high and low development of Diabetes over the year since their baseline measurements were taken. What proportion of high development respondents are female? Answered

  •   75.3%
  •   69.6%
  •   24.7%
  •   30.4%

Question 19

1 / 1 pts

In an initial logistic regression analysis attempting to establish if all of the variables (excluding Y) in the sample can predict (with a margin of error of 5%) the level (high/low) of the disease, it can be concluded that:u Answered

  •   Some of the predictors are not significant and can be deleted from the model
  •   The overall model is significant in predicting the level of development of the disease
  •   The model accuracy can be determined by the confusion matrix
  •   All the other answer choices are correct

Question 20

1 / 1 pts

In the final logistic regression model to predict/classify Y binary, which of the following statements is true:

  •   77 respondents were correctly classified by the model as high disease development
  •   291 respondent were correctly classified by the model as low disease development
  •   44 respondents were incorrectly classified by the model as low disease developmentCorrect!
  •   All of the other answer choices are correct

 

 


View Buy Answer $15 Sign In -- OR --

1

(Solved): wpc300 Lab 8: Clustering march 2021...


wpc300 Lab 8: Clustering
Started: March

Quiz Instructions
•    Start with datafile: Presidential Elections.jmp
•    Change the file names to: YourFirstName_YourLastName_presidential.jmp
•    Review what we did in the lab.
•    You must use JMP to answer the following multiple-choice questions. 
•    Note: When you are asked to submit a screen capture, you need to make sure that your name is part of the capture. 
 
Question 1                    2 pts
Based on a hierarchical clustering on the election results, how many optimal clusters of States did you get from your analysis?

  • 4
  • 6
  • 5
  • 8

 
Question 2                  2 pts
Which cluster of States has the lowest proportion of Democratic voters in the Presidential elections?

  • Alaska, Nebraska, Idaho, Wyoming, Utah
  • Iowa, Wisconsin, Pennsylvania, Minnesota
  • Virginia, Texas, Indiana
  • Arizona, Nevada, New Hampshire, Colorado, Florida

 
Question 3                  2 pts
If you are an advisor for a Republican presidential nominee, for which States would you not run many campaign ads against your opponent who is a Democratic presidential nominee?

  • Iowa, Wisconsin, Pennsylvania, Minnesota
  • Alaska, Nebraska, Idaho, Wyoming, Utah
  • Arizona, Nevada, New Hampshire, Colorado, Florida
  • Virginia, Texas, Indiana

 
Question 4                 2 pts
Show a parallel plot graph to demonstrate how different States are clustered together based on hierarchical clustering analysis? Submit a screen shot.
Upload 
  

Question 5             2 pts
Perform a K-means clustering with initial number of cluster seed to vary between 3 and 10. What is the optimal number of clusters did you get from this analysis?

  • 8
  • 4
  • 5
  • 6


Question 6               2 pts
Show a parallel plot graph to demonstrate how different States are clustered together based on the K-means clustering analysis? Submit a screenshot.
Upload 

Question 7                    2 pts
Based on the K-means clustering, show how States are grouped together using colored coded US Map. Submit a screenshot.
Upload 
  

Question 8                    2 pts
Upon evaluation of cluster means obtained from K-means clustering, which cluster has the lowest overall democratic percentage of voters over the 9 election cycles?

  • 2
  • 4
  • 8
  • 3


Question 9                      2 pts
Upon evaluation of cluster means obtained from K-means clustering, which cluster has the highest overall democratic percentage of voters over the 9 election cycles?

  • 8
  • 4
  • 3
  • 2


Question 10                     2 pts
Which cluster depicts swing States where the percentage for both democratic and republican nominees are similar in the last couple of election years?

  • 7
  • 2
  • 6
  • 8

View Buy Answer $15 Sign In -- OR --

1

(Solved): CIS 360 assignment :Excel Pivots and Charts...


CIS 360 assignment: Excel Pivots and Charts

Pivot Tables and Charts
Introduction
Excel allows you to create cross-tab (or cross-tabulation) analysis in what it calls a Pivot Table.
Pivot tables can be one, two or three-dimensional. You can use multiple statistical analysis and summary options tools. You can include data from multiple worksheets, and you can modify them dynamically.
Creating a Pivot Table
1. Open ExcelPivotsWorksheetCIS360.xlsx, found on Canvas.
2. Click on the Expenses tab.
3. Click on cell A2.
4. On the Insert tab click on the [PivotTable] button.
5. The Create Pivot Table box opens:


View Buy Answer $50 Sign In -- OR --

1

(Solved): WPC 300 : Assignment 7: Logistic Regression |SchoolAdmission...


Assignment 7: Logistic Regression

Quiz Instructions

Actions

  • Start with datafile: SchoolAdmission.jmp
  • Change the file name to: YourFirstName_YourLastName_Assignment7.jmp
  • You must use JMP to answer the following multiple-choice questions. 
  • Note: When you are asked to submit a screen capture, you need to make sure that your name is part of the capture. 

 Question 1        2.5 pts

Since we are interested in understanding ADMITTED students, change the Value Order in "Column Properties" of the "ADMIT" column so that (admitted) appears first in graphics and analyses. Show the screenshot to provide the evidence of this step.

Question 2        2.5 pts

Execute a contingency analysis to find out the relationship between ‘Admit’ and “Rank’ of the students. Which of the following statements is correct based on your analysis?

Group of answer choices

  • 76.9% of Rank-2 students didn’t get admitted
  • 45.9% of Rank-1 students didn’t get admitted.
  • 35.8% of Rank-4 students didn’t get admitted.
  • 17.9% of Rank-3 students got admitted

 Question 3        2.5 pts

Execute a simple logistic regression analysis to test if ‘Admit’ is significantly related to ‘GPA’.  Based on your analysis, which of the following statements is correct? [save the script to the datafile]

Group of answer choices

  • ’Admit’ is significantly related to the independent variable ‘GPA’ because the p-value is less than 0.05
  • ‘Admit’ is not related with ‘survived’ because one of them is categorical variable
  • ‘Admit’ is not related to the dependent variable ‘GPA’ because the p-value is more than 0.05
  • As ‘GPA’ increases, ‘Admit’ decreases.

Question 4        2.5 pts

Share an appropriate screenshot to support your answer in Question 3. Make sure you use different colors and different markers for students who got admitted and who didn’t.

Question 5        2.5 pts

Develop a simple logistic regression analysis to test if ‘Admit’ is significantly related to ‘GRE’. Based on your analysis, which of the following statements is correct?

Group of answer choices

  • GRE is significantly related to the dependent variable ‘Admitted’
  • GRE is not related to the dependent variable ‘Admit’
  • As GRE increases, the Admit decreases.
  • GRE is related to ‘Admit’ but is it not significant

 Question 6        2.5 pts

Share a screenshot of the Logistic Fit Plot to support your answer in Q5. Make sure you use different colors and different markers for students who got admitted and who didn’t.

 Question 7        2.5 pts

Use Analyze-> Fit Model to perform a 'stepwise' fit for “ADMIT” using all relevant predictors in the data. Use Minimum BIC (as stopping rule), Forward (as direction), and Whole Effects (as Rules) in the model. Run the logistic regression model using the selected variables in the previous step. Which of the following variables is/are significant in the model to predict when students get admitted?  [save the script to the data file]

Group of answer choices

  • GPA
  • None of the variables are significant
  • GRE
  • Both GRE and GPA

 Question 8        2.5 pts

Submit the parameter estimate table from the above model in Question 7.

Upload 

 Question 9        2.5 pts

Submit the final equation of the model (the resulting logit of the probability model, saved as Lin[1] in the data table). [save script to the data file]

Upload 

 Question 10      2.5 pts

Submit the JMP data file with saved script for all the analysis to answer questions from this assignment.


View Buy Answer $15 Sign In -- OR --

1

(Solved): WPC300 Quiz 7: Clustering UPDATED: Feb  2021...


WPC300 Quiz 7: Clustering

Started: Feb  2021

Quiz Instructions

Question 1        1 pts

Which of the following is a step of agglomerative hierarchical clustering?

Group of answer choices

  • By separating cluster into two finer groups
  • By joining two clusters that not at a Euclidean distance
  • By joining two clusters that are closest to each other
  • By joining two clusters farthest away from each other 

Question 2        1 pts

Which of the following is true about k-means clustering

Group of answer choices

  • The cluster analysis will give us an optimum value for k
  • We choose the value for k before doing the clustering analysis
  • A tree diagram is used to illustrate the steps in the clustering analysis
  • It is a type of hierarchical clustering

Question 3        1 pts

Which of the following is true of hierarchical clustering?

Group of answer choices

  • All clusters must have the same number of data
  • All clusters must have more than one object in it
  • No single cluster can have all data
  • The data partition does not occur in a single step

Question 4        1 pts

In a cluster analysis, the distance between the clusters should be:

Group of answer choices

  • Minimized
  • Even
  • Maximized
  • Zero

 Question 5        1 pts

Which of the following is not an application of clustering analysis?

Group of answer choices

  • Crime prediction analysis
  • Web click stream analysis
  • Market segmentation analysis
  • Collaborating filtering analysis

 Question 6        1 pts

In the Target story discussed in the lecture, why did Target send the teen daughter maternity ads?

Group of answer choices

  • Target analytic model confused her with an older woman with a similar name
  • Target was using special promotion that targeted all teens in her geographical area
  • Target analytics model suggested she was pregnant based on her buying habit
  • Target was sending ads to all women in a particular neighborhood

Question 7        1 pts

Which of the following category of data mining you would use for Spam filtering of emails?

Group of answer choices

  • Supervised
  • Unsupervised
  • Heuristics
  • Both supervised and unsupervised

Question 8        1 pts

Which of the following statements below is false about supervised/unsupervised data analysis?

Group of answer choices

  • The data is not labeled for unsupervised
  • Data is not labeled for supervised analysis
  • For unsupervised analysis, the goal is to find cases that are similar to each other
  • The data is labeled for supervised analysis

Question 9        1 pts

Which of the following is a false statement?

Group of answer choices

  • The k-means algorithm is a method for doing partitional clustering
  • Reducing SSE (sum of squared error) within cluster increases cohesion
  • To predict sales from transactional data one should perform clustering analysis.
  • In the cluster analysis, the objects within clusters should exhibit an high amount of similarity

Question 10     1 pts

Which of the following is a definition of distance between two clusters in a complete linkage clustering?

Group of answer choices

  • The sum of square of the distance between clusters
  • The distance between the least distant pair of objects, one from each group
  • The distance between the most distant pair of objects, one from each group
  • The average of distance between all pairs of objects, where each pair is made up of one object from each group

 


View Buy Answer $15 Sign In -- OR --

1

(Solved): WPC 300 : Lab 7: Logistic 2021 UPDATES...


  • Lab 7: Logistic
  • Started: Feb
  • Quiz Instructions
  • Start with datafile:
  • titanicpassengers-bbm.jmp
  • Change the file names to: YourFirstName_YourLastName_titanicpassengers.jmp
  • Review what we did in the lab.
  • You must use JMP to answer the following multiple-choice questions. 
  • Note: When you are asked to submit a screen capture, you need to make sure that your name is part of the capture. 

 

Question 1        2 pts

Based on the available data sample, what percentage of the 3rd class passengers were on board that survived the accident?

Group of answer choices

  • 74.5%
  • 25.5%
  • 43%
  • 62%

Question 2        2 pts

What percentage of male on board did not survive the accident?

Group of answer choices

  • 72.3%
  • 19.1%
  • 27.3%
  • 80.9%

 Question 3        2 pts

Develop a simple logistic regression model to test if ‘Age’ is a significant predictor of the outcome ‘survived’.  Based on your analysis, which of the following statements is correct?

Group of answer choices

  • As 'Age' increases, the survival rate also increases.
  • Age is not a significant predictor of the dependent variable ‘survived’
  • Age is a significant predictor of the dependent variable ‘survived’
  • As 'Age' decreases, the survival rate decreases.

 Question 4        2 pts

Share a screenshot to support your answer in Q3. Make sure you use different colors and different markers for passengers who survived and who didn’t.

Upload 

 Question 5        2 pts

Develop a simple logistic regression model to test if ‘Fare’ is a significant predictor of the outcome ‘survived’.  Based on your analysis, which of the following statements is correct?

Group of answer choices

  • Fare is not correlated with the dependent variable ‘survived’ because the p-value is more than 0.05
  • Fare is not correlated with ‘survived’ because one of them is a categorical variable
  • As Fare increases, the survival rate decreases
  • Fare is a significant predictor of the dependent variable ‘survived’ because the p-value is less than 0.05

 Question 6        2 pts

Share a screenshot to support your answer in Q5. Make sure you use different colors and different markers for passengers who survived and who didn’t.

Upload 

 Question 7        2 pts

As shown in the lab, develop a multiple logistic regression model (use stepwise regression) to predict survival. Which of the following variables is most significant in predicting survival?

Group of answer choices

  • Age
  • Port
  • Sex
  • Passenger class

 Question 8        2 pts

As shown in the lab, develop a multiple logistic regression model (use stepwise regression) to predict survival. Which of the following variables is not significant to predict survival?

Group of answer choices

  • Parents and children
  • Age
  • Port
  • Passenger class

 Question 9        2 pts

Provide a screenshot of appropriate table to support your answer in Q7 & Q8

Upload 

 Question 10   2 pts

Provide the screenshot of the equation of the final model (the resulting logit of the probability model, saved as Lin[Yes] in the data table).

 


View Buy Answer $15 Sign In -- OR --

1

(Solved): WPC 300 : Quiz 6: Logistic...


Quiz 6: Logistic

Started: Feb

Quiz Instructions

 

Question 1        1 pts

In classification analysis, we are determining the probability of an observation ________.

Group of answer choices

  • To be undefined
  • To be one
  • To be part of a certain class or not
  • To be zero

 

Question 2        1 pts

A loan officer wants to know if the next customer is likely to default or not on a loan. How can she assess the risk of extending the loan to that customer?

Group of answer choices

  • By utilizing a simple linear regression model developed by an in-house analyst
  • By asking his colleague if he knows the person
  • By asking the customer if he is planning to default the loan or not
  • By utilizing a multiple logistic regression model developed by an in-house analyst

 

Question 3       1 pts

In classification analysis, we typically split the data into two mutually exclusive sets, known as ________, to investigate the strength of the developed model.

Group of answer choices

  • Training and Binary
  • Training and validation/testing
  • Binary and numeric
  • Testing and validation

Question 4       1 pts

Odds ratio is defined as ________, where p is the probability of success.

Group of answer choices

  • p/1-p
  • 1/p-1
  • p/p-1
  • 1/1-p

 

Question 5         1 pts

The ________ is often used to describe the performance of a classification model applied to a set of test data for which the true outcomes are known.

Group of answer choices

  • Effect summary table
  • ANOVA table
  • Parameter estimates table
  • Confusion matrix

 

Question 6       1 pts

If you want to find out if body weight, calorie intake, fat intake and age have an influence on the probability of having a heart attack (yes or no), which of the following kind of analysis will help determine the answer?

Group of answer choices

  • Multiple logistic regression
  • Simple logistic regression
  • Simple linear regression
  • Multiple linear regression

 

Question 7          1 pts

In classification problems, the primary source for accuracy estimation of the model is ________.

Group of answer choices

  • Probability of success
  • Logit
  • Confusion matrix
  • Odds ratio

 

Question 8     1 pts

In logistic regression analysis, instead of Y as a dependent variable, we use a function of Y called ________.

Group of answer choices

  • Logit
  • Odds
  • Odds ratio
  • Log of Y

 

Question 9       1 pts

Logistic regression is a specialized type of regression analysis that is designed to predict ________ variables.

Group of answer choices

  • independent
  • numeric dependent
  • a binary numeric
  • a binary categorical

 

Question 10        1 pts

In logistic regression, the dependent variable y is defined as:

Group of answer choices

  • Log (p/1-p)
  • Log(1/p)
  • Log (1/1-p)
  • Log (1-p)

 


View Buy Answer $15 Sign In -- OR --

1

(Solved): WPC 300 : Assignment 6: Linear Regression-CarsData.jmp feb-2...


Assignment 6: Linear Regression

Started: Feb

Quiz Instructions

Actions

  • Start with datafile: CarsData.jmp
  • Change the file name to: YourFirstName_YourLastName_Assignment6.jmp
  • You must use JMP to answer the following multiple-choice questions. 
  • Note: When you are asked to submit a screen capture, you need to make sure that your name is part of the capture. 

 

Question 1        2.5 pts

You have been asked to test if 'city mileage' of a car can be predicted based on the 'Fuel Tank capacity'. Which of the following statements is correct?

Group of answer choices

  • Dependent variable ‘Fuel Tank Capacity’ is negatively correlated with independent variable ‘City Mileage of Car’
  • The test shows that the variables are significantly correlated. Dependent variable ‘City mileage car’ is predicted as City Mileage (MPG) = 45.587166 - 1.3934743*Fuel Tank Capacity
  • The test shows that both variables are correlated. The dependent variable ‘Fuel Tank Capacity’ is predicted as Fuel Tank Capacity = 45.587166 + 1.3934743*City Mileage of Car.
  • The test shows that both variables are not correlated. Dependent variable City mileage car is predicted as City Mileage (MPG) = 45.587166 + 1.3934743*Fuel Tank Capacity

 

Question 2         2.5 pts

What is the coefficient of determination between 'City Mileage' and 'Weight'? [Save the script to the data file]

Group of answer choices

  • 47.05
  • 0.71
  • -0.008
  • 0.66

 

Question 3        2.5 pts

The team wants to find out if there are any other variables that are significantly correlated, with a correlation coefficient greater than +0.8 or less than -0.8. Execute an appropriate analysis to answer this question. Which of the following combinations of variables satisfy this condition? [Save the script to the data file]

Group of answer choices

  • Luggage Capacity and weight
  • Fuel Tank Capacity & Weight
  • Weight & Maximum Horsepower
  • Maximum Horsepower & Engine Size

 

Question 4       2.5 pts

Use the least square method to develop a linear model to predict the city mileage of a car using all other variables in the data file as independent variables, except ‘Model’ & ‘Manufacturer’. Remove the insignificant parameters from the model one by one by checking the log(worth) of each parameter and removing the least important parameter first from the model. Which of the following variables remain significant in the final model? [Save Script to the datafile]

Group of answer choices

  • Maximum Horsepower & Fuel Tank Capacity
  • Vehicle category & Weight
  • Luggage capacity & Weight
  • Rear seat room & Weight

 

Question 5        2.5 pts

What is the coefficient of determination of model equation obtained from Question 4?

Group of answer choices

  • 0.43
  • 0.64
  • 0.78
  • 0.80

 

Question 6           2.5 pts

The team also asked you to check for any multi-collinearity effects in your model obtained from Question 4. After testing for any multi-collinearity effects (using VIF), what did you find out?

Group of answer choices

  • ‘Passenger capacity’ and ‘Length’ variables show multi-collinearity effects in the model
  • ‘Passenger capacity’ and ‘Weight’ variables show multi-collinearity effects in the model
  • ‘Fuel Tank Capacity’, ‘Width’ and ‘Weight’ variables show multi-collinearity effects in the model
  • ‘Vehicle Category’ and ‘Engine Size’ variables show multi-collinearity effects in the model

 

Question 7        2.5 pts

If you discovered multi-collinearity effects in the model, remove the variables in question one at a time (starting from the highest VIF) from the model and then stop when you don’t need to remove any further variable(s) from the model based on accepted VIF and p-values. After this process, submit the screenshot of the ‘Effect Summary’ of the final model. 

Upload 

Question 8            2.5 pts

Based on the model in question 7,  what are the strongest and weakest variables in predicting the city mileage of a car?

Group of answer choices

  • Passenger Capacity (strongest) and Weight (weakest)
  • Vehicle Category (strongest) and Passenger Capacity (weakest)
  • Weight (strongest) and ‘Length’ (weakest)
  • Fuel Tank Capacity (strongest) and Passenger Capacity (weakest)

 

Question 9          2.5 pts

What is the model equation for the final model in Question 7 to predict City mileage of a car? [Save script] Submit a screenshot.

Question 10       2.5 pts

Submit the JMP data file with saved script for all the analysis to answer questions from this assignment.

Upload

 

 


View Buy Answer $15 Sign In -- OR --

1

(Solved): WPC 300 : Lab 6: Linear Reg 2021...


WPC300

Lab 6: Linear Reg

Quiz Instructions

  • Start with datafile:
  • Change the file names to: YourFirstName_YourLastName_housingprices.jmp
  • Review what we did in the lab.
  • You must use JMP to answer the following multiple-choice questions. 
  • Note: When you are asked to submit a screen capture, you need to make sure that your name is part of the capture. 

 

Question 1   2 pts

In a simple linear regression model that predicts home price based on the number of bedrooms, the coefficient of determination is:

Group of answer choices

None of the given answers is correct

0.44

0.98

0.46

 

Question 2       2 pts

Is the linear model obtained in the previous question significant? Support your answer with an appropriate screenshot from JMP analysis.

Upload 

 

Question 3         2 pts

When you develop a prediction model for home-price, based on the beds, baths, and square feet as independent variables, which, if any, of the following independent variables is significant in the model?

Group of answer choices

All of the other answers are correct

Square footage

Baths

Beds

 

Question 4           2 pts

In a final model (after removing all the non-significant predictors) to predict the home price based on other variables available in the data file, the most important predictor that contributes significantly is:

Group of answer choices

  • Square feet
  • Acres
  • Baths
  • Miles to base

 

Question 5       2 pts

What is the model equation based on multiple regression analysis to predict home price? (reference Q4). Provide a screenshot from JMP output to show the model question.

Upload 

 

Question 6      2 pts

What is the coefficient of determination for the final model in previous question?

Group of answer choices

  • 0.61
  • 0.78
  • -0.34
  • 0.80

 

Question 7       2 pts

In the final model (reference Q4), do you have any multicollinearity issue to be addressed?

Group of answer choices

  • Need additional information such as ‘residual’ to answer this question
  • Need more data to answer this question
  • No
  • Yes

 

Question 8       2 pts

Share a screenshot of the appropriate table to support your answer in Q7. 

Upload 

Question 9        2 pts

Based on the final regression equation, if all variables remain the same, for an additional bath in the house, the home price wil:

Group of answer choices

  • Increase by $197 K
  • Increase by $59.2 K
  • Decrease by $3.8 K
  • Increase by $5.00 K

Question 10        2 pts

Create a frequency distribution (with summary statistics) of the residual ( Observed value - Predicted value) to show how well the model predicts the actual home price. Share the screenshot of this plot. 

 


View Buy Answer $15 Sign In -- OR --

Showing Page 1 of 148 Pages