Summer Sale Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: ecus65

SAS Institute A00-240 - SAS Statistical Business Analysis SAS9: Regression and Model

Page: 2 / 3
Total 99 questions

An analyst knows that the categorical predictor, storeId, is an important predictor of the target.

However, store_Id has too many levels to be a feasible predictor in the model. The analyst wants to combine stores and treat them as members of the same class level.

What are the two most effective ways to address the problem? (Choose two.)

A.

Eliminate store_id as a predictor in the model because it has too many levels to be feasible.

B.

Cluster by using Greenacre's method to combine stores that are similar.

C.

Use subject matter expertise to combine stores that are similar.

D.

Randomly combine the stores into five groups to keep the stochastic variation among the observations intact.

When mean imputation is performed on data after the data is partitioned for honest assessment, what is the most appropriate method for handling the mean imputation?

A.

The sample means from the validation data set are applied to the training and test data sets.

B.

The sample means from the training data set are applied to the validation and test data sets.

C.

The sample means from the test data set are applied to the training and validation data sets.

D.

The sample means from each partition of the data are applied to their own partition.

A confusion matrix is created for data that were oversampled due to a rare target.

What values are not affected by this oversampling?

A.

Sensitivity and PV+

B.

Specificity and PV-

C.

PV+ and PV-

D.

Sensitivity and Specificity

Given the following SAS data set TEST:

Which SAS program is NOT a correct way to create dummy variables?

A.

Option A

B.

Option B

C.

Option C

D.

Option D

Refer to the exhibit.

Based on the control plot, which conclusion is justified regarding the means of the response?

A.

All groups are significantly different from each other.

B.

2XL is significantly different from all other groups.

C.

Only XL and 2XL are not significantly different from each other.

D.

No groups are significantly different from each other.

An analyst compares the mean salaries of men and women working at a company.

The SAS data set SALARY contains variables:

    Gender (M or F)

    Pay (dollars per year)

Which SAS programs can be used to find the p-value for comparing men's salaries with women's salaries? (Choose two.)

A.

Option A

B.

Option B

C.

Option C

D.

Option D

This question will ask you to provide a missing option.

Given the following SAS program:

What option must be added to the program to obtain a data set containing Spearman statistics?

A.

OUTCORR=estimates

B.

OUTS=estimates

C.

OUT=estimates

D.

OUTPUT=estimates

Refer to the confusion matrix:

Calculate the accuracy and error rate (0 - negative outcome, 1 - positive outcome)

A.

Accuracy = 58/102, Error Rate = 23/48

B.

Accuracy = 83/102, Error Rate = 67/102

C.

Accuracy = 25/150, Error Rate = 44/150

D.

Accuracy = 83/150, Error Rate = 67/150

There are missing values in the input variables for a regression application.

Which SAS procedure provides a viable solution?

A.

GLM

B.

VARCLUS

C.

STDI2E

D.

CLUSTER

A financial services manager wants to assess the probability that certain clients will default on their Home Equity Line of Credit (HELOC). A former employee left the code listed below.

The training data set is named HELOC, while a similar data set of more recent clients is named RECENT_HELOC.

Which SAS data steps will calculate the predicted probability of default on recent clients? (Choose two.)

A.

Option A

B.

Option B

C.

Option C

D.

Option D