Summer Sale Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: ecus65

EMC D-DS-FN-23 - Dell Data Science Foundations

Page: 1 / 2
Total 59 questions

You have been given a task to improve sales force compensation of your organization. As a result of a study, your team decides to classify personnel as follows:

● Did not meet quota

● Met quota

● Exceeded 150% of quota

In which data analytics lifecycle phase should you define these categories for analysis purposes?

A.

Model building

B.

Communicate results

C.

Operationalize

D.

Model planning

You build a decision tree to classify five different types of customers based on their browsing history from a sample of 500. The resulting decision tree has 17 layers. One of the leaf nodes has only three customers.

What do you conclude?

A.

The decision tree needs to be rebuilt without the three customers

B.

The decision tree needs to be rebuilt to see if the results change

C.

The sample size is too small, so the classes may not be accurate

D.

Due to large number of layers, there may be an overfitting problem

Which chart type is intended to display time series data?

A.

Bar chart

B.

Pie chart

C.

Line chart

D.

[Histogram

In time series analysis, what function is examined to identify the order of the autoregressive component of an ARIMA model?

A.

Logistic function

B.

Lognormal distribution function

C.

Partial autocorrelation function

D.

Normal distribution function

What is the similarity between the matrix and array data structures in R?

A.

Both structures can contain only integers

B.

Both structures can only contain one data type

C.

Both structures can store multiple data types

D.

Both structures must be 2-dimensional

When should you consider using multinomial logistic regression over binary logistic regression?

A.

Dependent variable is continuous or dichotomous

B.

Dependent variable is continuous or categorical

C.

Dependent variable has more than two categories

D.

Dependent variable is continuous only

What is part of the model output for a linear regression?

A.

The assignment of each input datum to a cluster

B.

Coefficients indicating relative impact of the input variables on the outcome

C.

The set of all rules X -> Y with minimum support and confidence

D.

Probability score for each possible class label

What are three built-in data types in the R programming language?

A.

Boolean, integer, and character

B.

Boolean, table, and character

C.

Boolean, table, and integer

D.

List, array, and integer

Which analytic technique would be appropriate to estimate home sale price in U.S. dollars as a function of square footage, number of bedrooms, and lot size?

A.

Time series analysis

B.

Linear regression

C.

Naive Bayesian classification

D.

K-means clustering

A logistic regression model is built to determine the probability of a credit card borrower defaulting on a credit loan. A threshold value of 0.3 is selected. Which statement can be used to predict a borrower will default?

A.

If probability > 0.1, then predict the borrower will default

B.

If probability < 0.1, then predict the borrower will default

C.

If probability > 0.3, then predict the borrower will default

D.

If probability < 0.3, then predict the borrower will default