Databricks Databricks-Certified-Professional-Data-Scientist - Databricks Certified Professional Data Scientist Exam

Databricks Databricks-Certified-Professional-Data-Scientist Premium Access Download Demo

Page: 2 / 5
Total 138 questions

Question # 11

What are the advantages of the mutual information over the Pearson correlation for text classification problems?

The mutual information has a meaningful test for statistical significance.

The mutual information can signal non-linear relationships between the dependent and independent variables.

The mutual information is easier to parallelize.

The mutual information doesn't assume that the variables are normally distributed.

Question # 12

Consider the following confusion matrix for a data set with 600 out of 11,100 instances positive:

In this case, Precision = 50%, Recall = 83%, Specificity = 95%, and Accuracy = 95%.

Select the correct statement

Precision is low, which means the classifier is predicting positives best

Precision is low, which means the classifier is predicting positives poorly

problem domain has a major impact on the measures that should be used to evaluate a classifier within it

1 and 3

2 and 3

Question # 13

A fruit may be considered to be an apple if it is red, round, and about 3" in diameter. A naive Bayes classifier considers each of these features to contribute independently to the probability that this fruit is an apple, regardless of the

Presence of the other features.

Absence of the other features.

Presence or absence of the other features

None of the above

Question # 14

A problem statement is given as below

Hospital records show that of patients suffering from a certain disease, 75% die of it. What is the probability that of 6 randomly selected patients, 4 will recover?

Which of the following model will you use to solve it.

Binomial

Poisson

Normal

Any of the above

Question # 15

Which analytical method is considered unsupervised?

may have a trend component that is quadratic in nature. Which pattern of data will indicate that the trend in the time series data is quadratic in nature?

Naive Bayesian classifier

Decision tree

Linear regression

K-means clustering

Question # 16

Suppose A, B , and C are events. The probability of A given B , relative to P(|C), is the same as the probability of A given B and C (relative to P ). That is,

P(A,B|C) P(B|C) =P(A|B,C)

P(A,B|C) P(B|C) =P(B|A,C)

P(A,B|C) P(B|C) =P(C|B,C)

P(A,B|C) P(B|C) =P(A|C,B)

Question # 17

Question-3: In machine learning, feature hashing, also known as the hashing trick (by analogy to the kernel trick), is a fast and space-efficient way of vectorizing features (such as the words in a language), i.e., turning arbitrary features into indices in a vector or matrix. It works by applying a hash function to the features and using their hash values modulo the number of features as indices directly, rather than looking the indices up in an associative array. So what is the primary reason of the hashing trick for building classifiers?

It creates the smaller models

It requires the lesser memory to store the coefficients for the model

It reduces the non-significant features e.g. punctuations

Noisy features are removed

Question # 18

Which of the following skills a data scientists required?

Web designing to represent best visuals of its results from algorithm.

He should be creative

Should possess good programming skills

Should be very good at mathematics and statistic

He should possess database administrative skills.

Question # 19

Which of the following is not a correct application for the Classification?

credit scoring

tumor detection

image recognition

drug discovery

Question # 20

Google Adwords studies the number of men, and women, clicking the advertisement on search

engine during the midnight for an hour each day.

Google find that the number of men that click can be modeled as a random variable with distribution

Poisson(X), and likewise the number of women that click as Poisson(Y).

What is likely to be the best model of the total number of advertisement clicks during the midnight for an hour ?

Binomial(X+Y,X+Y)

Poisson(X/Y)

Normal(X+Y(M+Y)1/2)

Poisson(X+Y)

Winter Sale Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: ecus65

Databricks Databricks-Certified-Professional-Data-Scientist - Databricks Certified Professional Data Scientist Exam

The Answer Is:

Explanation:

The Answer Is:

Explanation:

The Answer Is:

Explanation:

The Answer Is:

The Answer Is:

Explanation:

The Answer Is:

Explanation:

The Answer Is:

Explanation:

The Answer Is:

Explanation:

The Answer Is:

Explanation:

The Answer Is:

Explanation: