Summer Sale Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: ecus65

Databricks Databricks-Certified-Professional-Data-Scientist - Databricks Certified Professional Data Scientist Exam

You are working on a Data Science project and during the project you have been gibe a responsibility to interview all the stakeholders in the project. In which phase of the project you are?

A.

Discovery

B.

Data Preparations

C.

Creating Models

D.

Executing Models

E.

Creating visuals from the outcome

F.

Operationnalise the models

Select the sequence of the developing machine learning applications

A) Analyze the input data

B) Prepare the input data

C) Collect data

D) Train the algorithm

E) Test the algorithm

F) Use It

A.

A, B, C, D, E, F

B.

C, B, A, D, E, F

C.

C, A, B, D, E, F

D.

C, B, A, D, E, F

Assume some output variable "y" is a linear combination of some independent input variables "A" plus some independent noise "e". The way the independent variables are combined is defined by a parameter vector B y=AB+e where X is an m x n matrix. B is a vector of n unknowns, and b is a vector of m values. Assuming that m is not equal to n and the columns of X are linearly independent, which expression correctly solves for B?

A.

Option A

B.

Option B

C.

Option C

D.

Option D

Question-34. Stories appear in the front page of Digg as they are "voted up" (rated positively) by the community. As the community becomes larger and more diverse, the promoted stories can better reflect the average interest of the community members. Which of the following technique is used to make such recommendation engine?

A.

Naive Bayes classifier

B.

Collaborative filtering

C.

Logistic Regression

D.

Content-based filtering

Select the correct statement which applies to K-Nearest Neighbors

A.

No Assumption about the data

B.

Computationally expensive

C.

Require less memory

D.

Works with Numeric Values

Classification and regression are examples of___________.

A.

supervised learning

B.

un-supervised learning

C.

Clustering

D.

Density estimation

Your company has organized an online campaign for feedback on product quality and you have all the responses for the product reviews, in the response form people have check box as well as text field. Now you know that people who do not fill in or write non-dictionary word in the text field are not considered valid feedback. People who fill in text field with proper English words are considered valid response. Which of the following method you should not use to identify whether the response is valid or not?

A.

Naive Bayes

B.

Logistic Regression

C.

Random Decision Forests

D.

Any one of the above

Spam filtering of the emails is an example of

A.

Supervised learning

B.

Unsupervised learning

C.

Clustering

D.

1 and 3 are correct

E.

2 and 3 are correct

You are creating a model for the recommending the book at Amazon.com, so which of the following recommender system you will use you don't have cold start problem?

A.

Naive Bayes classifier

B.

Item-based collaborative filtering

C.

User-based collaborative filtering

D.

Content-based filtering

You are using k-means clustering to classify heart patients for a hospital. You have chosen Patient Sex, Height, Weight, Age and Income as measures and have used 3 clusters. When you create a pair-wise plot of the clusters, you notice that there is significant overlap between the clusters. What should you do?

A.

Identify additional measures to add to the analysis

B.

Remove one of the measures

C.

Decrease the number of clusters

D.

Increase the number of clusters