Databricks Databricks-Machine-Learning-Professional - Databricks Certified Machine Learning Professional

Databricks Databricks-Machine-Learning-Professional Premium Access Download Demo

Page: 1 / 2
Total 60 questions

In a continuous integration, continuous deployment (CI/CD) process for machine learning pipelines, which of the following events commonly triggers the execution of automated testing?

The launch of a new cost-efficient SQL endpoint

CI/CD pipelines are not needed for machine learning pipelines

The arrival of a new feature table in the Feature Store

The launch of a new cost-efficient job cluster

The arrival of a new model version in the MLflow Model Registry

Question # 2

Which of the following is a reason for using Jensen-Shannon (JS) distance over a Kolmogorov-Smirnov (KS) test for numeric feature drift detection?

All of these reasons

JS is not normalized or smoothed

None of these reasons

JS is more robust when working with large datasets

JS does not require any manual threshold or cutoff determinations

Question # 3

Which of the following machine learning model deployment paradigms is the most common for machine learning projects?

On-device

Streaming

Real-time

Batch

None of these deployments

Question # 4

A machine learning engineer is in the process of implementing a concept drift monitoring solution. They are planning to use the following steps:

1. Deploy a model to production and compute predicted values

2. Obtain the observed (actual) label values

3. _____

4. Run a statistical test to determine if there are changes over time

Which of the following should be completed as Step #3?

Obtain the observed values (actual) feature values

Measure the latency of the prediction time

Retrain the model

None of these should be completed as Step #3

Compute the evaluation metric using the observed and predicted values

Question # 5

A machine learning engineer has developed a model and registered it using the FeatureStoreClient fs. The model has model URI model_uri. The engineer now needs to perform batch inference on customer-level Spark DataFrame spark_df, but it is missing a few of the static features that were used when training the model. The customer_id column is the primary key of spark_df and the training set used when training and logging the model.

Which of the following code blocks can be used to compute predictions for spark_df when the missing feature values can be found in the Feature Store by searching for features by customer_id?

df = fs.get_missing_features(spark_df, model_uri)

fs.score_model(model_uri, df)

fs.score_model(model_uri, spark_df)

df = fs.get_missing_features(spark_df, model_uri)

fs.score_batch(model_uri, df)

df = fs.get_missing_features(spark_df)

fs.score_batch(model_uri, df)

fs.score_batch(model_uri, spark_df)

Explanation:

To compute predictions for spark_df when the missing feature values can be found in the Feature Store by searching for features by customer_id, you can use the following code block:

Python

# Get the missing features from the Feature Store using the model URI and the customer_id column

df = fs.get_missing_features(spark_df, model_uri, lookup_key="customer_id")

# Score the DataFrame using the model URI and the Feature Store Client

fs.score_batch(model_uri, df)

AI-generated code. Review and use carefully.Â More info on FAQ.

The fs.get_missing_features method takes a Spark DataFrame, a model URI, and a lookup key as arguments. It returns a new Spark DataFrame that contains the originalcolumns plus the missing features that are required by the model. The missing features are retrieved from the Feature Store by joining the DataFrame with the feature tables using the lookup key. The lookup key must match the primary key of the feature tables.Â The model URI must point to a registered model that was trained using features from the Feature Store1.

The fs.score_batch method takes a model URI and a Spark DataFrame as arguments. It applies the model to the DataFrame and returns a new Spark DataFrame that contains the original columns plus a prediction column.Â The model URI must point to a registered model that was trained using features from the Feature Store2.

The other options are incorrect because:

Option A: fs.score_model is not a valid method name, as it is missing an underscore.Â The correct method name is fs.score_batch2.

Option B: fs.score_model without getting the missing features will not work, as the model expects the DataFrame to have all the features that were used for training.Â The correct way is to use fs.get_missing_features before fs.score_batch12.

Option D: fs.score_batch without getting the missing features will not work, as the model expects the DataFrame to have all the features that were used for training.Â The correct way is to use fs.get_missing_features before fs.score_batch12.

Option E: fs.score_batch without specifying the lookup key will not work, as the fs.get_missing_features method requires a lookup key to join the DataFrame with the feature tables.Â The correct way is to use fs.get_missing_features with the lookup key â€œcustomer_idâ€ before fs.score_batch12.Â References:Â Get missing features,Â Score batch

Question # 6

Which of the following MLflow operations can be used to automatically calculate and log a Shapley feature importance plot?

mlflow.shap.log_explanation

None of these operations can accomplish the task.

mlflow.shap

mlflow.log_figure

client.log_artifact

Question # 7

A data scientist set up a machine learning pipeline to automatically log a data visualization with each run. They now want to view the visualizations in Databricks.

Which of the following locations in Databricks will show these data visualizations?

The MLflow Model RegistryModel paqe

The Artifacts section of the MLflow Experiment page

Logged data visualizations cannot be viewed in Databricks

The Artifacts section of the MLflow Run page

The Figures section of the MLflow Run page

Question # 8

A machine learning engineer is manually refreshing a model in an existing machine learning pipeline. The pipeline uses the MLflow Model Registry model "project". The machine learning engineer would like to add a new version of the model to "project".

Which of the following MLflow operations can the machine learning engineer use to accomplish this task?

mlflow.register_model

MlflowClient.update_registered_model

mlflow.add_model_version

MlflowClient.get_model_version

The machine learning engineer needs to create an entirely new MLflow Model Registry model

Question # 9

A machine learning engineer wants to move their model versionmodel_versionfor the MLflow Model Registry modelmodelfrom the Staging stage to the Production stage using MLflow Clientclient.

Which of the following code blocks can they use to accomplish the task?