Google Professional-Machine-Learning-Engineer - Google Professional Machine Learning Engineer

Google Professional-Machine-Learning-Engineer Premium Access Download Demo

Page: 3 / 9
Total 285 questions

You developed a Transformer model in TensorFlow to translate text Your training data includes millions of documents in a Cloud Storage bucket. You plan to use distributed training to reduce training time. You need to configure the training job while minimizing the effort required to modify code and to manage the clusters configuration. What should you do?

Create a Vertex Al custom training job with GPU accelerators for the second worker pool Use tf .distribute.MultiWorkerMirroredStrategy for distribution.

Create a Vertex Al custom distributed training job with Reduction Server Use N1 high-memory machine type instances for the first and second pools, and use N1 high-CPU machine type instances for the third worker pool.

Create a training job that uses Cloud TPU VMs Use tf.distribute.TPUStrategy for distribution.

Create a Vertex Al custom training job with a single worker pool of A2 GPU machine type instances Use tf .distribute.MirroredStraregy for distribution.

Question # 22

You have recently used TensorFlow to train a classification model on tabular data You have created a Dataflow pipeline that can transform several terabytes of data into training or prediction datasets consisting of TFRecords. You now need to productionize the model, and you want the predictions to be automatically uploaded to a BigQuery table on a weekly schedule. What should you do?

Import the model into Vertex Al and deploy it to a Vertex Al endpoint On Vertex Al Pipelines create a pipeline that uses the Dataf lowPythonJobop and the Mcdei3archPredictoc components.

Import the model into Vertex Al and deploy it to a Vertex Al endpoint Create a Dataflow pipeline that reuses the data processing logic sends requests to the endpoint and then uploads predictions to a BigQuery table.

Import the model into Vertex Al On Vertex Al Pipelines, create a pipeline that uses the DatafIowPythonJobOp and the ModelBatchPredictOp components.

Import the model into BigQuery Implement the data processing logic in a SQL query On Vertex Al Pipelines create a pipeline that uses the BigqueryQueryJobop and the EigqueryPredictModejobOp components.

Explanation:

Â Vertex AI is a service that allows you to create and train ML models using Google Cloud technologies. You can use Vertex AI to import the model that you trained with TensorFlow and store it in the Vertex AI Model Registry. The Vertex AI Model Registry is a service that allows you to store and manage your ML models on Google Cloud. You can then use Vertex AI Pipelines to create a pipeline that uses the DataflowPythonJobOp and the ModelBatchPredictOp components. The DataflowPythonJobOp component is a component that allows you to run a Dataflow job using a Python script. Dataflow is a service that allows you to create and run scalable and portable data processing pipelines on Google Cloud. You can use the DataflowPythonJobOp component to reuse the data processing logic that you created for transforming the data into TFRecords. The ModelBatchPredictOp component is a component that allows you to run a batch prediction job using a model from the Vertex AI Model Registry. Batch prediction is a type of prediction that provides high-throughput responses to large batches of input data. You can use the ModelBatchPredictOp component to make predictions using the TFRecords from the DataflowPythonJobOp component and the model from the Vertex AI Model Registry. You can also configure the ModelBatchPredictOp component to automatically upload the predictions to a BigQuery table. BigQuery is a service that allows you to store and query large amounts of data in a scalable and cost-effective way. You can use BigQuery to store and analyze the predictions from your model. You can also schedule the pipeline to run on a weekly basis, so that the predictions are updated regularly. By using Vertex AI, Vertex AI Pipelines, Dataflow, and BigQuery, you can productionize the model and upload the predictions to a BigQuery table on a weekly schedule.Â References:

Vertex AI documentation

Vertex AI Pipelines documentation

Dataflow documentation

BigQuery documentation

Preparing for Google Cloud Certification: Machine Learning Engineer Professional Certificate

Question # 23

You are responsible for building a unified analytics environment across a variety of on-premises data marts. Your company is experiencing data quality and security challenges when integrating data across the servers, caused by the use of a wide range of disconnected tools and temporary solutions. You need a fully managed, cloud-native data integration service that will lower the total cost of work and reduce repetitive work. Some members on your team prefer a codeless interface for building Extract, Transform, Load (ETL) process. Which service should you use?

Dataflow

Dataprep

Apache Flink

Cloud Data Fusion

Question # 24

You developed a custom model by using Vertex Al to forecast the sales of your company s products based on historical transactional data You anticipate changes in the feature distributions and the correlations between the features in the near future You also expect to receive a large volume of prediction requests You plan to use Vertex Al Model Monitoring for drift detection and you want to minimize the cost. What should you do?

Use the features for monitoring Set a monitoring- frequency value that is higher than the default.

Use the features for monitoring Set a prediction-sampling-rare value that is closer to 1 than 0.

Use the features and the feature attributions for monitoring. Set a monitoring-frequency value that is lower than the default.

Use the features and the feature attributions for monitoring Set a prediction-sampling-rate value that is closer to 0 than 1.

Explanation:

The best option for using Vertex AI Model Monitoring for drift detection and minimizing the cost is to use the features and the feature attributions for monitoring, and set a prediction-sampling-rate value that is closer to 0 than 1. This option allows you to leverage the power and flexibility of Google Cloud to detect feature drift in the input predict requests for custom models, and reduce the storage and computation costs of the model monitoring job. Vertex AI Model Monitoring is a service that can track and compare the results of multiple machine learning runs. Vertex AI Model Monitoring can monitor the modelâ€™s prediction input data for feature skew and drift. Feature drift occurs when the feature data distribution in production changes over time. If the original training data is not available, you can enable drift detection to monitor your models for feature drift. Vertex AI Model Monitoring uses TensorFlow Data Validation (TFDV) to calculate the distributions and distance scores for each feature, and compares them with a baseline distribution. The baseline distribution is the statistical distribution of the featureâ€™s values in the training data. If the training data is not available, the baseline distribution is calculated from the first 1000 prediction requests that the model receives. If the distance score for a feature exceeds an alerting threshold that you set, Vertex AI Model Monitoring sends you an email alert. However, if you use a custom model, you can also enable feature attribution monitoring, which can provide more insights into the feature drift. Feature attribution monitoring analyzes the feature attributions, which are the contributions of each feature to the prediction output. Feature attribution monitoring can help you identify the features that have the most impact on the model performance, and the features that have the most significant drift over time.Â Feature attribution monitoring can also help you understand the relationship between the features and the prediction output, and the correlation between the features1. The prediction-sampling-rate is a parameter that determines the percentage of prediction requests that are logged and analyzed by the model monitoring job. Using a lower prediction-sampling-rate can reduce the storage and computation costs of the model monitoring job, but also the quality and validity of the data. Using a lower prediction-sampling-rate can introduce sampling bias and noise into the data, and make the model monitoring job miss some important features or patterns of the data. However, using a higher prediction-sampling-rate can increase the storage and computation costs of the model monitoring job, and also the amount of data that needs to be processed and analyzed.Â Therefore, there is a trade-off between the prediction-sampling-rate and the cost and accuracy of the model monitoring job, and the optimal prediction-sampling-rate depends on the business objective and the data characteristics2. By using the features and the feature attributions for monitoring, and setting a prediction-sampling-rate value that is closer to 0 than 1, you can use Vertex AI Model Monitoring for drift detection and minimize the cost.

The other options are not as good as option D, for the following reasons:

Option A: Using the features for monitoring and setting a monitoring-frequency value that is higher than the default would not enable feature attribution monitoring, and could increase the cost of the model monitoring job. The monitoring-frequency is a parameter that determines how often the model monitoring job analyzes the logged prediction requests and calculates the distributions and distance scores for each feature. Using a higher monitoring-frequency can increase the frequency and timeliness of the model monitoring job, but also the computation costs of the model monitoring job.Â Moreover, using the features for monitoring would not enable feature attribution monitoring, which can provide more insights into the feature drift and the model performance1.

Option B: Using the features for monitoring and setting a prediction-sampling-rate value that is closer to 1 than 0 would not enable feature attribution monitoring, and could increase the cost of the model monitoring job. The prediction-sampling-rate is a parameter that determines the percentage of prediction requests that are logged and analyzed by the model monitoring job. Using a higher prediction-sampling-rate can increase the quality and validity of the data, but also the storage and computation costs of the model monitoring job.Â Moreover, using the features for monitoring would not enable feature attribution monitoring, which can provide more insights into the feature drift and the model performance12.

Option C: Using the features and the feature attributions for monitoring and setting a monitoring-frequency value that is lower than the default would enable feature attribution monitoring, but could reduce the frequency and timeliness of the model monitoring job. The monitoring-frequency is a parameter that determines how often the model monitoring job analyzes the logged prediction requests and calculates the distributions and distance scores for each feature. Using a lower monitoring-frequency can reduce the computation costs of the model monitoring job, but also the frequency and timeliness of the model monitoring job.Â This can make the model monitoring job less responsive and effective in detecting and alerting the feature drift1.

References:

Preparing for Google Cloud Certification: Machine Learning Engineer, Course 3: Production ML Systems, Week 4: Evaluation

Google Cloud Professional Machine Learning Engineer Exam Guide, Section 3: Scaling ML models in production, 3.3 Monitoring ML models in production

Official Google Cloud Certified Professional Machine Learning Engineer Study Guide, Chapter 6: Production ML Systems, Section 6.3: Monitoring ML Models

Using Model Monitoring

Understanding the score threshold slider

Question # 25

You are developing a recommendation engine for an online clothing store. The historical customer transaction data is stored in BigQuery and Cloud Storage. You need to perform exploratory data analysis (EDA), preprocessing and model training. You plan to rerun these EDA, preprocessing, and training steps as you experiment with different types of algorithms. You want to minimize the cost and development effort of running these steps as you experiment. How should you configure the environment?

Create a Vertex Al Workbench user-managed notebook using the default VM instance, and use the %%bigquery magic commands in Jupyter to query the tables.

Create a Vertex Al Workbench managed notebook to browse and query the tables directly from the JupyterLab interface.

Create a Vertex Al Workbench user-managed notebook on a Dataproc Hub. and use the %%bigquery magic commands in Jupyter to query the tables.

Create a Vertex Al Workbench managed notebook on a Dataproc cluster, and use the spark-bigquery-connector to access the tables.

Question # 26

You are training a TensorFlow model on a structured data set with 100 billion records stored in several CSV files. You need to improve the input/output execution performance. What should you do?

Load the data into BigQuery and read the data from BigQuery.

Load the data into Cloud Bigtable, and read the data from Bigtable

Convert the CSV files into shards of TFRecords, and store the data in Cloud Storage

Convert the CSV files into shards of TFRecords, and store the data in the Hadoop Distributed File System (HDFS)

Explanation:

Â The input/output execution performance of a TensorFlow model depends on how efficiently the model can read and process the data from the data source. Reading and processing data from CSV files can be slow and inefficient, especially if the data is large and distributed. Therefore, to improve the input/output execution performance, one should use a more suitable data format and storage system.

One of the best options for improving the input/output execution performance is to convert the CSV files into shards of TFRecords, and store the data in Cloud Storage. TFRecord is a binary data format that can store a sequence of serialized TensorFlow examples. TFRecord has several advantages over CSV, such as:

Faster data loading: TFRecord can be read and processed faster than CSV, as it avoids the overhead of parsing and decoding the text data.Â TFRecord also supports compression and checksums, which can reduce the data size and ensure data integrity1

Better performance: TFRecord can improve the performance of the model, as it allows the model to access the data in a sequential and streaming manner, and leverage the tf.data API to build efficient data pipelines.Â TFRecord also supports sharding and interleaving, which can increase the parallelism and throughput of the data processing2

Easier integration: TFRecord can integrate seamlessly with TensorFlow, as it is the native data format for TensorFlow.Â TFRecord also supports various types of data, such as images, text, audio, and video, and can store the data schema and metadata along with the data3

Cloud Storage is a scalable and reliable object storage service that can store any amount of data. Cloud Storage has several advantages over other storage systems, such as:

High availability: Cloud Storage can provide high availability and durability for the data, as it replicates the data across multiple regions and zones, and supports versioning and lifecycle management.Â Cloud Storage also offers various storage classes, such as Standard, Nearline, Coldline, and Archive, to meet different performance and cost requirements4

Low latency: Cloud Storage can provide low latency and high bandwidth for the data, as it supports HTTP and HTTPS protocols, and integrates with other Google Cloud services, such as AI Platform, Dataflow, and BigQuery.Â Cloud Storage also supports resumable uploads and downloads, and parallel composite uploads, which can improve the data transfer speed and reliability5

Easy access: Cloud Storage can provide easy access and management for the data, as it supports various tools and libraries, such as gsutil, Cloud Console, and Cloud Storage Client Libraries. Cloud Storage also supports fine-grained access control and encryption, which can ensure the data security and privacy.

The other options are not as effective or feasible. Loading the data into BigQuery and reading the data from BigQuery is not recommended, as BigQuery is mainly designed for analytical queries on large-scale data, and does not support streaming or real-time data processing. Loading the data into Cloud Bigtable and reading the data from Bigtable is not ideal, as Cloud Bigtable is mainly designed for low-latency and high-throughput key-value operations on sparse and wide tables, and does not support complex data types or schemas. Converting the CSV files into shards of TFRecords and storing the data in the Hadoop Distributed File System (HDFS) is not optimal, as HDFS is not natively supported by TensorFlow, and requires additional configuration and dependencies, such as Hadoop, Spark, or Beam.

References:Â 1:Â TFRecord and tf.ExampleÂ 2:Â Better performance with the tf.data APIÂ 3:Â TensorFlow Data ValidationÂ 4:Â Cloud Storage overviewÂ 5:Â PerformanceÂ : [How-to guides]

Question # 27

You trained a model, packaged it with a custom Docker container for serving, and deployed it to Vertex Al Model Registry. When you submit a batch prediction job, it fails with this error "Error model server never became ready Please validate that your model file or container configuration are valid. There are no additional errors in the logs What should you do?

Add a logging configuration to your application to emit logs to Cloud Logging.

Change the HTTP port in your model's configuration to the default value of 8080

Change the health Route value in your models configuration to /heal thcheck.

Pull the Docker image locally and use the decker run command to launch it locally. Use the docker logs command to explore the error logs.

Question # 28

Your company stores a large number of audio files of phone calls made to your customer call center in an on-premises database. Each audio file is in wav format and is approximately 5 minutes long. You need to analyze these audio files for customer sentiment. You plan to use the Speech-to-Text API. You want to use the most efficient approach. What should you do?

1 Upload the audio files to Cloud Storage

2. Call the speech: Iongrunningrecognize API endpoint to generate transcriptions

3. Call the predict method of an AutoML sentiment analysis model to analyze the transcriptions

1 Upload the audio files to Cloud Storage

2 Call the speech: Iongrunningrecognize API endpoint to generate transcriptions.

3 Create a Cloud Function that calls the Natural Language API by using the analyzesentiment method

1 Iterate over your local Tiles in Python

2. Use the Speech-to-Text Python library to create a speech.RecognitionAudio object and set the content to the audio file data

3. Call the speech: recognize API endpoint to generate transcriptions

4. Call the predict method of an AutoML sentiment analysis model to analyze the transcriptions

1 Iterate over your local files in Python

2 Use the Speech-to-Text Python Library to create a speech.RecognitionAudio object, and set the content to the audio file data

3. Call the speech: lengrunningrecognize API endpoint to generate transcriptions

4 Call the Natural Language API by using the analyzesenriment method

Question # 29

You have deployed a scikit-learn model to a Vertex Al endpoint using a custom model server. You enabled auto scaling; however, the deployed model fails to scale beyond one replica, which led to dropped requests. You notice that CPU utilization remains low even during periods of high load. What should you do?

Attach a GPU to the prediction nodes.

Increase the number of workers in your model server.

Schedule scaling of the nodes to match expected demand.

Increase the minReplicaCount in your DeployedModel configuration.

Explanation:

Auto scaling is a feature that allows you to automatically adjust the number of prediction nodes based on the traffic and load of your deployed model1.Â However, auto scaling depends on the CPU utilization of your prediction nodes, which is the percentage of CPU resources used by your model server1.Â If your CPU utilization is low, even during periods of high load, it means that your model server is not fully utilizing the available CPU resources, and thus auto scaling will not trigger more replicas2.

One possible reason for low CPU utilization is that your model server is using a single worker process to handle prediction requests3.Â A worker process is a subprocess that runs your model code and handles prediction requests3.Â If you have only one worker process, it can only handle one request at a time, which can lead to dropped requests when the traffic is high3.Â To increase the CPU utilization and the throughput of your model server, you can increase the number of worker processes, which will allow your model server to handle multiple requests in parallel3.

To increase the number of workers in your model server, you need to modify your custom model server code and use theÂ --workersÂ flag to specify the number of worker processes you want to use3. For example, if you are using a Gunicorn server, you can use the following command to start your model server with four worker processes:

gunicorn --bind :$PORT --workers 4 --threads 1 --timeout 60 main:app

By increasing the number of workers in your model server, you can increase the CPU utilization of your prediction nodes, and thus enable auto scaling to scale beyond one replica.

The other options are not suitable for your scenario, because they either do not address the root cause of low CPU utilization, such as attaching a GPU or scheduling scaling, or they do not enable auto scaling, such as increasing the minReplicaCount, which is a fixed number of nodes that will always run regardless of the traffic1.

References:

Scaling prediction nodes | Vertex AI | Google Cloud

Troubleshooting | Vertex AI | Google Cloud

Using a custom prediction routine with online prediction | Vertex AI | Google Cloud

Question # 30

You are building an ML model to detect anomalies in real-time sensor data. You will use Pub/Sub to handle incoming requests. You want to store the results for analytics and visualization. How should you configure the pipeline?

1 = Dataflow, 2 - Al Platform, 3 = BigQuery

1 = DataProc, 2 = AutoML, 3 = Cloud Bigtable

1 = BigQuery, 2 = AutoML, 3 = Cloud Functions

1 = BigQuery, 2 = Al Platform, 3 = Cloud Storage

Weekend Sale Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: xmas50

Google Professional-Machine-Learning-Engineer - Google Professional Machine Learning Engineer

The Answer Is:

Explanation:

The Answer Is:

Explanation:

The Answer Is:

Explanation:

The Answer Is:

Explanation:

The Answer Is:

Explanation:

The Answer Is:

Explanation:

The Answer Is:

Explanation:

The Answer Is:

Explanation:

The Answer Is:

Explanation:

The Answer Is:

Explanation: