Summer Sale Special Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: xmas50

Amazon Web Services MLA-C01 - AWS Certified Machine Learning Engineer - Associate

Page: 5 / 8
Total 241 questions

An ML engineer needs to use an ML model to predict the price of apartments in a specific location.

Which metric should the ML engineer use to evaluate the model ' s performance?

A.

Accuracy

B.

Area Under the ROC Curve (AUC)

C.

F1 score

D.

Mean absolute error (MAE)

A company is building a near real-time data analytics application to detect anomalies and failures for industrial equipment. The company has thousands of IoT sensors that send data every 60 seconds. When new versions of the application are released, the company wants to ensure that application code bugs do not prevent the application from running.

Which solution will meet these requirements?

A.

Use Amazon Managed Service for Apache Flink with the system rollback capability enabled to build the data analytics application.

B.

Use Amazon Managed Service for Apache Flink with manual rollback when an error occurs to build the data analytics application.

C.

Use Amazon Data Firehose to deliver real-time streaming data programmatically for the data analytics application. Pause the stream when a new version of the application is released and resume the stream after the application is deployed.

D.

Use Amazon Data Firehose to deliver data to Amazon EC2 instances across two Availability Zones for the data analytics application.

An ML engineer is using Amazon SageMaker Canvas to build a custom ML model from an imported dataset. The model must make continuous numeric predictions based on 10 years of data.

Which metric should the ML engineer use to evaluate the model’s performance?

A.

Accuracy

B.

InferenceLatency

C.

Area Under the ROC Curve (AUC)

D.

Root Mean Square Error (RMSE)

A healthcare analytics company wants to segment patients into groups that have similar risk factors to develop personalized treatment plans. The company has a dataset that includes patient health records, medication history, and lifestyle changes. The company must identify the appropriate algorithm to determine the number of groups by using hyperparameters.

Which solution will meet these requirements?

A.

Use the Amazon SageMaker AI XGBoost algorithm. Set max_depth to control tree complexity for risk groups.

B.

Use the Amazon SageMaker k-means clustering algorithm. Set k to specify the number of clusters.

C.

Use the Amazon SageMaker AI DeepAR algorithm. Set epochs to determine the number of training iterations for risk groups.

D.

Use the Amazon SageMaker AI Random Cut Forest (RCF) algorithm. Set a contamination hyperparameter for risk anomaly detection.

A company has used Amazon SageMaker to deploy a predictive ML model in production. The company is using SageMaker Model Monitor on the model. After a model update, an ML engineer notices data quality issues in the Model Monitor checks.

What should the ML engineer do to mitigate the data quality issues that Model Monitor has identified?

A.

Adjust the model ' s parameters and hyperparameters.

B.

Initiate a manual Model Monitor job that uses the most recent production data.

C.

Create a new baseline from the latest dataset. Update Model Monitor to use the new baseline for evaluations.

D.

Include additional data in the existing training set for the model. Retrain and redeploy the model.

A company is using Amazon SageMaker AI to develop a credit risk assessment model. During model validation, the company finds that the model achieves 82% accuracy on the validation data. However, the model achieved 99% accuracy on the training data. The company needs to address the model accuracy issue before deployment.

Which solution will meet this requirement?

A.

Add more dense layers to increase model complexity. Implement batch normalization. Use early stopping during training.

B.

Implement dropout layers. Use L1 or L2 regularization. Perform k-fold cross-validation.

C.

Use principal component analysis (PCA) to reduce the feature dimensionality. Decrease model layers. Implement cross-entropy loss functions.

D.

Augment the training dataset. Remove duplicate records from the training dataset. Implement stratified sampling.

A company ' s ML engineer has deployed an ML model for sentiment analysis to an Amazon SageMaker endpoint. The ML engineer needs to explain to company stakeholders how the model makes predictions.

Which solution will provide an explanation for the model ' s predictions?

A.

Use SageMaker Model Monitor on the deployed model.

B.

Use SageMaker Clarify on the deployed model.

C.

Show the distribution of inferences from A/Ð’ testing in Amazon CloudWatch.

D.

Add a shadow endpoint. Analyze prediction differences on samples.

A company collects customer data every day. The company stores the data as compressed files in an Amazon S3 bucket that is partitioned by date. Every month, analysts download the data, process the data to check the data quality, and then upload the data to Amazon QuickSight dashboards.

An ML engineer needs to implement a solution to automatically check the data quality before the data is sent to QuickSight.

Which solution will meet these requirements with the LEAST operational overhead?

A.

Run an AWS Glue crawler every month to update the AWS Glue Data Catalog. Use AWS Glue Data Quality rules to check the data quality.

B.

Use an AWS Glue trigger to run an AWS Glue crawler every month to update the AWS Glue Data Catalog. Create an AWS Glue job that loads the data into a PySpark DataFrame. Configure the job to apply custom functions and to evaluate the data quality.

C.

Run Python scripts on an AWS Lambda function every month to evaluate data quality. Configure the S3 bucket to invoke the Lambda function when objects are added to the S3 bucket.

D.

Configure the S3 bucket to send event notifications to an Amazon Simple Queue Service (Amazon SQS) queue when objects are uploaded. Use Amazon CloudWatch insights every month for the SQS queue to evaluate the data quality.

A company runs an ML model on Amazon SageMaker AI. The company uses an automatic process that makes API calls to create training jobs for the model. The company has new compliance rules that prohibit the collection of aggregated metadata from training jobs.

Which solution will prevent SageMaker AI from collecting metadata from the training jobs?

A.

Opt out of metadata tracking for any training job that is submitted.

B.

Ensure that training jobs are running in a private subnet in a custom VPC.

C.

Encrypt the training data with an AWS Key Management Service (AWS KMS) customer managed key.

D.

Reconfigure the training jobs to use only AWS Nitro instances.

An ML engineer is setting up an Amazon SageMaker AI pipeline for an ML model. The pipeline must automatically initiate a retraining job if any data drift is detected.

How should the ML engineer set up the pipeline to meet this requirement?

A.

Use an AWS Glue crawler and an AWS Glue ETL job to detect data drift. Use AWS Glue triggers to automate the retraining job.

B.

Use Amazon Managed Service for Apache Flink to detect data drift. Use an AWS Lambda function to automate the retraining job.

C.

Use SageMaker Model Monitor to detect data drift. Use an AWS Lambda function to automate the retraining job.

D.

Use Amazon QuickSight anomaly detection to detect data drift. Use an AWS Step Functions workflow to automate the retraining job.