Summer Sale Special Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: xmas50

Amazon Web Services MLA-C01 - AWS Certified Machine Learning Engineer - Associate

Page: 4 / 8
Total 241 questions

A company has deployed an XGBoost prediction model in production to predict if a customer is likely to cancel a subscription. The company uses Amazon SageMaker Model Monitor to detect deviations in the F1 score.

During a baseline analysis of model quality, the company recorded a threshold for the F1 score. After several months of no change, the model ' s F1 score decreases significantly.

What could be the reason for the reduced F1 score?

A.

Concept drift occurred in the underlying customer data that was used for predictions.

B.

The model was not sufficiently complex to capture all the patterns in the original baseline data.

C.

The original baseline data had a data quality issue of missing values.

D.

Incorrect ground truth labels were provided to Model Monitor during the calculation of the baseline.

A company uses a batching solution to process daily analytics. The company wants to provide near real-time updates, use open-source technology, and avoid managing or scaling infrastructure.

Which solution will meet these requirements?

A.

Create Amazon Managed Streaming for Apache Kafka (Amazon MSK) Serverless clusters.

B.

Create Amazon MSK Provisioned clusters.

C.

Create Amazon Kinesis Data Streams with Application Auto Scaling.

D.

Create self-hosted Apache Flink applications on Amazon EC2.

A company uses Amazon SageMaker for its ML workloads. The company ' s ML engineer receives a 50 MB Apache Parquet data file to build a fraud detection model. The file includes several correlated columns that are not required.

What should the ML engineer do to drop the unnecessary columns in the file with the LEAST effort?

A.

Download the file to a local workstation. Perform one-hot encoding by using a custom Python script.

B.

Create an Apache Spark job that uses a custom processing script on Amazon EMR.

C.

Create a SageMaker processing job by calling the SageMaker Python SDK.

D.

Create a data flow in SageMaker Data Wrangler. Configure a transform step.

Case Study

A company is building a web-based AI application by using Amazon SageMaker. The application will provide the following capabilities and features: ML experimentation, training, a

central model registry, model deployment, and model monitoring.

The application must ensure secure and isolated use of training data during the ML lifecycle. The training data is stored in Amazon S3.

The company must implement a manual approval-based workflow to ensure that only approved models can be deployed to production endpoints.

Which solution will meet this requirement?

A.

Use SageMaker Experiments to facilitate the approval process during model registration.

B.

Use SageMaker ML Lineage Tracking on the central model registry. Create tracking entities for the approval process.

C.

Use SageMaker Model Monitor to evaluate the performance of the model and to manage the approval.

D.

Use SageMaker Pipelines. When a model version is registered, use the AWS SDK to change the approval status to " Approved. "

An ML engineer needs to process thousands of existing CSV objects and new CSV objects that are uploaded. The CSV objects are stored in a central Amazon S3 bucket and have the same number of columns. One of the columns is a transaction date. The ML engineer must query the data based on the transaction date.

Which solution will meet these requirements with the LEAST operational overhead?

A.

Use an Amazon Athena CREATE TABLE AS SELECT (CTAS) statement to create a table based on the transaction date from data in the central S3 bucket. Query the objects from the table.

B.

Create a new S3 bucket for processed data. Set up S3 replication from the central S3 bucket to the new S3 bucket. Use S3 Object Lambda to query the objects based on transaction date.

C.

Create a new S3 bucket for processed data. Use AWS Glue for Apache Spark to create a job to query the CSV objects based on transaction date. Configure the job to store the results in the new S3 bucket. Query the objects from the new S3 bucket.

D.

Create a new S3 bucket for processed data. Use Amazon Data Firehose to transfer the data from the central S3 bucket to the new S3 bucket. Configure Firehose to run an AWS Lambda function to query the data based on transaction date.

A company needs to host a custom ML model to perform forecast analysis. The forecast analysis will occur with predictable and sustained load during the same 2-hour period every day.

Multiple invocations during the analysis period will require quick responses. The company needs AWS to manage the underlying infrastructure and any auto scaling activities.

Which solution will meet these requirements?

A.

Schedule an Amazon SageMaker batch transform job by using AWS Lambda.

B.

Configure an Auto Scaling group of Amazon EC2 instances to use scheduled scaling.

C.

Use Amazon SageMaker Serverless Inference with provisioned concurrency.

D.

Run the model on an Amazon Elastic Kubernetes Service (Amazon EKS) cluster on Amazon EC2 with pod auto scaling.

A company has an ML model that generates text descriptions based on images that customers upload to the company ' s website. The images can be up to 50 MB in total size.

An ML engineer decides to store the images in an Amazon S3 bucket. The ML engineer must implement a processing solution that can scale to accommodate changes in demand.

Which solution will meet these requirements with the LEAST operational overhead?

A.

Create an Amazon SageMaker batch transform job to process all the images in the S3 bucket.

B.

Create an Amazon SageMaker Asynchronous Inference endpoint and a scaling policy. Run a script to make an inference request for each image.

C.

Create an Amazon Elastic Kubernetes Service (Amazon EKS) cluster that uses Karpenter for auto scaling. Host the model on the EKS cluster. Run a script to make an inference request for each image.

D.

Create an AWS Batch job that uses an Amazon Elastic Container Service (Amazon ECS) cluster. Specify a list of images to process for each AWS Batch job.

A company has an application that uses different APIs to generate embeddings for input text. The company needs to implement a solution to automatically rotate the API tokens every 3 months.

Which solution will meet this requirement?

A.

Store the tokens in AWS Secrets Manager. Create an AWS Lambda function to perform the rotation.

B.

Store the tokens in AWS Systems Manager Parameter Store. Create an AWS Lambda function to perform the rotation.

C.

Store the tokens in AWS Key Management Service (AWS KMS). Use an AWS managed key to perform the rotation.

D.

Store the tokens in AWS Key Management Service (AWS KMS). Use an AWS owned key to perform the rotation.

An ML engineer at a credit card company built and deployed an ML model by using Amazon SageMaker AI. The model was trained on transaction data that contained very few fraudulent transactions. After deployment, the model is underperforming.

What should the ML engineer do to improve the model’s performance?

A.

Retrain the model with a different SageMaker built-in algorithm.

B.

Use random undersampling to reduce the majority class and retrain the model.

C.

Use Synthetic Minority Oversampling Technique (SMOTE) to generate synthetic minority samples and retrain the model.

D.

Use random oversampling to duplicate minority samples and retrain the model.

An ML model is deployed in production. The model has performed well and has met its metric thresholds for months.

An ML engineer who is monitoring the model observes a sudden degradation. The performance metrics of the model are now below the thresholds.

What could be the cause of the performance degradation?

A.

Lack of training data

B.

Drift in production data distribution

C.

Compute resource constraints

D.

Model overfitting