Cloud Tech Exam

#11 Single Choice

Case study -
An ML engineer is developing a fraud detection model on AWS. The training dataset includes transaction logs, customer profiles, and tables from
an on-premises MySQL database. The transaction logs and customer profiles are stored in Amazon S3.
The dataset has a class imbalance that affects the learning of the model's algorithm. Additionally, many of the features have interdependencies.
The algorithm is not capturing all the desired underlying patterns in the data.
After the data is aggregated, the ML engineer must implement a solution to automatically detect anomalies in the data and to visualize the result.
Which solution will meet these requirements?

A.

Use Amazon Athena to automatically detect the anomalies and to visualize the result.

B.

Use Amazon Redshift Spectrum to automatically detect the anomalies. Use Amazon QuickSight to visualize the result.

C.

Use Amazon SageMaker Data Wrangler to automatically detect the anomalies and to visualize the result. Most Voted

D.

Use AWS Batch to automatically detect the anomalies. Use Amazon QuickSight to visualize the result.

#12 Single Choice

Case study -
An ML engineer is developing a fraud detection model on AWS. The training dataset includes transaction logs, customer profiles, and tables from
an on-premises MySQL database. The transaction logs and customer profiles are stored in Amazon S3.
The dataset has a class imbalance that affects the learning of the model's algorithm. Additionally, many of the features have interdependencies.
The algorithm is not capturing all the desired underlying patterns in the data.
The training dataset includes categorical data and numerical data. The ML engineer must prepare the training dataset to maximize the accuracy
of the model.
Which action will meet this requirement with the LEAST operational overhead?

A.

Use AWS Glue to transform the categorical data into numerical data.

B.

Use AWS Glue to transform the numerical data into categorical data.

C.

Use Amazon SageMaker Data Wrangler to transform the categorical data into numerical data. Most Voted

D.

Use Amazon SageMaker Data Wrangler to transform the numerical data into categorical data.

#13 Single Choice

Case study -
An ML engineer is developing a fraud detection model on AWS. The training dataset includes transaction logs, customer profiles, and tables from
an on-premises MySQL database. The transaction logs and customer profiles are stored in Amazon S3.
The dataset has a class imbalance that affects the learning of the model's algorithm. Additionally, many of the features have interdependencies.
The algorithm is not capturing all the desired underlying patterns in the data.
Before the ML engineer trains the model, the ML engineer must resolve the issue of the imbalanced data.
Which solution will meet this requirement with the LEAST operational effort?

A.

Use Amazon Athena to identify patterns that contribute to the imbalance. Adjust the dataset accordingly.

B.

Use Amazon SageMaker Studio Classic built-in algorithms to process the imbalanced dataset.

C.

Use AWS Glue DataBrew built-in features to oversample the minority class.

D.

Use the Amazon SageMaker Data Wrangler balance data operation to oversample the minority class. Most Voted

#14 Single Choice

Case study -
An ML engineer is developing a fraud detection model on AWS. The training dataset includes transaction logs, customer profiles, and tables from
an on-premises MySQL database. The transaction logs and customer profiles are stored in Amazon S3.
The dataset has a class imbalance that affects the learning of the model's algorithm. Additionally, many of the features have interdependencies.
The algorithm is not capturing all the desired underlying patterns in the data.
The ML engineer needs to use an Amazon SageMaker built-in algorithm to train the model.
Which algorithm should the ML engineer use to meet this requirement?

A.

LightGBM Most Voted

B.

Linear learner

C.

К-means clustering

D.

Neural Topic Model (NTM)

#15 Single Choice

A company has deployed an XGBoost prediction model in production to predict if a customer is likely to cancel a subscription. The company uses
Amazon SageMaker Model Monitor to detect deviations in the F1 score.
During a baseline analysis of model quality, the company recorded a threshold for the F1 score. After several months of no change, the model's F1
score decreases significantly.
What could be the reason for the reduced F1 score?

A.

Concept drift occurred in the underlying customer data that was used for predictions. Most Voted

B.

The model was not sufficiently complex to capture all the patterns in the original baseline data.

C.

The original baseline data had a data quality issue of missing values.

D.

Incorrect ground truth labels were provided to Model Monitor during the calculation of the baseline.

AWS Certified Machine Learning Engineer - Associate MLA-C01

Unlock All Questions