Summer Sale Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: ecus65

Google Professional-Data-Engineer - Google Professional Data Engineer Exam

Page: 4 / 7
Total 376 questions

What is the recommended action to do in order to switch between SSD and HDD storage for your Google Cloud Bigtable instance?

A.

create a third instance and sync the data from the two storage types via batch jobs

B.

export the data from the existing instance and import the data into a new instance

C.

run parallel instances where one is HDD and the other is SDD

D.

the selection is final and you must resume using the same storage type

If you're running a performance test that depends upon Cloud Bigtable, all the choices except one below are recommended steps. Which is NOT a recommended step to follow?

A.

Do not use a production instance.

B.

Run your test for at least 10 minutes.

C.

Before you test, run a heavy pre-test for several minutes.

D.

Use at least 300 GB of data.

Which of the following job types are supported by Cloud Dataproc (select 3 answers)?

A.

Hive

B.

Pig

C.

YARN

D.

Spark

Which of the following is NOT a valid use case to select HDD (hard disk drives) as the storage for Google Cloud Bigtable?

A.

You expect to store at least 10 TB of data.

B.

You will mostly run batch workloads with scans and writes, rather than frequently executing random reads of a small number of rows.

C.

You need to integrate with Google BigQuery.

D.

You will not use the data to back a user-facing or latency-sensitive application.

Which Cloud Dataflow / Beam feature should you use to aggregate data in an unbounded data source every hour based on the time when the data entered the pipeline?

A.

An hourly watermark

B.

An event time trigger

C.

The with Allowed Lateness method

D.

A processing time trigger

Which of the following IAM roles does your Compute Engine account require to be able to run pipeline jobs?

A.

dataflow.worker

B.

dataflow.compute

C.

dataflow.developer

D.

dataflow.viewer

When you store data in Cloud Bigtable, what is the recommended minimum amount of stored data?

A.

500 TB

B.

1 GB

C.

1 TB

D.

500 GB

Which of the following are feature engineering techniques? (Select 2 answers)

A.

Hidden feature layers

B.

Feature prioritization

C.

Crossed feature columns

D.

Bucketization of a continuous feature

MJTelco is building a custom interface to share data. They have these requirements:

    They need to do aggregations over their petabyte-scale datasets.

    They need to scan specific time range rows with a very fast response time (milliseconds).

Which combination of Google Cloud Platform products should you recommend?

A.

Cloud Datastore and Cloud Bigtable

B.

Cloud Bigtable and Cloud SQL

C.

BigQuery and Cloud Bigtable

D.

BigQuery and Cloud Storage

MJTelco needs you to create a schema in Google Bigtable that will allow for the historical analysis of the last 2 years of records. Each record that comes in is sent every 15 minutes, and contains a unique identifier of the device and a data record. The most common query is for all the data for a given device for a given day. Which schema should you use?

A.

Rowkey: date#device_idColumn data: data_point

B.

Rowkey: dateColumn data: device_id, data_point

C.

Rowkey: device_idColumn data: date, data_point

D.

Rowkey: data_pointColumn data: device_id, date

E.

Rowkey: date#data_pointColumn data: device_id