New Year Sale Special Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: xmas50

Google Professional-Data-Engineer - Google Professional Data Engineer Exam

Page: 7 / 8
Total 387 questions

What are two methods that can be used to denormalize tables in BigQuery?

A.

1) Split table into multiple tables; 2) Use a partitioned table

B.

1) Join tables into one table; 2) Use nested repeated fields

C.

1) Use a partitioned table; 2) Join tables into one table

D.

1) Use nested repeated fields; 2) Use a partitioned table

Which is the preferred method to use to avoid hotspotting in time series data in Bigtable?

A.

Field promotion

B.

Randomization

C.

Salting

D.

Hashing

Which of these statements about exporting data from BigQuery is false?

A.

To export more than 1 GB of data, you need to put a wildcard in the destination filename.

B.

The only supported export destination is Google Cloud Storage.

C.

Data can only be exported in JSON or Avro format.

D.

The only compression option available is GZIP.

Which of these sources can you not load data into BigQuery from?

A.

File upload

B.

Google Drive

C.

Google Cloud Storage

D.

Google Cloud SQL

Which of the following is NOT one of the three main types of triggers that Dataflow supports?

A.

Trigger based on element size in bytes

B.

Trigger that is a combination of other triggers

C.

Trigger based on element count

D.

Trigger based on time

Which row keys are likely to cause a disproportionate number of reads and/or writes on a particular node in a Bigtable cluster (select 2 answers)?

A.

A sequential numeric ID

B.

A timestamp followed by a stock symbol

C.

A non-sequential numeric ID

D.

A stock symbol followed by a timestamp

Why do you need to split a machine learning dataset into training data and test data?

A.

So you can try two different sets of features

B.

To make sure your model is generalized for more than just the training data

C.

To allow you to create unit tests in your code

D.

So you can use one dataset for a wide model and one for a deep model

To run a TensorFlow training job on your own computer using Cloud Machine Learning Engine, what would your command start with?

A.

gcloud ml-engine local train

B.

gcloud ml-engine jobs submit training

C.

gcloud ml-engine jobs submit training local

D.

You can't run a TensorFlow program on your own computer using Cloud ML Engine .

Google Cloud Bigtable indexes a single value in each row. This value is called the _______.

A.

primary key

B.

unique key

C.

row key

D.

master key

Which of the following is NOT true about Dataflow pipelines?

A.

Dataflow pipelines are tied to Dataflow, and cannot be run on any other runner

B.

Dataflow pipelines can consume data from other Google Cloud services

C.

Dataflow pipelines can be programmed in Java

D.

Dataflow pipelines use a unified programming model, so can work both with streaming and batch data sources