Summer Sale Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: ecus65

Cloudera CCA-500 - Cloudera Certified Administrator for Apache Hadoop (CCAH)

Page: 1 / 2
Total 60 questions

On a cluster running CDH 5.0 or above, you use the hadoop fs –put command to write a 300MB file into a previously empty directory using an HDFS block size of 64 MB. Just after this command has finished writing 200 MB of this file, what would another use see when they look in directory?

A.

The directory will appear to be empty until the entire file write is completed on the cluster

B.

They will see the file with a ._COPYING_ extension on its name. If they view the file, they will see contents of the file up to the last completed block (as each 64MB block is written, that block becomes available)

C.

They will see the file with a ._COPYING_ extension on its name. If they attempt to view the file, they will get a ConcurrentFileAccessException until the entire file write is completed on the cluster

D.

They will see the file with its original name. If they attempt to view the file, they will get a ConcurrentFileAccessException until the entire file write is completed on the cluster

You suspect that your NameNode is incorrectly configured, and is swapping memory to disk. Which Linux commands help you to identify whether swapping is occurring? (Select all that apply)

A.

free

B.

df

C.

memcat

D.

top

E.

jps

F.

vmstat

G.

swapinfo

You have a Hadoop cluster HDFS, and a gateway machine external to the cluster from which clients submit jobs. What do you need to do in order to run Impala on the cluster and submit jobs from the command line of the gateway machine?

A.

Install the impalad daemon statestored daemon, and daemon on each machine in the cluster, and the impala shell on your gateway machine

B.

Install the impalad daemon, the statestored daemon, the catalogd daemon, and the impala shell on your gateway machine

C.

Install the impalad daemon and the impala shell on your gateway machine, and the statestored daemon and catalogd daemon on one of the nodes in the cluster

D.

Install the impalad daemon on each machine in the cluster, the statestored daemon and catalogd daemon on one machine in the cluster, and the impala shell on your gateway machine

E.

Install the impalad daemon, statestored daemon, and catalogd daemon on each machine in the cluster and on the gateway node

Choose three reasons why should you run the HDFS balancer periodically? (Choose three)

A.

To ensure that there is capacity in HDFS for additional data

B.

To ensure that all blocks in the cluster are 128MB in size

C.

To help HDFS deliver consistent performance under heavy loads

D.

To ensure that there is consistent disk utilization across the DataNodes

E.

To improve data locality MapReduce

Which is the default scheduler in YARN?

A.

YARN doesn’t configure a default scheduler, you must first assign an appropriate scheduler class in yarn-site.xml

B.

Capacity Scheduler

C.

Fair Scheduler

D.

FIFO Scheduler

You decide to create a cluster which runs HDFS in High Availability mode with automatic failover, using Quorum Storage. What is the purpose of ZooKeeper in such a configuration?

A.

It only keeps track of which NameNode is Active at any given time

B.

It monitors an NFS mount point and reports if the mount point disappears

C.

It both keeps track of which NameNode is Active at any given time, and manages the Edits file. Which is a log of changes to the HDFS filesystem

D.

If only manages the Edits file, which is log of changes to the HDFS filesystem

E.

Clients connect to ZooKeeper to determine which NameNode is Active

You are working on a project where you need to chain together MapReduce, Pig jobs. You also need the ability to use forks, decision points, and path joins. Which ecosystem project should you use to perform these actions?

A.

Oozie

B.

ZooKeeper

C.

HBase

D.

Sqoop

E.

HUE

Your company stores user profile records in an OLTP databases. You want to join these records with web server logs you have already ingested into the Hadoop file system. What is the best way to obtain and ingest these user records?

A.

Ingest with Hadoop streaming

B.

Ingest using Hive’s IQAD DATA command

C.

Ingest with sqoop import

D.

Ingest with Pig’s LOAD command

E.

Ingest using the HDFS put command

Which scheduler would you deploy to ensure that your cluster allows short jobs to finish within a reasonable time without starting long-running jobs?

A.

Complexity Fair Scheduler (CFS)

B.

Capacity Scheduler

C.

Fair Scheduler

D.

FIFO Scheduler

You’re upgrading a Hadoop cluster from HDFS and MapReduce version 1 (MRv1) to one running HDFS and MapReduce version 2 (MRv2) on YARN. You want to set and enforce version 1 (MRv1) to one running HDFS and MapReduce version 2 (MRv2) on YARN. You want to set and enforce a block size of 128MB for all new files written to the cluster after upgrade. What should you do?

A.

You cannot enforce this, since client code can always override this value

B.

Set dfs.block.size to 128M on all the worker nodes, on all client machines, and on the NameNode, and set the parameter to final

C.

Set dfs.block.size to 128 M on all the worker nodes and client machines, and set the parameter to final. You do not need to set this value on the NameNode

D.

Set dfs.block.size to 134217728 on all the worker nodes, on all client machines, and on the NameNode, and set the parameter to final

E.

Set dfs.block.size to 134217728 on all the worker nodes and client machines, and set the parameter to final. You do not need to set this value on the NameNode