Cloudera CCA-500 - Cloudera Certified Administrator for Apache Hadoop (CCAH)

Cloudera CCA-500 Premium Access Download Demo

Page: 1 / 2
Total 60 questions

On a cluster running CDH 5.0 or above, you use the hadoop fs â€“put command to write a 300MB file into a previously empty directory using an HDFS block size of 64 MB. Just after this command has finished writing 200 MB of this file, what would another use see when they look in directory?

The directory will appear to be empty until the entire file write is completed on the cluster

They will see the file with a ._COPYING_ extension on its name. If they view the file, they will see contents of the file up to the last completed block (as each 64MB block is written, that block becomes available)

They will see the file with a ._COPYING_ extension on its name. If they attempt to view the file, they will get a ConcurrentFileAccessException until the entire file write is completed on the cluster

They will see the file with its original name. If they attempt to view the file, they will get a ConcurrentFileAccessException until the entire file write is completed on the cluster

Question # 2

You suspect that your NameNode is incorrectly configured, and is swapping memory to disk. Which Linux commands help you to identify whether swapping is occurring? (Select all that apply)

free

memcat

top

jps

vmstat

swapinfo

Question # 3

You have a Hadoop cluster HDFS, and a gateway machine external to the cluster from which clients submit jobs. What do you need to do in order to run Impala on the cluster and submit jobs from the command line of the gateway machine?

Install the impalad daemon statestored daemon, and daemon on each machine in the cluster, and the impala shell on your gateway machine

Install the impalad daemon, the statestored daemon, the catalogd daemon, and the impala shell on your gateway machine

Install the impalad daemon and the impala shell on your gateway machine, and the statestored daemon and catalogd daemon on one of the nodes in the cluster

Install the impalad daemon on each machine in the cluster, the statestored daemon and catalogd daemon on one machine in the cluster, and the impala shell on your gateway machine

Install the impalad daemon, statestored daemon, and catalogd daemon on each machine in the cluster and on the gateway node

Question # 4

Choose three reasons why should you run the HDFS balancer periodically? (Choose three)

To ensure that there is capacity in HDFS for additional data

To ensure that all blocks in the cluster are 128MB in size

To help HDFS deliver consistent performance under heavy loads

To ensure that there is consistent disk utilization across the DataNodes

To improve data locality MapReduce

Question # 5

Which is the default scheduler in YARN?

YARN doesnâ€™t configure a default scheduler, you must first assign an appropriate scheduler class in yarn-site.xml

Capacity Scheduler

Fair Scheduler

FIFO Scheduler

Question # 6

You decide to create a cluster which runs HDFS in High Availability mode with automatic failover, using Quorum Storage. What is the purpose of ZooKeeper in such a configuration?

It only keeps track of which NameNode is Active at any given time

It monitors an NFS mount point and reports if the mount point disappears

It both keeps track of which NameNode is Active at any given time, and manages the Edits file. Which is a log of changes to the HDFS filesystem

If only manages the Edits file, which is log of changes to the HDFS filesystem

Clients connect to ZooKeeper to determine which NameNode is Active

Question # 7

You are working on a project where you need to chain together MapReduce, Pig jobs. You also need the ability to use forks, decision points, and path joins. Which ecosystem project should you use to perform these actions?

Oozie

ZooKeeper

HBase

Sqoop

HUE

Question # 8

Your company stores user profile records in an OLTP databases. You want to join these records with web server logs you have already ingested into the Hadoop file system. What is the best way to obtain and ingest these user records?

Ingest with Hadoop streaming

Ingest using Hiveâ€™s IQAD DATA command

Ingest with sqoop import

Ingest with Pigâ€™s LOAD command

Ingest using the HDFS put command

Question # 9

Which scheduler would you deploy to ensure that your cluster allows short jobs to finish within a reasonable time without starting long-running jobs?

Complexity Fair Scheduler (CFS)

Capacity Scheduler

Fair Scheduler

FIFO Scheduler

Question # 10

Youâ€™re upgrading a Hadoop cluster from HDFS and MapReduce version 1 (MRv1) to one running HDFS and MapReduce version 2 (MRv2) on YARN. You want to set and enforce version 1 (MRv1) to one running HDFS and MapReduce version 2 (MRv2) on YARN. You want to set and enforce a block size of 128MB for all new files written to the cluster after upgrade. What should you do?

You cannot enforce this, since client code can always override this value

Set dfs.block.size to 128M on all the worker nodes, on all client machines, and on the NameNode, and set the parameter to final

Set dfs.block.size to 128 M on all the worker nodes and client machines, and set the parameter to final. You do not need to set this value on the NameNode

Set dfs.block.size to 134217728 on all the worker nodes, on all client machines, and on the NameNode, and set the parameter to final

Set dfs.block.size to 134217728 on all the worker nodes and client machines, and set the parameter to final. You do not need to set this value on the NameNode

Weekend Sale Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: xmas50

Cloudera CCA-500 - Cloudera Certified Administrator for Apache Hadoop (CCAH)

The Answer Is:

The Answer Is:

Explanation:

The Answer Is:

The Answer Is:

Explanation:

The Answer Is:

Explanation:

The Answer Is:

Explanation:

The Answer Is:

The Answer Is:

The Answer Is:

Explanation:

The Answer Is: