Review the following andapos;dataandapos; file and Pig code.
Which one of the following statements is true?
A. The Output Of the DUMP D command IS (M,{(M,62.95102),(M,38,95111)})
B. The output of the dump d command is (M, {(38,95in),(62,95i02)})
C. The code executes successfully but there is not output because the D relation is empty
D. The code does not execute successfully because D is not a valid relation
Analyze each scenario below and indentify which best describes the behavior of the default partitioner?
A. The default partitioner assigns key-values pairs to reduces based on an internal random number generator.
B. The default partitioner implements a round-robin strategy, shuffling the key-value pairs to each reducer in turn. This ensures an event partition of the key space.
C. The default partitioner computes the hash of the key. Hash values between specific ranges are associated with different buckets, and each bucket is assigned to a specific reducer.
D. The default partitioner computes the hash of the key and divides that valule modulo the number of reducers. The result determines the reducer assigned to process the key-value pair.
E. The default partitioner computes the hash of the value and takes the mod of that value with the number of reducers. The result determines the reducer assigned to process the key-value pair.
You have the following key-value pairs as output from your Map task:
(the, 1)
(fox, 1)
(faster, 1)
(than, 1)
(the, 1) (dog, 1)
How many keys will be passed to the Reducer's reduce method?
A. Six
B. Five
C. Four
D. Two
E. One
F. Three
Consider the following two relations, A and B.
Which Pig statement combines A by its first field and B by its second field?
A. C = DOIN B BY a1, A by b2;
B. C = JOIN A by al, B by b2;
C. C = JOIN A a1, B b2;
D. C = JOIN A SO, B $1;
In Hadoop 2.0, which TWO of the following processes work together to provide automatic failover of the NameNode? Choose 2 answers
A. ZKFailoverController
B. ZooKeeper
C. QuorumManager
D. JournalNode
Your cluster's HDFS block size in 64MB. You have directory containing 100 plain text files, each of which
is 100MB in size. The InputFormat for your job is TextInputFormat.
Determine how many Mappers will run?
A. 64
B. 100
C. 200
D. 640
A client application creates an HDFS file named foo.txt with a replication factor of 3. Identify which best describes the file access rules in HDFS if the file has a single block that is stored on data nodes A, B and C?
A. The file will be marked as corrupted if data node B fails during the creation of the file.
B. Each data node locks the local file to prohibit concurrent readers and writers of the file.
C. Each data node stores a copy of the file in the local file system with the same name as the HDFS file.
D. The file can be accessed if at least one of the data nodes storing the file is available.
What is a SequenceFile?
A. A SequenceFile contains a binary encoding of an arbitrary number of homogeneous writable objects.
B. A SequenceFile contains a binary encoding of an arbitrary number of heterogeneous writable objects.
C. A SequenceFile contains a binary encoding of an arbitrary number of WritableComparable objects, in sorted order.
D. A SequenceFile contains a binary encoding of an arbitrary number key-value pairs. Each key must be the same type. Each value must be same type.
How are keys and values presented and passed to the reducers during a standard sort and shuffle phase of MapReduce?
A. Keys are presented to reducer in sorted order; values for a given key are not sorted.
B. Keys are presented to reducer in sorted order; values for a given key are sorted in ascending order.
C. Keys are presented to a reducer in random order; values for a given key are not sorted.
D. Keys are presented to a reducer in random order; values for a given key are sorted in ascending order.
When is the earliest point at which the reduce method of a given Reducer can be called?
A. As soon as at least one mapper has finished processing its input split.
B. As soon as a mapper has emitted at least one record.
C. Not until all mappers have finished processing all records.
D. It depends on the InputFormat used for the job.