Your company has 3 different sales teams. Each team's sales manager has developed incentive offers to increase the size of each sales transaction. Any sales manager whose incentive program can be shown to increase the size of the average sales transaction will receive a bonus. Data are available for the number and average sale amount for transactions offering one of the incentives as well as transactions offering no incentive. The VP of Sales has asked you to determine analytically if any of the incentive programs has resulted in a demonstrable increase in the average sale amount. Which analytical technique would be appropriate in this situation?
A. One-way ANOVA
B. Multi-way ANOVA
C. Student's t-test
D. Wilcoxson Rank Sum Test
If your intention is to show trends over time, which chart type is the most appropriate way to depict the data?
A. Line chart
B. Bar chart
C. Stacked bar chart
D. Histogram
What is the format of the output from the Map function of MapReduce?
A. Key-value pairs
B. Binary respresentation of keys concatenated with structured data
C. Compressed index
D. Unique key record and separate records of all possible values
Which characteristic applies only to Business Intelligence as opposed to Data Science?
A. Uses only structured data
B. Supports solving "what if" scenarios
C. Uses large data sets
D. Uses predictive modeling techniques
How does Pig's use of a schema differ from that of a traditional RDBMS?
A. Pig's schema is optional
B. Pig's schema requires that the data is physically present when the schema is defined
C. Pig's schema is required for ETL
D. Pig's schema supports a single data type
When would you use a Wilcoxson Rank Sum test?
A. When you cannot make an assumption about the distribution of the populations
B. When the data can easily be sorted
C. When the populations represent the sums of other values
D. When the data cannot easily be sorted
In the MapReduce framework, what is the purpose of the Reduce function?
A. It aggregates the results of the Map function and generates processed output
B. It distributes the input to multiple nodes for processing
C. It writes the output of the Map function to storage
D. It breaks the input into smaller components and distributes to other nodes in the cluster 26 / 55
Which of the following is an example of quasi-structured data?
A. OLAP
B. OLTP
C. Customer record table
D. Clickstream data
Refer to the exhibit Consider the training data set shown in the exhibit. What are the classification (Y = 0 or
1) and the probability of the classification for the tupleX(0, 0, 1) using Naive Bayesian classifier?
A. Classification Y = 1,Probability = 4/54
B. Classification Y = 0,Probability = 1/54
C. Classification Y = 1,Probability = 1/54
D. Classification Y = 0,Probability = 4/54