Databricks Databricks-Certified-Associate-Developer-for-Apache-Spark-3.0 - Databricks Certified Associate Developer for Apache Spark 3.0 Exam
Total 180 questions
Which of the following is a viable way to improve Spark's performance when dealing with large amounts of data, given that there is only a single application running on the cluster?
The code block displayed below contains an error. The code block should return DataFrame transactionsDf, but with the column storeId renamed to storeNumber. Find the error.
Code block:
transactionsDf.withColumn("storeNumber", "storeId")
Which of the following describes the role of tasks in the Spark execution hierarchy?
The code block shown below should return all rows of DataFrame itemsDf that have at least 3 items in column itemNameElements. Choose the answer that correctly fills the blanks in the code block
to accomplish this.
Example of DataFrame itemsDf:
1.+------+----------------------------------+-------------------+------------------------------------------+
2.|itemId|itemName |supplier |itemNameElements |
3.+------+----------------------------------+-------------------+------------------------------------------+
4.|1 |Thick Coat for Walking in the Snow|Sports Company Inc.|[Thick, Coat, for, Walking, in, the, Snow]|
5.|2 |Elegant Outdoors Summer Dress |YetiX |[Elegant, Outdoors, Summer, Dress] |
6.|3 |Outdoors Backpack |Sports Company Inc.|[Outdoors, Backpack] |
7.+------+----------------------------------+-------------------+------------------------------------------+
Code block:
itemsDf.__1__(__2__(__3__)__4__)
Which of the following code blocks returns all unique values of column storeId in DataFrame transactionsDf?
The code block displayed below contains an error. The code block should read the csv file located at path data/transactions.csv into DataFrame transactionsDf, using the first row as column header
and casting the columns in the most appropriate type. Find the error.
First 3 rows of transactions.csv:
1.transactionId;storeId;productId;name
2.1;23;12;green grass
3.2;35;31;yellow sun
4.3;23;12;green grass
Code block:
transactionsDf = spark.read.load("data/transactions.csv", sep=";", format="csv", header=True)
Which of the following are valid execution modes?
Which of the following code blocks returns a copy of DataFrame transactionsDf where the column storeId has been converted to string type?
Which of the following describes slots?
Which of the following describes Spark's way of managing memory?