Weekend Sale Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: xmas50

Databricks Databricks-Generative-AI-Engineer-Associate - Databricks Certified Generative AI Engineer Associate

A Generative Al Engineer is helping a cinema extend its website's chat bot to be able to respond to questions about specific showtimes for movies currently playing at their local theater. They already have the location of the user provided by location services to their agent, and a Delta table which is continually updated with the latest showtime information by location. They want to implement this new capability In their RAG application.

Which option will do this with the least effort and in the most performant way?

A.

Create a Feature Serving Endpoint from a FeatureSpec that references an online store synced from the Delta table. Query the Feature Serving Endpoint as part of the agent logic / tool implementation.

B.

Query the Delta table directly via a SQL query constructed from the user's input using a text-to-SQL LLM in the agent logic / tool

C.

implementation. Write the Delta table contents to a text column.then embed those texts using an embedding model and store these in the vector index Look

up the information based on the embedding as part of the agent logic / tool implementation.

D.

Set up a task in Databricks Workflows to write the information in the Delta table periodically to an external database such as MySQL and query the information from there as part of the agent logic / tool implementation.

A Generative Al Engineer would like an LLM to generate formatted JSON from emails. This will require parsing and extracting the following information: order ID, date, and sender email. Here’s a sample email:

They will need to write a prompt that will extract the relevant information in JSON format with the highest level of output accuracy.

Which prompt will do that?

A.

You will receive customer emails and need to extract date, sender email, and order ID. You should return the date, sender email, and order ID information in JSON format.

B.

You will receive customer emails and need to extract date, sender email, and order ID. Return the extracted information in JSON format.

Here’s an example: {“date”: “April 16, 2024”, “sender_email”: “sarah.lee925@gmail.com”, “order_id”: “RE987D”}

C.

You will receive customer emails and need to extract date, sender email, and order ID. Return the extracted information in a human-readable format.

D.

You will receive customer emails and need to extract date, sender email, and order ID. Return the extracted information in JSON format.

A Generative Al Engineer is ready to deploy an LLM application written using Foundation Model APIs. They want to follow security best practices for production scenarios

Which authentication method should they choose?

A.

Use an access token belonging to service principals

B.

Use a frequently rotated access token belonging to either a workspace user or a service principal

C.

Use OAuth machine-to-machine authentication

D.

Use an access token belonging to any workspace user

A Generative Al Engineer is building a production-ready LLM system which replies directly to customers. The solution makes use of the Foundation Model API via provisioned throughput. They are concerned that the LLM could potentially respond in a toxic or otherwise unsafe way. They also wish to perform this with the least amount of effort.

Which approach will do this?

A.

Host Llama Guard on Foundation Model API and use it to detect unsafe responses

B.

Add some LLM calls to their chain to detect unsafe content before returning text

C.

Add a regex expression on inputs and outputs to detect unsafe responses.

D.

Ask users to report unsafe responses

Which indicator should be considered to evaluate the safety of the LLM outputs when qualitatively assessing LLM responses for a translation use case?

A.

The ability to generate responses in code

B.

The similarity to the previous language

C.

The latency of the response and the length of text generated

D.

The accuracy and relevance of the responses

A Generative Al Engineer is setting up a Databricks Vector Search that will lookup news articles by topic within 10 days of the date specified An example query might be "Tell me about monster truck news around January 5th 1992". They want to do this with the least amount of effort.

How can they set up their Vector Search index to support this use case?

A.

Split articles by 10 day blocks and return the block closest to the query.

B.

Include metadata columns for article date and topic to support metadata filtering.

C.

pass the query directly to the vector search index and return the best articles.

D.

Create separate indexes by topic and add a classifier model to appropriately pick the best index.

A Generative Al Engineer is creating an LLM system that will retrieve news articles from the year 1918 and related to a user's query and summarize them. The engineer has noticed that the summaries are generated well but often also include an explanation of how the summary was generated, which is undesirable.

Which change could the Generative Al Engineer perform to mitigate this issue?

A.

Split the LLM output by newline characters to truncate away the summarization explanation.

B.

Tune the chunk size of news articles or experiment with different embedding models.

C.

Revisit their document ingestion logic, ensuring that the news articles are being ingested properly.

D.

Provide few shot examples of desired output format to the system and/or user prompt.

A Generative Al Engineer has developed an LLM application to answer questions about internal company policies. The Generative AI Engineer must ensure that the application doesn’t hallucinate or leak confidential data.

Which approach should NOT be used to mitigate hallucination or confidential data leakage?

A.

Add guardrails to filter outputs from the LLM before it is shown to the user

B.

Fine-tune the model on your data, hoping it will learn what is appropriate and not

C.

Limit the data available based on the user’s access level

D.

Use a strong system prompt to ensure the model aligns with your needs.

A Generative AI Engineer is testing a simple prompt template in LangChain using the code below, but is getting an error.

Assuming the API key was properly defined, what change does the Generative AI Engineer need to make to fix their chain?

A)

B)

C)

D)

A.

Option A

B.

Option B

C.

Option C

D.

Option D

A Generative Al Engineer needs to design an LLM pipeline to conduct multi-stage reasoning that leverages external tools. To be effective at this, the LLM will need to plan and adapt actions while performing complex reasoning tasks.

Which approach will do this?

A.

Tram the LLM to generate a single, comprehensive response without interacting with any external tools, relying solely on its pre-trained knowledge.

B.

Implement a framework like ReAct which allows the LLM to generate reasoning traces and perform task-specific actions that leverage external tools if necessary.

C.

Encourage the LLM to make multiple API calls in sequence without planning or structuring the calls, allowing the LLM to decide when and how to use external tools spontaneously.

D.

Use a Chain-of-Thought (CoT) prompting technique to guide the LLM through a series of reasoning steps, then manually input the results from external tools for the final answer.