Databricks Databricks-Generative-AI-Engineer-Associate - Databricks Certified Generative AI Engineer Associate

Databricks Databricks-Generative-AI-Engineer-Associate Premium Access Download Demo

Page: 2 / 2
Total 61 questions

A Generative Al Engineer is developing a RAG application and would like to experiment with different embedding models to improve the application performance.

Which strategy for picking an embedding model should they choose?

Pick an embedding model trained on related domain knowledge

Pick the most recent and most performant open LLM released at the time

pick the embedding model ranked highest on the Massive Text Embedding Benchmark (MTEB) leaderboard hosted by HuggingFace

Pick an embedding model with multilingual support to support potential multilingual user questions

Explanation:

The task involves improving a Retrieval-Augmented Generation (RAG) applicationâ€™s performance by experimenting with embedding models. The choice of embedding model impacts retrieval accuracy,which is critical for RAG systems. Letâ€™s evaluate the options based on Databricks Generative AI Engineer best practices.

Option A: Pick an embedding model trained on related domain knowledge

Embedding models trained on domain-specific data (e.g., industry-specific corpora) produce vectors that better capture the semantics of the applicationâ€™s context, improving retrieval relevance. For RAG, this is a key strategy to enhance performance.

Databricks Reference:"For optimal retrieval in RAG systems, select embedding models aligned with the domain of your data"("Building LLM Applications with Databricks," 2023).

Option B: Pick the most recent and most performant open LLM released at the time

LLMs are not embedding models; they generate text, not embeddings for retrieval. While recent LLMs may be performant for generation, this doesnâ€™t address the embedding step in RAG. This option misunderstands the component being selected.

Databricks Reference: Embedding models and LLMs are distinct in RAG workflows:"Embedding models convert text to vectors, while LLMs generate responses"("Generative AI Cookbook").

Option C: Pick the embedding model ranked highest on the Massive Text Embedding Benchmark (MTEB) leaderboard hosted by HuggingFace

The MTEB leaderboard ranks models across general tasks, but high overall performance doesnâ€™t guarantee suitability for a specific domain. A top-ranked model might excel in generic contexts but underperform on the engineerâ€™s unique data.

Databricks Reference: General performance is less critical than domain fit:"Benchmark rankings provide a starting point, but domain-specific evaluation is recommended"("Databricks Generative AI Engineer Guide").

Option D: Pick an embedding model with multilingual support to support potential multilingual user questions

Multilingual support is useful only if the application explicitly requires it. Without evidence of multilingual needs, this adds complexity without guaranteed performance gains for the current use case.

Databricks Reference:"Choose features like multilingual support based on application requirements"("Building LLM-Powered Applications").

Conclusion: Option A is the best strategy because it prioritizes domain relevance, directly improving retrieval accuracy in a RAG systemâ€”aligning with Databricksâ€™ emphasis on tailoring models to specific use cases.

Question # 12

Generative AI Engineer at an electronics company just deployed a RAG application for customers to ask questions about products that the company carries. However, they received feedback that the RAG response often returns information about an irrelevant product.

What can the engineer do to improve the relevance of the RAGâ€™s response?

Assess the quality of the retrieved context

Implement caching for frequently asked questions

Use a different LLM to improve the generated response

Use a different semantic similarity search algorithm

Question # 13

A Generative Al Engineer is tasked with developing an application that is based on an open source large language model (LLM). They need a foundation LLM with a large context window.

Which model fits this need?

DistilBERT

MPT-30B

Llama2-70B

DBRX

Question # 14

A Generative AI Engineer has been asked to design an LLM-based application that accomplishes the following business objective: answer employee HR questions using HR PDF documentation.

Which set of high level tasks should the Generative AI Engineer's system perform?

Calculate averaged embeddings for each HR document, compare embeddings to user query to find the best document. Pass the best document with the user query into an LLM with a large context window to generate a response to the employee.

Use an LLM to summarize HR documentation. Provide summaries of documentation and user query into an LLM with a large context window to generate a response to the user.

Create an interaction matrix of historical employee questions and HR documentation. Use ALS to factorize the matrix and create embeddings. Calculate the embeddings of new queries and use them to find the best HR documentation. Use an LLM to generate a response to the employee question based upon the documentation retrieved.

Split HR documentation into chunks and embed into a vector store. Use the employee question to retrieve best matched chunks of documentation, and use the LLM to generate a response to the employee based upon the documentation retrieved.

Question # 15

What is the most suitable library for building a multi-step LLM-based workflow?

Pandas

TensorFlow

PySpark

LangChain

Question # 16

A Generative AI Engineer is building an LLM to generate article summaries in the form of a type of poem, such as a haiku, given the article content. However, the initial output from the LLM does not match the desired tone or style.

Which approach will NOT improve the LLMâ€™s response to achieve the desired response?

Provide the LLM with a prompt that explicitly instructs it to generate text in the desired tone and style

Use a neutralizer to normalize the tone and style of the underlying documents

Include few-shot examples in the prompt to the LLM

Fine-tune the LLM on a dataset of desired tone and style

Question # 17

A Generative AI Engineer is developing an LLM application that users can use to generate personalized birthday poems based on their names.

Which technique would be most effective in safeguarding the application, given the potential for malicious user inputs?

Implement a safety filter that detects any harmful inputs and ask the LLM to respond that it is unable to assist

Reduce the time that the users can interact with the LLM

Ask the LLM to remind the user that the input is malicious but continue the conversation with the user

Increase the amount of compute that powers the LLM to process input faster

Question # 18

A Generative Al Engineer is building a system which will answer questions on latest stock news articles.

Which will NOT help with ensuring the outputs are relevant to financial news?

Implement a comprehensive guardrail framework that includes policies for content filters tailored to the finance sector.

Increase the compute to improve processing speed of questions to allow greater relevancy analysis

C Implement a profanity filter to screen out offensive language

Incorporate manual reviews to correct any problematic outputs prior to sending to the users

New Year Sale Special Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: xmas50

Databricks Databricks-Generative-AI-Engineer-Associate - Databricks Certified Generative AI Engineer Associate

The Answer Is:

Explanation:

The Answer Is:

Explanation:

The Answer Is:

Explanation:

The Answer Is:

Explanation:

The Answer Is:

Explanation:

The Answer Is:

Explanation:

The Answer Is:

Explanation:

The Answer Is:

Explanation: