Introduction
This guide covers effective prompting strategies for getting high-quality results from large language models, with a focus on Google’s Gemini models.
Prompt engineering is iterative!
Treat these as starting points and refine based on what works for your specific use cases.
The strategies below are ordered roughly from foundational to advanced. Start with clear instructions and few-shot examples before reaching for more complex techniques.
Structure the prompt with specific sections
A well-organized prompt has distinct sections that the model can parse independently. Think of it like a document with clear headings—the model knows where to look for what. We recommend the following structure as a starting template:
- Role or task: Define who the model is and what it’s trying to accomplish. This anchors the model’s behavior for the rest of the prompt. Example: “You are a helpful AI assistant for LivePerson Bank specialized in Question Answering.”
- Instructions: The core behavioral rules. What should the model do? What should it avoid? How should it handle edge cases? Order these by importance—the model pays more attention to instructions that appear first.
- Examples (if applicable): Few-shot examples that show the model what a correct input/output pair looks like. Keep them relevant to the actual task. Generic or off-topic examples waste tokens and can confuse the model.
- Output format: Specify tone, structure, length, and any formatting rules (JSON, bullet points, etc.). This section can go before or after the prompt input depending on your use case.
- Prompt inputs or context: The actual data the model needs to work with: knowledge articles, conversation transcripts, user queries, etc. Place this after your instructions, so the model is already primed on what to do before it reads the data.
Use clear delimiters (XML tags or Markdown headings) between each section, so the model knows where one ends and the next begins. See the Use structured markup… section below for specific examples.
Write clear, specific instructions
One of the most impactful things you can do is be explicit about what you want.
Remove ambiguity by specifying:
- Constraints: How long should the response be? What should be included or excluded?
- Format: Do you want a table, bullet list, JSON, paragraph, or something else?
- Tone and style: Set conversational vs. formal tone, verbosity level, etc.
From our experimentation and literature review, we have found that instructions at the beginning of the prompt are given more attention by the model. This means that you should order your instructions by how important they are for completing the task. For example, a description of exactly how information should be extracted from a conversation is more crucial than an output format constraint like length.
Use Few-Shot examples
Including examples in your prompt is one of the most reliable ways to steer Gemini’s behavior. Few-shot prompts—those with examples—consistently outperform zero-shot prompts (no examples). Google’s own guidance recommends always including them.
Best practices for examples
- Show, don’t tell. Clear examples can sometimes replace written instructions entirely.
- Use positive patterns. Show what the model should do, not what it should avoid. Positive examples are more effective than anti-patterns.
- Keep formatting consistent. If your examples use different structures, the model may produce inconsistent output. Standardize your XML tags, whitespace, and delimiters.
- Don’t overdo it. Gemini picks up patterns from just a few examples. Too many can cause overfitting, where the model mimics examples too literally.
Provide context
Don’t assume the model knows what you know. Include the reference material, data, or background information it needs to give you a useful answer.
Grounding the model in provided context produces far more specific and accurate responses than relying on its general training knowledge.
From our experimentation, we have found that including your retrieved context after your instructions leads to better results, allowing the model to prime on the task, and then focus on grounding that task with the appropriate context.
Use prefixes to structure prompts
Prefixes are labels that you add to different parts of the prompt to help the model parse your intent. There are three types:
| Prefix type | What it does | Example |
|---|---|---|
| Input prefix | Labels the input data so the model knows what it’s working with | "Text: ", "English: ", "Order: " |
| Output prefix | Signals what format or type the response should be | "JSON: ", "The answer is: " |
| Example prefix | Labels examples in few-shot prompts so outputs are easier to parse | "Input: ", "Output: ", "Q: ", "A: " |
Use structured markup to organize prompts
When your prompt has multiple distinct parts—instructions, context, examples, constraints—wrapping them in XML-style tags or Markdown headings makes it much easier for the model to parse what’s what. This is especially effective with newer models, which are tuned to respond well to structured input.
XML tags work well because they create unambiguous boundaries. The model can clearly see where your context ends and your task begins, which reduces the chance of it confusing data for instructions (or vice versa). Markdown headings serve a similar purpose and feel more natural if you’re already writing in a conversational style.
XML example
<task>
Write a 3-sentence executive summary of this quarter’s performance.
</task>
<constraints>
- Tone: professional, optimistic
- Do not include exact dollar figures
</constraints>
<context>
Our Q4 revenue was $2.3M, up 15% from Q3. Customer churn dropped to 4.2%.
</context>
Markdown example
# Task
Write a 3-sentence executive summary of this quarter’s performance.
# Constraints
- Tone: professional, optimistic
- Do not include exact dollar figures
# Context
Our Q4 revenue was $2.3M, up 15% from Q3. Customer churn dropped to 4.2%.
Iterate on the prompt
If your prompt isn’t producing the results you want, try these adjustments before starting from scratch:
- Rephrase. Different wording can yield significantly different results, even when the meaning is the same.
- Reframe the task. If classification isn’t working well, try framing it as a multiple-choice question instead.
- Reorder content. The placement of examples, context, and instructions relative to each other can affect the response.
Tune model parameters
Beyond the prompt itself, you can control model behavior through hyperparameters (settings) configured in the prompt's settings in the Prompt Library, in the API call, or in the Vertex AI playground (i.e., not in the prompt text).
- Temperature: This value between 0.0 and 2.0 inclusive controls randomness. Lower = more deterministic and repeatable; higher = more creative but riskier.
Google recommends 1.0 as the default for Gemini 2.5 models, since these models have improved internal calibration. However, for our use cases like KnowledgeAI and Conversation Assist, we recommend a temperature of 0 (zero) to ensure repeatable, reliable responses. When the model is grounding answers in retrieved context, we want consistency, not creativity.
LivePerson's Promp Library only supports values between 0.0 and 1.0 inclusive, as values above 1 are not particularly useful.
For some newer models, like the Gemini 3 series, it is actually recommended that you keep the temperature at 1.0 to avoid unexpected behavior, regardless of the use case.
Use of variables
Learn about best practices when using variables.
Prompt management
When creating and managing prompts, follow these best practices. They’ll help you to avoid impacting your Generative AI solution in ways that you don’t intend.
- Minor changes: These include changes like typo fixes. It’s okay to edit the prompt directly.
- Major changes: To make these, we recommend duplicating the prompt and testing the changes in the copy first. This avoids impacting your Production solution while you’re testing and verifying. Always test before changing your Production solution.
- Duplicate feature: Take advantage of this feature. It lets you implement self-managed versioning: Duplicate Prompt A v1.0 to create Prompt A v2.0. Duplicate Prompt A 2.0 to create Prompt A v3.0. And so on. This kind of strategy has two very important benefits: 1) Your Production solution isn’t impacted as you work on new, independent copies of prompts. 2) By keeping versions distinct, it enables you to revert your solution to an earlier version of a prompt if needed.
- Edit feature: This feature lets you make changes to a prompt. But for safety, we recommend using the duplicate feature for major changes. Always fully test any substantive changes, especially major ones.
Prompt testing
Even modest changes to prompts can produce very different results, so always test a prompt fully before using it in Production. Take advantage of the following tools:
- KnowledgeAI’s testing tools: Use these to test the article matching, and to see enriched answers that are generated without any conversation context as input to the LLM. In the results, you can see the articles that were matched, the prompt sent to the LLM service, and the enriched answer that was returned. This tool can help you to tune the performance of the knowledge base. It also gives you some insight into how well the Generative AI piece of your solution is performing.
-
Conversation Builder’s Preview and Conversation Tester: Use either of these tools to fully test your Generative AI solution. Both tools give you a better view into performance because both pass previous turns from the current conversation to the LLM service, not just the matched articles and the prompt. This added context enhances the quality of the enriched answers. So these tools give you the most complete picture.
Use the Conversation Tester to test the end-to-end flow. With Preview, the conversation only flows between the tool and the underlying bot server. With Conversation Tester, it goes through Conversational Cloud.
Using Generative AI in Conversation Assist? To fully test prompt changes and include the conversation context as input to the LLM, you’ll need to create a messaging bot in Conversation Builder. Configure it to match your Conversation Assist configuration, for example, use the same answer threshold. You can quickly create a messaging bot via a bot template. Use the Generative AI - Messaging bot template in specific.
Releasing prompt changes
- Generative AI in Conversation Assist: First test via a Conversation Builder test bot. Then update the prompt configuration in Conversation Assist.
- Generative AI in Conversation Builder bots: We recommend you take advantage of Conversation Builder’s Release feature to control and manage how prompt changes are made live in your Production bots. First make the updates in a Development or Sandbox bot and test. When you’re ready, push those changes to the Production bot.