# LangGraph and LangSmith - Agentic RAG Powered by LangChain

In the following notebook we'll complete the following tasks:

- ü§ù Breakout Room #1:
  1. Install required libraries
  2. Set Environment Variables
  3. Creating our Tool Belt
  4. Creating Our State
  5. Creating and Compiling A Graph!

- ü§ù Breakout Room #2:
  1. Evaluating the LangGraph Application with LangSmith
  2. Adding Helpfulness Check and "Loop" Limits
  3. LangGraph for the "Patterns" of GenAI

# ü§ù Breakout Room #1

## Part 1: LangGraph - Building Cyclic Applications with LangChain

LangGraph is a tool that leverages LangChain Expression Language to build coordinated multi-actor and stateful applications that includes cyclic behaviour.

### Why Cycles?

In essence, we can think of a cycle in our graph as a more robust and customizable loop. It allows us to keep our application agent-forward while still giving the powerful functionality of traditional loops.

Due to the inclusion of cycles over loops, we can also compose rather complex flows through our graph in a much more readable and natural fashion. Effectively allowing us to recreate application flowcharts in code in an almost 1-to-1 fashion.

### Why LangGraph?

Beyond the agent-forward approach - we can easily compose and combine traditional "DAG" (directed acyclic graph) chains with powerful cyclic behaviour due to the tight integration with LCEL. This means it's a natural extension to LangChain's core offerings!

## Task 1:  Dependencies

We'll first install all our required libraries.

> NOTE: If you're running this locally - please skip this step.

In [1]:
!pip install -qU langchain langchain_openai langchain-community langgraph arxiv

[0m

## Task 2: Environment Variables

We'll want to set both our OpenAI API key and our LangSmith environment variables.

In [2]:
import os
import getpass

os.environ["OPENAI_API_KEY"] = getpass.getpass("OpenAI API Key:")

In [3]:
os.environ["TAVILY_API_KEY"] = getpass.getpass("TAVILY_API_KEY")

In [4]:
from uuid import uuid4

os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_PROJECT"] = f"AIE6 - LangGraph - {uuid4().hex[0:8]}"
os.environ["LANGCHAIN_API_KEY"] = getpass.getpass("LangSmith API Key: ")

## Task 3: Creating our Tool Belt

As is usually the case, we'll want to equip our agent with a toolbelt to help answer questions and add external knowledge.

There's a tonne of tools in the [LangChain Community Repo](https://github.com/langchain-ai/langchain/tree/master/libs/community/langchain_community/tools) but we'll stick to a couple just so we can observe the cyclic nature of LangGraph in action!

We'll leverage:

- [Tavily Search Results](https://github.com/langchain-ai/langchain/blob/master/libs/community/langchain_community/tools/tavily_search/tool.py)
- [Arxiv](https://github.com/langchain-ai/langchain/tree/master/libs/community/langchain_community/tools/arxiv)

#### üèóÔ∏è Activity #1:

Please add the tools to use into our toolbelt.

> NOTE: Each tool in our toolbelt should be a method.

In [5]:
from langchain_community.tools.tavily_search import TavilySearchResults
from langchain_community.tools.arxiv.tool import ArxivQueryRun

tavily_tool = TavilySearchResults(max_results=5)

tool_belt = [
    tavily_tool,
    ArxivQueryRun(),
]

### Model

Now we can set-up our model! We'll leverage the familiar OpenAI model suite for this example - but it's not *necessary* to use with LangGraph. LangGraph supports all models - though you might not find success with smaller models - as such, they recommend you stick with:

- OpenAI's GPT-3.5 and GPT-4
- Anthropic's Claude
- Google's Gemini

> NOTE: Because we're leveraging the OpenAI function calling API - we'll need to use OpenAI *for this specific example* (or any other service that exposes an OpenAI-style function calling API.

In [6]:
from langchain_openai import ChatOpenAI

model = ChatOpenAI(model="gpt-4o", temperature=0)

Now that we have our model set-up, let's "put on the tool belt", which is to say: We'll bind our LangChain formatted tools to the model in an OpenAI function calling format.

In [7]:
model = model.bind_tools(tool_belt)

#### ‚ùì Question #1:

How does the model determine which tool to use?

#### Answer #1:

The tool_belt [] definition creates the tools (Tavily Search and Arxiv search ).

The bind_tools registers the two tools with the model, so the LLM knows they have access to external tools with specific capabilities. It attaches metadata about each tool (name, description, input schema) to the model.This allows the model to decide which one to use during response generation 

Without bind_tools, the LLM won‚Äôt know that your tool like search_arxiv() even exists ‚Äî or how to use it.

To determine which tool to use, the LLM itself at runtime (or generation time) leverages the metadata about the tool (name, description) and any prior context along with user query to decide what tool to call. 

## Task 4: Putting the State in Stateful

Earlier we used this phrasing:

`coordinated multi-actor and stateful applications`

So what does that "stateful" mean?

To put it simply - we want to have some kind of object which we can pass around our application that holds information about what the current situation (state) is. Since our system will be constructed of many parts moving in a coordinated fashion - we want to be able to ensure we have some commonly understood idea of that state.

LangGraph leverages a `StatefulGraph` which uses an `AgentState` object to pass information between the various nodes of the graph.

There are more options than what we'll see below - but this `AgentState` object is one that is stored in a `TypedDict` with the key `messages` and the value is a `Sequence` of `BaseMessages` that will be appended to whenever the state changes.

Let's think about a simple example to help understand exactly what this means (we'll simplify a great deal to try and clearly communicate what state is doing):

1. We initialize our state object:
  - `{"messages" : []}`
2. Our user submits a query to our application.
  - New State: `HumanMessage(#1)`
  - `{"messages" : [HumanMessage(#1)}`
3. We pass our state object to an Agent node which is able to read the current state. It will use the last `HumanMessage` as input. It gets some kind of output which it will add to the state.
  - New State: `AgentMessage(#1, additional_kwargs {"function_call" : "WebSearchTool"})`
  - `{"messages" : [HumanMessage(#1), AgentMessage(#1, ...)]}`
4. We pass our state object to a "conditional node" (more on this later) which reads the last state to determine if we need to use a tool - which it can determine properly because of our provided object!

In [8]:
from typing import TypedDict, Annotated
from langgraph.graph.message import add_messages
import operator
from langchain_core.messages import BaseMessage

class AgentState(TypedDict):
  messages: Annotated[list, add_messages]

## Task 5: It's Graphing Time!

Now that we have state, and we have tools, and we have an LLM - we can finally start making our graph!

Let's take a second to refresh ourselves about what a graph is in this context.

Graphs, also called networks in some circles, are a collection of connected objects.

The objects in question are typically called nodes, or vertices, and the connections are called edges.

Let's look at a simple graph.

![image](https://i.imgur.com/2NFLnIc.png)

Here, we're using the coloured circles to represent the nodes and the yellow lines to represent the edges. In this case, we're looking at a fully connected graph - where each node is connected by an edge to each other node.

If we were to think about nodes in the context of LangGraph - we would think of a function, or an LCEL runnable.

If we were to think about edges in the context of LangGraph - we might think of them as "paths to take" or "where to pass our state object next".

Let's create some nodes and expand on our diagram.

> NOTE: Due to the tight integration with LCEL - we can comfortably create our nodes in an async fashion!

In [9]:
from langgraph.prebuilt import ToolNode

def call_model(state):
  messages = state["messages"]
  response = model.invoke(messages)
  return {"messages" : [response]}

tool_node = ToolNode(tool_belt)

Now we have two total nodes. We have:

- `call_model` is a node that will...well...call the model
- `tool_node` is a node which can call a tool

Let's start adding nodes! We'll update our diagram along the way to keep track of what this looks like!


In [34]:
from langgraph.graph import StateGraph, END

uncompiled_graph = StateGraph(AgentState)

uncompiled_graph.add_node("agent", call_model)
uncompiled_graph.add_node("action", tool_node)

<langgraph.graph.state.StateGraph at 0x127c42ad0>

Let's look at what we have so far:

![image](https://i.imgur.com/md7inqG.png)

Next, we'll add our entrypoint. All our entrypoint does is indicate which node is called first.

In [11]:
uncompiled_graph.set_entry_point("agent")

<langgraph.graph.state.StateGraph at 0x114672f90>

![image](https://i.imgur.com/wNixpJe.png)

Now we want to build a "conditional edge" which will use the output state of a node to determine which path to follow.

We can help conceptualize this by thinking of our conditional edge as a conditional in a flowchart!

Notice how our function simply checks if there is a "function_call" kwarg present.

Then we create an edge where the origin node is our agent node and our destination node is *either* the action node or the END (finish the graph).

It's important to highlight that the dictionary passed in as the third parameter (the mapping) should be created with the possible outputs of our conditional function in mind. In this case `should_continue` outputs either `"end"` or `"continue"` which are subsequently mapped to the action node or the END node.

In [12]:
def should_continue(state):
  last_message = state["messages"][-1]

  if last_message.tool_calls:
    return "action"

  return END

uncompiled_graph.add_conditional_edges(
    "agent",
    should_continue
)

<langgraph.graph.state.StateGraph at 0x114672f90>

Let's visualize what this looks like.

![image](https://i.imgur.com/8ZNwKI5.png)

Finally, we can add our last edge which will connect our action node to our agent node. This is because we *always* want our action node (which is used to call our tools) to return its output to our agent!

In [13]:
uncompiled_graph.add_edge("action", "agent")

<langgraph.graph.state.StateGraph at 0x114672f90>

Let's look at the final visualization.

![image](https://i.imgur.com/NWO7usO.png)

All that's left to do now is to compile our workflow - and we're off!

In [14]:
compiled_graph = uncompiled_graph.compile()

#### ‚ùì Question #2:

Is there any specific limit to how many times we can cycle?

If not, how could we impose a limit to the number of cycles?

#### Answer #2:
LangGraph does set limits. By default, LangGraph sets a maximum loop limit of 25 cycles through a node to prevent infinite loops.

To impose a limit to number of cycles, we can set max_iterations=10 as a paramter when compiling. 

compiled_graph = uncompiled_graph.compile(max_iterations=10)

## Using Our Graph

Now that we've created and compiled our graph - we can call it *just as we'd call any other* `Runnable`!

Let's try out a few examples to see how it fairs:

In [35]:
from langchain_core.messages import HumanMessage

inputs = {"messages" : [HumanMessage(content="Who is the current captain of the Winnipeg Jets?")]}

async for chunk in compiled_graph.astream(inputs, stream_mode="updates"):
    for node, values in chunk.items():
        print(f"Receiving update from node: '{node}'")
        print(values["messages"])
        print("\n\n")

Receiving update from node: 'agent'
[AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_pH8wHpgrsx3jTB6357CYOuml', 'function': {'arguments': '{"query":"current captain of the Winnipeg Jets 2023"}', 'name': 'tavily_search_results_json'}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 27, 'prompt_tokens': 162, 'total_tokens': 189, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-2024-08-06', 'system_fingerprint': 'fp_6b6e24b474', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-2b53e9c4-f1f6-4683-88ce-e91b0284c7c9-0', tool_calls=[{'name': 'tavily_search_results_json', 'args': {'query': 'current captain of the Winnipeg Jets 2023'}, 'id': 'call_pH8wHpgrsx3jTB6357CYOuml', 'type': 'tool_call'}], usage_metadata={'input_tokens': 162, 'output_t

Let's look at what happened:

1. Our state object was populated with our request
2. The state object was passed into our entry point (agent node) and the agent node added an `AIMessage` to the state object and passed it along the conditional edge
3. The conditional edge received the state object, found the "tool_calls" `additional_kwarg`, and sent the state object to the action node
4. The action node added the response from the OpenAI function calling endpoint to the state object and passed it along the edge to the agent node
5. The agent node added a response to the state object and passed it along the conditional edge
6. The conditional edge received the state object, could not find the "tool_calls" `additional_kwarg` and passed the state object to END where we see it output in the cell above!

Now let's look at an example that shows a multiple tool usage - all with the same flow!

In [16]:
inputs = {"messages" : [HumanMessage(content="Search Arxiv for the QLoRA paper, then search each of the authors to find out their latest Tweet using Tavily!")]}

async for chunk in compiled_graph.astream(inputs, stream_mode="updates"):
    for node, values in chunk.items():
        print(f"Receiving update from node: '{node}'")
        if node == "action":
          print(f"Tool Used: {values['messages'][0].name}")
        print(values["messages"])

        print("\n\n")

Receiving update from node: 'agent'
[AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_PCF8rpPXTP0PTo851aTpUZCD', 'function': {'arguments': '{"query":"QLoRA"}', 'name': 'arxiv'}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 17, 'prompt_tokens': 178, 'total_tokens': 195, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-2024-08-06', 'system_fingerprint': 'fp_6b6e24b474', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-f26eb76d-919b-4fa9-8779-9dd2069fb246-0', tool_calls=[{'name': 'arxiv', 'args': {'query': 'QLoRA'}, 'id': 'call_PCF8rpPXTP0PTo851aTpUZCD', 'type': 'tool_call'}], usage_metadata={'input_tokens': 178, 'output_tokens': 17, 'total_tokens': 195, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'a

#### üèóÔ∏è Activity #2:

Please write out the steps the agent took to arrive at the correct answer.

Answer:
1. The Agent Started with the HumanMessage (from user) to search Qlora paper authors tweet. Our state object was populated with our request. 
2. The state object was passed into our entry point (agent node) and the agent node added an `AIMessage` to the state object and passed it along the conditional edge
3. The conditional edge received the state object, found the "tool_calls" `Sesrch Arxiv`, and sent the state object to the action node
4. The action node searched through QLora paper, and added the response from the OpenAI function calling endpoint to the state object and passed it along the edge to the agent node. This tool_use agent returned the paper authors to Tim Dettmers, Artidoro Pagnoni, Ari Holtzman, Luke Zettlemoyer.
5. The agent node added a response to the state object and passed it along the conditional edge. i.e It issued action to use another tool Tavily_search for latest tweet for the 4 authors above. It did four parallel tool calls. It had 4 parameters like 'Luke Zettlemoyer latest tweet' in a single action call. 
6. The action call called the Tavily Search tool looking for tweets from the 4 authors that is then returned to the agent.
7. The agent determines that the 4 tweets are recieved, formats the answer and determines to END the flow. It send back AIMessage with 4 links, one to each tweet. 

# ü§ù Breakout Room #2

## Part 1: LangSmith Evaluator

### Pre-processing for LangSmith

To do a little bit more preprocessing, let's wrap our LangGraph agent in a simple chain.

In [36]:
def convert_inputs(input_object):
  return {"messages" : [HumanMessage(content=input_object["question"])]}

def parse_output(input_state):
  return input_state["messages"][-1].content

agent_chain = convert_inputs | compiled_graph | parse_output

In [37]:
agent_chain.invoke({"question" : "What is RAG?"})

"RAG stands for Retrieval-Augmented Generation. It is a technique used in natural language processing (NLP) that combines retrieval-based methods with generative models to improve the quality and relevance of generated text. Here's a brief overview of how it works:\n\n1. **Retrieval**: The system first retrieves relevant documents or pieces of information from a large corpus or database. This step is crucial for grounding the generative model in factual and contextually relevant information.\n\n2. **Augmentation**: The retrieved information is then used to augment the input to a generative model. This can involve concatenating the retrieved text with the original input or using it to inform the model's understanding of the context.\n\n3. **Generation**: Finally, a generative model, such as a transformer-based language model, uses the augmented input to produce a response or generate text. The inclusion of retrieved information helps ensure that the output is more accurate and contextua

### Task 1: Creating An Evaluation Dataset

Just as we saw last week, we'll want to create a dataset to test our Agent's ability to answer questions.

In order to do this - we'll want to provide some questions and some answers. Let's look at how we can create such a dataset below.

```python
questions = [
    "What optimizer is used in QLoRA?",
    "What data type was created in the QLoRA paper?",
    "What is a Retrieval Augmented Generation system?",
    "Who authored the QLoRA paper?",
    "What is the most popular deep learning framework?",
    "What significant improvements does the LoRA system make?"
]

answers = [
    {"must_mention" : ["paged", "optimizer"]},
    {"must_mention" : ["NF4", "NormalFloat"]},
    {"must_mention" : ["ground", "context"]},
    {"must_mention" : ["Tim", "Dettmers"]},
    {"must_mention" : ["PyTorch", "TensorFlow"]},
    {"must_mention" : ["reduce", "parameters"]},
]
```

#### üèóÔ∏è Activity #3:

Please create a dataset in the above format with at least 5 questions.

In [19]:
questions = [
    "What optimizer is used in QLoRA?",
    "What data type was created in the QLoRA paper?",
    "What is a Retrieval Augmented Generation system?",
    "Who authored the QLoRA paper?",
    "What is the most popular deep learning framework?",
    "What significant improvements does the LoRA system make?"
]

answers = [
    {"must_mention" : ["paged", "optimizer"]},
    {"must_mention" : ["NF4", "NormalFloat"]},
    {"must_mention" : ["ground", "context"]},
    {"must_mention" : ["Tim", "Dettmers"]},
    {"must_mention" : ["PyTorch", "TensorFlow"]},
    {"must_mention" : ["reduce", "parameters"]},
]

Now we can add our dataset to our LangSmith project using the following code which we saw last Thursday!

In [20]:
from langsmith import Client

client = Client()

dataset_name = f"Retrieval Augmented Generation - Evaluation Dataset - {uuid4().hex[0:8]}"

dataset = client.create_dataset(
    dataset_name=dataset_name,
    description="Questions about the QLoRA Paper to Evaluate RAG over the same paper."
)

client.create_examples(
    inputs=[{"question" : q} for q in questions],
    outputs=answers,
    dataset_id=dataset.id,
)

{'example_ids': ['f6cf2a19-2b03-4c3b-a550-1826432dbc78',
  '93986ba4-42f3-4425-b421-6bb1a544a9bc',
  'fa2e903c-eeb7-4fea-9bb8-841f5377dcbb',
  '72151714-ca7e-44d5-9bdc-e5d6d86cb3ee',
  '6d48e0c3-be9f-49f5-9c6a-ac391fe2dc36',
  '1b832f22-ac30-4878-a594-12150aba0967'],
 'count': 6}

#### ‚ùì Question #3:

How are the correct answers associated with the questions?

> NOTE: Feel free to indicate if this is problematic or not


## ANSWERS
- LangSmith associates each question with the answer at the same index in the list.

This is positional: questions[0] is matched with answers[0], questions[1] is matched with answers[1].

This works correctly only if the two lists are perfectly aligned in order and length.


Yes, this is PROBLEMATIC
- If you add/remove an item from questions but not answers, the pairing silently breaks.
- It‚Äôs easy to misalign if editing either list later.
- There's no explicit mapping, which reduces readability and robustness.

A better way to associate the answers with the questions is to use explicit examples,
examples = [
    {
        "input": {"question": "What optimizer is used in QLoRA?"},
        "output": {"must_mention": ["paged", "optimizer"]}
    },

### Task 2: Adding Evaluators

Now we can add a custom evaluator to see if our responses contain the expected information.

We'll be using a fairly naive exact-match process to determine if our response contains specific strings.

In [21]:
from langsmith.evaluation import EvaluationResult, run_evaluator

@run_evaluator
def must_mention(run, example) -> EvaluationResult:
    prediction = run.outputs.get("output") or ""
    required = example.outputs.get("must_mention") or []
    score = all(phrase in prediction for phrase in required)
    return EvaluationResult(key="must_mention", score=score)

#### ‚ùì Question #4:

What are some ways you could improve this metric as-is?

Our current approach uses Must Match ie exact match which naive evaluator to validate if agent responses contain all required keywords or phrase. It is fast, deterministic and easy to debug, by checking if all required phrases are present in the prediction. However it has limitations.

It is too rigid, so won't catch synoynms or paraphrases: "paged optimizer" ‚â† "optimizer using paging"
It is case sensitive. It also has no partial credit - if both words are present it matches, If one is present, it won't.

Simple Improvements:
1) Use fuzzy string matching (e.g. fuzzywuzzy, difflib)
2) Use Use embedding-based similarity or LLM evaluator for Semantic Understanding
3) To solve for all or nothing, Score proportionally: score = num_matched / len(required)
4) Normalize case + strip punctuation
5) Add evaluator to check if tool outputs were cited correctly

Advanced Improvement:

1) Semantic meaning : Use embeddings like similarity measures, k-nn, approximate matches for contextual relevance
 To do this, Use LangChain‚Äôs StringEvaluator with contains, exact, or semantic modes

2) Use an LLM-based evaluator: Ask GPT ‚ÄúDid this response correctly answer the question using the expected facts?‚Äù

Task 3: Evaluating

All that is left to do is evaluate our agent's response!

In [22]:
experiment_results = client.evaluate(
    agent_chain,
    data=dataset_name,
    evaluators=[must_mention],
    experiment_prefix=f"RAG Pipeline - Evaluation - {uuid4().hex[0:4]}",
    metadata={"version": "1.0.0"},
)

View the evaluation results for experiment: 'RAG Pipeline - Evaluation - abb3-4d91a69f' at:
https://smith.langchain.com/o/4e223e9d-b789-4c00-8d16-32ad70974f10/datasets/11372d31-b6ee-42c4-8c64-cf64a0c7c413/compare?selectedSessions=63bb8652-e7de-4822-a8be-515b6f8b803d




0it [00:00, ?it/s]

In [23]:
experiment_results

Unnamed: 0,inputs.question,outputs.output,error,reference.must_mention,feedback.must_mention,execution_time,example_id,id
0,What significant improvements does the LoRA sy...,The LoRA (Low-Rank Adaptation) system has made...,,"[reduce, parameters]",False,4.796972,1b832f22-ac30-4878-a594-12150aba0967,d20395e3-acd9-4927-95ed-883dfe63f11c
1,What is the most popular deep learning framework?,"In 2023, the most popular deep learning framew...",,"[PyTorch, TensorFlow]",True,3.882287,6d48e0c3-be9f-49f5-9c6a-ac391fe2dc36,2075e68b-d272-4854-a92f-7ce5588d009e
2,Who authored the QLoRA paper?,"The QLoRA paper titled ""Accurate LoRA-Finetuni...",,"[Tim, Dettmers]",False,6.106932,72151714-ca7e-44d5-9bdc-e5d6d86cb3ee,779f7707-77c4-4d98-9b19-425291a390ca
3,What data type was created in the QLoRA paper?,"The QLoRA paper introduced the concept of ""Qua...",,"[NF4, NormalFloat]",False,2.833395,93986ba4-42f3-4425-b421-6bb1a544a9bc,4c339a08-769d-49af-965e-6c2e92181768
4,What optimizer is used in QLoRA?,"QLoRA uses ""paged optimizers"" to manage memory...",,"[paged, optimizer]",True,2.221695,f6cf2a19-2b03-4c3b-a550-1826432dbc78,165db289-dc6f-4567-9f9c-7812b69382c0
5,What is a Retrieval Augmented Generation system?,A Retrieval Augmented Generation (RAG) system ...,,"[ground, context]",True,2.372521,fa2e903c-eeb7-4fea-9bb8-841f5377dcbb,1c77e7ac-e7b7-4a75-9229-a89a0f2ce57e


## Part 2: LangGraph with Helpfulness:

### Task 3: Adding Helpfulness Check and "Loop" Limits

Now that we've done evaluation - let's see if we can add an extra step where we review the content we've generated to confirm if it fully answers the user's query!

We're going to make a few key adjustments to account for this:

1. We're going to add an artificial limit on how many "loops" the agent can go through - this will help us to avoid the potential situation where we never exit the loop.
2. We'll add to our existing conditional edge to obtain the behaviour we desire.

First, let's define our state again - we can check the length of the state object, so we don't need additional state for this.

In [24]:
class AgentState(TypedDict):
  messages: Annotated[list, add_messages]

Now we can set our graph up! This process will be almost entirely the same - with the inclusion of one additional node/conditional edge!

#### üèóÔ∏è Activity #5:

Please write markdown for the following cells to explain what each is doing.

##### YOUR MARKDOWN HERE

ANSWER MARKDOWN

- This defines the schema of the state that will be passed between nodes in the LangGraph.

- AgentState is a custom TypedDict used to represent the structure of data shared between graph steps.

- The key messages holds a list of messages (from user, agent, and tool) that represent the full conversation history.

- Annotated[list, add_messages] is a LangGraph utility that automatically appends new messages to the list after each node executes. This enables state accumulation, so the agent maintains memory of past steps.


In [25]:
graph_with_helpfulness_check = StateGraph(AgentState)

graph_with_helpfulness_check.add_node("agent", call_model)
graph_with_helpfulness_check.add_node("action", tool_node)

<langgraph.graph.state.StateGraph at 0x127b596d0>

##### YOUR MARKDOWN HERE


- We start by defining a new StateGraph that uses our custom AgentState. This graph will include the logic to control looping based on helpfulness and iteration count.
-We then create a stateful agent execution graph that tracks progress through defined nodes and transitions based on conditional logic.
- We add two nodes to our StateGraph: agent and action nodes.
 1. "agent": This node runs the language model (e.g., GPT-4), which decides the next step based on the current state (e.g., generate answer, call a tool).
 2. "action": This node performs the tool call, executing any functions the agent selected (e.g., tavily_search_results, arxiv, etc.).



In [26]:
graph_with_helpfulness_check.set_entry_point("agent")

<langgraph.graph.state.StateGraph at 0x127b596d0>

##### YOUR MARKDOWN HERE


This sets the "agent" node as the starting point of the graph.
When the graph is executed, it will begin by calling the language model node (call_model), which is responsible for:

- Interpreting the user‚Äôs question
- Deciding whether to call a tool or generate an answer

By defining this entry point, we ensure the agent starts the reasoning loop before taking any actions or evaluations.

In [27]:
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser

def tool_call_or_helpful(state):
  last_message = state["messages"][-1]

  if last_message.tool_calls:
    return "action"

  initial_query = state["messages"][0]
  final_response = state["messages"][-1]

  if len(state["messages"]) > 10:
    return "END"

  prompt_template = """\
  Given an initial query and a final response, determine if the final response is extremely helpful or not. Please indicate helpfulness with a 'Y' and unhelpfulness as an 'N'.

  Initial Query:
  {initial_query}

  Final Response:
  {final_response}"""

  prompt_template = PromptTemplate.from_template(prompt_template)

  helpfulness_check_model = ChatOpenAI(model="gpt-4")

  helpfulness_chain = prompt_template | helpfulness_check_model | StrOutputParser()

  helpfulness_response = helpfulness_chain.invoke({"initial_query" : initial_query.content, "final_response" : final_response.content})

  if "Y" in helpfulness_response:
    return "end"
  else:
    return "continue"

#### üèóÔ∏è Activity #4:

Please write what is happening in our `tool_call_or_helpful` function!

##### YOUR MARKDOWN HERE

This function determines what path the graph should follow after the agent (LLM) speaks. It's used to add conditional edges between nodes in your LangGraph.
It take one path of the 4 options defined in the function:
1) If the LLM returns a tool_use, then it returns to the action node, to call the tool
2) If The LLM has already done 10 max iterations, it proceeds to return the final response.
3) Otherwise, it uses an LLM-based helpfulness check: 
- It asks GPT-4 to assess whether the agent's final response sufficiently addresses the initial user query
- The LLM returns "Y" if the response is helpful, or "N" if it is not
- It provides a decision outcome from the helpfulness check. 
-- If "Y" ‚Üí return "end"
-- Else ‚Üí return "continue" to loop back to the agent

In [28]:
graph_with_helpfulness_check.add_conditional_edges(
    "agent",
    tool_call_or_helpful,
    {
        "continue" : "agent",
        "action" : "action",
        "end" : END
    }
)

<langgraph.graph.state.StateGraph at 0x127b596d0>

##### YOUR MARKDOWN HERE

It setsup conditional logic on where the agent should go after the agent speaks (LLM call). It answers the question: Whethere to go call another tool, or call the LLM (ie try again)  or to stop and end?
If it decided to continue, it calls the LLM. If it needs to take action, it will invoke tool use.
Alternatively, if the helpfulness score is Y or the 10 max iterations have been achieved, it will END.

In [29]:
graph_with_helpfulness_check.add_edge("action", "agent")

<langgraph.graph.state.StateGraph at 0x127b596d0>

##### YOUR MARKDOWN HERE

It just adds the edge to the graph between the two nodes: Action (for tool use) and Agent (LLM Call). 

this is important becuase, After a tool call is executed: The result (e.g., search results, document snippet) is appended to the agent‚Äôs messages. This edge routes the graph back to the "agent" so it can: Interpret the tool output, Decide whether it needs more tools Or generate a final answer

In [30]:
agent_with_helpfulness_check = graph_with_helpfulness_check.compile()

##### YOUR MARKDOWN HERE

.compile is to compile the graph defined with all nodes, edges, and conditions into an executable agent program
- It validates the graph structure. Converts your logic into an executable form, and Prepares the agent to handle dynamic state transitions based on: Tool calls, Helpfulness feedback, Max iteration limits

In [31]:
inputs = {"messages" : [HumanMessage(content="Related to machine learning, what is LoRA? Also, who is Tim Dettmers? Also, what is Attention?")]}

async for chunk in agent_with_helpfulness_check.astream(inputs, stream_mode="updates"):
    for node, values in chunk.items():
        print(f"Receiving update from node: '{node}'")
        print(values["messages"])
        print("\n\n")

Receiving update from node: 'agent'
[AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_4ycCk81fq0KMPKFJ3OJwd4S3', 'function': {'arguments': '{"query": "LoRA machine learning"}', 'name': 'arxiv'}, 'type': 'function'}, {'id': 'call_p6vUMdT49oLuXpdKNLPd5GCU', 'function': {'arguments': '{"query": "Tim Dettmers"}', 'name': 'tavily_search_results_json'}, 'type': 'function'}, {'id': 'call_ogG4monw6vghpYltW8tWG4Qn', 'function': {'arguments': '{"query": "Attention mechanism machine learning"}', 'name': 'arxiv'}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 72, 'prompt_tokens': 177, 'total_tokens': 249, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4o-2024-08-06', 'system_fingerprint': 'fp_6b6e24b474', 'finish_reason': 'tool_calls', 'logprobs': None},

### Task 4: LangGraph for the "Patterns" of GenAI

Let's ask our system about the 4 patterns of Generative AI:

1. Prompt Engineering
2. RAG
3. Fine-tuning
4. Agents

In [32]:
patterns = ["prompt engineering", "RAG", "fine-tuning", "LLM-based agents"]

In [33]:
for pattern in patterns:
  what_is_string = f"What is {pattern} and when did it break onto the scene??"
  inputs = {"messages" : [HumanMessage(content=what_is_string)]}
  messages = agent_with_helpfulness_check.invoke(inputs)
  print(messages["messages"][-1].content)
  print("\n\n")

**What is Prompt Engineering?**

Prompt engineering is the process of designing and refining input prompts to effectively guide the behavior of AI models. It involves structuring or crafting instructions to produce the best possible output from a generative AI model. This can include phrasing a query, specifying a style, providing relevant context, or describing a character for the AI to mimic. The goal is to improve the accuracy and effectiveness of the AI's responses, whether it's generating text, images, or other types of content. Prompt engineering is essential for creating better AI-powered services, minimizing biases, and getting better results from generative AI tools.

**When Did Prompt Engineering Become Popular?**

Prompt engineering gained significant attention with the release of GPT-3 in 2020, which showcased the power of prompts in guiding AI models to perform desired tasks. The technique became more prominent after the release of ChatGPT in 2022, when it was recognized a