Week 4, Creating your Agent pt 3

Hi All, welcome to week 4 of the Spotify Agent project.

This week, we'll be implementing the architecture of our agent, and testing it out for the first time

Setting up your MCP configuration

We'll be using a library called MCP-use (https://github.com/mcp-use/mcp-use) to load in our MCP tools from the mcp server we created earlier. Check out the official documentation if you are interested. MCP-Use requires us to crete a JSON (https://www.oracle.com/database/what-is-json/) that will tell the library exactly how to launch and configure the needed tools.

In your Spotify-Agent folder create a file titled mcp_config.json. Now copy and paste the following configuration.

{

"mcpServers": {

"spotify": {

"command": "node",

"args": ["PATH TO YOUR SPOTIFY MCP SERVER", build/index.js"],

}

Locate the path of wherever you copied the MCP server we cloned last week, and paste that into your code here.

This config will eventually tell our script how to launch the MCP server

Note, if you wanted to add more MCP servers to your code, you would add them here in their own section after "spotify". Everything would go under "mcpServers".

Installing needed libraries

Run the following uv pip installs to get the needed libaries we're gonna be using

uv pip install langgraph

uv pip install langchain-groq

uv pip install mcp-use

And add the following to the imports section of your code

from langchain_groq import ChatGroq

from langgraph.graph import StateGraph, START, END

from langgraph.prebuilt import tools_condition, ToolNode

from langgraph.graph import MessagesState

import asyncio

from mcp_use.client import MCPClient

from mcp_use.adapters.langchain_adapter import LangChainAdapter

Creating an Agent Graph

Before you start creating your graph, it is important that you are familiar with LangGraph, the library we'll be using to create our agent. Read through this documentation and try to get a good understanding how things are working together.

https://docs.langchain.com/oss/python/langgraph/quickstart

now create a new function in your agent_script file with the code

async def create_graph()

if you recall from a few weeks ago, we use asynchronous programming when we want components of our code to run while waiting for other parts to finish. This is useful for our use case because our agent will be communicating with external services (such as Spotify) while needing to do other things in the background.

Now lets load in our MCP tools, copy and paste the following code into your create_graph function

#create client

client = MCPClient.from_config_file("mcp_config.json")

#create adapter instance

adapter = LangChainAdapter()

#load in tools from the MCP client

tools = await adapter.create_tools(client)*

The first line is setting up the MCP server, the second is creating an object that will be used to connect LangChain/LangGraph with the MCP server, and the third is loading in the tools from the MCP server as tools that are useable in LangChain/LangGraph

Now copy and paste the following line

tools = [t for t in tools if t.name not in['getNowPlaying', 'getRecentlyPlayed', 'getQueue', 'playMusic', 'pausePlayback', 'skipToNext', 'skipToPrevious', 'resumePlayback', 'addToQueue', 'getMyPlaylists','getUsersSavedTracks', 'saveOrRemoveAlbum', 'checkUsersSavedAlbums']]

There's a lot of tools we get from the MCP server that are not relevant to the simplest possible version of this agent, so to save input tokens we'll remove them for now (more on that later).

Next, we'll load in our LLM

#define llm

llm = ChatGroq(model='meta-llama/llama-4-scout-17b-16e-instruct')

#bind tools

llm_with_tools = llm.bind_tools(tools, parallel_tool_calls=False)

Here, we're loading in a model from Groq (llama-4-scout specifically because we get a lot of free usage out of it), then we're letting our model know what tools are available to it with the bind_tools function.

Now, its time to add a system prompt. If you are not already familiar, a system prompt is a piece of text that you give to every model to let it know what its core purposes, objectives, and guidelines are. Every chatbot you interact with actually has a system prompt in the background. Here is a list of leaked system prompts from different chatbots if you are curious: https://github.com/asgeirtj/system_prompts_leaks

Ours will not be as long, feel free to customize yours, but mine is:

system_msg = """You are a helpful assistant that has access to Spotify. You can create playlists, find songs, and provide music recommendations.

When creating playlists:

- If the user does not specify playlist size, limit playlist lengths to only 10 songs

- Always provide helpful music recommendations based on user preferences and create well-curated playlists with appropriate descriptions

- When the User requests a playlist to be created, ensure that there are actually songs added to the playlist you create

"""

We can finally begin building out the graph that defines our agent's behavior. Add the following function to your code

#define assistant

def assistant(state: MessagesState):

return {"messages": [llm_with_tools.invoke([system_msg] + state["messages"])]}

Here, we are creating a special kind of Node called an assistant node. The job of the assistant node is to recieve input from the user, call the appropriate tools needed to answer the user's instructions, recieve the output from those tools, and decide whether the user's instructions have been met or if more tool calls are needed. The assistant is essentially the brains of our operation; it's the one making all of the decisions.

Notice, the function is receving an input called 'state' of type 'MessagesState', this means the assistant has a full history of the full conversation between itself and the user, as well as the history of all tools that have been called so far. All of this information, alongside the system prompt, is being passed to the llm to generate its next output.

Finally, we are ready to create our graph

# Graph

builder = StateGraph(MessagesState)

# Define nodes: these do the work

builder.add_node("assistant", assistant)

builder.add_node("tools", ToolNode(tools))

In these first few lines, we are initializing an empty state (empty message history) and adding two nodes, one with the assistant and one with all of the tools.

# Define edges: these determine the control flow

builder.add_edge(START, "assistant")

builder.add_conditional_edges(

"assistant",

tools_condition,

)

builder.add_edge("tools", "assistant")

In the next few lines, we are defining our edges, or what connects our nodes together. We first add an edge between START and assistant, that tells our graph to first send user input to the assistant node. Then we add a conditional edge between our assistant node and our tools node using the convenient prebuilt tools_condition class from LangGraph. This will automatically create two edges that will execute conditionally; if the agent decides that the user's request has not been met, it will continue to call tools. If it decides the user's request has been met, it will return the output to the user. Finally, we add an edge between our tools node and our assistant, so the output that the tools node creates (i.e the output of individual tool calls) can make it back to the agent

graph = builder.compile()

return graph

Finally, we run the .compile() function on the builder to create our agent and return that agent as output so we can use it later on.

If you are curious, you can find an image of the graph we generated here: https://github.com/harshil0217/Spotify_Agent_Images

Using your Agent

First modify your main function, change the definition from

def main()

async def main()

This is because we are going to call our create_graph function inside of our main function, which if you recall, is asynchronous, so we need to update main to reflect that.

In your now updated main function, copy and paste the following code after you kill the existing port processes on port 8090

agent = await create_graph()

while True:

final_text = ""

message = input("User: ")

Notice, we are using the "await" keyword when we are calling our 'create_graph' function. This is because whenever you want to get the output of an async function, you need to let python know it needs to wait for that function to finish executing to get the right output. So we add the "await" keyword to tell python that it needs to wait for create_graph to finish creating the agent before we can actually start using it.

I'll leave creating the rest of the main function as a challenge to you, though I would recommend referencing the following documentation

https://docs.langchain.com/oss/python/langchain/models#invocation

Also pay attention to the blurb in the documentation that says something about an "asynchronous equivalent", this is VERY important

Once you've finished your main function, make the following modification to where you are executing your main function.

if name == "__main__":

# Run the main function in an event loop

asyncio.run(main())

As you can see, the main function is enclosed by asyncio.run(), which is a package in python that allows us to actually run asynchronous functions.

Before you use your agent make sure to update the access token in your MCP server code. Simply go to the folder you git cloned last week (called spotify-mcp-server) and run npm run auth. Note: if its been a few hours since you last used your agent, you will have to run this command since the access token expires after a defined length of time. You could write some code to automate this process, but I'll let you figure out how to do that yourself.

You should be able to use your agent now, just run the script like a normal .py file, and you should be able to interact with the agent in your code editor's terminal.

In case you run into any issues with Groq, read the section below.

Input Tokens

When you use an llm hosted on the cloud (on a service such as Groq for example) you are charged based on the number of input tokens you give to the model, as well as the number of output tokens the model generates. A "token", if you are not already familiar with the term, is just a piece of a word.

We are using Groq because it offers a free usage tier for many of its models, and llama-4-scout specifically because its free usage tier is a bit higher than the other models on the Groq platform.

Unfortunately, this free tier is not really generous for our purposes, and it is very possible you will max out the defined usage limit. You can check your token usage within a dashboard on the groq website. Just sign into groq and go to https://console.groq.com/dashboard/metrics

You should be able to see your recent usage, as well as if you have exceeded the limit for both input and output tokens and the corresponding error code associated with that. Not that the input token usage is so high because we have to pass all of the instructions for how to use each tool (which is a big ass JSON) as well as the history of all messages and tool calls. To keep usage down, make playlists with a small (3-5 songs) and end conversations before they get too long. We'll make some optimizations next week that should help, but keep these guidelines in mind for now.

Conclusion

Thanks again for joining us, see you soon!

Spotify Agent Week 4