亚马逊云科技免费托管服务快速构建审批MCP工作流（详细指南）

企业在部署Agent服务时一般需要整合多种工具，从而实现函数的调用。但是这些被集成为Agent本身内部的API调用，从而导致企业级应用场景下的扩展性难题以及工具复用困难。本文将通过亚马逊云科技免费托管服务+MCP快速构建一个通过专业角色和自动化工作流的应用程序。

亚马逊云科技免费托管服务全称为“Amazon SageMaker AI”，目前提供2个月免费试用套餐，能够提供托管LLM的能力，让运维人员减轻扩展性问题或管理繁琐的基础工作。

亚马逊云科技官网：点击直达（注册即可试用SageMaker AI和百余款云服务）

参考教程：《亚马逊云科技账号注册流程图解》

接下来将介绍可扩展部署MCP Servers与MCP Clients的参考架构，该架构采用Amazon SageMaker AI作为基础模型（FMs）和LLM的托管环境。尽管此架构以Amazon SageMaker AI作为推理核心，但它也可快速调整以支持Amazon Bedrock模型。该解决方案架构如下图所示。

一、MCP是什么

近期有关“MCP”的关键词在业界内掀起了一波“热潮”，那么MCP到底是什么呢？MCP是由Anthropic公司开发的一项开源协议，主要将AI模型连接到大部分数据源或工具的标准化方式。MCP采用client-server架构（如下图所示），主要帮助开发者通过轻量级的MCP Servers来公开其数据，同时构建作为MCP Clients的AI应用程序，且客户端可连接到服务器。

传统API通常将各种功能捆绑在一起，导致需要扩展时必须升级整个系统，且更新时面临系统全面故障的高风险，而且针对不同应用程序管理不同版本的API也变得异常复杂和繁琐。尽管微服务提供了更高的模块化程度，但它们通常要求对每个服务单独进行复杂集成，造成繁琐的管理开销。

MCP的标准化client-server架构，专为高效且安全的集成而设计，克服了上述限制。它提供了一个实时的双向通信接口，使得AI系统能够依据“一次编写，随处使用”的理念，无缝连接各种外部工具、API服务以及数据源。

二、亚马逊云科技免费托管服务集成FastMCP

架构确定后，对下图所示的应用程序流程进行分析。

就使用模式而言，MCP与工具调用在逻辑上存在相似之处，但它首先增加了用于发现可用工具的步骤。

1、客户端连接到MCP Server，并获取可用工具的列表。

2、客户端使用根据MCP Server上可用工具列表设计的提示词（类型为“用户”的消息），来调用LLM。

3、LLM根据需要判断应调用的工具以及调用次数，并作出回复（类型为“助手”的消息）。

4、客户端要求MCP Server执行工具调用，并将结果提供给LLM（类型为“用户”的消息）。

5、此循环会不断迭代，直至得出最终答案，并可将答案返回给用户。

6、客户端断开与MCP Server的连接。

要创建MCP Server，需使用官方的Model Context Protocol Python SDK。以创建仅包含一个工具的简单服务器为例，该工具将模拟在广播电台搜索播放次数最多的热门歌曲，并以Python字典的形式返回结果。请务必添加适当的文档字符串以及输入与输出类型注解，以便服务器和客户端均能正确发现并使用该资源。

from mcp.server.fastmcp import FastMCP

# instantiate an MCP server client
mcp = FastMCP(“Radio Station Server”)

# DEFINE TOOLS
@mcp.tool()
def top_song(sign: str) -> dict:
“””Get the most popular song played on a radio station”””
# In this example, we simulate the return
# but you should replace this with your business logic
return {
“song”: “In the end”,
“author”: “Linkin Park”
}

@mcp.tool()
def …

if __name__ == “__main__”:
# Start the MCP server using stdio/SSE transport
mcp.run(transport=”sse”)

MCP Servers可以在亚马逊云科技的计算服务上运行，例如Amazon EC2、Amazon EKS或Amazon Lambda，随后MCP Servers可用于安全地访问亚马逊云科技云中的其他资源，例如VPC中的数据库或企业API以及外部资源。例如部署MCP Servers的一种简单方法是利用Amazon Lambda对Docker镜像的支持，在Lambda函数或Amazon Fargate上安装MCP依赖项。

服务器设置完成后，将重点转移至MCP Client，通信始于MCP Client使用可流式传输的HTTP，连接到MCP Server。

from mcp import ClientSession
from mcp.client.sse import sse_client

async def connect_to_sse_server(self, server_url: str):
“””Connect to an MCP server running with SSE transport”””
# Store the context managers so they stay alive
self._streams_context = sse_client(url=server_url)
streams = await self._streams_context.__aenter__()

self._session_context = ClientSession(*streams)
self.session: ClientSession = await self._session_context.__aenter__()

# Initialize
await self.session.initialize()

# List available tools to verify connection
print(“Initialized SSE client…”)
print(“Listing tools…”)
response = await self.session.list_tools()
tools = response.tools
print(“\nConnected to server with tools:”, [tool.name for tool in tools])

连接到MCP Server时，可使用list_tools() API向服务器请求可用工具的列表。获取工具列表及其描述后，可为调用工具定义系统提示词。

system_message = (
“You are a helpful assistant with access to these tools:\n\n”
f”{tools_description}\n”
“Choose the appropriate tool based on the user’s question. ”
“If no tool is needed, reply directly.\n\n”
“IMPORTANT: When you need to use a tool, you must ONLY respond with ”
“the exact JSON object format below, nothing else:\n”
“{\n”
‘ “tool”: “tool-name”,\n’
‘ “arguments”: {\n’
‘ “argument-name”: “value”\n’
” }\n”
“}\n\n”
“After receiving a tool’s response:\n”
“1. Transform the raw data into a natural, conversational response\n”
“2. Keep responses concise but informative\n”
“3. Focus on the most relevant information\n”
“4. Use appropriate context from the user’s question\n”
“5. Avoid simply repeating the raw data\n\n”
“Please use only the tools that are explicitly defined above.”
)

工具通常使用类似以下示例的.json模式（schema）进行定义，该工具名为top_song，功能为获取广播电台播放次数最多的热门歌曲。

{
“name”: “top_song”,
“description”: “Get the most popular song played on a radio station.”,
“parameters”: {
“type”: “object”,
“properties”: {
“sign”: {
“type”: “string”,
“description”: “The call sign for the radio station for which you want the most popular song. Example calls signs are WZPZ and WKRP.”
}
},
“required”: [“sign”]
}
}

系统提示词配置完成后，可根据需要多次运行聊天循环。在该过程中，需要在调用托管的LLM，与调用由MCP Server所支持的工具之间不断切换。可使用诸如Amazon SageMaker Boto3、Amazon SageMaker Python SDK，或者诸如LiteLLM及类似库等其他第三方库，来实现这一功能。

messages = [
{“role”: “system”, “content”: system_message},
{“role”: “user”, “content”: “What is the most played song on WZPZ?”}
]

result = sagemaker_client.invoke_endpoint(…)
tool_name, tool_args = parse_tools_from_llm_response(result)
# Identify if there is a tool call in the message received from the LLM
result = await self.session.call_tool(tool_name, tool_args)
# Parse the output from the tool called, then invoke the endpoint again
result = sagemaker_client.invoke_endpoint(…)

托管在Amazon SageMaker上的模型，在其API中并不原生支持函数调用功能，因此需要使用正则表达式或类似方法来解析响应内容。

import re, json

def parse_tools_from_llm_response(message: str)->dict:
match = re.search(r'(?s)\{(?:[^{}]|(?:\{[^{}]*\}))*\}’, content)
content = json.loads(match.group(0))
tool_name = content[“tool”]
tool_arguments = content[“arguments”]
return tool_name, tool_arguments

LLM的响应中不再包含任何待处理的工具调用请求时，即可将响应内容视为最终答案，并将其返回给用户。最后记得关闭数据流，以结束与MCP Server的交互流程。