PythonSDKLangChainRAG
Python SDK 接入教程
用 Python openai SDK 接入 ChinaWHAPI,支持同步、流式、Function Calling、LangChain 和 RAG 全场景。
安装依赖
ChinaWHAPI 完全兼容 OpenAI Python SDK,直接安装 openai 包即可。
pip install openai tiktoken
基础调用
配置 base_url 指向 ChinaWHAPI 端点,使用方式和 OpenAI 完全一致。
from openai import OpenAI
client = OpenAI(
api_key="your_chinawhapi_key",
base_url="https://chinawhapi.com/v1"
)
# 同步调用
response = client.chat.completions.create(
model="qwen3.6-plus",
messages=[{"role": "user", "content": "解释什么是微服务架构"}]
)
print(response.choices[0].message.content)流式输出
设置 stream=True 实现打字机效果,适合聊天界面和实时生成场景。
stream = client.chat.completions.create(
model="qwen3.6-plus",
messages=[{"role": "user", "content": "写一个 Python FastAPI 示例"}],
stream=True,
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)Function Calling / Tool Use
定义 tools 数组,模型会选择性调用指定工具,实现 Agent 工作流。
response = client.chat.completions.create(
model="qwen3.6-plus",
messages=[{"role": "user", "content": "北京现在多少度?"}],
tools=[
{
"type": "function",
"function": {
"name": "get_weather",
"description": "获取指定城市的天气",
"parameters": {
"type": "object",
"properties": {"city": {"type": "string", "description": "城市名称"}},
"required": ["city"]
}
}
}
]
)
print(response.choices[0].message.tool_calls)LangChain 集成
在 LangChain 中使用 ChatOpenAI 并设置 openai_api_base,即可使用 ChinaWHAPI 所有模型。
from langchain.chat_models import ChatOpenAI
from langchain.schema import HumanMessage
llm = ChatOpenAI(
model="qwen3.6-plus",
openai_api_key="your_chinawhapi_key",
openai_api_base="https://chinawhapi.com/v1",
temperature=0.7,
)
response = llm([HumanMessage(content="用 Python 写一个快速排序")])
print(response.content)LlamaIndex / RAG 集成
LlamaIndex 通过 OpenAILike 接口接入 ChinaWHAPI,适合构建知识库问答系统。
from llama_index.llms.openai_like import OpenAILike
from llama_index import VectorStoreIndex, SimpleDirectoryReader
llm = OpenAILike(
model="qwen3.6-plus",
api_key="your_chinawhapi_key",
api_base="https://chinawhapi.com/v1",
is_chat_model=True,
)
documents = SimpleDirectoryReader("./docs").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine(llm=llm)
response = query_engine.query("总结这份文档的核心内容")
print(response)Token 计数
使用 tiktoken 精确计算 token 量,便于成本估算和 Prompt 优化。
import tiktoken
enc = tiktoken.encoding_for_model("gpt-4o")
text = "这是一段中文文本,大约几十个字"
tokens = enc.encode(text)
print(f"Token 数: {len(tokens)}")错误处理与重试
实现带指数退避的重试机制,保证生产环境的稳定性。
import time
from openai import RateLimitError, APIError
def call_with_retry(client, model, messages, max_retries=3):
for attempt in range(max_retries):
try:
return client.chat.completions.create(model=model, messages=messages)
except RateLimitError:
wait_time = 2 ** attempt
time.sleep(wait_time)
except APIError as e:
if e.status_code >= 500:
continue # 5xx 错误直接重试
raise
raise Exception("Max retries exceeded")