团队引入OpenFaaS的初衷是好的:降低后端开发的复杂度,让开发者专注业务逻辑。但现实是,我们很快陷入了另一个困境。开发者不再写Dockerfile,但他们开始编写、调试和维护越来越复杂的stack.yml
文件。CI/CD流水线也变得脆弱,充满了faas-cli
的shell脚本,任何微小的环境差异都可能导致部署失败。新成员的上手曲线陡峭,Serverless带来的敏捷性优势几乎被这种“工具链摩擦”所抵消。
痛点很明确:我们需要一个能屏蔽底层复杂度的内部开发者平台(IDP)。目标不是替代faas-cli
,而是为80%的通用场景提供一个“一键式”的图形化发布路径。一个开发者应该能够通过Web界面,提交他的函数代码,然后由平台负责后续的测试、打包、发布和验证。这个想法成了我们团队一个季度性的技术项目。
初步构想与技术选型决策
这个内部平台的核心是一条自动化的工作流。我们首先用行为驱动开发(BDD)的思路,以用户故事的形式定义了核心功能:
Feature: Self-Service Function Deployment
Scenario: A developer successfully deploys a new Node.js function
Given the developer is on the "New Function" page
And they have selected the "node18-express" template
When they provide a valid function name "user-profile-service"
And they paste their function's source code into the editor
And they click the "Deploy" button
Then the system should initiate the deployment process
And after a short period, they should see a "Deployment Successful" notification
And the function "user-profile-service" should be available and responsive at its gateway endpoint
这个Gherkin剧本成为了我们架构设计的北极星。它清晰地描述了前端交互、后端处理和最终状态验证,这直接影响了我们的技术选型:
前端界面 (UI):
Shadcn UI
我们不需要一个重量级、自带设计系统的UI库。我们需要的是能快速构建出干净、专业界面的基础组件,并且对这些组件有完全的控制权。Shadcn UI的模式——它不是一个依赖库,而是一系列你可以直接复制到自己代码库中的组件——完美符合我们的要求。这避免了组件库升级带来的麻烦,也让我们能轻松地根据内部设计规范进行定制。在真实项目中,这种可维护性和控制力远比开箱即用的主题重要。后端工作流编排:
OpenFaaS Function
用一个OpenFaaS函数来部署另一个OpenFaaS函数,这听起来有点“套娃”,但实际上是一个非常务实的选择。我们的IDP后端不需要7x24小时运行,它本质上是一个事件触发的流程。将其本身实现为一个Serverless函数(我们称之为deployer-function
),可以复用现有的OpenFaaS基础设施,避免了额外维护一个常驻服务的成本。这个deployer-function
将成为一个面向平台API,接收前端请求,并在后台执行faas-cli
命令或直接调用OpenFaaS API来完成部署。测试框架:
Vitest
这个项目横跨前后端,我们需要一个能同时处理React组件测试和Node.js后端逻辑测试的工具。Vitest以其闪电般的速度和与Vite生态的无缝集成脱颖而出。更关键的是,我们可以利用它来编写端到端的测试,直接验证上面定义的BDD剧本。一个测试框架就能覆盖单元、集成和端到端测试,极大地简化了我们的开发工作流。
步骤化实现:从前端到后端的完整链路
我们的项目采用pnpm monorepo结构,包含两个主要部分:portal
(React前端应用)和functions
(OpenFaaS函数)。
/faas-idp
├── apps/
│ └── portal/ # React + Shadcn UI 前端
│ ├── src/
│ ├── package.json
│ └── ...
├── functions/
│ ├── deployer/ # 部署器函数
│ │ ├── handler.js
│ │ ├── package.json
│ │ └── Dockerfile
│ └── stack.yml # OpenFaaS 函数定义
├── package.json
└── pnpm-workspace.yaml
1. 前端界面:构建部署表单
我们首先使用create-vite
初始化portal
应用,然后通过shadcn-ui
的CLI引入我们需要的组件:Card
, Input
, Textarea
, Button
, Label
, Select
, 和 Toast
。
核心组件是 DeploymentForm.tsx
,它负责收集用户输入。
jsx// apps/portal/src/components/DeploymentForm.tsx import { useState } from "react"; import { Button } from "@/components/ui/button"; import { Card, CardContent, CardDescription, CardFooter, CardHeader, CardTitle } from "@/components/ui/card"; import { Input } from "@/components/ui/input"; import { Label } from "@/components/ui/label"; import { Textarea } from "@/components/ui/textarea"; import { useToast } from "@/components/ui/use-toast"; // 在真实项目中,这个地址应该是可配置的 const DEPLOYER_FUNCTION_URL = "http://<your-openfaas-gateway>/function/deployer"; interface DeploymentPayload { functionName: string; template: string; handlerCode: string; } export function DeploymentForm() { const [functionName, setFunctionName] = useState(""); const [handlerCode, setHandlerCode] = useState(`module.exports = async (event, context) => {\n return context\n .status(200)\n .succeed("Hello from your new function!");\n}`); const [isLoading, setIsLoading] = useState(false); const { toast } = useToast(); const handleSubmit = async (e: React.FormEvent) => { e.preventDefault(); if (!functionName || !handlerCode) { toast({ title: "Validation Error", description: "Function name and code cannot be empty.", variant: "destructive", }); return; } setIsLoading(true); const payload: DeploymentPayload = { functionName, template: "node18-express", // 简化处理,实际应为Select组件 handlerCode, }; try { const response = await fetch(DEPLOYER_FUNCTION_URL, { method: "POST", headers: { "Content-Type": "application/json", }, body: JSON.stringify(payload), }); if (!response.ok) { // 尝试解析后端返回的错误信息 const errorData = await response.json(); throw new Error(errorData.message || `Deployment failed with status: ${response.status}`); } const result = await response.json(); toast({ title: "Deployment Successful", description: `Function ${result.functionName} is now available at ${result.url}`, }); setFunctionName(""); // 清空表单,准备下一次部署 } catch (error) { // 这里的错误处理至关重要 const errorMessage = error instanceof Error ? error.message : "An unknown error occurred."; console.error("Deployment failed:", errorMessage); toast({ title: "Deployment Failed", description: errorMessage, variant: "destructive", }); } finally { setIsLoading(false); } }; return ( <Card className="w-[650px]"> <CardHeader> <CardTitle>Deploy a New Function</CardTitle> <CardDescription>Provide details for your new serverless function.</CardDescription> </CardHeader> <form onSubmit={handleSubmit}> <CardContent> <div className="grid w-full items-center gap-4"> <div className="flex flex-col space-y-1.5"> <Label htmlFor="name">Function Name</Label> <Input id="name" placeholder="e.g., user-profile-service" value={functionName} onChange={(e) => setFunctionName(e.target.value.toLowerCase().replace(/\s/g, '-'))} disabled={isLoading} /> </div> <div className="flex flex-col space-y-1.5"> <Label htmlFor="code">Handler Code (handler.js)</Label> <Textarea id="code" placeholder="Paste your Node.js function code here" className="font-mono h-[300px]" value={handlerCode} onChange={(e) => setHandlerCode(e.target.value)} disabled={isLoading} /> </div> </div> </CardContent> <CardFooter className="flex justify-end"> <Button type="submit" disabled={isLoading}> {isLoading ? "Deploying..." : "Deploy"} </Button> </CardFooter> </form> </Card> ); }
代码中的错误处理和加载状态管理是生产级UI的必备要素。用户必须清楚地知道当前系统状态,以及操作失败的原因。
2. deployer-function
: 核心部署逻辑
这个函数是整个平台的心脏。它需要被授予足够的权限来执行faas-cli
命令。我们为它构建了一个自定义的Dockerfile,将faas-cli
包含进去。
functions/deployer/handler.js
:
// functions/deployer/handler.js
'use strict'
const { exec } = require('child_process');
const fs = require('fs').promises;
const path = require('path');
const os = require('os');
// 安全性提示:在生产环境中,这些变量应通过环境变量或OpenFaaS secrets传入
const OPENFAAS_GATEWAY = process.env.OPENFAAS_GATEWAY || 'http://gateway.openfaas:8080';
const TEMP_DIR_BASE = os.tmpdir();
/**
* 异步执行shell命令的辅助函数
* @param {string} command - The command to execute
* @returns {Promise<{stdout: string, stderr: string}>}
*/
const execAsync = (command) => {
return new Promise((resolve, reject) => {
exec(command, (error, stdout, stderr) => {
if (error) {
console.error(`Exec error for command "${command}": ${stderr}`);
// 将stderr作为错误信息的一部分,这对于调试至关重要
reject(new Error(`Command failed: ${stderr || error.message}`));
return;
}
resolve({ stdout, stderr });
});
});
};
module.exports = async (event, context) => {
let tempDir = '';
try {
const { functionName, template, handlerCode } = event.body;
// 1. 输入验证
if (!functionName || !/^[a-z0-9]([-a-z0-9]*[a-z0-9])?$/.test(functionName)) {
return context.status(400).fail({ message: 'Invalid function name. Must be lowercase, alphanumeric, and start with a letter.' });
}
if (!template || !handlerCode) {
return context.status(400).fail({ message: 'Template and handler code are required.' });
}
// 2. 创建一个唯一的临时工作目录
tempDir = await fs.mkdtemp(path.join(TEMP_DIR_BASE, `faas-deploy-${functionName}-`));
console.log(`Created temporary directory: ${tempDir}`);
// 3. 使用 faas-cli 创建函数骨架
// 这里的 --gateway 是为了确保cli知道要和哪个OpenFaaS实例通信
console.log(`Pulling template: ${template}`);
await execAsync(`faas-cli template store pull ${template}`);
console.log(`Creating new function skeleton for: ${functionName}`);
await execAsync(`faas-cli new ${functionName} --lang ${template} --gateway ${OPENFAAS_GATEWAY}`, { cwd: tempDir });
const functionDirPath = path.join(tempDir, functionName);
// 4. 将用户代码写入handler文件
const handlerFilePath = path.join(functionDirPath, 'handler.js');
await fs.writeFile(handlerFilePath, handlerCode);
console.log(`Wrote user code to ${handlerFilePath}`);
// 5. 部署函数
// 这里的stack file是动态生成的,就在临时目录里
const stackFilePath = path.join(tempDir, `${functionName}.yml`);
await execAsync(`faas-cli deploy -f ${stackFilePath} --gateway ${OPENFAAS_GATEWAY}`);
console.log(`Deployment command issued for ${functionName}`);
// 6. 返回成功响应
const functionUrl = `${OPENFAAS_GATEWAY}/function/${functionName}`;
return context
.status(200)
.succeed({
message: 'Deployment initiated successfully.',
functionName: functionName,
url: functionUrl
});
} catch (err) {
// 关键的错误捕获和日志记录
console.error('Deployment process failed:', err.message);
// 向前端返回有意义的错误信息
return context
.status(500)
.fail({
message: 'Internal deployment error.',
details: err.message
});
} finally {
// 7. 清理临时文件,无论成功与否
if (tempDir) {
try {
await fs.rm(tempDir, { recursive: true, force: true });
console.log(`Cleaned up temporary directory: ${tempDir}`);
} catch (cleanupErr) {
console.error(`Failed to cleanup temp directory ${tempDir}:`, cleanupErr);
}
}
}
}
这个handler.js
的健壮性体现在:
- 详细的日志: 每一步操作都有日志输出,便于排查问题。
- 严格的输入验证: 防止无效的函数名或数据注入。
- 临时文件管理: 在临时目录中操作,避免并发请求之间的冲突,并通过
finally
块确保清理,防止磁盘空间被占满。 - 清晰的错误传递:
exec
的stderr
被捕获并返回给前端,而不是一个模糊的“服务器错误”。
3. 实现BDD测试:用Vitest验证完整流程
这是将所有部分粘合在一起的关键。我们将在portal
应用中创建一个端到端测试文件,它将模拟用户行为,并验证整个系统的反应。
我们将使用msw
(Mock Service Worker) 来拦截对deployer-function
的网络请求,这样我们的测试就不需要一个真正运行的OpenFaaS环境,使其更快、更可靠。
apps/portal/src/e2e/deployment.spec.ts
:
// apps/portal/src/e2e/deployment.spec.ts
import { describe, it, expect, beforeAll, afterAll, afterEach } from 'vitest';
import { render, screen, fireEvent, waitFor } from '@testing-library/react';
import { userEvent } from '@testing-library/user-event';
import { App } from '@/App'; // 假设App组件渲染了DeploymentForm
import { rest } from 'msw';
import { setupServer } from 'msw/node';
const DEPLOYER_FUNCTION_URL = "http://localhost/function/deployer"; // 使用本地或mock地址
// 模拟成功的响应
const successResponse = rest.post(DEPLOYER_FUNCTION_URL, (req, res, ctx) => {
return res(
ctx.status(200),
ctx.json({
message: 'Deployment initiated successfully.',
functionName: 'test-function',
url: `${DEPLOYER_FUNCTION_URL}/test-function`
})
);
});
// 模拟失败的响应
const errorResponse = rest.post(DEPLOYER_FUNCTION_URL, (req, res, ctx) => {
return res(
ctx.status(500),
ctx.json({
message: 'Internal deployment error.',
details: 'Command failed: faas-cli exited with code 1.'
})
);
});
const server = setupServer();
// Vitest生命周期钩子
beforeAll(() => server.listen({ onUnhandledRequest: 'error' }));
afterAll(() => server.close());
afterEach(() => server.resetHandlers());
describe('Feature: Self-Service Function Deployment', () => {
it('Scenario: A developer successfully deploys a new Node.js function', async () => {
server.use(successResponse);
const user = userEvent.setup();
// Given the developer is on the "New Function" page
render(<App />);
expect(screen.getByRole('heading', { name: /Deploy a New Function/i })).toBeInTheDocument();
// And they have selected the "node18-express" template (simplified)
// When they provide a valid function name "user-profile-service"
const nameInput = screen.getByLabelText(/Function Name/i);
await user.type(nameInput, 'user-profile-service');
// And they paste their function's source code into the editor
const codeEditor = screen.getByLabelText(/Handler Code/i);
await user.clear(codeEditor); // 清除默认代码
await user.type(codeEditor, 'module.exports = async () => ({ status: "ok" });');
// And they click the "Deploy" button
const deployButton = screen.getByRole('button', { name: /Deploy/i });
await user.click(deployButton);
// Then the system should initiate the deployment process
expect(screen.getByRole('button', { name: /Deploying.../i })).toBeDisabled();
// And after a short period, they should see a "Deployment Successful" notification
await waitFor(() => {
expect(screen.getByText(/Deployment Successful/i)).toBeInTheDocument();
}, { timeout: 3000 });
// And the form should be reset
expect(nameInput).toHaveValue('');
});
it('Scenario: Deployment fails due to a server-side error', async () => {
server.use(errorResponse);
const user = userEvent.setup();
// Given I am on the deployment page
render(<App />);
// When I fill the form and submit
await user.type(screen.getByLabelText(/Function Name/i), 'failing-function');
await user.click(screen.getByRole('button', { name: /Deploy/i }));
// Then I should see a detailed error notification
await waitFor(() => {
expect(screen.getByText(/Deployment Failed/i)).toBeInTheDocument();
// 验证错误详情是否被展示,这对于用户体验非常重要
expect(screen.getByText(/Command failed: faas-cli exited with code 1./i)).toBeInTheDocument();
});
// And the deploy button should be enabled again
expect(screen.getByRole('button', { name: /Deploy/i })).not.toBeDisabled();
});
});
这个测试文件完美地将BDD剧本转化为了可执行的代码。它验证了UI的状态变化、API的调用、成功和失败的流程,以及给用户的反馈是否清晰。这种测试是项目质量的最后一道防线。
架构流程图
为了更清晰地展示整个工作流程,我们可以用Mermaid图来表示:
sequenceDiagram participant User as Developer participant Portal as Shadcn UI Portal participant Deployer as Deployer Function (OpenFaaS) participant OpenFaaS as OpenFaaS Gateway/Core participant Registry as Container Registry participant Cluster as Kubernetes/Container Runtime User->>Portal: Fills form and clicks "Deploy" Portal->>Deployer: POST / with JSON payload (name, code) activate Deployer Deployer->>Deployer: Creates temporary directory Deployer->>Deployer: Executes `faas-cli new` Deployer->>Deployer: Writes user code to handler.js Deployer->>Deployer: Executes `faas-cli build` Note over Deployer,Cluster: Builds container image locally Deployer->>Registry: Executes `faas-cli push` Registry-->>Deployer: Image pushed Deployer->>OpenFaaS: Executes `faas-cli deploy` activate OpenFaaS OpenFaaS->>Cluster: Creates/Updates Deployment & Service Cluster-->>OpenFaaS: Pods are running OpenFaaS-->>Deployer: Deployment acknowledged deactivate OpenFaaS Deployer-->>Portal: 200 OK with function URL deactivate Deployer Portal->>User: Shows "Success" Toast notification
遗留问题与未来迭代方向
这个v1版本的IDP平台虽然解决了核心痛点,但作为一个务实的工程方案,它依然存在一些局限性和可以改进的地方。
首先,deployer-function
内部直接执行exec('faas-cli ...')
的方式,虽然简单直接,但在安全性和可扩展性上存在隐患。一个更健壮的方案是让deployer-function
直接通过HTTP请求与OpenFaaS Gateway的REST API进行交互,而不是依赖CLI工具。这将消除对faas-cli
二进制文件的依赖,并提供更精细的错误控制。
其次,当前的部署过程是同步的。前端发起请求后会一直等待deployer-function
执行完毕。对于复杂的函数(例如,需要安装很多依赖的),这个过程可能会超过HTTP的超时时间。未来的迭代方向是转向异步工作流。前端提交请求后,deployer-function
立即返回一个任务ID,并触发一个NATS JetStream或Kafka消息。由另一个或一组worker函数订阅该消息来执行实际的部署工作,并通过WebSocket或轮询将部署状态(如BUILDING
, PUSHING
, DEPLOYING
, READY
)实时更新回前端。
最后,测试覆盖还不够完整。当前的端到端测试依赖于msw
来mock后端,这无法发现deployer-function
本身的逻辑错误。一个更完整的CI流程应该包含一个集成测试阶段,它会在一个临时的、隔离的OpenFaaS环境中(例如使用kind
或k3d
启动一个本地集群)真实地部署deployer-function
并运行测试,验证从UI点击到新函数成功部署的整个物理链路。