构建基于BDD与OpenFaaS的函数即服务内部开发者平台


团队引入OpenFaaS的初衷是好的:降低后端开发的复杂度,让开发者专注业务逻辑。但现实是,我们很快陷入了另一个困境。开发者不再写Dockerfile,但他们开始编写、调试和维护越来越复杂的stack.yml文件。CI/CD流水线也变得脆弱,充满了faas-cli的shell脚本,任何微小的环境差异都可能导致部署失败。新成员的上手曲线陡峭,Serverless带来的敏捷性优势几乎被这种“工具链摩擦”所抵消。

痛点很明确:我们需要一个能屏蔽底层复杂度的内部开发者平台(IDP)。目标不是替代faas-cli,而是为80%的通用场景提供一个“一键式”的图形化发布路径。一个开发者应该能够通过Web界面,提交他的函数代码,然后由平台负责后续的测试、打包、发布和验证。这个想法成了我们团队一个季度性的技术项目。

初步构想与技术选型决策

这个内部平台的核心是一条自动化的工作流。我们首先用行为驱动开发(BDD)的思路,以用户故事的形式定义了核心功能:

Feature: Self-Service Function Deployment

Scenario: A developer successfully deploys a new Node.js function
  Given the developer is on the "New Function" page
  And they have selected the "node18-express" template
  When they provide a valid function name "user-profile-service"
  And they paste their function's source code into the editor
  And they click the "Deploy" button
  Then the system should initiate the deployment process
  And after a short period, they should see a "Deployment Successful" notification
  And the function "user-profile-service" should be available and responsive at its gateway endpoint

这个Gherkin剧本成为了我们架构设计的北极星。它清晰地描述了前端交互、后端处理和最终状态验证,这直接影响了我们的技术选型:

  1. 前端界面 (UI): Shadcn UI
    我们不需要一个重量级、自带设计系统的UI库。我们需要的是能快速构建出干净、专业界面的基础组件,并且对这些组件有完全的控制权。Shadcn UI的模式——它不是一个依赖库,而是一系列你可以直接复制到自己代码库中的组件——完美符合我们的要求。这避免了组件库升级带来的麻烦,也让我们能轻松地根据内部设计规范进行定制。在真实项目中,这种可维护性和控制力远比开箱即用的主题重要。

  2. 后端工作流编排: OpenFaaS Function
    用一个OpenFaaS函数来部署另一个OpenFaaS函数,这听起来有点“套娃”,但实际上是一个非常务实的选择。我们的IDP后端不需要7x24小时运行,它本质上是一个事件触发的流程。将其本身实现为一个Serverless函数(我们称之为deployer-function),可以复用现有的OpenFaaS基础设施,避免了额外维护一个常驻服务的成本。这个deployer-function将成为一个面向平台API,接收前端请求,并在后台执行faas-cli命令或直接调用OpenFaaS API来完成部署。

  3. 测试框架: Vitest
    这个项目横跨前后端,我们需要一个能同时处理React组件测试和Node.js后端逻辑测试的工具。Vitest以其闪电般的速度和与Vite生态的无缝集成脱颖而出。更关键的是,我们可以利用它来编写端到端的测试,直接验证上面定义的BDD剧本。一个测试框架就能覆盖单元、集成和端到端测试,极大地简化了我们的开发工作流。

步骤化实现:从前端到后端的完整链路

我们的项目采用pnpm monorepo结构,包含两个主要部分:portal(React前端应用)和functions(OpenFaaS函数)。

/faas-idp
├── apps/
│   └── portal/         # React + Shadcn UI 前端
│       ├── src/
│       ├── package.json
│       └── ...
├── functions/
│   ├── deployer/       # 部署器函数
│   │   ├── handler.js
│   │   ├── package.json
│   │   └── Dockerfile
│   └── stack.yml       # OpenFaaS 函数定义
├── package.json
└── pnpm-workspace.yaml

1. 前端界面:构建部署表单

我们首先使用create-vite初始化portal应用,然后通过shadcn-ui的CLI引入我们需要的组件:Card, Input, Textarea, Button, Label, Select, 和 Toast

核心组件是 DeploymentForm.tsx,它负责收集用户输入。

jsx
// apps/portal/src/components/DeploymentForm.tsx import { useState } from "react"; import { Button } from "@/components/ui/button"; import { Card, CardContent, CardDescription, CardFooter, CardHeader, CardTitle } from "@/components/ui/card"; import { Input } from "@/components/ui/input"; import { Label } from "@/components/ui/label"; import { Textarea } from "@/components/ui/textarea"; import { useToast } from "@/components/ui/use-toast"; // 在真实项目中,这个地址应该是可配置的 const DEPLOYER_FUNCTION_URL = "http://<your-openfaas-gateway>/function/deployer"; interface DeploymentPayload { functionName: string; template: string; handlerCode: string; } export function DeploymentForm() { const [functionName, setFunctionName] = useState(""); const [handlerCode, setHandlerCode] = useState(`module.exports = async (event, context) => {\n return context\n .status(200)\n .succeed("Hello from your new function!");\n}`); const [isLoading, setIsLoading] = useState(false); const { toast } = useToast(); const handleSubmit = async (e: React.FormEvent) => { e.preventDefault(); if (!functionName || !handlerCode) { toast({ title: "Validation Error", description: "Function name and code cannot be empty.", variant: "destructive", }); return; } setIsLoading(true); const payload: DeploymentPayload = { functionName, template: "node18-express", // 简化处理,实际应为Select组件 handlerCode, }; try { const response = await fetch(DEPLOYER_FUNCTION_URL, { method: "POST", headers: { "Content-Type": "application/json", }, body: JSON.stringify(payload), }); if (!response.ok) { // 尝试解析后端返回的错误信息 const errorData = await response.json(); throw new Error(errorData.message || `Deployment failed with status: ${response.status}`); } const result = await response.json(); toast({ title: "Deployment Successful", description: `Function ${result.functionName} is now available at ${result.url}`, }); setFunctionName(""); // 清空表单,准备下一次部署 } catch (error) { // 这里的错误处理至关重要 const errorMessage = error instanceof Error ? error.message : "An unknown error occurred."; console.error("Deployment failed:", errorMessage); toast({ title: "Deployment Failed", description: errorMessage, variant: "destructive", }); } finally { setIsLoading(false); } }; return ( <Card className="w-[650px]"> <CardHeader> <CardTitle>Deploy a New Function</CardTitle> <CardDescription>Provide details for your new serverless function.</CardDescription> </CardHeader> <form onSubmit={handleSubmit}> <CardContent> <div className="grid w-full items-center gap-4"> <div className="flex flex-col space-y-1.5"> <Label htmlFor="name">Function Name</Label> <Input id="name" placeholder="e.g., user-profile-service" value={functionName} onChange={(e) => setFunctionName(e.target.value.toLowerCase().replace(/\s/g, '-'))} disabled={isLoading} /> </div> <div className="flex flex-col space-y-1.5"> <Label htmlFor="code">Handler Code (handler.js)</Label> <Textarea id="code" placeholder="Paste your Node.js function code here" className="font-mono h-[300px]" value={handlerCode} onChange={(e) => setHandlerCode(e.target.value)} disabled={isLoading} /> </div> </div> </CardContent> <CardFooter className="flex justify-end"> <Button type="submit" disabled={isLoading}> {isLoading ? "Deploying..." : "Deploy"} </Button> </CardFooter> </form> </Card> ); }

代码中的错误处理和加载状态管理是生产级UI的必备要素。用户必须清楚地知道当前系统状态,以及操作失败的原因。

2. deployer-function: 核心部署逻辑

这个函数是整个平台的心脏。它需要被授予足够的权限来执行faas-cli命令。我们为它构建了一个自定义的Dockerfile,将faas-cli包含进去。

functions/deployer/handler.js:

// functions/deployer/handler.js
'use strict'

const { exec } = require('child_process');
const fs = require('fs').promises;
const path = require('path');
const os = require('os');

// 安全性提示:在生产环境中,这些变量应通过环境变量或OpenFaaS secrets传入
const OPENFAAS_GATEWAY = process.env.OPENFAAS_GATEWAY || 'http://gateway.openfaas:8080';
const TEMP_DIR_BASE = os.tmpdir();

/**
 * 异步执行shell命令的辅助函数
 * @param {string} command - The command to execute
 * @returns {Promise<{stdout: string, stderr: string}>}
 */
const execAsync = (command) => {
  return new Promise((resolve, reject) => {
    exec(command, (error, stdout, stderr) => {
      if (error) {
        console.error(`Exec error for command "${command}": ${stderr}`);
        // 将stderr作为错误信息的一部分,这对于调试至关重要
        reject(new Error(`Command failed: ${stderr || error.message}`));
        return;
      }
      resolve({ stdout, stderr });
    });
  });
};

module.exports = async (event, context) => {
  let tempDir = '';
  try {
    const { functionName, template, handlerCode } = event.body;

    // 1. 输入验证
    if (!functionName || !/^[a-z0-9]([-a-z0-9]*[a-z0-9])?$/.test(functionName)) {
      return context.status(400).fail({ message: 'Invalid function name. Must be lowercase, alphanumeric, and start with a letter.' });
    }
    if (!template || !handlerCode) {
      return context.status(400).fail({ message: 'Template and handler code are required.' });
    }

    // 2. 创建一个唯一的临时工作目录
    tempDir = await fs.mkdtemp(path.join(TEMP_DIR_BASE, `faas-deploy-${functionName}-`));
    console.log(`Created temporary directory: ${tempDir}`);

    // 3. 使用 faas-cli 创建函数骨架
    // 这里的 --gateway 是为了确保cli知道要和哪个OpenFaaS实例通信
    console.log(`Pulling template: ${template}`);
    await execAsync(`faas-cli template store pull ${template}`);

    console.log(`Creating new function skeleton for: ${functionName}`);
    await execAsync(`faas-cli new ${functionName} --lang ${template} --gateway ${OPENFAAS_GATEWAY}`, { cwd: tempDir });
    
    const functionDirPath = path.join(tempDir, functionName);
    
    // 4. 将用户代码写入handler文件
    const handlerFilePath = path.join(functionDirPath, 'handler.js');
    await fs.writeFile(handlerFilePath, handlerCode);
    console.log(`Wrote user code to ${handlerFilePath}`);
    
    // 5. 部署函数
    // 这里的stack file是动态生成的,就在临时目录里
    const stackFilePath = path.join(tempDir, `${functionName}.yml`);
    await execAsync(`faas-cli deploy -f ${stackFilePath} --gateway ${OPENFAAS_GATEWAY}`);
    console.log(`Deployment command issued for ${functionName}`);

    // 6. 返回成功响应
    const functionUrl = `${OPENFAAS_GATEWAY}/function/${functionName}`;
    return context
      .status(200)
      .succeed({ 
        message: 'Deployment initiated successfully.',
        functionName: functionName,
        url: functionUrl
      });

  } catch (err) {
    // 关键的错误捕获和日志记录
    console.error('Deployment process failed:', err.message);
    // 向前端返回有意义的错误信息
    return context
      .status(500)
      .fail({ 
          message: 'Internal deployment error.',
          details: err.message 
      });
  } finally {
    // 7. 清理临时文件,无论成功与否
    if (tempDir) {
      try {
        await fs.rm(tempDir, { recursive: true, force: true });
        console.log(`Cleaned up temporary directory: ${tempDir}`);
      } catch (cleanupErr) {
        console.error(`Failed to cleanup temp directory ${tempDir}:`, cleanupErr);
      }
    }
  }
}

这个handler.js的健壮性体现在:

  • 详细的日志: 每一步操作都有日志输出,便于排查问题。
  • 严格的输入验证: 防止无效的函数名或数据注入。
  • 临时文件管理: 在临时目录中操作,避免并发请求之间的冲突,并通过finally块确保清理,防止磁盘空间被占满。
  • 清晰的错误传递: execstderr被捕获并返回给前端,而不是一个模糊的“服务器错误”。

3. 实现BDD测试:用Vitest验证完整流程

这是将所有部分粘合在一起的关键。我们将在portal应用中创建一个端到端测试文件,它将模拟用户行为,并验证整个系统的反应。

我们将使用msw (Mock Service Worker) 来拦截对deployer-function的网络请求,这样我们的测试就不需要一个真正运行的OpenFaaS环境,使其更快、更可靠。

apps/portal/src/e2e/deployment.spec.ts:

// apps/portal/src/e2e/deployment.spec.ts
import { describe, it, expect, beforeAll, afterAll, afterEach } from 'vitest';
import { render, screen, fireEvent, waitFor } from '@testing-library/react';
import { userEvent } from '@testing-library/user-event';
import { App } from '@/App'; // 假设App组件渲染了DeploymentForm
import { rest } from 'msw';
import { setupServer } from 'msw/node';

const DEPLOYER_FUNCTION_URL = "http://localhost/function/deployer"; // 使用本地或mock地址

// 模拟成功的响应
const successResponse = rest.post(DEPLOYER_FUNCTION_URL, (req, res, ctx) => {
  return res(
    ctx.status(200),
    ctx.json({
      message: 'Deployment initiated successfully.',
      functionName: 'test-function',
      url: `${DEPLOYER_FUNCTION_URL}/test-function`
    })
  );
});

// 模拟失败的响应
const errorResponse = rest.post(DEPLOYER_FUNCTION_URL, (req, res, ctx) => {
    return res(
        ctx.status(500),
        ctx.json({
            message: 'Internal deployment error.',
            details: 'Command failed: faas-cli exited with code 1.'
        })
    );
});


const server = setupServer();

// Vitest生命周期钩子
beforeAll(() => server.listen({ onUnhandledRequest: 'error' }));
afterAll(() => server.close());
afterEach(() => server.resetHandlers());


describe('Feature: Self-Service Function Deployment', () => {

  it('Scenario: A developer successfully deploys a new Node.js function', async () => {
    server.use(successResponse);
    const user = userEvent.setup();

    // Given the developer is on the "New Function" page
    render(<App />);
    expect(screen.getByRole('heading', { name: /Deploy a New Function/i })).toBeInTheDocument();
    
    // And they have selected the "node18-express" template (simplified)
    
    // When they provide a valid function name "user-profile-service"
    const nameInput = screen.getByLabelText(/Function Name/i);
    await user.type(nameInput, 'user-profile-service');

    // And they paste their function's source code into the editor
    const codeEditor = screen.getByLabelText(/Handler Code/i);
    await user.clear(codeEditor); // 清除默认代码
    await user.type(codeEditor, 'module.exports = async () => ({ status: "ok" });');

    // And they click the "Deploy" button
    const deployButton = screen.getByRole('button', { name: /Deploy/i });
    await user.click(deployButton);

    // Then the system should initiate the deployment process
    expect(screen.getByRole('button', { name: /Deploying.../i })).toBeDisabled();

    // And after a short period, they should see a "Deployment Successful" notification
    await waitFor(() => {
      expect(screen.getByText(/Deployment Successful/i)).toBeInTheDocument();
    }, { timeout: 3000 });

    // And the form should be reset
    expect(nameInput).toHaveValue('');
  });

  it('Scenario: Deployment fails due to a server-side error', async () => {
    server.use(errorResponse);
    const user = userEvent.setup();

    // Given I am on the deployment page
    render(<App />);
    
    // When I fill the form and submit
    await user.type(screen.getByLabelText(/Function Name/i), 'failing-function');
    await user.click(screen.getByRole('button', { name: /Deploy/i }));

    // Then I should see a detailed error notification
    await waitFor(() => {
        expect(screen.getByText(/Deployment Failed/i)).toBeInTheDocument();
        // 验证错误详情是否被展示,这对于用户体验非常重要
        expect(screen.getByText(/Command failed: faas-cli exited with code 1./i)).toBeInTheDocument();
    });

    // And the deploy button should be enabled again
    expect(screen.getByRole('button', { name: /Deploy/i })).not.toBeDisabled();
  });
});

这个测试文件完美地将BDD剧本转化为了可执行的代码。它验证了UI的状态变化、API的调用、成功和失败的流程,以及给用户的反馈是否清晰。这种测试是项目质量的最后一道防线。

架构流程图

为了更清晰地展示整个工作流程,我们可以用Mermaid图来表示:

sequenceDiagram
    participant User as Developer
    participant Portal as Shadcn UI Portal
    participant Deployer as Deployer Function (OpenFaaS)
    participant OpenFaaS as OpenFaaS Gateway/Core
    participant Registry as Container Registry
    participant Cluster as Kubernetes/Container Runtime

    User->>Portal: Fills form and clicks "Deploy"
    Portal->>Deployer: POST / with JSON payload (name, code)
    activate Deployer
    Deployer->>Deployer: Creates temporary directory
    Deployer->>Deployer: Executes `faas-cli new`
    Deployer->>Deployer: Writes user code to handler.js
    Deployer->>Deployer: Executes `faas-cli build`
    Note over Deployer,Cluster: Builds container image locally
    Deployer->>Registry: Executes `faas-cli push`
    Registry-->>Deployer: Image pushed
    Deployer->>OpenFaaS: Executes `faas-cli deploy`
    activate OpenFaaS
    OpenFaaS->>Cluster: Creates/Updates Deployment & Service
    Cluster-->>OpenFaaS: Pods are running
    OpenFaaS-->>Deployer: Deployment acknowledged
    deactivate OpenFaaS
    Deployer-->>Portal: 200 OK with function URL
    deactivate Deployer
    Portal->>User: Shows "Success" Toast notification

遗留问题与未来迭代方向

这个v1版本的IDP平台虽然解决了核心痛点,但作为一个务实的工程方案,它依然存在一些局限性和可以改进的地方。

首先,deployer-function内部直接执行exec('faas-cli ...')的方式,虽然简单直接,但在安全性和可扩展性上存在隐患。一个更健壮的方案是让deployer-function直接通过HTTP请求与OpenFaaS Gateway的REST API进行交互,而不是依赖CLI工具。这将消除对faas-cli二进制文件的依赖,并提供更精细的错误控制。

其次,当前的部署过程是同步的。前端发起请求后会一直等待deployer-function执行完毕。对于复杂的函数(例如,需要安装很多依赖的),这个过程可能会超过HTTP的超时时间。未来的迭代方向是转向异步工作流。前端提交请求后,deployer-function立即返回一个任务ID,并触发一个NATS JetStream或Kafka消息。由另一个或一组worker函数订阅该消息来执行实际的部署工作,并通过WebSocket或轮询将部署状态(如BUILDING, PUSHING, DEPLOYING, READY)实时更新回前端。

最后,测试覆盖还不够完整。当前的端到端测试依赖于msw来mock后端,这无法发现deployer-function本身的逻辑错误。一个更完整的CI流程应该包含一个集成测试阶段,它会在一个临时的、隔离的OpenFaaS环境中(例如使用kindk3d启动一个本地集群)真实地部署deployer-function并运行测试,验证从UI点击到新函数成功部署的整个物理链路。


  目录