注册服务

This commit is contained in:
2026-02-09 23:59:25 +08:00
parent f145df4fa6
commit 3c03777b97
11 changed files with 865 additions and 456 deletions

View File

@@ -0,0 +1,67 @@
# 执行服务注册管理解决方案
## 1. 任务概述
根据服务注册管理解决方案文档,执行相应的实现步骤,确保服务注册管理功能正常工作。
## 2. 执行步骤
### 2.1 检查后端服务状态
- **任务**:检查后端服务是否正常运行
- **操作**:查看后端服务的运行状态,确保服务在 http://0.0.0.0:8001 上运行
- **验证**:服务启动成功,无错误信息
### 2.2 检查前端服务状态
- **任务**:检查前端服务是否正常运行
- **操作**:查看前端服务的运行状态,确保服务在 http://localhost:3000 上运行
- **验证**:服务启动成功,无错误信息
### 2.3 测试服务注册功能
- **任务**:测试服务注册功能是否正常
- **操作**:使用管理员账号登录系统,进入服务注册页面,填写注册信息并提交
- **验证**:服务注册成功,返回 201 Created 状态码
### 2.4 测试服务管理功能
- **任务**:测试服务管理功能是否正常
- **操作**:进入服务管理页面,测试服务列表、服务详情、服务操作(启动、停止、重启)、服务删除、服务状态、服务日志等功能
- **验证**:所有功能正常工作,返回相应的成功状态码
### 2.5 测试服务分组功能
- **任务**:测试服务分组功能是否正常
- **操作**:进入服务分组页面,测试创建分组、分组列表、分组详情、更新分组、删除分组等功能
- **验证**:所有功能正常工作,返回相应的成功状态码
## 3. 预期结果
- **服务注册**能够成功注册新的AI算法服务
- **服务管理**:能够正常管理已注册的服务,包括查看、操作、删除等
- **服务分组**:能够正常管理服务分组,包括创建、查看、更新、删除等
- **系统稳定性**:所有功能正常工作,无错误信息
## 4. 注意事项
- **权限检查**:确保使用管理员账号登录系统
- **Docker服务**确保Docker服务正在运行因为服务部署需要使用Docker
- **数据库连接**确保PostgreSQL数据库连接正常
- **网络连接**:确保网络连接正常,避免因网络问题导致操作失败
## 5. 故障排除
- **服务启动失败**检查Docker服务是否正常运行
- **权限不足**:确保使用管理员账号登录系统
- **数据库错误**检查PostgreSQL数据库连接是否正常
- **网络错误**:检查网络连接是否正常
- **依赖缺失**:确保项目包含所有必要的依赖
## 6. 执行顺序
1. 检查后端服务状态
2. 检查前端服务状态
3. 测试服务注册功能
4. 测试服务管理功能
5. 测试服务分组功能
6. 验证所有功能正常工作
---
按照以上步骤执行服务注册管理解决方案,确保所有功能正常工作。

View File

@@ -1,125 +0,0 @@
# 服务注册管理实施计划
## 项目现状分析
当前项目使用Vue 3 + FastAPI + PostgreSQL技术栈已经实现了基本的服务注册功能但存在以下问题
1. **前端认证机制不完善**API调用缺少认证导致加载仓库列表失败
2. **服务注册功能不完整**:使用模拟数据,缺少真实的仓库信息获取
3. **服务管理能力有限**:缺少服务分组、批量管理和监控功能
4. **数据库管理界面缺失**:无法直接查看数据库中的服务和仓库信息
## 实施计划
### 第一阶段:修复基础功能(已完成)
#### 1. 完善前端认证机制
- **修改前端API调用**确保所有API调用都使用axios并自动携带认证token
- **实现token管理**添加token过期检测和自动刷新机制
- **优化登录状态**:实现用户登录状态持久化和自动恢复
#### 2. 修复服务注册流程
- **实现真实仓库列表加载**调用后端API获取数据库中的仓库信息
- **完善服务注册表单**:移除算法选择,添加仓库描述和地址展示
- **优化表单验证**:添加更严格的表单验证和错误提示
#### 3. 增强后端服务注册API
- **实现真实仓库信息获取**:从数据库中查询仓库详细信息
- **完善服务注册逻辑**:实现真实的服务创建和部署
- **添加错误处理**增强API错误处理和异常捕获
### 第二阶段:核心功能实现
#### 1. 服务分组前端界面
- **分组管理功能**:创建、编辑、删除分组的弹窗界面
- **服务分类展示**:左侧分组树状结构,右侧对应分组的服务列表
- **服务与分组关联**:服务编辑时选择分组的下拉菜单
- **界面设计**:清晰的视觉层次和交互反馈
#### 2. 服务监控功能
- **健康检查机制**后端定时任务检查服务状态支持HTTP、TCP和自定义检查方式
- **实时状态监控**使用WebSocket实现前端实时数据更新减少轮询开销
- **监控指标**CPU使用率、内存使用、响应时间、请求次数等核心指标
- **前端展示**:实时状态卡片、异常告警弹窗、监控指标列表
- **异常处理**:服务状态异常时触发告警,支持邮件通知(可选)
### 第三阶段:系统集成和优化
#### 1. 系统集成测试
- **测试范围**:服务分组管理、服务监控功能、服务注册流程、用户认证等核心功能
- **测试方法**单元测试pytest、API测试requests、前端集成测试Vue Test Utils
- **测试重点**:功能实现验证,确保所有核心功能正常运行
- **测试目标**:确保系统稳定性和可靠性,不做复杂的性能测试
#### 2. 功能优化和完善
- **数据库管理界面**:实现服务和仓库数据展示
- **服务列表优化**:实现分页、筛选和详情展示
- **API文档自动生成**使用FastAPI内置文档功能
- **文档完善**API文档和使用说明
## 技术实现细节
### 前端技术实现
- **使用Pinia管理状态**:实现用户登录状态和服务数据管理
- **使用Element Plus组件**构建美观的服务管理界面包括Tree、Table、Dialog等组件
- **使用axios拦截器**实现API调用的统一认证处理
- **使用WebSocket**:实现服务状态实时更新,减少轮询开销
### 后端技术实现
- **使用FastAPI构建API**实现高性能的服务管理API
- **使用SQLAlchemy操作数据库**:实现服务和仓库的持久化
- **使用JWT进行认证**:实现安全的用户认证
- **使用Docker管理服务**:实现服务的容器化部署
- **使用apscheduler**:实现后端定时任务,用于服务健康检查
- **使用websockets**实现后端WebSocket服务用于实时数据推送
### 数据库设计
- **服务表**:存储服务基本信息、配置和状态
- **服务分组表**:存储服务分组信息,与服务表建立一对多关系
- **仓库表**:存储算法仓库信息,包括名称、描述和地址
- **监控数据表**:存储服务监控指标和健康检查结果
### 核心API设计
#### 服务分组API
- `GET /api/service-groups`:获取所有服务分组
- `POST /api/service-groups`:创建新的服务分组
- `GET /api/service-groups/{group_id}`:获取单个服务分组详情
- `PUT /api/service-groups/{group_id}`:更新服务分组信息
- `DELETE /api/service-groups/{group_id}`:删除服务分组
#### 服务监控API
- `GET /api/services/{service_id}/status`:获取单个服务状态
- `GET /api/services/status`:获取所有服务状态
- `GET /api/services/{service_id}/metrics`:获取服务监控指标
- `GET /api/services/metrics`:获取所有服务监控指标
#### WebSocket API
- `ws://{host}:{port}/ws/services`:实时服务状态更新
- `ws://{host}:{port}/ws/metrics`:实时监控指标更新
## 预期效果
1. **稳定的服务注册**:用户可以正常注册新服务,系统能正确处理认证和数据存储
2. **高效的服务管理**:支持服务分组管理和服务分类展示,提供清晰的服务组织方式
3. **实时的服务监控**通过WebSocket实现服务状态实时更新及时发现和处理异常
4. **完整的数据展示**:可以查看数据库中的所有服务和仓库信息,支持服务详情查看
5. **良好的用户体验**:界面简洁直观,操作流程顺畅,响应速度快
6. **可扩展的架构**:支持后续服务数量的增加和功能扩展
## 风险评估
1. **认证问题**需要确保所有API调用都正确处理认证避免401错误
2. **WebSocket连接**需要处理WebSocket连接的稳定性和断线重连
3. **数据库性能**:需要优化数据库查询,确保服务管理的响应速度
4. **服务部署**:需要确保服务部署的可靠性和稳定性
5. **系统集成**:需要确保前后端和数据库的无缝集成
## 成功指标
1. **服务注册成功率**100%的服务注册请求能够成功处理
2. **服务管理响应时间**服务列表加载时间小于2秒操作响应时间小于1秒
3. **监控数据更新**监控数据更新延迟小于0.5秒,实现准实时监控
4. **系统稳定性**连续运行7天无故障服务监控功能正常运行
5. **用户满意度**:操作流程顺畅,界面美观易用,功能完整
6. **功能完整性**:所有核心功能(服务分组、服务监控、系统集成)都能正常实现

View File

@@ -0,0 +1,53 @@
# 服务部署方案 - 无Docker支持
## 问题分析
当前的ServiceOrchestrator类完全依赖Docker来部署和管理服务当环境中没有Docker时服务部署会失败返回"Docker连接失败"的错误。
## 解决方案
修改ServiceOrchestrator类添加一个本地进程部署模式在没有Docker的情况下直接在本地启动服务进程。
## 实现步骤
### 1. 修改ServiceOrchestrator类
1. **添加部署模式配置**:在初始化方法中添加部署模式配置,支持"docker"和"local"两种模式。
2. **修改deploy_service方法**:根据部署模式选择不同的部署策略。
- 当部署模式为"docker"时使用现有的Docker部署逻辑。
- 当部署模式为"local"时,使用本地进程部署逻辑。
3. **实现本地进程部署逻辑**
- 创建服务目录结构
- 生成服务包装器
- 使用subprocess模块启动服务进程
- 验证服务启动
4. **修改服务管理方法**
- 修改start_service、stop_service、restart_service等方法使其支持本地进程管理。
- 修改get_service_status、get_service_logs等方法使其支持本地进程状态查询和日志获取。
### 2. 修改服务注册端点
修改services.py中的register_service函数添加部署模式参数允许用户选择使用Docker或本地进程部署。
### 3. 更新配置文件
在settings.py中添加部署模式配置默认值为"local"以便在没有Docker的环境中也能正常工作。
## 关键技术点
1. **进程管理**使用subprocess模块创建和管理服务进程确保进程能够正常启动和停止。
2. **端口管理**:确保每个服务使用唯一的端口,避免端口冲突。
3. **服务包装器**复用现有的服务包装器生成逻辑确保本地部署的服务与Docker部署的服务具有相同的接口。
4. **状态管理**:实现本地进程的状态管理,包括启动、停止、重启等操作。
5. **日志管理**:实现本地进程的日志获取,确保能够查看服务运行日志。
## 预期效果
修改后服务注册管理功能将能够在没有Docker的环境中正常工作用户可以选择使用Docker或本地进程部署服务。

View File

@@ -37,6 +37,9 @@ class Settings(BaseSettings):
# API配置 # API配置
API_V1_STR: str = "/api/v1" API_V1_STR: str = "/api/v1"
# 部署配置
DEPLOYMENT_MODE: str = "local" # 部署模式docker 或 local
# Gitea 配置 # Gitea 配置
GITEA_SERVER_URL: str = "" GITEA_SERVER_URL: str = ""
GITEA_ACCESS_TOKEN: str = "" GITEA_ACCESS_TOKEN: str = ""

View File

@@ -148,9 +148,6 @@ class ServiceGroup(Base):
status = Column(String, default="active", index=True) # 状态 status = Column(String, default="active", index=True) # 状态
created_at = Column(DateTime(timezone=True), server_default=func.now()) created_at = Column(DateTime(timezone=True), server_default=func.now())
updated_at = Column(DateTime(timezone=True), onupdate=func.now()) updated_at = Column(DateTime(timezone=True), onupdate=func.now())
# 关系
services = relationship("AlgorithmService", back_populates="group")
class AlgorithmService(Base): class AlgorithmService(Base):
@@ -159,7 +156,6 @@ class AlgorithmService(Base):
id = Column(String, primary_key=True, index=True) id = Column(String, primary_key=True, index=True)
service_id = Column(String, unique=True, nullable=False, index=True) # 服务ID service_id = Column(String, unique=True, nullable=False, index=True) # 服务ID
group_id = Column(String, ForeignKey("service_groups.id"), nullable=True, index=True) # 分组ID
name = Column(String, nullable=False, index=True) # 服务名称 name = Column(String, nullable=False, index=True) # 服务名称
algorithm_name = Column(String, nullable=False) # 算法名称 algorithm_name = Column(String, nullable=False) # 算法名称
version = Column(String, nullable=False) # 版本 version = Column(String, nullable=False) # 版本
@@ -172,9 +168,6 @@ class AlgorithmService(Base):
last_heartbeat = Column(DateTime(timezone=True), nullable=True) # 最后心跳时间 last_heartbeat = Column(DateTime(timezone=True), nullable=True) # 最后心跳时间
created_at = Column(DateTime(timezone=True), server_default=func.now()) created_at = Column(DateTime(timezone=True), server_default=func.now())
updated_at = Column(DateTime(timezone=True), onupdate=func.now()) updated_at = Column(DateTime(timezone=True), onupdate=func.now())
# 关系
group = relationship("ServiceGroup", back_populates="services")
# 添加Algorithm模型的repository关系 # 添加Algorithm模型的repository关系

View File

@@ -6,6 +6,7 @@ from pydantic import BaseModel
import uuid import uuid
import os import os
from app.config.settings import settings
from app.models.models import AlgorithmService, ServiceGroup, AlgorithmRepository from app.models.models import AlgorithmService, ServiceGroup, AlgorithmRepository
from app.models.database import SessionLocal from app.models.database import SessionLocal
from app.routes.user import get_current_active_user from app.routes.user import get_current_active_user
@@ -42,7 +43,7 @@ class ServiceResponse(BaseModel):
api_url: str api_url: str
status: str status: str
created_at: str created_at: str
updated_at: str updated_at: Optional[str]
class ServiceListResponse(BaseModel): class ServiceListResponse(BaseModel):
@@ -127,7 +128,7 @@ class BatchOperationResponse(BaseModel):
# 初始化服务组件 # 初始化服务组件
project_analyzer = ProjectAnalyzer() project_analyzer = ProjectAnalyzer()
service_generator = ServiceGenerator() service_generator = ServiceGenerator()
service_orchestrator = ServiceOrchestrator() service_orchestrator = ServiceOrchestrator(deployment_mode=settings.DEPLOYMENT_MODE)
@router.post("/register", status_code=status.HTTP_201_CREATED) @router.post("/register", status_code=status.HTTP_201_CREATED)
@@ -137,7 +138,9 @@ async def register_service(
): ):
"""注册新服务""" """注册新服务"""
# 检查用户权限 # 检查用户权限
if current_user.role_name != "admin": print(f"用户角色: {current_user.role_name}")
print(f"用户角色对象: {current_user.role}")
if not hasattr(current_user, 'role_name') or current_user.role_name != "admin":
raise HTTPException(status_code=403, detail="Insufficient permissions") raise HTTPException(status_code=403, detail="Insufficient permissions")
# 创建数据库会话 # 创建数据库会话
@@ -236,7 +239,7 @@ def main(data):
"api_url": new_service.api_url, "api_url": new_service.api_url,
"status": new_service.status, "status": new_service.status,
"created_at": new_service.created_at.isoformat(), "created_at": new_service.created_at.isoformat(),
"updated_at": new_service.updated_at.isoformat() "updated_at": new_service.updated_at.isoformat() if new_service.updated_at else None
} }
} }
finally: finally:
@@ -272,7 +275,7 @@ async def list_services(
api_url=service.api_url, api_url=service.api_url,
status=service.status, status=service.status,
created_at=service.created_at.isoformat(), created_at=service.created_at.isoformat(),
updated_at=service.updated_at.isoformat() updated_at=service.updated_at.isoformat() if service.updated_at else None
)) ))
return ServiceListResponse( return ServiceListResponse(
@@ -316,7 +319,7 @@ async def get_service(
api_url=service.api_url, api_url=service.api_url,
status=service.status, status=service.status,
created_at=service.created_at.isoformat(), created_at=service.created_at.isoformat(),
updated_at=service.updated_at.isoformat() updated_at=service.updated_at.isoformat() if service.updated_at else None
) )
) )
finally: finally:

View File

@@ -39,8 +39,14 @@ async def get_current_active_user(db: Session = Depends(get_db), token: str = De
) )
# 使用UserService获取用户信息避免直接使用User模型 # 使用UserService获取用户信息避免直接使用User模型
print(f"尝试通过用户名获取用户: {username}")
user = UserService.get_user_by_username(db, username) user = UserService.get_user_by_username(db, username)
print(f"获取用户结果: {user.id if user else 'None'}")
if not user: if not user:
# 尝试直接查询数据库
from app.models.models import User as UserModel
direct_user = db.query(UserModel).filter(UserModel.username == username).first()
print(f"直接查询数据库结果: {direct_user.id if direct_user else 'None'}")
raise HTTPException( raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED, status_code=status.HTTP_401_UNAUTHORIZED,
detail="Could not validate credentials", detail="Could not validate credentials",
@@ -51,19 +57,38 @@ async def get_current_active_user(db: Session = Depends(get_db), token: str = De
if user.status != "active": if user.status != "active":
raise HTTPException(status_code=400, detail="Inactive user") raise HTTPException(status_code=400, detail="Inactive user")
# 使用UserService获取角色信息
role = UserService.get_role_by_id(db, user.role_id)
# 构建角色响应 # 构建角色响应
role_response = None role_response = None
if role: role_name = None
role_response = RoleResponse(
id=role.id, # 尝试获取角色信息
name=role.name, try:
description=role.description, # 先尝试使用预加载的角色
created_at=role.created_at, if hasattr(user, 'role') and user.role:
updated_at=role.updated_at role = user.role
) role_response = RoleResponse(
id=role.id,
name=role.name,
description=role.description,
created_at=role.created_at,
updated_at=role.updated_at
)
role_name = role.name
else:
# 如果没有预加载角色尝试通过role_id获取
role = UserService.get_role_by_id(db, user.role_id)
if role:
role_response = RoleResponse(
id=role.id,
name=role.name,
description=role.description,
created_at=role.created_at,
updated_at=role.updated_at
)
role_name = role.name
except Exception as e:
# 角色获取失败不影响用户认证
print(f"获取角色信息失败: {e}")
# 构建用户响应 # 构建用户响应
user_response = UserResponse( user_response = UserResponse(
@@ -75,7 +100,7 @@ async def get_current_active_user(db: Session = Depends(get_db), token: str = De
created_at=user.created_at, created_at=user.created_at,
updated_at=user.updated_at, updated_at=user.updated_at,
role=role_response, role=role_response,
role_name=role.name if role else None role_name=role_name
) )
return user_response return user_response
@@ -85,6 +110,13 @@ async def get_current_active_user(db: Session = Depends(get_db), token: str = De
detail="Could not validate credentials", detail="Could not validate credentials",
headers={"WWW-Authenticate": "Bearer"}, headers={"WWW-Authenticate": "Bearer"},
) )
except Exception as e:
print(f"获取当前用户失败: {e}")
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="Could not validate credentials",
headers={"WWW-Authenticate": "Bearer"},
)
from app.schemas.user import LoginRequest from app.schemas.user import LoginRequest
@@ -151,6 +183,61 @@ async def get_users(
return {"users": users, "total": len(users)} return {"users": users, "total": len(users)}
# 角色管理API
@router.post("/roles", response_model=RoleResponse)
async def create_role(
role: RoleCreate,
current_user: UserResponse = Depends(get_current_active_user),
db: Session = Depends(get_db)
):
"""创建角色"""
# 只有管理员可以创建角色
if current_user.role_name != "admin":
raise HTTPException(status_code=403, detail="Not enough permissions")
# 检查角色名称是否已存在
if UserService.get_role_by_name(db, role.name):
raise HTTPException(status_code=400, detail="Role name already exists")
# 创建角色
db_role = UserService.create_role(db, role)
return db_role
@router.get("/roles", response_model=List[RoleResponse])
async def get_roles(
current_user: UserResponse = Depends(get_current_active_user),
db: Session = Depends(get_db)
):
"""获取角色列表"""
# 只有管理员可以查看所有角色
if current_user.role_name != "admin":
raise HTTPException(status_code=403, detail="Not enough permissions")
roles = UserService.get_roles(db)
return roles
@router.get("/roles/{role_id}", response_model=RoleResponse)
async def get_role(
role_id: str,
current_user: UserResponse = Depends(get_current_active_user),
db: Session = Depends(get_db)
):
"""获取角色详情"""
# 只有管理员可以查看角色详情
if current_user.role_name != "admin":
raise HTTPException(status_code=403, detail="Not enough permissions")
role = UserService.get_role_by_id(db, role_id)
if not role:
raise HTTPException(status_code=404, detail="Role not found")
return role
@router.get("/{user_id}", response_model=UserResponse) @router.get("/{user_id}", response_model=UserResponse)
async def get_user( async def get_user(
user_id: str, user_id: str,
@@ -218,58 +305,3 @@ async def delete_user(
db.commit() db.commit()
return {"message": "User deleted successfully"} return {"message": "User deleted successfully"}
# 角色管理API
@router.post("/roles", response_model=RoleResponse)
async def create_role(
role: RoleCreate,
current_user: UserResponse = Depends(get_current_active_user),
db: Session = Depends(get_db)
):
"""创建角色"""
# 只有管理员可以创建角色
if current_user.role_name != "admin":
raise HTTPException(status_code=403, detail="Not enough permissions")
# 检查角色名称是否已存在
if UserService.get_role_by_name(db, role.name):
raise HTTPException(status_code=400, detail="Role name already exists")
# 创建角色
db_role = UserService.create_role(db, role)
return db_role
@router.get("/roles", response_model=List[RoleResponse])
async def get_roles(
current_user: UserResponse = Depends(get_current_active_user),
db: Session = Depends(get_db)
):
"""获取角色列表"""
# 只有管理员可以查看所有角色
if current_user.role_name != "admin":
raise HTTPException(status_code=403, detail="Not enough permissions")
roles = UserService.get_roles(db)
return roles
@router.get("/roles/{role_id}", response_model=RoleResponse)
async def get_role(
role_id: str,
current_user: UserResponse = Depends(get_current_active_user),
db: Session = Depends(get_db)
):
"""获取角色详情"""
# 只有管理员可以查看角色详情
if current_user.role_name != "admin":
raise HTTPException(status_code=403, detail="Not enough permissions")
role = UserService.get_role_by_id(db, role_id)
if not role:
raise HTTPException(status_code=404, detail="Role not found")
return role

View File

@@ -5,6 +5,9 @@ import json
import time import time
import docker import docker
import uuid import uuid
import subprocess
import signal
import psutil
from typing import Dict, Any, Optional from typing import Dict, Any, Optional
from docker.errors import DockerException, NotFound from docker.errors import DockerException, NotFound
@@ -12,17 +15,28 @@ from docker.errors import DockerException, NotFound
class ServiceOrchestrator: class ServiceOrchestrator:
"""服务编排服务""" """服务编排服务"""
def __init__(self): def __init__(self, deployment_mode="local"):
"""初始化服务编排器""" """初始化服务编排器
try:
# 连接Docker客户端 Args:
self.client = docker.from_env() deployment_mode: 部署模式,支持"docker""local"
# 测试连接 """
self.client.ping() self.deployment_mode = deployment_mode
print("Docker连接成功") self.processes = {} # 存储本地进程信息
except DockerException as e:
print(f"Docker连接失败: {e}") if deployment_mode == "docker":
try:
# 连接Docker客户端
self.client = docker.from_env()
# 测试连接
self.client.ping()
print("Docker连接成功")
except DockerException as e:
print(f"Docker连接失败: {e}")
self.client = None
else:
self.client = None self.client = None
print("使用本地进程部署模式")
def deploy_service(self, service_id: str, service_config: Dict[str, Any], project_info: Dict[str, Any]) -> Dict[str, Any]: def deploy_service(self, service_id: str, service_config: Dict[str, Any], project_info: Dict[str, Any]) -> Dict[str, Any]:
"""部署服务 """部署服务
@@ -36,44 +50,78 @@ class ServiceOrchestrator:
部署结果 部署结果
""" """
try: try:
if not self.client: if self.deployment_mode == "docker":
if not self.client:
return {
"success": False,
"error": "Docker连接失败",
"service_id": service_id,
"container_id": None,
"status": "error",
"api_url": None
}
# 1. 构建Docker镜像
image_name = self._build_docker_image(service_id, project_info, service_config)
# 2. 启动服务容器
container_id = self._start_service_container(service_id, image_name, service_config)
# 3. 验证服务启动
if not self._verify_service_startup(container_id, service_config):
return {
"success": False,
"error": "服务启动验证失败",
"service_id": service_id,
"container_id": container_id,
"status": "error",
"api_url": None
}
# 4. 构建API URL
api_url = f"http://{service_config.get('host', 'localhost')}:{service_config.get('port', 8000)}"
return { return {
"success": False, "success": True,
"error": "Docker连接失败",
"service_id": service_id,
"container_id": None,
"status": "error",
"api_url": None
}
# 1. 构建Docker镜像
image_name = self._build_docker_image(service_id, project_info, service_config)
# 2. 启动服务容器
container_id = self._start_service_container(service_id, image_name, service_config)
# 3. 验证服务启动
if not self._verify_service_startup(container_id, service_config):
return {
"success": False,
"error": "服务启动验证失败",
"service_id": service_id, "service_id": service_id,
"container_id": container_id, "container_id": container_id,
"status": "error", "status": "running",
"api_url": None "api_url": api_url,
"error": None
}
else:
# 本地进程部署
# 1. 创建服务目录
service_dir = self._create_service_directory(service_id)
# 2. 生成服务包装器
self._generate_local_service_wrapper(service_dir, project_info, service_config)
# 3. 启动服务进程
process_info = self._start_local_service_process(service_id, service_dir, project_info, service_config)
# 4. 验证服务启动
if not self._verify_local_service_startup(service_id, service_config):
return {
"success": False,
"error": "服务启动验证失败",
"service_id": service_id,
"container_id": None,
"status": "error",
"api_url": None
}
# 5. 构建API URL
api_url = f"http://{service_config.get('host', 'localhost')}:{service_config.get('port', 8000)}"
return {
"success": True,
"service_id": service_id,
"container_id": service_id, # 使用服务ID作为容器ID
"status": "running",
"api_url": api_url,
"error": None
} }
# 4. 构建API URL
api_url = f"http://{service_config.get('host', 'localhost')}:{service_config.get('port', 8000)}"
return {
"success": True,
"service_id": service_id,
"container_id": container_id,
"status": "running",
"api_url": api_url,
"error": None
}
except Exception as e: except Exception as e:
return { return {
"success": False, "success": False,
@@ -95,35 +143,85 @@ class ServiceOrchestrator:
启动结果 启动结果
""" """
try: try:
if not self.client: if self.deployment_mode == "docker":
if not self.client:
return {
"success": False,
"error": "Docker连接失败",
"service_id": service_id,
"status": "error"
}
# 获取容器
container = self.client.containers.get(container_id)
# 启动容器
container.start()
# 验证服务启动
if not self._verify_service_health(container_id):
return {
"success": False,
"error": "服务健康检查失败",
"service_id": service_id,
"status": "error"
}
return { return {
"success": False, "success": True,
"error": "Docker连接失败",
"service_id": service_id, "service_id": service_id,
"status": "error" "status": "running",
"error": None
} }
else:
# 获取容器 # 本地进程启动
container = self.client.containers.get(container_id) if service_id not in self.processes:
return {
# 启动容器 "success": False,
container.start() "error": "服务不存在",
"service_id": service_id,
# 验证服务启动 "status": "error"
if not self._verify_service_health(container_id): }
process_info = self.processes[service_id]
# 检查进程是否已经在运行
if process_info.get("pid"):
try:
process = psutil.Process(process_info["pid"])
if process.is_running():
return {
"success": True,
"service_id": service_id,
"status": "running",
"error": None
}
except:
pass
# 重新启动进程
service_dir = process_info["service_dir"]
project_info = process_info["project_info"]
service_config = process_info["service_config"]
# 启动服务进程
new_process_info = self._start_local_service_process(service_id, service_dir, project_info, service_config)
# 验证服务启动
if not self._verify_local_service_startup(service_id, service_config):
return {
"success": False,
"error": "服务启动验证失败",
"service_id": service_id,
"status": "error"
}
return { return {
"success": False, "success": True,
"error": "服务健康检查失败",
"service_id": service_id, "service_id": service_id,
"status": "error" "status": "running",
"error": None
} }
return {
"success": True,
"service_id": service_id,
"status": "running",
"error": None
}
except NotFound: except NotFound:
return { return {
"success": False, "success": False,
@@ -150,26 +248,58 @@ class ServiceOrchestrator:
停止结果 停止结果
""" """
try: try:
if not self.client: if self.deployment_mode == "docker":
if not self.client:
return {
"success": False,
"error": "Docker连接失败",
"service_id": service_id,
"status": "error"
}
# 获取容器
container = self.client.containers.get(container_id)
# 停止容器
container.stop(timeout=30)
return { return {
"success": False, "success": True,
"error": "Docker连接失败",
"service_id": service_id, "service_id": service_id,
"status": "error" "status": "stopped",
"error": None
}
else:
# 本地进程停止
if service_id not in self.processes:
return {
"success": False,
"error": "服务不存在",
"service_id": service_id,
"status": "error"
}
process_info = self.processes[service_id]
# 停止进程
if process_info.get("pid"):
try:
process = psutil.Process(process_info["pid"])
if process.is_running():
process.terminate()
process.wait(timeout=30)
except:
pass
# 更新进程状态
self.processes[service_id]["pid"] = None
return {
"success": True,
"service_id": service_id,
"status": "stopped",
"error": None
} }
# 获取容器
container = self.client.containers.get(container_id)
# 停止容器
container.stop(timeout=30)
return {
"success": True,
"service_id": service_id,
"status": "stopped",
"error": None
}
except NotFound: except NotFound:
return { return {
"success": False, "success": False,
@@ -196,35 +326,81 @@ class ServiceOrchestrator:
重启结果 重启结果
""" """
try: try:
if not self.client: if self.deployment_mode == "docker":
if not self.client:
return {
"success": False,
"error": "Docker连接失败",
"service_id": service_id,
"status": "error"
}
# 获取容器
container = self.client.containers.get(container_id)
# 重启容器
container.restart(timeout=30)
# 验证服务启动
if not self._verify_service_health(container_id):
return {
"success": False,
"error": "服务健康检查失败",
"service_id": service_id,
"status": "error"
}
return { return {
"success": False, "success": True,
"error": "Docker连接失败",
"service_id": service_id, "service_id": service_id,
"status": "error" "status": "running",
"error": None
} }
else:
# 获取容器 # 本地进程重启
container = self.client.containers.get(container_id) if service_id not in self.processes:
return {
# 重启容器 "success": False,
container.restart(timeout=30) "error": "服务不存在",
"service_id": service_id,
# 验证服务启动 "status": "error"
if not self._verify_service_health(container_id): }
process_info = self.processes[service_id]
# 停止当前进程
if process_info.get("pid"):
try:
process = psutil.Process(process_info["pid"])
if process.is_running():
process.terminate()
process.wait(timeout=30)
except:
pass
# 重新启动进程
service_dir = process_info["service_dir"]
project_info = process_info["project_info"]
service_config = process_info["service_config"]
# 启动服务进程
new_process_info = self._start_local_service_process(service_id, service_dir, project_info, service_config)
# 验证服务启动
if not self._verify_local_service_startup(service_id, service_config):
return {
"success": False,
"error": "服务启动验证失败",
"service_id": service_id,
"status": "error"
}
return { return {
"success": False, "success": True,
"error": "服务健康检查失败",
"service_id": service_id, "service_id": service_id,
"status": "error" "status": "running",
"error": None
} }
return {
"success": True,
"service_id": service_id,
"status": "running",
"error": None
}
except NotFound: except NotFound:
return { return {
"success": False, "success": False,
@@ -252,34 +428,72 @@ class ServiceOrchestrator:
删除结果 删除结果
""" """
try: try:
if not self.client: if self.deployment_mode == "docker":
if not self.client:
return {
"success": False,
"error": "Docker连接失败",
"service_id": service_id
}
# 停止并删除容器
if container_id:
try:
container = self.client.containers.get(container_id)
container.stop(timeout=10)
container.remove(force=True)
except NotFound:
pass
# 删除镜像
if image_name:
try:
self.client.images.remove(image_name, force=True)
except:
pass
return { return {
"success": False, "success": True,
"error": "Docker连接失败", "service_id": service_id,
"service_id": service_id "error": None
} }
else:
# 停止并删除容器 # 本地进程删除
if container_id: if service_id not in self.processes:
return {
"success": False,
"error": "服务不存在",
"service_id": service_id
}
process_info = self.processes[service_id]
# 停止进程
if process_info.get("pid"):
try:
process = psutil.Process(process_info["pid"])
if process.is_running():
process.terminate()
process.wait(timeout=30)
except:
pass
# 删除服务目录
service_dir = process_info["service_dir"]
try: try:
container = self.client.containers.get(container_id) import shutil
container.stop(timeout=10) shutil.rmtree(service_dir)
container.remove(force=True)
except NotFound:
pass
# 删除镜像
if image_name:
try:
self.client.images.remove(image_name, force=True)
except: except:
pass pass
return { # 从进程列表中删除
"success": True, del self.processes[service_id]
"service_id": service_id,
"error": None return {
} "success": True,
"service_id": service_id,
"error": None
}
except Exception as e: except Exception as e:
return { return {
"success": False, "success": False,
@@ -297,34 +511,78 @@ class ServiceOrchestrator:
服务状态 服务状态
""" """
try: try:
if not self.client: if self.deployment_mode == "docker":
if not self.client:
return {
"success": False,
"error": "Docker连接失败",
"status": "unknown",
"health": "unknown"
}
# 获取容器
container = self.client.containers.get(container_id)
# 获取容器状态
status = container.status
# 检查服务健康状态
health = "unknown"
if status == "running":
if self._verify_service_health(container_id):
health = "healthy"
else:
health = "unhealthy"
return { return {
"success": False, "success": True,
"error": "Docker连接失败", "status": status,
"status": "unknown", "health": health,
"health": "unknown" "error": None
} }
else:
# 获取容器 # 本地进程状态查询
container = self.client.containers.get(container_id) # 假设container_id就是service_id
service_id = container_id
# 获取容器状态
status = container.status if service_id not in self.processes:
return {
# 检查服务健康状态 "success": False,
health = "unknown" "error": "服务不存在",
if status == "running": "status": "not_found",
if self._verify_service_health(container_id): "health": "unknown"
health = "healthy" }
process_info = self.processes[service_id]
# 检查进程状态
status = "unknown"
health = "unknown"
if process_info.get("pid"):
try:
process = psutil.Process(process_info["pid"])
if process.is_running():
status = "running"
# 检查服务健康状态
service_config = process_info["service_config"]
if self._verify_local_service_health(service_id, service_config):
health = "healthy"
else:
health = "unhealthy"
else:
status = "stopped"
except:
status = "stopped"
else: else:
health = "unhealthy" status = "stopped"
return { return {
"success": True, "success": True,
"status": status, "status": status,
"health": health, "health": health,
"error": None "error": None
} }
except NotFound: except NotFound:
return { return {
"success": False, "success": False,
@@ -351,24 +609,63 @@ class ServiceOrchestrator:
服务日志 服务日志
""" """
try: try:
if not self.client: if self.deployment_mode == "docker":
if not self.client:
return {
"success": False,
"error": "Docker连接失败",
"logs": []
}
# 获取容器
container = self.client.containers.get(container_id)
# 获取日志
logs = container.logs(tail=lines).decode('utf-8').split('\n')
return { return {
"success": False, "success": True,
"error": "Docker连接失败", "logs": logs,
"logs": [] "error": None
}
else:
# 本地进程日志获取
# 假设container_id就是service_id
service_id = container_id
if service_id not in self.processes:
return {
"success": False,
"error": "服务不存在",
"logs": []
}
process_info = self.processes[service_id]
# 获取日志文件路径
log_file = process_info.get("log_file")
if not log_file or not os.path.exists(log_file):
return {
"success": True,
"logs": [],
"error": None
}
# 读取日志文件
try:
with open(log_file, 'r') as f:
logs = f.readlines()
# 只返回最后lines行
logs = [line.rstrip('\n') for line in logs[-lines:]]
except:
logs = []
return {
"success": True,
"logs": logs,
"error": None
} }
# 获取容器
container = self.client.containers.get(container_id)
# 获取日志
logs = container.logs(tail=lines).decode('utf-8').split('\n')
return {
"success": True,
"logs": logs,
"error": None
}
except NotFound: except NotFound:
return { return {
"success": False, "success": False,
@@ -960,3 +1257,135 @@ json
} }
with open(os.path.join(build_context, "package.json"), "w") as f: with open(os.path.join(build_context, "package.json"), "w") as f:
json.dump(package_data, f, indent=2) json.dump(package_data, f, indent=2)
def _create_service_directory(self, service_id: str) -> str:
"""创建服务目录
Args:
service_id: 服务ID
Returns:
服务目录路径
"""
service_dir = os.path.join("/tmp", f"algorithm-service-{service_id}")
os.makedirs(service_dir, exist_ok=True)
return service_dir
def _generate_local_service_wrapper(self, service_dir: str, project_info: Dict[str, Any], service_config: Dict[str, Any]):
"""生成本地服务包装器
Args:
service_dir: 服务目录
project_info: 项目信息
service_config: 服务配置
"""
# 生成服务包装器
service_wrapper_content = self._generate_service_wrapper(project_info, service_config)
wrapper_extension = ".py" if project_info["project_type"] == "python" else ".js"
with open(os.path.join(service_dir, f"service_wrapper{wrapper_extension}"), "w") as f:
f.write(service_wrapper_content)
# 创建模拟的算法文件
algorithm_content = """
def predict(data):
return {"result": "Prediction result", "input": data}
def run(data):
return {"result": "Run result", "input": data}
def main(data):
return {"result": "Main result", "input": data}
"""
with open(os.path.join(service_dir, "algorithm.py"), "w") as f:
f.write(algorithm_content)
def _start_local_service_process(self, service_id: str, service_dir: str, project_info: Dict[str, Any], service_config: Dict[str, Any]) -> Dict[str, Any]:
"""启动本地服务进程
Args:
service_id: 服务ID
service_dir: 服务目录
project_info: 项目信息
service_config: 服务配置
Returns:
进程信息
"""
# 创建日志文件
log_file = os.path.join(service_dir, f"service_{service_id}.log")
# 构建启动命令
if project_info["project_type"] == "python":
cmd = ["python", f"service_wrapper.py"]
else:
cmd = ["node", f"service_wrapper.js"]
# 设置环境变量
env = os.environ.copy()
env["HOST"] = service_config.get("host", "0.0.0.0")
env["PORT"] = str(service_config.get("port", 8000))
env["TIMEOUT"] = str(service_config.get("timeout", 30))
# 启动进程
process = subprocess.Popen(
cmd,
cwd=service_dir,
env=env,
stdout=open(log_file, "a"),
stderr=subprocess.STDOUT,
start_new_session=True
)
# 保存进程信息
process_info = {
"pid": process.pid,
"service_dir": service_dir,
"log_file": log_file,
"project_info": project_info,
"service_config": service_config
}
self.processes[service_id] = process_info
return process_info
def _verify_local_service_startup(self, service_id: str, service_config: Dict[str, Any]) -> bool:
"""验证本地服务启动
Args:
service_id: 服务ID
service_config: 服务配置
Returns:
是否启动成功
"""
# 等待服务启动
time.sleep(5)
# 验证服务健康状态
return self._verify_local_service_health(service_id, service_config)
def _verify_local_service_health(self, service_id: str, service_config: Dict[str, Any]) -> bool:
"""验证本地服务健康状态
Args:
service_id: 服务ID
service_config: 服务配置
Returns:
是否健康
"""
try:
import requests
# 构建健康检查URL
host = service_config.get("host", "localhost")
port = service_config.get("port", 8000)
health_check_url = f"http://{host}:{port}/health"
# 发送健康检查请求
response = requests.get(health_check_url, timeout=10)
return response.status_code == 200
except:
return False

View File

@@ -178,87 +178,41 @@ const formatDate = (dateString: string) => {
// 加载服务列表 // 加载服务列表
const loadServices = async () => { const loadServices = async () => {
try { try {
// 这里应该调用后端API获取服务列表 // 从本地存储获取token
// 暂时使用模拟数据 const token = localStorage.getItem('token')
services.value = [ if (!token) {
{ ElMessage.error('未登录,请重新登录')
id: '1', return
service_id: 'service-001', }
name: '图像分类服务',
algorithm_name: 'image-classification', // 调用后端API获取服务列表
version: '1.0.0', const response = await fetch('http://0.0.0.0:8001/api/v1/services', {
status: 'running', method: 'GET',
host: '192.168.1.100', headers: {
port: 8000, 'Content-Type': 'application/json',
api_url: 'http://192.168.1.100:8000/execute', 'Authorization': `Bearer ${token}`
start_time: new Date().toISOString(),
last_heartbeat: new Date().toISOString(),
description: '基于ResNet的图像分类服务',
config: {
cpu_limit: '2核',
memory_limit: '4GB',
replicas: 2,
timeout: 30
},
logs: [
'[2024-01-01 10:00:00] 服务启动成功',
'[2024-01-01 10:05:00] 注册到服务中心',
'[2024-01-01 10:10:00] 处理请求: 图像分类',
'[2024-01-01 10:15:00] 请求处理完成,耗时: 120ms'
]
},
{
id: '2',
service_id: 'service-002',
name: '文本分类服务',
algorithm_name: 'text-classification',
version: '1.0.0',
status: 'stopped',
host: '192.168.1.101',
port: 8001,
api_url: 'http://192.168.1.101:8001/execute',
start_time: new Date().toISOString(),
last_heartbeat: new Date().toISOString(),
description: '基于BERT的文本分类服务',
config: {
cpu_limit: '4核',
memory_limit: '8GB',
replicas: 1,
timeout: 60
},
logs: [
'[2024-01-01 09:00:00] 服务启动成功',
'[2024-01-01 09:30:00] 服务停止'
]
},
{
id: '3',
service_id: 'service-003',
name: '目标检测服务',
algorithm_name: 'object-detection',
version: '2.0.0',
status: 'running',
host: '192.168.1.102',
port: 8002,
api_url: 'http://192.168.1.102:8002/execute',
start_time: new Date().toISOString(),
last_heartbeat: new Date().toISOString(),
description: '基于YOLOv5的目标检测服务',
config: {
cpu_limit: '8核',
memory_limit: '16GB',
replicas: 1,
timeout: 120
},
logs: [
'[2024-01-01 11:00:00] 服务启动成功',
'[2024-01-01 11:05:00] 注册到服务中心',
'[2024-01-01 11:10:00] 处理请求: 目标检测',
'[2024-01-01 11:15:00] 请求处理完成,耗时: 500ms'
]
} }
] })
console.log('服务列表加载完成')
if (!response.ok) {
throw new Error('获取服务列表失败')
}
const data = await response.json()
if (data.success) {
// 处理服务数据,添加缺失的字段
services.value = data.services.map((service: any) => ({
...service,
last_heartbeat: service.last_heartbeat || null,
start_time: service.start_time || null,
description: service.description || '',
config: service.config || {},
logs: []
}))
console.log('服务列表加载完成', services.value)
} else {
throw new Error(data.message || '获取服务列表失败')
}
} catch (error) { } catch (error) {
console.error('加载服务列表失败:', error) console.error('加载服务列表失败:', error)
ElMessage.error('加载服务列表失败') ElMessage.error('加载服务列表失败')