Ironic 深度解析:OpenStack 裸金属服务原理与实践
定位与职责
Ironic 是 OpenStack 的裸金属服务,让 OpenStack 像管理虚拟机一样管理物理服务器:
- 通过 PXE/iPXE 网络启动部署操作系统
- 通过 IPMI/Redfish 控制物理机电源
- 与 Nova 集成,用户无感知地使用物理机
- 支持 RAID 配置、BIOS 设置、固件升级
- 典型场景:高性能计算、AI 训练集群、数据库服务器
架构总览
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
| 用户请求(与 Nova 虚拟机相同的 API) │ ▼ Nova API → Nova Scheduler → Nova Compute(ironic virt driver) │ ▼ ironic-api(REST API) │ RPC ▼ ironic-conductor(核心引擎) │ ├── IPMI Driver → 电源控制(开机/关机/重启) ├── Redfish Driver → 现代 BMC 接口(替代 IPMI) ├── PXE Driver → 网络启动 └── iDRAC/iLO Driver → 厂商专有接口
|
裸金属部署流程
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
| 1. 节点注册(管理员操作) ironic node-create → 填写 IPMI 地址、MAC 地址等
2. 节点自检(Introspection) ironic-inspector → PXE 启动 → 收集硬件信息(CPU/内存/磁盘)
3. 用户请求裸金属实例 nova boot --flavor baremetal --image centos8 myserver
4. Nova Scheduler 选择裸金属节点 (通过 Placement 匹配资源)
5. Ironic 部署流程 ├── 设置节点为 deploying 状态 ├── IPMI 开机 ├── PXE 启动 → 加载 deploy ramdisk ├── deploy ramdisk 将镜像写入磁盘(dd/partclone) ├── 配置 bootloader(grub) ├── IPMI 重启 → 从磁盘启动 └── 节点变为 active 状态,返回给用户
|
节点状态机
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
| enroll(注册) │ manage ▼ manageable(可管理) │ provide ▼ available(可用,等待部署) │ deploy(Nova 触发) ▼ deploying(部署中) │ ▼ active(运行中) │ undeploy(Nova 删除实例) ▼ wait call-back(等待 ramdisk 回调) │ ▼ available(重新可用)
|
PXE 启动机制
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
| 物理机上电 │ ▼ BIOS/UEFI → 网卡 PXE ROM │ ▼ DHCP 请求(广播) │ ▼ ironic-conductor 中的 DHCP(或 Neutron DHCP) │ 返回:IP + next-server(TFTP 服务器地址)+ filename(pxelinux.0) ▼ TFTP 下载启动文件 ├── pxelinux.0(BIOS 模式) └── grubx64.efi(UEFI 模式) │ ▼ 加载 pxelinux.cfg/<mac-address> 配置 │ 指向 kernel + initrd(deploy ramdisk) ▼ 启动 deploy ramdisk(内存中运行的小型 Linux) │ ▼ deploy ramdisk 回调 ironic-conductor │ 报告:我准备好了,请给我镜像 ▼ ironic-conductor 通过 HTTP 提供镜像 URL │ ▼ deploy ramdisk 下载镜像并写入磁盘 │ dd if=image.raw of=/dev/sda bs=4M ▼ 写入完成,回调 ironic-conductor │ ▼ ironic-conductor 通过 IPMI 重启物理机 │ ▼ 物理机从磁盘启动,部署完成
|
驱动体系
电源驱动(Power Driver)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
|
class NativeIPMIPower(base.PowerInterface): def get_power_state(self, task): """获取当前电源状态""" return ipmitool.get_power_state(task.node)
def set_power_state(self, task, power_state, timeout=None): """设置电源状态""" if power_state == states.POWER_ON: ipmitool.power_on(task.node) elif power_state == states.POWER_OFF: ipmitool.power_off(task.node) elif power_state == states.REBOOT: ipmitool.power_reset(task.node)
|
Redfish 驱动(现代 BMC)
1 2 3 4 5 6 7 8 9
|
class RedfishPower(base.PowerInterface): def set_power_state(self, task, power_state, timeout=None): system = redfish_utils.get_system(task.node) if power_state == states.POWER_ON: system.reset_system(sushy.RESET_ON) elif power_state == states.POWER_OFF: system.reset_system(sushy.RESET_FORCE_OFF)
|
节点注册与自检
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
| openstack baremetal node create \ --driver ipmi \ --driver-info ipmi_address=10.0.0.100 \ --driver-info ipmi_username=admin \ --driver-info ipmi_password=secret \ --driver-info ipmi_port=623 \ --name server-01
openstack baremetal port create \ --node <node-uuid> \ --address aa:bb:cc:dd:ee:ff
openstack baremetal node set server-01 \ --property cpus=32 \ --property memory_mb=131072 \ --property local_gb=1800 \ --property cpu_arch=x86_64
openstack baremetal introspection start server-01 openstack baremetal introspection status server-01
openstack baremetal node manage server-01 openstack baremetal node provide server-01
|
与 Nova 集成
1 2 3 4 5 6 7 8 9 10
| [DEFAULT] compute_driver = ironic.IronicDriver
[ironic] auth_type = password auth_url = http://keystone:5000/v3 username = nova password = secret project_name = service
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
| openstack flavor create baremetal.large \ --vcpus 32 \ --ram 131072 \ --disk 1800 \ --property resources:CUSTOM_BAREMETAL=1 \ --property resources:VCPU=0 \ --property resources:MEMORY_MB=0 \ --property resources:DISK_GB=0
openstack server create \ --flavor baremetal.large \ --image centos-8 \ --key-name mykey \ --network baremetal-net \ my-bare-metal-server
|
RAID 配置
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
| openstack baremetal node set server-01 \ --target-raid-config '{ "logical_disks": [ { "size_gb": 100, "raid_level": "1", "is_root_volume": true }, { "size_gb": 1600, "raid_level": "5", "is_root_volume": false } ] }'
|
生产运维要点
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
| openstack baremetal node list --long
openstack baremetal node show <node-uuid> | grep last_error
openstack baremetal node clean server-01 \ --clean-steps '[{"interface": "deploy", "step": "erase_devices"}]'
openstack baremetal node maintenance set server-01 \ --reason "硬件维修"
openstack baremetal node abort server-01 openstack baremetal node undeploy server-01
|
源码关键路径
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
| ironic/ ├── api/ ├── conductor/ │ ├── manager.py │ └── flows/ │ ├── deploy.py │ └── clean.py ├── drivers/ │ └── modules/ │ ├── ipminative.py │ ├── redfish/ │ ├── pxe.py │ └── agent.py └── common/ └── states.py
|