OpenStack 生产级高可用部署架构

OpenStack 生产级高可用部署架构

整体架构

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
                    ┌─────────────────────────────────────────┐
│ 负载均衡层(HAProxy) │
│ VIP: 10.0.0.10(Keepalived) │
└──────────────┬──────────────────────────┘

┌────────────────────────┼────────────────────────┐
│ │ │
┌─────────▼──────┐ ┌──────────▼─────┐ ┌──────────▼─────┐
│ 控制节点 1 │ │ 控制节点 2 │ │ 控制节点 3 │
│ controller-1 │ │ controller-2 │ │ controller-3
│ │ │ │ │ │
│ keystone-api │ │ keystone-api │ │ keystone-api │
│ nova-api │ │ nova-api │ │ nova-api │
│ neutron-server│ │ neutron-server│ │ neutron-server│
│ cinder-api │ │ cinder-api │ │ cinder-api │
│ glance-api │ │ glance-api │ │ glance-api │
│ heat-api │ │ heat-api │ │ heat-api │
│ horizon │ │ horizon │ │ horizon │
└────────┬───────┘ └───────┬────────┘ └───────┬────────┘
│ │ │
┌────────▼──────────────────────▼────────────────────────▼────────┐
│ MySQL Galera Cluster │
│ controller-1 controller-2 controller-3
│ (同步多主复制,任意节点可读写) │
└─────────────────────────────────────────────────────────────────┘
│ │ │
┌────────▼──────────────────────▼────────────────────────▼────────┐
│ RabbitMQ Cluster │
│ controller-1 controller-2 controller-3
│ (镜像队列,消息持久化) │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│ 计算节点(N 台) │
│ compute-1 compute-2 compute-3 ... compute-N │
│ nova-compute neutron-openvswitch-agent libvirt │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│ 存储层 │
│ Ceph Cluster(MON × 3 + OSD × N) │
│ volumes pool / images pool / vms pool │
└─────────────────────────────────────────────────────────────────┘

控制节点 HA

HAProxy 配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
# /etc/haproxy/haproxy.cfg

global
log /dev/log local0
maxconn 4096

defaults
log global
mode http
option httplog
timeout connect 5s
timeout client 50s
timeout server 50s

# Keystone API
frontend keystone_public
bind *:5000
default_backend keystone_public_back

backend keystone_public_back
balance roundrobin
option httpchk GET /v3
server controller-1 10.0.0.1:5000 check inter 2s
server controller-2 10.0.0.2:5000 check inter 2s
server controller-3 10.0.0.3:5000 check inter 2s

# Nova API
frontend nova_api
bind *:8774
default_backend nova_api_back

backend nova_api_back
balance roundrobin
option httpchk GET /
server controller-1 10.0.0.1:8774 check inter 2s
server controller-2 10.0.0.2:8774 check inter 2s
server controller-3 10.0.0.3:8774 check inter 2s

# RabbitMQ(TCP 模式)
frontend rabbitmq
bind *:5672
mode tcp
default_backend rabbitmq_back

backend rabbitmq_back
mode tcp
balance roundrobin
server controller-1 10.0.0.1:5672 check inter 2s
server controller-2 10.0.0.2:5672 check inter 2s
server controller-3 10.0.0.3:5672 check inter 2s

Keepalived VIP

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# /etc/keepalived/keepalived.conf(controller-1,MASTER)

vrrp_script chk_haproxy {
script "killall -0 haproxy"
interval 2
weight 2
}

vrrp_instance VI_1 {
state MASTER
interface eth0
virtual_router_id 51
priority 101 # controller-2: 100, controller-3: 99
advert_int 1
authentication {
auth_type PASS
auth_pass openstack
}
virtual_ipaddress {
10.0.0.10/24
}
track_script {
chk_haproxy
}
}

MySQL Galera 集群

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# /etc/mysql/conf.d/galera.cnf

[mysqld]
binlog_format = ROW
default-storage-engine = innodb
innodb_autoinc_lock_mode = 2
bind-address = 0.0.0.0

# Galera 配置
wsrep_on = ON
wsrep_provider = /usr/lib/galera/libgalera_smm.so
wsrep_cluster_name = "openstack_galera"
wsrep_cluster_address = "gcomm://10.0.0.1,10.0.0.2,10.0.0.3"
wsrep_node_address = "10.0.0.1" # 每个节点填自己的 IP
wsrep_node_name = "controller-1"
wsrep_sst_method = rsync

# 性能优化
innodb_buffer_pool_size = 8G
innodb_log_file_size = 512M
max_connections = 1000

数据库连接池(oslo.db)

1
2
3
4
5
6
7
8
9
10
# nova.conf / neutron.conf 等
[database]
connection = mysql+pymysql://nova:password@10.0.0.10/nova
# 使用 VIP 连接,HAProxy 做负载均衡

# 连接池配置
max_pool_size = 30
max_overflow = 60
pool_timeout = 30
connection_recycle_time = 600

RabbitMQ 集群

1
2
3
4
5
6
7
8
9
10
11
# 在 controller-2 和 controller-3 上加入集群
rabbitmqctl stop_app
rabbitmqctl join_cluster rabbit@controller-1
rabbitmqctl start_app

# 设置镜像队列策略(所有队列在所有节点镜像)
rabbitmqctl set_policy ha-all "^" \
'{"ha-mode":"all","ha-sync-mode":"automatic"}'

# 查看集群状态
rabbitmqctl cluster_status
1
2
3
# nova.conf
[DEFAULT]
transport_url = rabbit://openstack:password@10.0.0.1:5672,openstack:password@10.0.0.2:5672,openstack:password@10.0.0.3:5672/

计算节点高可用

虚拟机疏散(Evacuate)

当计算节点宕机时,自动将其上的 VM 迁移到其他节点:

1
2
3
4
5
6
7
# 手动疏散单个节点
nova host-evacuate compute-1

# 自动疏散(需要 nova-compute 服务标记为 down)
# 配合 Pacemaker 或自定义脚本实现自动化

# 疏散前提:共享存储(Ceph),否则数据丢失

自动疏散脚本

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
#!/usr/bin/env python3
# auto_evacuate.py - 监控 nova-compute 状态,自动疏散

import openstack
import time

conn = openstack.connect(cloud='mycloud')

def check_and_evacuate():
services = conn.compute.services()
for svc in services:
if svc.binary == 'nova-compute' and svc.state == 'down':
# 等待确认(避免误判)
time.sleep(30)
svc_recheck = conn.compute.get_service(svc.id)
if svc_recheck.state == 'down':
print(f"节点 {svc.host} 宕机,开始疏散...")
# 获取该节点上的所有 VM
servers = conn.compute.servers(
all_projects=True,
host=svc.host
)
for server in servers:
conn.compute.evacuate_server(server.id)
print(f" 疏散 VM: {server.name}")

跨 AZ 灾备架构

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Region: RegionOne

├── AZ: az1(机房 A)
│ ├── controller-1,2,3
│ ├── compute-1 ~ compute-20
│ └── Ceph Cluster A

└── AZ: az2(机房 B)
├── compute-21 ~ compute-40
└── Ceph Cluster B(与 A 异步复制)

跨 AZ 网络:
- 控制平面:专线互联(低延迟)
- 存储复制:Ceph RBD Mirror(异步)
- 虚拟机网络:VXLAN 跨 AZ 延伸

Ceph RBD 跨 AZ 镜像

1
2
3
4
5
6
7
8
9
# 在两个 Ceph 集群间配置 RBD Mirror
# 主集群(az1)
rbd mirror pool enable volumes image
rbd mirror image enable volumes/volume-xxx

# 备集群(az2)
rbd mirror pool peer bootstrap create volumes > bootstrap-token
# 在主集群导入 token
rbd mirror pool peer bootstrap import volumes bootstrap-token

关键配置检查清单

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# 1. 验证所有服务状态
openstack compute service list
openstack network agent list
openstack volume service list

# 2. 验证 Galera 集群
mysql -e "SHOW STATUS LIKE 'wsrep_cluster_size';"
# 应该返回 3

# 3. 验证 RabbitMQ 集群
rabbitmqctl cluster_status | grep running_nodes

# 4. 验证 HAProxy 后端
echo "show stat" | socat stdio /var/run/haproxy/admin.sock | \
cut -d',' -f1,2,18 | grep -v "^#"

# 5. 验证 Ceph 健康
ceph health detail
ceph osd stat

容量规划参考

规模 控制节点 计算节点 存储
小型(< 100 VM) 3 × 8C16G 10 × 32C256G Ceph 3 节点
中型(< 1000 VM) 3 × 16C32G 50 × 64C512G Ceph 10+ 节点
大型(> 1000 VM) 3 × 32C64G + Cell v2 200+ × 64C512G Ceph 30+ 节点

控制节点瓶颈通常在数据库,建议单独部署 MySQL 集群并配置 ProxySQL 做读写分离。