Linux Centos7 Glusterfs分布式存储及其优化

前言

1、开源的
2、容量达到PB级、服务器的最多达到千台
3、提升数据读写速度、高用性
4、无元数据的架构, 采用弹性hash定位数据
5、可以廉价的pc server上构建
6、支持多种挂载的方式

数据元数据分离：元数据服务器
数据元数据不分离：元数据和数据都在一个服务器上

一、基础环境配置

[root@c7-glusterfs-node1-21 ~]# hostnamectl set-hostname c7-glusterfs-node1-21

[root@c7-glusterfs-node1-21 ~]# cat /etc/sysconfig/network-scripts/ifcfg-ens33 
TYPE="Ethernet"
BOOTPROTO="none"
IPADDR=172.29.7.21
PREFIX=24
GATEWAY=172.29.7.254
#dns改成dns服务器的ip地址
#改不改都可以这是我的上一篇文章中的dns服务器，主要是用来解析glusterfs域名的，也可以在每个主机中的host文件中写上
DNS1=172.29.7.10
NAME="ens33"
DEVICE="ens33"
ONBOOT="yes"

#不使用DNS服务器，需要修改hosts文件
#每个节点都需要添加
#[root@c7-glusterfs-node1-21 ~]# cat /etc/hosts
#127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
#::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
#node1.glusterfs.com    172.29.7.21
#node2.glusterfs.com    172.29.7.22
#node3.glusterfs.com    172.29.7.23
#node4.glusterfs.com    172.29.7.24

#重启网卡
[root@c7-glusterfs-node1-21 ~]# nmcli connection reload
[root@c7-glusterfs-node1-21 ~]# nmcli connection up ens33

#测试dns或hosts是否可以成功解析
[root@c7-glusterfs-node1-21 ~]# host node2.glusterfs.com
node2.glusterfs.com has address 172.29.7.22

##
##其他四个hostname、ip、进行修改就可以，这里不做展示
##

#动态开启selinux警告模式
[root@c7-glusterfs-node1-21 ~]# setenforce 0

[root@c7-glusterfs-node1-21 ~]# cat /etc/selinux/config 

# This file controls the state of SELinux on the system.
# SELINUX= can take one of these three values:
#     enforcing - SELinux security policy is enforced.
#     permissive - SELinux prints warnings instead of enforcing.
#     disabled - No SELinux policy is loaded.
#Disabled ：不启用控制系统。
#permissive：开启控制系统，但是处于警告模式。即使你违反了策略的话它让你继续操作，但是把你的违反的内容记录下来。
#Enforcing：开启控制系统，处于强制状态。一旦违反了策略，就无法继续操作下去。
SELINUX=permissive
# SELINUXTYPE= can take one of three values:
#     targeted - Targeted processes are protected,
#     minimum - Modification of targeted policy. Only selected processes are protected. 
#     mls - Multi Level Security protection.
SELINUXTYPE=targeted 

#开机不启动并关闭防火墙
[root@c7-glusterfs-node1-21 ~]# systemctl disable --now firewalld

#修改时区
[root@c7-glusterfs-node1-21 ~]# timedatectl set-timezone Asia/Shanghai

#时间同步
[root@c7-glusterfs-node1-21 ~]# cat /etc/chrony.conf 
# Use public servers from the pool.ntp.org project.
# Please consider joining the pool (http://www.pool.ntp.org/join.html).
##这里就使用dns服务器的时间，也可以用阿里云
server 172.29.7.10 iburst
#server 0.centos.pool.ntp.org iburst
#server 1.centos.pool.ntp.org iburst
#server 2.centos.pool.ntp.org iburst
#server 3.centos.pool.ntp.org iburst

[root@c7-glusterfs-node1-21 ~]# systemctl restart chronyd
[root@c7-glusterfs-node1-21 ~]# systemctl enable --now chronyd

[root@c7-glusterfs-node1-21 ~]# chronyc -n sources
210 Number of sources = 1
MS Name/IP address         Stratum Poll Reach LastRx Last sample               
===============================================================================
^? 172.29.7.10                   0   7     0     -     +0ns[   +0ns] +/-    0ns

二、Glusterfs部署

#ping一下，看看通不通
[root@c7-glusterfs-node1-21 ~]# for i in node1 node2 node3 node4 node5
do
ping -c 1 $i.glusterfs.com
done

#配置gluster镜像仓库
[root@c7-glusterfs-node1-21 ~]# cat /etc/yum.repos.d/gluster.repo 
[gluster]
name=gluster
baseurl=https://mirrors-i.tuna.tsinghua.edu.cn/centos/7.9.2009/storage/x86_64/gluster-6/
enabled=1
gpgcheck=0

#给其他节点也传一份
[root@c7-glusterfs-node1-21 ~]# for i in 22 23 24 25
do
rsync -av /etc/yum.repos.d/gluster.repo root@172.29.7.$i:/etc/yum.repos.d/gluster.repo
done

#下载glusterfs
#glusterfs-server:服务端
#glusterfs-fuse:fuse挂载方式
#glusterfs：客户端
[root@c7-glusterfs-node1-21 ~]# for i in 21 22 23 24 25
do ssh root@172.29.7.$i yum -y install glusterfs-server glusterfs-fuse glusterfs
done

#启动gluster
[root@c7-glusterfs-node1-21 ~]# for i in 21 22 23 24 25
do
ssh root@172.29.7.$i systemctl enable --now glusterd
done

三、热添加磁盘

#通过扫秒系统接口，热识别磁盘
[root@c7-glusterfs-node1-21 ~]# for i in {0..2}
do
echo '- - -' > /sys/class/scsi_host/host$i/scan
done

[root@c7-glusterfs-node1-21 ~]# lsblk
NAME   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda      8:0    0   20G  0 disk 
├─sda1   8:1    0  300M  0 part /boot
├─sda2   8:2    0    2G  0 part [SWAP]
└─sda3   8:3    0 17.7G  0 part /
sdb      8:16   0    2G  0 disk 
sdc      8:32   0    2G  0 disk 
sdd      8:48   0    2G  0 disk 
sr0     11:0    1 1024M  0 rom

四、关于节点磁盘的配置

#一般情况下，每个节点的磁盘都需要做raid或逻辑卷之类的，raid方面这里就用软raid练习。

#raid
maadm
-C      #创建
-a      #添加磁盘
-l N    #指明要创建的RAID的级别
-n N    #使用N个块设备来创建此RAID

[root@c7-glusterfs-node1-21 ~]# mdadm -C /dev/md0 -a yes -l 5 -n 3 /dev/sdb /dev/sdc /dev/sdd
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started.
[root@c7-glusterfs-node1-21 ~]# ls /dev/md0 
/dev/md0

[root@c7-glusterfs-node2-22 ~]# mdadm -C /dev/md0 -a yes -l 5 -n 3 /dev/sdb /dev/sdc /dev/sdd
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started.
[root@c7-glusterfs-node2-22 ~]# ls /dev/md0
/dev/md0

#逻辑卷
pvcreate        #创造物理卷
vgcreate        #创造卷组
-s      #指定PE大小，数字加单位，单位为 k|K|m|M|g|G|t|T|p|P|e|E
lvcreate        #创造逻辑卷
-L      #指定磁盘大小
-n      ##逻辑卷名称

[root@c7-glusterfs-node3-23 ~]# pvcreate /dev/sdb
  Physical volume "/dev/sdb" successfully created.

[root@c7-glusterfs-node3-23 ~]# vgcreate -s 16M gluster_vg /dev/sdb
  Volume group "gluster_vg" successfully created

[root@c7-glusterfs-node3-23 ~]# lvcreate -L 10G -n lv1 gluster_vg
  Logical volume "lv1" created.

[root@c7-glusterfs-node3-23 ~]# lsblk
NAME             MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda                8:0    0   20G  0 disk 
├─sda1             8:1    0  300M  0 part /boot
├─sda2             8:2    0    2G  0 part [SWAP]
└─sda3             8:3    0 17.7G  0 part /
sdb                8:16   0   20G  0 disk 
└─gluster_vg-lv1 253:0    0   10G  0 lvm  
sr0               11:0    1 1024M  0 rom

五、创造卷

1、分布式卷

分布试卷 distribute
特性: 以单个文件为单位，分散存储到不同的brick上
适用于大量的小文件，增加数据读写速度
分布式卷容量==所有brick之和
无brick数量的限制

#基础配置
#node1.glusterfs.com基础配置

[root@c7-glusterfs-node1-21 ~]# mdadm -C /dev/md0 -a yes -l 5 -n 3 /dev/sdb /dev/sdc /dev/sdd
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started.
[root@c7-glusterfs-node1-21 ~]# ls /dev/md0 
/dev/md0

[root@c7-glusterfs-node1-21 ~]# mkfs.ext4 /dev/md0 
mke2fs 1.42.9 (28-Dec-2013)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=128 blocks, Stripe width=256 blocks
262144 inodes, 1047040 blocks
52352 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=1073741824
32 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks: 
    32768, 98304, 163840, 229376, 294912, 819200, 884736

Allocating group tables: done                            
Writing inode tables: done                            
Creating journal (16384 blocks): done
Writing superblocks and filesystem accounting information: done 

[root@c7-glusterfs-node1-21 ~]# mkdir /data

[root@c7-glusterfs-node1-21 ~]# cat /etc/fstab 

#
# /etc/fstab
# Created by anaconda on Fri Jan  5 22:40:35 2024
#
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
#
UUID=f5fc8c85-5347-40f5-abd3-c27a1cac0f94 /                       xfs     defaults        0 0
UUID=b63d2603-d3de-4447-9c7a-cc7715c62cd1 /boot                   xfs     defaults        0 0
UUID=3891180d-07ff-4917-80a3-757b618838f1 swap                    swap    defaults        0 0

/dev/md0    /data   ext4    defaults    0 0

[root@c7-glusterfs-node1-21 ~]# mount -a

[root@c7-glusterfs-node1-21 ~]# df -hT
Filesystem     Type      Size  Used Avail Use% Mounted on
devtmpfs       devtmpfs  471M     0  471M   0% /dev
tmpfs          tmpfs     487M     0  487M   0% /dev/shm
tmpfs          tmpfs     487M  8.5M  478M   2% /run
tmpfs          tmpfs     487M     0  487M   0% /sys/fs/cgroup
/dev/sda3      xfs        18G  5.1G   13G  29% /
/dev/sda1      xfs       297M  163M  134M  55% /boot
tmpfs          tmpfs      98M   12K   98M   1% /run/user/42
tmpfs          tmpfs      98M     0   98M   0% /run/user/0
/dev/md0       ext4      3.9G   16M  3.7G   1% /data

[root@c7-glusterfs-node1-21 ~]# rm -rf /data/*

[root@c7-glusterfs-node1-21 ~]# mkdir /data/br1

#node2.glusterfs.com基础配置

[root@c7-glusterfs-node2-22 ~]# mdadm -C /dev/md0 -a yes -l 5 -n 3 /dev/sdb /dev/sdc /dev/sdd
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started.
[root@c7-glusterfs-node2-22 ~]# ls /dev/md0
/dev/md0

[root@c7-glusterfs-node2-22 ~]# mkfs.ext4 /dev/md0 
mke2fs 1.42.9 (28-Dec-2013)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=128 blocks, Stripe width=256 blocks
262144 inodes, 1047040 blocks
52352 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=1073741824
32 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks: 
    32768, 98304, 163840, 229376, 294912, 819200, 884736

Allocating group tables: done                            
Writing inode tables: done                            
Creating journal (16384 blocks): done
Writing superblocks and filesystem accounting information: done 

[root@c7-glusterfs-node2-22 ~]# mkdir /data

[root@c7-glusterfs-node2-22 ~]# cat /etc/fstab 

#
# /etc/fstab
# Created by anaconda on Fri Jan  5 22:40:35 2024
#
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
#
UUID=f5fc8c85-5347-40f5-abd3-c27a1cac0f94 /                       xfs     defaults        0 0
UUID=b63d2603-d3de-4447-9c7a-cc7715c62cd1 /boot                   xfs     defaults        0 0
UUID=3891180d-07ff-4917-80a3-757b618838f1 swap                    swap    defaults        0 0

/dev/md0    /data   ext4    defaults    0 0

[root@c7-glusterfs-node2-22 ~]# mount -a

[root@c7-glusterfs-node2-22 ~]# df -hT
Filesystem     Type      Size  Used Avail Use% Mounted on
devtmpfs       devtmpfs  471M     0  471M   0% /dev
tmpfs          tmpfs     487M     0  487M   0% /dev/shm
tmpfs          tmpfs     487M  8.5M  478M   2% /run
tmpfs          tmpfs     487M     0  487M   0% /sys/fs/cgroup
/dev/sda3      xfs        18G  5.4G   13G  31% /
/dev/sda1      xfs       297M  163M  134M  55% /boot
tmpfs          tmpfs      98M   12K   98M   1% /run/user/42
tmpfs          tmpfs      98M     0   98M   0% /run/user/0
/dev/md0       ext4      3.9G   16M  3.7G   1% /data

[root@c7-glusterfs-node2-22 ~]# rm -rf /data/*

[root@c7-glusterfs-node2-22 ~]# mkdir /data/br1

#创造分布式卷
[root@c7-glusterfs-node1-21 ~]# gluster volume create dis_volume \
> node1.glusterfs.com:/data/br1 \
> node2.glusterfs.com:/data/br1
volume create: dis_volume: success: please start the volume to access data

#启动卷
[root@c7-glusterfs-node1-21 ~]# gluster volume start dis_volume 
volume start: dis_volume: success

#查看卷信息
[root@c7-glusterfs-node1-21 ~]# gluster volume info dis_volume 

Volume Name: dis_volume
Type: Distribute
Volume ID: cd165767-a425-48f0-a8ab-d76d5494a1fe
Status: Started
Snapshot Count: 0
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: node1.glusterfs.com:/data/br1
Brick2: node2.glusterfs.com:/data/br1
Options Reconfigured:
transport.address-family: inet
nfs.disable: on

test节点测试

#测试节点基础环境配置
[root@c7-glusterfs-node1-21 ~]# rsync -av /etc/yum.repos.d/gluster.repo root@172.29.7.11:/etc/yum.repos.d/gluster.repo

[root@c7-test-11 ~]# yum -y install glusterfs-fuse

[root@c7-test-11 ~]# cat /etc/sysconfig/network-scripts/ifcfg-ens33 
TYPE="Ethernet"
BOOTPROTO="none"
IPADDR=172.29.7.11
PERFIX=24
GATEWAY=172.29.7.254
DNS1=172.29.7.10
NAME="ens33"
DEVICE="ens33"
ONBOOT="yes"

[root@c7-test-11 ~]# nmcli connection reload ;nmcli connection up ens33

#挂载glusterfs卷
[root@c7-test-11 ~]# mkdir /data

[root@c7-test-11 ~]# mount -t glusterfs node1.glusterfs.com:/dis_volume /data
[root@c7-test-11 ~]# df -hT
Filesystem                      Type            Size  Used Avail Use% Mounted on
devtmpfs                        devtmpfs        1.9G     0  1.9G   0% /dev
tmpfs                           tmpfs           1.9G     0  1.9G   0% /dev/shm
tmpfs                           tmpfs           1.9G   13M  1.9G   1% /run
tmpfs                           tmpfs           1.9G     0  1.9G   0% /sys/fs/cgroup
/dev/sda3                       xfs              18G  5.4G   13G  31% /
/dev/sda1                       xfs             297M  163M  134M  55% /boot
tmpfs                           tmpfs           378M   12K  378M   1% /run/user/42
tmpfs                           tmpfs           378M     0  378M   0% /run/user/0
node1.glusterfs.com:/dis_volume fuse.glusterfs  7.8G  112M  7.3G   2% /data

#配置文件挂载glusterfs卷
[root@c7-test-11 ~]# cat /etc/fstab 

#
# /etc/fstab
# Created by anaconda on Fri Jan  5 22:40:35 2024
#
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
#
UUID=f5fc8c85-5347-40f5-abd3-c27a1cac0f94 /                       xfs     defaults        0 0
UUID=b63d2603-d3de-4447-9c7a-cc7715c62cd1 /boot                   xfs     defaults        0 0
UUID=3891180d-07ff-4917-80a3-757b618838f1 swap                    swap    defaults        0 0

#添加这一行
node1.glusterfs.com:/dis_volume /data   glusterfs   defaults,_netdev    0 0

[root@c7-test-11 ~]#mount -a

#创造文件测试
[root@c7-test-11 ~]# touch /data/{1..10}.txt
[root@c7-test-11 ~]# ls /data/
10.txt  1.txt  2.txt  3.txt  4.txt  5.txt  6.txt  7.txt  8.txt  9.txt

#文件被分散在node1和node2上
[root@c7-glusterfs-node1-21 ~]# ls /data/br1/
4.txt  8.txt  9.txt
[root@c7-glusterfs-node2-22 ~]# ls /data/br1/
10.txt  1.txt  2.txt  3.txt  5.txt  6.txt  7.txt

2、复制卷

复制卷 replica_volume
文件被复制为多份，保存到brick上
提高文件的可用性

replica指定的复制数要与brick数量一致

#基础配置
[root@c7-glusterfs-node3-23 ~]# mkfs.ext4 /dev/gluster_vg/lv1 

[root@c7-glusterfs-node3-23 ~]# mkdir /data

[root@c7-glusterfs-node3-23 ~]# cat /etc/fstab 

#
# /etc/fstab
# Created by anaconda on Fri Jan  5 22:40:35 2024
#
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
#
UUID=f5fc8c85-5347-40f5-abd3-c27a1cac0f94 /                       xfs     defaults        0 0
UUID=b63d2603-d3de-4447-9c7a-cc7715c62cd1 /boot                   xfs     defaults        0 0
UUID=3891180d-07ff-4917-80a3-757b618838f1 swap                    swap    defaults        0 0
/dev/gluster_vg/lv1 /data   ext4    defaults    0 0

[root@c7-glusterfs-node3-23 ~]# mount -a

[root@c7-glusterfs-node3-23 ~]# df -hT
Filesystem                 Type      Size  Used Avail Use% Mounted on
devtmpfs                   devtmpfs  471M     0  471M   0% /dev
tmpfs                      tmpfs     487M     0  487M   0% /dev/shm
tmpfs                      tmpfs     487M  8.4M  478M   2% /run
tmpfs                      tmpfs     487M     0  487M   0% /sys/fs/cgroup
/dev/sda3                  xfs        18G  5.1G   13G  29% /
/dev/sda1                  xfs       297M  163M  134M  55% /boot
tmpfs                      tmpfs      98M   12K   98M   1% /run/user/42
tmpfs                      tmpfs      98M     0   98M   0% /run/user/0
/dev/mapper/gluster_vg-lv1 ext4      9.8G   37M  9.2G   1% /data

[root@c7-glusterfs-node3-23 ~]# mkdir /data/br1

#
##c7-glusterfs-node4-24基础配置一样
#

#创造复制卷

[root@c7-glusterfs-node1-21 ~]# gluster volume create replica_volume replica 2 \
node3.glusterfs.com:/data/br1 \
node4.glusterfs.com:/data/br1 
Replica 2 volumes are prone to split-brain. Use Arbiter or Replica 3 to avoid this. See: http://docs.gluster.org/en/latest/Administrator%20Guide/Split%20brain%20and%20ways%20to%20deal%20with%20it/.
Do you still want to continue?
 (y/n) y
volume create: replica_volume: success: please start the volume to access data

[root@c7-glusterfs-node1-21 ~]# gluster volume start replica_volume 
volume start: replica_volume: success
[root@c7-glusterfs-node1-21 ~]# gluster volume info replica_volume 

Volume Name: replica_volume
Type: Replicate
Volume ID: 7bff53e5-03d4-4e34-9c7c-809fc5ef624f
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: node3.glusterfs.com:/data/br1
Brick2: node4.glusterfs.com:/data/br1
Options Reconfigured:
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off

test节点测试

#挂载复制卷
[root@c7-test-11 ~]# cat /etc/fstab 

#
# /etc/fstab
# Created by anaconda on Fri Jan  5 22:40:35 2024
#
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
#
UUID=f5fc8c85-5347-40f5-abd3-c27a1cac0f94 /                       xfs     defaults        0 0
UUID=b63d2603-d3de-4447-9c7a-cc7715c62cd1 /boot                   xfs     defaults        0 0
UUID=3891180d-07ff-4917-80a3-757b618838f1 swap                    swap    defaults        0 0

node1.glusterfs.com:/dis_volume /data   glusterfs   defaults,_netdev    0 0
node1.glusterfs.com:/replica_volume /data1  glusterfs   defaults,_netdev    0 0

[root@c7-test-11 ~]# mount -a

#测试
[root@c7-test-11 ~]# touch /data1/{1..10}.txt
[root@c7-test-11 ~]# ls /data1
10.txt  1.txt  2.txt  3.txt  4.txt  5.txt  6.txt  7.txt  8.txt  9.txt

#node3和node4节点
[root@c7-glusterfs-node3-23 ~]# ls /data/br1/
10.txt  1.txt  2.txt  3.txt  4.txt  5.txt  6.txt  7.txt  8.txt  9.txt
[root@c7-glusterfs-node4-24 ~]# ls /data/br1/
10.txt  1.txt  2.txt  3.txt  4.txt  5.txt  6.txt  7.txt  8.txt  9.txt

3、分布复制卷

分布复制卷 dis_replica_volume
适用于保存小文件，提升文件可用性
brick数量为replica参数的整倍数

#基础环境配置
[root@c7-glusterfs-node1-21 ~]# mkfs.ext4 /dev/sde
[root@c7-glusterfs-node1-21 ~]# mkdir /data1
[root@c7-glusterfs-node1-21 ~]# tail -n 1 /etc/fstab 
/dev/sde    /data1  ext4    defaults    0 0
[root@c7-glusterfs-node1-21 ~]# mount -a
[root@c7-glusterfs-node1-21 ~]# df -hT | tail -n 1
/dev/sde       ext4       20G   45M   19G   1% /data1
[root@c7-glusterfs-node1-21 ~]# mkdir /data1/br1

[root@c7-glusterfs-node2-22 ~]# mkfs.ext4 /dev/sde 
[root@c7-glusterfs-node2-22 ~]# mkdir /data1
[root@c7-glusterfs-node2-22 ~]# tail -n 1 /etc/fstab 
/dev/sde    /data1  ext4    defaults    0 0
[root@c7-glusterfs-node2-22 ~]# mount -a
[root@c7-glusterfs-node2-22 ~]# df -hT | tail -n 1
/dev/sde       ext4       20G   45M   19G   1% /data1
[root@c7-glusterfs-node1-22 ~]# mkdir /data1/br1

[root@c7-glusterfs-node3-23 ~]# mkfs.ext4 /dev/sdc
[root@c7-glusterfs-node3-23 ~]# mkdir /data1
[root@c7-glusterfs-node3-23 ~]# tail -n 1 /etc/fstab 
/dev/sdc    /data1  ext4    defaults    0 0
[root@c7-glusterfs-node3-23 ~]# mount -a
[root@c7-glusterfs-node3-23 ~]# df -hT | tail -n 1
/dev/sdc                   ext4       20G   45M   19G   1% /data1
[root@c7-glusterfs-node1-23 ~]# mkdir /data1/br1

[root@c7-glusterfs-node4-24 ~]# mkfs.ext4 /dev/sdc
[root@c7-glusterfs-node4-24 ~]# mkdir /data1
[root@c7-glusterfs-node4-24 ~]# tail -n 1 /etc/fstab 
/dev/sdc    /data1  ext4    defaults    0 0
[root@c7-glusterfs-node4-24 ~]# mount -a
[root@c7-glusterfs-node4-24 ~]# df -hT | tail -n 1
/dev/sdc                   ext4       20G   45M   19G   1% /data1
[root@c7-glusterfs-node1-24 ~]# mkdir /data1/br1

#创造分布复制卷
[root@c7-glusterfs-node1-21 ~]# gluster volume create dis_replica_volume replica 2 \
node1.glusterfs.com:/data1/br1 \
node2.glusterfs.com:/data1/br1 \
node3.glusterfs.com:/data1/br1 \
node4.glusterfs.com:/data1/br1
Replica 2 volumes are prone to split-brain. Use Arbiter or Replica 3 to avoid this. See: http://docs.gluster.org/en/latest/Administrator%20Guide/Split%20brain%20and%20ways%20to%20deal%20with%20it/.
Do you still want to continue?
 (y/n) y     
volume create: dis_replica_volume: success: please start the volume to access data

[root@c7-glusterfs-node1-21 ~]# gluster volume start dis_replica_volume 
volume start: dis_replica_volume: success
[root@c7-glusterfs-node1-21 ~]# gluster volume info dis_replica_volume 

Volume Name: dis_replica_volume
Type: Distributed-Replicate
Volume ID: b44a0f00-6866-421f-9461-2b3109b81c43
Status: Started
Snapshot Count: 0
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: node1.glusterfs.com:/data1/br1
Brick2: node2.glusterfs.com:/data1/br1
Brick3: node3.glusterfs.com:/data1/br1
Brick4: node4.glusterfs.com:/data1/br1
Options Reconfigured:
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off

test节点测试

[root@c7-test-11 ~]# tail -n 1 /etc/fstab 
node1.glusterfs.com:/dis_replica_volume /data2  glusterfs       defaults,_netdev        0 0
[root@c7-test-11 ~]# mkdir /data2
[root@c7-test-11 ~]# mount -a
[root@c7-test-11 ~]# df -hT | tail -n 1
node1.glusterfs.com:/dis_replica_volume fuse.glusterfs   40G  489M   38G   2% /data2
[root@c7-test-11 ~]# touch /data2/{1..5}.jpg
[root@c7-test-11 ~]# ls /data2/
1.png  2.png  3.png  4.png  5.png

[root@c7-glusterfs-node1-21 ~]# ls /data1/br1/
2.jpg  4.jpg
[root@c7-glusterfs-node2-22 ~]# ls /data1/br1/
2.jpg  4.jpg
[root@c7-glusterfs-node3-23 ~]# ls /data1/br1/
1.jpg  3.jpg  5.jpg
[root@c7-glusterfs-node3-24 ~]# ls /data1/br1/
1.jpg  3.jpg  5.jpg

4、分散卷

分散卷 perse_volume
单个文件被拆分成多份，分散存储在不同的brick上，会有指定数量的brick用于保存数据的校验码
提升数据的读写速度、可靠性
适用于大文件存储

disperse：用于指定brick数量
redundany: 指定用于保存校验码的brick数量，不指定的话，gluster集群会自动选定brick数量用于保存校验码

#基础环境配置
[root@c7-glusterfs-node1-21 ~]# mkfs.ext4 /dev/sdf 
[root@c7-glusterfs-node1-21 ~]# mkdir /data2
[root@c7-glusterfs-node1-21 ~]# tail -n 1 /etc/fstab 
/dev/sdf    /data2  ext4    defaults    0 0
[root@c7-glusterfs-node1-21 ~]# mount -a
[root@c7-glusterfs-node1-21 ~]# df -hT | tail -n 1
/dev/sdf       ext4       20G   45M   19G   1% /data2
[root@c7-glusterfs-node1-21 ~]# mkdir /data2/br1

[root@c7-glusterfs-node1-22 ~]# mkfs.ext4 /dev/sdf 
[root@c7-glusterfs-node1-22 ~]# mkdir /data2
[root@c7-glusterfs-node1-22 ~]# tail -n 1 /etc/fstab 
/dev/sdf    /data2  ext4    defaults    0 0
[root@c7-glusterfs-node1-22 ~]# mount -a
[root@c7-glusterfs-node1-22 ~]# df -hT | tail -n 1
/dev/sdf       ext4       20G   45M   19G   1% /data2
[root@c7-glusterfs-node1-22 ~]# mkdir /data2/br1

[root@c7-glusterfs-node3-23 ~]# mkfs.ext4 /dev/sdd 
[root@c7-glusterfs-node3-23 ~]# mkdir /data2
[root@c7-glusterfs-node3-23 ~]# tail -n 1 /etc/fstab 
/dev/sdd    /data2  ext4    defaults    0 0
[root@c7-glusterfs-node3-23 ~]# mount -a
[root@c7-glusterfs-node3-23 ~]# df -hT | tail -n 1 
/dev/sdd                   ext4       20G   45M   19G   1% /data2
[root@c7-glusterfs-node3-23 ~]# mkdir /data2/br1

[root@c7-glusterfs-node3-24 ~]# mkfs.ext4 /dev/sdd 
[root@c7-glusterfs-node3-24 ~]# mkdir /data2
[root@c7-glusterfs-node3-24 ~]# tail -n 1 /etc/fstab 
/dev/sdd    /data2  ext4    defaults    0 0
[root@c7-glusterfs-node3-24 ~]# mount -a
[root@c7-glusterfs-node3-24 ~]# df -hT | tail -n 1 
/dev/sdd                   ext4       20G   45M   19G   1% /data2
[root@c7-glusterfs-node3-24 ~]# mkdir /data2/br1

#创造分散卷
[root@c7-glusterfs-node1-21 ~]# gluster volume create perse_volume disperse 4 \
node1.glusterfs.com:/data2/br1/ \
node2.glusterfs.com:/data2/br1/ \
node3.glusterfs.com:/data2/br1/ \
node4.glusterfs.com:/data2/br1/
There isn't an optimal redundancy value for this configuration. Do you want to create the volume with redundancy 1 ? (y/n) y
volume create: perse_volume: success: please start the volume to access data

[root@c7-glusterfs-node1-21 ~]# gluster volume start perse_volume 
volume start: perse_volume: success
[root@c7-glusterfs-node1-21 ~]# gluster volume info perse_volume 

Volume Name: perse_volume
Type: Disperse
Volume ID: ffabae5b-6a77-4525-af68-b429843476c0
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (3 + 1) = 4
Transport-type: tcp
Bricks:
Brick1: node1.glusterfs.com:/data2/br1
Brick2: node2.glusterfs.com:/data2/br1
Brick3: node3.glusterfs.com:/data2/br1
Brick4: node4.glusterfs.com:/data2/br1
Options Reconfigured:
transport.address-family: inet
nfs.disable: on

test节点测试

[root@c7-test-11 ~]# tail -n 1 /etc/fstab 
node1.glusterfs.com:/perse_volume   /data3  glusterfs   defaults,_netdev    0 0
[root@c7-test-11 ~]# mkdir /data3
[root@c7-test-11 ~]# mount -a
[root@c7-test-11 ~]# df -hT | tail -n 1
node1.glusterfs.com:/perse_volume       fuse.glusterfs   59G  734M   56G   2% /data3
[root@c7-test-11 ~]# touch /data3/{1..5}.png
[root@c7-test-11 ~]# ls /data3/
1.png  2.png  3.png  4.png  5.png

[root@c7-glusterfs-node1-21 ~]# ls /data2/br1/
1.png  2.png  3.png  4.png  5.png
[root@c7-glusterfs-node2-22 ~]# ls /data2/br1/
1.png  2.png  3.png  4.png  5.png
[root@c7-glusterfs-node3-23 ~]# ls /data2/br1/
1.png  2.png  3.png  4.png  5.png
[root@c7-glusterfs-node4-24 ~]# ls /data2/br1/
1.png  2.png  3.png  4.png  5.png

六、管理卷

1.使用以下命令卸载所有客户端上的卷
umount mount-point

2.使用以下命令停止卷
gluster volume stop <VOLNAME>

3.更改传输类型。例如，要同时启用 tcp 和 rdma，请执行 followimg 命令
gluster volume set test-volume config.transport tcp,rdma OR tcp OR rdma

4.在所有客户端上挂载卷。例如，要使用 rdma 传输挂载，请使用以下命令
mount -t glusterfs -o transport=rdma server1:/test-volume /mnt/glusterfs

1、扩展卷

注意：
扩展分布复制卷时，添加的brick的数量需要为replica参数的整倍数
扩展分布分散卷时，添加的brick的数量需要为disperse参数的整倍数

注意：
不能只添加卷，还需要打散卷里面的数据

1.扩展卷

#add-brick  添加卷
[root@c7-glusterfs-node1-21 ~]# gluster volume add-brick dis_volume node3.glusterfs.com:/data3/br1
volume add-brick: success

#查看添加的卷
[root@c7-glusterfs-node1-21 ~]# gluster volume info dis_volume 

Volume Name: dis_volume
Type: Distribute
Volume ID: cd165767-a425-48f0-a8ab-d76d5494a1fe
Status: Started
Snapshot Count: 0
Number of Bricks: 3
Transport-type: tcp
Bricks:
Brick1: node1.glusterfs.com:/data/br1
Brick2: node2.glusterfs.com:/data/br1
Brick3: node3.glusterfs.com:/data3/br1
Options Reconfigured:
transport.address-family: inet
nfs.disable: on

2.重分布卷

#rebalance .... start 开始打散
[root@c7-glusterfs-node1-21 ~]# gluster volume rebalance dis_volume start
volume rebalance: dis_volume: success: Rebalance on dis_volume has been started successfully. Use rebalance status command to check status of the rebalance process.
ID: 6248defa-7b69-4ee3-b96f-56e0b01e9212

#rebalance .... status 查看打散情况
[root@c7-glusterfs-node1-21 ~]# gluster volume rebalance dis_volume status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                     node2.glusterfs.com                7        0Bytes             7             0             0            completed        0:00:01
                     node3.glusterfs.com                0        0Bytes             0             0             0            completed        0:00:00
                               localhost                3        0Bytes             3             0             0            completed        0:00:00
volume rebalance: dis_volume: success

2、缩减卷

注意：
缩减分布复制卷、分布式分散卷时，缩减的brick数量要为replica参数的整倍数

1.迁移数据

#迁移卷中的数据
[root@c7-glusterfs-node1-21 ~]# gluster volume remove-brick dis_volume node3.glusterfs.com:/data3/br1 start
Running remove-brick with cluster.force-migration enabled can result in data corruption. It is safer to disable this option so that files that receive writes during migration are not migrated.
Files that are not migrated can then be manually copied after the remove-brick commit operation.
Do you want to continue with your current cluster.force-migration settings? (y/n) y
volume remove-brick start: success
ID: 722a8297-a838-410f-b3d4-2631255d9108

2.查看迁移情况

[root@c7-glusterfs-node1-21 ~]# gluster volume remove-brick dis_volume node3.glusterfs.com:/data3/br1 status
                                    Node Rebalanced-files          size       scanned      failures       skipped               status  run time in h:m:s
                               ---------      -----------   -----------   -----------   -----------   -----------         ------------     --------------
                     node3.glusterfs.com                8        0Bytes             8             0             0            completed        0:00:00

3.移除卷

[root@c7-glusterfs-node1-21 ~]# gluster volume remove-brick dis_volume node3.glusterfs.com:/data3/br1 commit
volume remove-brick commit: success
Check the removed bricks to ensure all files are migrated.
If files with data are found on the brick path, copy them via a gluster mount point before re-purposing the removed brick. 

[root@c7-glusterfs-node1-21 ~]# gluster volume info dis_volume 

Volume Name: dis_volume
Type: Distribute
Volume ID: cd165767-a425-48f0-a8ab-d76d5494a1fe
Status: Started
Snapshot Count: 0
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: node1.glusterfs.com:/data/br1
Brick2: node2.glusterfs.com:/data/br1
Options Reconfigured:
performance.client-io-threads: on
transport.address-family: inet
nfs.disable: on

3、替换故障卷

replace-brick命令
注意: 该命令只适用于分布式复制卷、复制卷

其他卷替换
1、添加新brick[不需要手动执行rebalance]
2、删除旧brick, 确认数据迁移完毕，再commit删除

[root@c7-glusterfs-node1-21 ~]# gluster volume replace-brick replica_volume node3.glusterfs.com:/data/br1 node3.glusterfs.com:/data3/br1 commit force
volume replace-brick: success: replace-brick commit force operation successful

[root@c7-glusterfs-node1-21 ~]# gluster volume info replica_volume 

Volume Name: replica_volume
Type: Replicate
Volume ID: 7bff53e5-03d4-4e34-9c7c-809fc5ef624f
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: node3.glusterfs.com:/data3/br1
Brick2: node4.glusterfs.com:/data/br1
Options Reconfigured:
transport.address-family: inet
nfs.disable: on
performance.client-io-threads: off

4、卷优化

格式: 

    # gluster volume set <卷名称>  <参数名称>  <值>

1、基于客户端地址进行访问控制
    auth.allow
    auth.reject

2、performance.write-behind-window-size
设置写缓冲区大小，默认1M

3、performance.io-thread-count
设置卷的IO线程数量 1--64

4、performance.cache-size
设置卷的缓存大小

5、performance.cache-max-file-size
设置缓存的最大文件大小

6、performance.cache-min-file-size
设置缓存的最小文件大小

7、performance.cache-refresh-timeout
设置缓存区的刷新时间间隔, 单位秒

Linux Centos7 Glusterfs分布式存储及其优化

前言

一、基础环境配置

二、Glusterfs部署

三、热添加磁盘

四、关于节点磁盘的配置

五、创造卷

1、分布式卷

test节点测试

2、复制卷

test节点测试

3、分布复制卷

test节点测试

4、分散卷

test节点测试

六、管理卷

1、扩展卷

1.扩展卷

2.重分布卷

2、缩减卷

1.迁移数据

2.查看迁移情况

3.移除卷

3、替换故障卷

4、卷优化

By jiutingqiu

Related Post

发表回复取消回复

You Missed

UNIX环境高级编程（APUE）学习记录第五章标准I/O库

UNIX环境高级编程（APUE）学习记录第四章文件和目录

warning: implicit declaration of function ‘open’; did you mean ‘popen’? [-Wimplicit-function-declaration]

UNIX环境高级编程（APUE）学习记录第三章文件I\O

Linux Centos7 Glusterfs分布式存储及其优化

前言

一、基础环境配置

二、Glusterfs部署

三、热添加磁盘

四、关于节点磁盘的配置

五、创造卷

1、分布式卷

test节点测试

2、复制卷

test节点测试

3、分布复制卷

test节点测试

4、分散卷

test节点测试

六、管理卷

1、扩展卷

1.扩展卷

2.重分布卷

2、缩减卷

1.迁移数据

2.查看迁移情况

3.移除卷

3、替换故障卷

4、卷优化

By jiutingqiu

Related Post

发表回复 取消回复

You Missed

UNIX环境高级编程（APUE） 学习记录 第五章 标准I/O库

UNIX环境高级编程（APUE） 学习记录 第四章 文件和目录

warning: implicit declaration of function ‘open’; did you mean ‘popen’? [-Wimplicit-function-declaration]

UNIX环境高级编程（APUE） 学习记录 第三章 文件I\O

发表回复取消回复

UNIX环境高级编程（APUE）学习记录第五章标准I/O库

UNIX环境高级编程（APUE）学习记录第四章文件和目录

UNIX环境高级编程（APUE）学习记录第三章文件I\O