使用S3,NFS安装与客户端兼容的LeoFS分布式故障安全存储设施

我来自Luxoft。
根据Opennet的说法: LeoFSLeoFS对象的分布式容错存储设施,与使用Amazon S3 API和REST-API的客户端兼容,并且还支持作为NFS服务器的操作模式。 有用于存储大小对象的优化,有内置的缓存机制,可以在数据中心之间复制存储。 该项目的目标之一是通过重复复制和消除单点故障来实现99.9999999%的可靠性。 项目代码用Erlang编写。


LeoFS包含三个组件:


  • LeoFS存储 -服务于添加,检索和删除对象和元数据的操作,负责复制,还原和排队客户端请求。
  • LeoFS网关 -使用REST-API或S3-API提供HTTP请求并将响应重定向到客户端,并在内存和磁盘上缓存请求最多的数据。
  • LeoFS管理器 -监视LeoFS网关和LeoFS存储节点的操作,监视节点的状态并检查校验和。 确保数据完整性和高存储可用性。

在这篇文章中,使用ansible-playbook安装Leofs,测试S3,NFS。


如果您尝试使用官方手册来安装LeoFS,则将等待其他错误: 1,2 。 在这篇文章中,我将写一些避免这些错误的方法。


您将在哪里运行ansible-playbook,需要安装netcat。


库存示例


示例清单(在hosts.sample存储库中):
# Please check roles/common/vars/leofs_releases for available versions [all:vars] leofs_version=1.4.3 build_temp_path="/tmp/leofs_builder" build_install_path="/tmp/" build_branch="master" source="package" #[builder] #172.26.9.177 # nodename of leo_manager_0 and leo_manager_1 are set at group_vars/all [leo_manager_0] 172.26.9.176 # nodename of leo_manager_0 and leo_manager_1 are set at group_vars/all [leo_manager_1] 172.26.9.178 [leo_storage] 172.26.9.179 leofs_module_nodename=S0@172.26.9.179 172.26.9.181 leofs_module_nodename=S0@172.26.9.181 172.26.9.182 leofs_module_nodename=S0@172.26.9.182 172.26.9.183 leofs_module_nodename=S0@172.26.9.183 [leo_gateway] 172.26.9.180 leofs_module_nodename=G0@172.26.9.180 172.26.9.184 leofs_module_nodename=G0@172.26.9.184 [leofs_nodes:children] leo_manager_0 leo_manager_1 leo_gateway leo_storage 

服务器准备


禁用Selinux。 我希望社区为LeoFS创建Selinux策略。


  - name: Install libselinux as prerequisite for SELinux Ansible module yum: name: "{{item}}" state: latest with_items: - libselinux-python - libsemanage-python - name: Disable SELinux at next reboot selinux: state: disabled - name: Set SELinux in permissive mode until the machine is rebooted command: setenforce 0 ignore_errors: true changed_when: false 

安装netcatredhat-lsb-coreleofs-adm需要netcat在这里确定OS版本需要redhat-lsb-core


  - name: Install Packages yum: name={{ item }} state=present with_items: - nmap-ncat - redhat-lsb-core 

创建用户leofs并将其添加到wheel组


  - name: Create user leofs group: name: leofs state: present - name: Allow 'wheel' group to have passwordless sudo lineinfile: dest: /etc/sudoers state: present regexp: '^%wheel' line: '%wheel ALL=(ALL) NOPASSWD: ALL' validate: 'visudo -cf %s' - name: Add the user 'leofs' to group 'wheel' user: name: leofs groups: wheel append: yes 

安装Erlang


  - name: Remote erlang-20.3.8.23-1.el7.x86_64.rpm install with yum yum: name=https://github.com/rabbitmq/erlang-rpm/releases/download/v20.3.8.23/erlang-20.3.8.23-1.el7.x86_64.rpm 

可在此处找到已更正的ansible剧本的完整版本: https : //github.com/patsevanton/leofs_ansible


安装,配置,开始


接下来,按照https://github.com/leo-project/leofs_ansible中的说明执行,不带build_leofs.yml


 ## Install LeoFS $ ansible-playbook -i hosts install_leofs.yml ## Config LeoFS $ ansible-playbook -i hosts config_leofs.yml ## Start LeoFS $ ansible-playbook -i hosts start_leofs.yml 

在主要LeoManager上检查群集状态


 leofs-adm status 

在ansible-playbook日志中可以看到Primary和Secondary




结论将是这样的
  [System Confiuration] -----------------------------------+---------- Item | Value -----------------------------------+---------- Basic/Consistency level -----------------------------------+---------- system version | 1.4.3 cluster Id | leofs_1 DC Id | dc_1 Total replicas | 2 number of successes of R | 1 number of successes of W | 1 number of successes of D | 1 number of rack-awareness replicas | 0 ring size | 2^128 -----------------------------------+---------- Multi DC replication settings -----------------------------------+---------- [mdcr] max number of joinable DCs | 2 [mdcr] total replicas per a DC | 1 [mdcr] number of successes of R | 1 [mdcr] number of successes of W | 1 [mdcr] number of successes of D | 1 -----------------------------------+---------- Manager RING hash -----------------------------------+---------- current ring-hash | a0314afb previous ring-hash | a0314afb -----------------------------------+---------- [State of Node(s)] -------+----------------------+--------------+---------+----------------+----------------+---------------------------- type | node | state | rack id | current ring | prev ring | updated at -------+----------------------+--------------+---------+----------------+----------------+---------------------------- S | S0@172.26.9.179 | running | | a0314afb | a0314afb | 2019-12-05 10:33:47 +0000 S | S0@172.26.9.181 | running | | a0314afb | a0314afb | 2019-12-05 10:33:47 +0000 S | S0@172.26.9.182 | running | | a0314afb | a0314afb | 2019-12-05 10:33:47 +0000 S | S0@172.26.9.183 | attached | | | | 2019-12-05 10:33:58 +0000 G | G0@172.26.9.180 | running | | a0314afb | a0314afb | 2019-12-05 10:33:49 +0000 G | G0@172.26.9.184 | running | | a0314afb | a0314afb | 2019-12-05 10:33:49 +0000 -------+----------------------+--------------+---------+----------------+----------------+---------------------------- 

创建一个用户


创建用户leof:


 leofs-adm create-user leofs leofs access-key-id: 9c2615f32e81e6a1caf5 secret-access-key: 8aaaa35c1ad78a2cbfa1a6cd49ba8aaeb3ba39eb 

用户列表:


 leofs-adm get-users user_id | role_id | access_key_id | created_at ------------+---------+------------------------+--------------------------- _test_leofs | 9 | 05236 | 2019-12-02 06:56:49 +0000 leofs | 1 | 9c2615f32e81e6a1caf5 | 2019-12-02 10:43:29 +0000 

创建一个桶


做一个水桶


 leofs-adm add-bucket leofs 9c2615f32e81e6a1caf5 OK 

遗愿清单:


  leofs-adm get-buckets cluster id | bucket | owner | permissions | created at -------------+----------+--------+------------------+--------------------------- leofs_1 | leofs | leofs | Me(full_control) | 2019-12-02 10:44:02 +0000 

配置s3cmd


在“ HTTP Proxy server name字段中,指定网关服务器IP


 s3cmd --configure Enter new values or accept defaults in brackets with Enter. Refer to user manual for detailed description of all options. Access key and Secret key are your identifiers for Amazon S3. Leave them empty for using the env variables. Access Key [9c2615f32e81e6a1caf5]: Secret Key [8aaaa35c1ad78a2cbfa1a6cd49ba8aaeb3ba39eb]: Default Region [US]: Use "s3.amazonaws.com" for S3 Endpoint and not modify it to the target Amazon S3. S3 Endpoint [s3.amazonaws.com]: Use "%(bucket)s.s3.amazonaws.com" to the target Amazon S3. "%(bucket)s" and "%(location)s" vars can be used if the target S3 system supports dns based buckets. DNS-style bucket+hostname:port template for accessing a bucket [%(bucket)s.s3.amazonaws.com]: leofs Encryption password is used to protect your files from reading by unauthorized persons while in transfer to S3 Encryption password: Path to GPG program [/usr/bin/gpg]: When using secure HTTPS protocol all communication with Amazon S3 servers is protected from 3rd party eavesdropping. This method is slower than plain HTTP, and can only be proxied with Python 2.7 or newer Use HTTPS protocol [No]: On some networks all internet access must go through a HTTP proxy. Try setting it here if you can't connect to S3 directly HTTP Proxy server name [172.26.9.180]: HTTP Proxy server port [8080]: New settings: Access Key: 9c2615f32e81e6a1caf5 Secret Key: 8aaaa35c1ad78a2cbfa1a6cd49ba8aaeb3ba39eb Default Region: US S3 Endpoint: s3.amazonaws.com DNS-style bucket+hostname:port template for accessing a bucket: leofs Encryption password: Path to GPG program: /usr/bin/gpg Use HTTPS protocol: False HTTP Proxy server name: 172.26.9.180 HTTP Proxy server port: 8080 Test access with supplied credentials? [Y/n] Y Please wait, attempting to list all buckets... Success. Your access key and secret key worked fine :-) Now verifying that encryption works... Not configured. Never mind. Save settings? [y/N] y Configuration saved to '/home/user/.s3cfg' 

如果收到错误ERROR:S3错误:403(AccessDenied):访问被拒绝:


 s3cmd put test.py s3://leofs/ upload: 'test.py' -> 's3://leofs/test.py' [1 of 1] 382 of 382 100% in 0s 3.40 kB/s done ERROR: S3 error: 403 (AccessDenied): Access Denied 

然后,您需要在True s3cmd配置中更正signature_v2。 本期详细内容。


如果signature_v2为False,则将出现这样的错误:


 WARNING: Retrying failed request: /?delimiter=%2F (getaddrinfo() argument 2 must be integer or string) WARNING: Waiting 3 sec... WARNING: Retrying failed request: /?delimiter=%2F (getaddrinfo() argument 2 must be integer or string) WARNING: Waiting 6 sec... ERROR: Test failed: Request failed for: /?delimiter=%2F 

开机测试


创建一个1GB的文件


 fallocate -l 1GB 1gb 

将其上传到Leofs


 time s3cmd put 1gb s3://leofs/ real 0m19.099s user 0m7.855s sys 0m1.620s 

统计资料


leofs-adm du适用于1个节点:


 leofs-adm du S0@172.26.9.179 active number of objects: 156 total number of objects: 156 active size of objects: 602954495 total size of objects: 602954495 ratio of active size: 100.0% last compaction start: ____-__-__ __:__:__ last compaction end: ____-__-__ __:__:__ 

我们看到结论并不十分有益。


让我们看看这个文件的位置。
leofs-adm whereis leofs / 1GB


 leofs-adm whereis leofs/1gb -------+----------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+---------------------------- del? | node | ring address | size | checksum | has children | total chunks | clock | when -------+----------------------+--------------------------------------+------------+--------------+----------------+----------------+----------------+---------------------------- | S0@172.26.9.181 | 657a9f3a3db822a7f1f5050925b26270 | 976563K | a4634eea55 | true | 64 | 598f2aa976a4f | 2019-12-05 10:48:15 +0000 | S0@172.26.9.182 | 657a9f3a3db822a7f1f5050925b26270 | 976563K | a4634eea55 | true | 64 | 598f2aa976a4f | 2019-12-05 10:48:15 +0000 

激活NFS


我们在Leo Gateway 172.26.9.184服务器上激活NFS。


在服务器和客户端上,安装nfs-utils


 sudo yum install nfs-utils 

根据说明,我们将修复配置文件/usr/local/leofs/current/leo_gateway/etc/leo_gateway.conf


 protocol = nfs 

在服务器172.26.9.184上,运行rpcbind和leofs-gateway


 sudo service rpcbind start sudo service leofs-gateway restart 

在运行leo_manager的服务器上,为NFS创建存储桶并生成一个密钥以连接到NFS


 leofs-adm add-bucket test 05236 leofs-adm gen-nfs-mnt-key test 05236 ip--nfs- 

连接到NFS


 sudo mkdir /mnt/leofs ## for Linux - "sudo mount -t nfs -o nolock <host>:/<bucket>/<token> <dir>" sudo mount -t nfs -o nolock ip--nfs-------gateway:/bucket/access_key_id/---gen-nfs-mnt-key /mnt/leofs sudo mount -t nfs -o nolock 172.26.9.184:/test/05236/bb5034f0c740148a346ed663ca0cf5157efb439f /mnt/leofs 

通过NFS客户端查看磁盘空间


考虑到每个存储节点都有一个40GB磁盘(3个正在运行的节点,1个连接的节点)的磁盘空间:


 df -hP Filesystem Size Used Avail Use% Mounted on 172.26.9.184:/test/05236/e7298032e78749149dd83a1e366afb328811c95b 60G 3.6G 57G 6% /mnt/leofs 

使用6个存储节点安装LeoFS。


库存(无生成器):
 # Please check roles/common/vars/leofs_releases for available versions [all:vars] leofs_version=1.4.3 build_temp_path="/tmp/leofs_builder" build_install_path="/tmp/" build_branch="master" source="package" # nodename of leo_manager_0 and leo_manager_1 are set at group_vars/all [leo_manager_0] 172.26.9.177 # nodename of leo_manager_0 and leo_manager_1 are set at group_vars/all [leo_manager_1] 172.26.9.176 [leo_storage] 172.26.9.178 leofs_module_nodename=S0@172.26.9.178 172.26.9.179 leofs_module_nodename=S0@172.26.9.179 172.26.9.181 leofs_module_nodename=S0@172.26.9.181 172.26.9.182 leofs_module_nodename=S0@172.26.9.182 172.26.9.183 leofs_module_nodename=S0@172.26.9.183 172.26.9.185 leofs_module_nodename=S0@172.26.9.185 [leo_gateway] 172.26.9.180 leofs_module_nodename=G0@172.26.9.180 172.26.9.184 leofs_module_nodename=G0@172.26.9.184 [leofs_nodes:children] leo_manager_0 leo_manager_1 leo_gateway leo_storage 

Leofs-ADM状态输出


Leofs-ADM状态输出
  [System Confiuration] -----------------------------------+---------- Item | Value -----------------------------------+---------- Basic/Consistency level -----------------------------------+---------- system version | 1.4.3 cluster Id | leofs_1 DC Id | dc_1 Total replicas | 2 number of successes of R | 1 number of successes of W | 1 number of successes of D | 1 number of rack-awareness replicas | 0 ring size | 2^128 -----------------------------------+---------- Multi DC replication settings -----------------------------------+---------- [mdcr] max number of joinable DCs | 2 [mdcr] total replicas per a DC | 1 [mdcr] number of successes of R | 1 [mdcr] number of successes of W | 1 [mdcr] number of successes of D | 1 -----------------------------------+---------- Manager RING hash -----------------------------------+---------- current ring-hash | d8ff465e previous ring-hash | d8ff465e -----------------------------------+---------- [State of Node(s)] -------+----------------------+--------------+---------+----------------+----------------+---------------------------- type | node | state | rack id | current ring | prev ring | updated at -------+----------------------+--------------+---------+----------------+----------------+---------------------------- S | S0@172.26.9.178 | running | | d8ff465e | d8ff465e | 2019-12-06 05:18:29 +0000 S | S0@172.26.9.179 | running | | d8ff465e | d8ff465e | 2019-12-06 05:18:29 +0000 S | S0@172.26.9.181 | running | | d8ff465e | d8ff465e | 2019-12-06 05:18:30 +0000 S | S0@172.26.9.182 | running | | d8ff465e | d8ff465e | 2019-12-06 05:18:29 +0000 S | S0@172.26.9.183 | running | | d8ff465e | d8ff465e | 2019-12-06 05:18:29 +0000 S | S0@172.26.9.185 | running | | d8ff465e | d8ff465e | 2019-12-06 05:18:29 +0000 G | G0@172.26.9.180 | running | | d8ff465e | d8ff465e | 2019-12-06 05:18:31 +0000 G | G0@172.26.9.184 | running | | d8ff465e | d8ff465e | 2019-12-06 05:18:31 +0000 -------+----------------------+--------------+---------+----------------+----------------+---------------------------- 

磁盘空间,并考虑到每个存储节点都有一个40GB磁盘(6个正在运行的节点):


 df -hP Filesystem Size Used Avail Use% Mounted on 172.26.9.184:/test/05236/e7298032e78749149dd83a1e366afb328811c95b 120G 3.6G 117G 3% /mnt/leofs 

如果使用5个存储节点


 [leo_storage] 172.26.9.178 leofs_module_nodename=S0@172.26.9.178 172.26.9.179 leofs_module_nodename=S1@172.26.9.179 172.26.9.181 leofs_module_nodename=S2@172.26.9.181 172.26.9.182 leofs_module_nodename=S3@172.26.9.182 172.26.9.183 leofs_module_nodename=S4@172.26.9.183 

 df -hP 172.26.9.184:/test/05236/e7298032e78749149dd83a1e366afb328811c95b 100G 3.0G 97G 3% /mnt/leofs 

日志


日志位于目录/usr/local/leofs/current/*/log


如果您手动安装/配置Leofs,则可能会遇到以下错误。


[错误]无法使用Mnesia

启动systemctl服务启动leofs-manager-master


 leofs-adm status [ERROR] Mnesia is not available 

需要在leo_manager_1上启动systemctl start leofs-manager-slave


Leofs-storage无法启动。

leofs-manager-master和leofs-manager-slave和leofs-adm status必须运行才能显示该状态。


附加节点少于副本数

当您启动leofs-adm start时,会出现以下错误:


 leofs-adm start [ERROR] Attached nodes less than # of replicas 

没有足够的存储节点。 leofs-adm状态将向您显示少于2个存储节点。 所需的最少存储节点数2。


leofs-adm状态显示为已附着,其余正在运行。

需要重新平衡节点


 leofs-adm rebalance 

启动leofs-gateway之后,您看不到leofs-adm状态的Gateway节点

需要启动leofs-adm


 leofs-adm start 

无法连接到从节点上的LeoFS Manager

(默认情况下,leofs-adm在从属节点上不起作用!]( Https://leo-project.net/leofs/docs/issues/documentation-issues/


负载测试


使用以下配置在2个节点上进行测试:


 CPU: Single Core Intel Core (Broadwell) (-MCP-) speed: 2295 MHz Kernel: 3.10.0-862.3.2.el7.x86_64 x86_64 Up: 1h 08m Mem: 1023.8/1999.6 MiB (51.2%) Storage: 10.00 GiB (43.5% used) Procs: 98 Shell: bash 4.2.46 inxi: 3.0.37 

为了测试,请拿一个小磁盘
在两个节点上,我们都看到了9.4G和5.9G的可用空间磁盘。


 df -hP Filesystem Size Used Avail Use% Mounted on /dev/vda1 9.4G 5.9G 3.1G 66% / 

电报频道: SDS和Cluster FS

Source: https://habr.com/ru/post/zh-CN478990/


All Articles