first commit

This commit is contained in:
kaichao 2023-03-31 17:01:47 +08:00
commit edc85f20d7
73 changed files with 2027 additions and 0 deletions

5
.gitignore vendored Normal file
View File

@ -0,0 +1,5 @@
.DS_Store
.DS_Store?
Makefile

201
LICENSE Normal file
View File

@ -0,0 +1,201 @@
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

0
README.md Normal file
View File

13
cluster/.env Normal file
View File

@ -0,0 +1,13 @@
# CentOS 7 : environment variable does not take effect
# CentOS 8 : ok
# CLUSTER=local
# ################################
# database config
# ################################
PGHOST=database
PGPORT=5432
POSTGRES_DB=scalebox
POSTGRES_USER=scalebox
POSTGRES_PASSWORD=changeme

132
cluster/README.md Normal file
View File

@ -0,0 +1,132 @@
# Scalebox集群
## Scalebox集群介绍
### 头节点
头节点上安装了单个scalebox集群的管理服务主要包括
- controld面向actuator、计算节点提供基于grpc的控制端应用服务
- database基于postgresql的数据库存放app、job、task、slot等相关数据面向controld等提供数据存储、检索等服务。
- actuator启动端负责在计算节点上启动slot。
### 计算节点
计算节点分为两类:
- 内部计算节点scalebox内部调度管理的计算节点通过免密ssh启动slot
- 外部计算节点通过外部调度程序slurm/k8s等启动的计算节点由外部启动slot。
## Scalebox单节点集群安装
- 操作系统CentOS 7以上
- 容器化引擎DockerCE 20.10+
- docker-compose: 1.29.2+
- 安装dstat、htop、zstd、gmake、git、rsync等工具软件用于性能监控等
### 安装CentOS 7/8下基本软件
- 以root用户安装
```bash
yum install -y epel-release
yum install -y htop dstat rsync pv pdsh wget make
```
- 安装git v2
- sshd
- turn off UseDNS/GSSAPIAuthentication
- Setup Linux Time Calibration
### 安装scalebox命令工具
- 以当前普通用户身份
```bash
mkdir -p ~/bin
wget -O ~/bin/docker-compose https://github.com/docker/compose/releases/download/1.29.2/docker-compose-Linux-x86_64
chmod +x ~/bin/docker-compose
mkdir -p ~/.ssh ~/.scalebox/log
chmod 700 ~/.ssh
docker pull hub.cstcloud.cn/scalebox/cli
id=$(docker create hub.cstcloud.cn/scalebox/cli)
docker cp $id:/usr/local/bin/scalebox ~/bin/
docker rm -v $id
chmod +x ~/bin/*
echo "alias app='scalebox app'" >> ${HOME}/.bash_profile
echo "alias job='scalebox job'" >> ${HOME}/.bash_profile
echo "alias task='scalebox task'" >> ${HOME}/.bash_profile
```
### 下载安装docker-scalebox
```bash
cd && git clone https://github.com/kaichao/docker-scalebox
```
- setup passwordless from actuator to node(设置actuator可通过免密ssh启动slot)
```bash
cat ~/docker-scalebox/cluster/id_rsa.pub >> ${HOME}/.ssh/authorized_keys
```
### 启动scalebox控制端
- 通过以下命令检查本地IP地址是否需要设定
```bash
hostname -i
```
- 前一步的本地IP地址不正确则需要设置local/defs.mk文件中LOCAL_IP_INDEX或LOCAL_ADDR变量
可通过以下命令```hostname -I```找到正确的本地IP地址。
- 示例
```
[user@node-1 local]$ hostname -I
10.0.2.21 192.168.56.21 172.17.0.1 172.20.0.1 172.19.0.1 172.22.0.1
则可以设置为:
LOCAL_IP_INDEX=2
LOCAL_ADDR=192.168.56.21
```
- 启动运行scalebox控制端
```bash
cd ~/docker-scalebox/cluster && make all
```
## Scalebox多节点集群安装
### 头节点安装
在单集群基础上安装time-server、pdsh、pv
```sh
yum install -y pdsh pdsh-rcmd-ssh
export PDSH_RCMD_TYPE=ssh
```
### 内部计算节点安装
- 操作系统CentOS 7以上
- 容器化引擎Docker 20.10以上
- 安装dstat、htop、zstd等工具软件用于性能监控等
-
可选:
- Linux Time Calibration
- 增加头节点管理用户、actuator的公钥
- docker-ce
- dstat
- htop
- zstd
- sshd
- turn off UseDNS GSSAPIAuthentication
```sh
yum install -y epel-release
yum install -y htop dstat rsync pv
```
### 外部计算节点
- 操作系统CentOS 7以上
- 单机容器化引擎:
- Docker 20.10以上
- podman ?版本;
- singularity 3.8
- k8s集群
## 计算节点上gluster存储安装

5
cluster/bio-down/defs.mk Normal file
View File

@ -0,0 +1,5 @@
CLUSTER := bio-down
SHARED_DIR := /gfs
NODES := n[0-3],h0

View File

@ -0,0 +1,38 @@
version: '1.0.0'
label: Global Definition Cluster bio-down
specs:
bio-vm-00:
# CPU cores
num_cores: 4
# Memory(GB)
mem_gb: 16.0
# Disk(GB)
disk: 200.0
clusters:
bio-down:
base_dir: /gfs
base_data_dir: /gfs/mydata
uid: root
memo: bio-down cluster
hosts:
h0:
label: head node
ip_addr: 10.0.6.100
role: head
spec: bio-vm-00
memo: CentOS 8
n0:
ip_addr: 10.0.6.101
spec: bio-vm-00
n1:
ip_addr: 10.0.6.102
spec: bio-vm-00
n2:
ip_addr: 10.0.6.103
spec: bio-vm-00
n3:
ip_addr: 10.0.6.104
spec: bio-vm-00

View File

@ -0,0 +1,73 @@
version: '3.8'
services:
controld:
image: hub.cstcloud.cn/scalebox/controld
restart: unless-stopped
container_name: controld
# hostname: host1
logging:
driver: "json-file"
options:
max-size: "100m"
max-file: "5"
depends_on:
- database
ports:
- 50051:50051
environment:
- CLUSTER=${CLUSTER}
- PGHOST=${PGHOST}
- PGPORT=${PGPORT}
- LOG_LEVEL=WARN
volumes:
- /etc/localtime:/etc/localtime
- ${HOME}/.scalebox/log/controld:/var/log/scalebox/controld
actuator:
image: hub.cstcloud.cn/scalebox/actuator
restart: unless-stopped
container_name: actuator
logging:
driver: "json-file"
options:
max-size: "50m"
max-file: "5"
environment:
- CLUSTER=${CLUSTER}
- GRPC_SERVER=controld:50051
- PGHOST=${PGHOST}
- PGPORT=${PGPORT}
- LOG_LEVEL=WARN
cap_add:
# # update default route, for CentOS8
- NET_ADMIN
volumes:
- /etc/localtime:/etc/localtime
- ${HOME}/.scalebox/log/actuator:/var/log/scalebox/actuator
depends_on:
- controld
database:
image: hub.cstcloud.cn/scalebox/database
restart: unless-stopped
container_name: database
environment:
# - TZ='GMT-8'
# - PGTZ='GMT-8'
- POSTGRES_DB=${POSTGRES_DB}
- POSTGRES_USER=${POSTGRES_USER}
- POSTGRES_PASSWORD=${POSTGRES_PASSWORD}
volumes:
# - ${PWD}/PGDATA:/var/lib/postgresql/data
- pgdata:/var/lib/postgresql/data
- /etc/localtime:/etc/localtime
ports:
- 5432:5432
healthcheck:
test: ["CMD", "pg_isready", "-U", "scalebox"]
interval: 5s
timeout: 5s
retries: 5
volumes:
pgdata:

1
cluster/id_rsa.pub Normal file
View File

@ -0,0 +1 @@
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDoHGMqm8gvuEnkcDdNy67Zna2R2lkwxUDe6Bu0D5Iwztj4ry9zvCnRnx58CpAF1Db1E3xaLZPbimwIvj/x3DTz8s7Mx3DqPef6ctamRYioRqkR1QKOjKvrYOTB2TIpOhRj8iDk0znIgdv0lHAa5saF3xC4KJGSII7eHSU6ZupGIBiQOkKgSCFx1JYwl7UbwhdXd3JZAk/mxR8n91FdvOWEwp73HA5MxW0yAV53qHvTD+394vi2zn48DkqUTnypu6aw3Z4lCbOebpQ4tT7etwliybLxD2zZhT07DWaPtmGVji7mXWDmJinZ2qbq4fo387/hrCiOKiIl/3jGB6nX79Hv fast@login1

9
cluster/local/defs.mk Normal file
View File

@ -0,0 +1,9 @@
CLUSTER := local
SHARED_DIR := /gfs
NODES := h0
# Customize local ip by defining LOCAL_IP_INDEX or LOCAL_ADDR
# LOCAL_ADDR=127.0.0.1
# LOCAL_IP_INDEX=2

View File

@ -0,0 +1,16 @@
version: '1.0.0'
label: Global Definition Cluster bio-down
clusters:
local:
base_dir: /gfs
base_data_dir: /gfs/mydata
# base_log_dir: /gfs/mylog
uname: ${USER}
memo: local cluster
hosts:
h0:
label: head node
ip_addr: ${LOCAL_ADDR}
role: head

65
cluster/misc.yml Normal file
View File

@ -0,0 +1,65 @@
version: '3.8'
services:
postgres-proxy:
image: edoburu/pgbouncer
environment:
- DB_USER=${POSTGRES_USER}
- DB_PASSWORD=${POSTGRES_PASSWORD}
- DB_HOST=${PGHOST}
- DB_PORT=${PGPORT}
- DB_NAME=${POSTGRES_DB}
- AUTH_TYPE=plain
- ADMIN_USERS=postgres,scalebox
- TCP_KEEPCNT=10
- TCP_KEEPIDLE=60
- TCP_KEEPINTVL=20
# - POOL_MODE=session,transaction,statement
# - TCP_KEEPALIVE=
# - TCP_USER_TIMEOUT=2500
ports:
- "5432:5432"
grpc-proxy:
image: kaichao/envoy-grpc-proxy
restart: unless-stopped
ports:
- 50051:50051
environment:
- SERVICE_NAME=grpc-host.org
- SERVICE_PORT=50051
volumes:
- /etc/localtime:/etc/localtime
- /var/log/envoy:/var/log/envoy
dbadmin:
image: adminer:4.8.1
restart: unless-stopped
environment:
- ADMINER_DEFAULT_SERVER=database
ports:
- 8080:8080
pgadmin4:
image: dpage/pgadmin4:4.29
restart: unless-stopped
ports:
- 8081:80
environment:
- PGADMIN_DEFAULT_EMAIL=myname@mail.org
- PGADMIN_DEFAULT_PASSWORD=SuperSecret
depends_on:
- database
redis:
image: redis:5
command: redis-server --requirepass 123456
ports:
- "6379:6379"
volumes:
- ${PWD}/DATA/redis-data:/data
redis-Insight:
image: redislabs/redisinsight:1.9.0
restart: unless-stopped
ports:
- "9221:9221"

0
doc/README.md Normal file
View File

0
doc/en_US/README.md Normal file
View File

153
doc/status_code.md Normal file
View File

@ -0,0 +1,153 @@
# Status Code
# Task/Task_exec Exit/Return Code
| Code | Number | Description |
| ----------- | ----------- | ----------- |
| ExFileNotExists | 240 | |
| ExMessageSendException | 245 | |
## Status Code Table
- code range(16-bit): [-32768..32767]
| Code | Number | Description |
| ----------- | ----------- | ----------- |
| app_code | 0~255 | |
| OK | 0 | |
| READY | -1 | |
| QUEUED | -2 | FROM 'READY' to 'RUNNING' |
| RUNNING| -3 | |
| INITIAL| -9 | Initial Status (used by dynamic scheduling) |
| ERROR| -32~-63 | |
- task status_code : for task scheduling
- task_exec status_code : task-exec history recording
- ssh / docker status_code (in actuator)
## ret_code vs. exit_code
## Task status_code (32-bit)
- range: [-128..-1]
-102 timeout
-1002 ?
-1003 ?
## Task_exec Status Code Structure
```
10987654321098765432109876543210
-+------++------++------++-------
| sum |prepare|cleanup| run |
-+------++------++------++-------
```
```sh
task_exec_code = combine(task_sum_code,prepare_code, cleanup_code, run_code)
```
### task_sum_code (left-most 8-bit, combined_code)
- range : [1..127]
- [Status codes and their use in gRPC](https://grpc.github.io/grpc/core/md_doc_statuscodes.html)
| Code | Number | Description |
| ----------- | ----------- | ----------- |
| RETRIED| 10~14 | 10 ~ 12 : 3; 13 : 10; 14 : 30 |
| | 16 | Network to Control-server UnavailableFailed to dial target host |
| | 17 | Control-Service Unavailable |
| | 18 | Control-Service Unauthenticated |
| | 19 | Control-Service Permission Denied |
| | 20 | Control-Service Timeout |
| | 21 | Control-Service Method Unimplemented |
| | 24 | Invalid Input Message Format |
| | 25 | No Valid Output |
| | 127 | Unknown |
### Sub-Task Status Code (for prepare / run / cleanup)
- Range: [0..255)
## REF
### REF[1] : /usr/include/sysexits.h IN Linux
| Code | Number | Description |
| ----------- | ----------- | ----------- |
| ExOK | 0 | successful termination |
| ExUsage | 64 | command line usage error |
| ExDataErr | 65 | data format error |
| ExNoInput | 66 | cannot open input |
| ExNoUser | 67 | addressee unknown (for email) |
| ExNoHost | 68 | host name unknown (for email) |
| ExUnavailable | 69 | service unavailable (for email) |
| ExSoftware | 70 | internal software error |
| ExOSErr | 71 | system error (e.g., can't fork) |
| ExOSFile | 72 | critical OS file missing |
| ExCantCreat | 73 | can't create (user) output file |
| ExIOErr | 74 | input/output error |
| ExTempFail | 75 | temp failure; user is invited to retry |
| ExProtocol | 76 | remote error in protocol |
| ExNoPerm | 77 | permission denied |
| ExConfig | 78 | configuration error |
### REF[2] : [Advanced Bash Scripting Guide, Appendix E. Exit Codes With Special Meanings](https://tldp.org/LDP/abs/html/exitcodes.html)
| Code | Number | Description |
| ----------- | ----------- | ----------- |
| ExGeneral | 1 | |
| ExMisuse | 2 | |
| ExCantExec | 126 | |
| ExCmdNotFound | 127 | |
| ExInvalidExit | 128 | |
| ExSignals | 129~165 | |
| ExOutOfRange | 255 | |
### Additional Scalebox Sub-Task Status Code
| Code | Number | Description |
| ----------- | ----------- | ----------- |
| ExUserDef | 32~63 | |
| ExUserDef | 192~223 | |
| ExTimeOut | 224 | |
| ExCoreDump | 225 | |
| ExNotRunnable | 225 | run program not exists |
| ExExecNotExists | 225 | run program not runnable |
## Exit Status Code in actuator
### Docker run Exit Status
- [Docker run Exit Status](https://docs.docker.com/engine/reference/run/#exit-status)
| Code | Description |
| ----------- | ----------- |
| 125 | The error is with Docker daemon itself |
| 126 | The contained command cannot be invoked |
| 127 | The contained command cannot be found |
### SSH Exit Status
- [SSH and SCP Return Codes](https://support.microfocus.com/kb/doc.php?id=7021696)
| Code | Description |
| -------- | ----------- |
| 0 | Operation was successful |
| 1 | Generic error, usually because invalid command line options or malformed configuration |
| 2 | Connection failed |
| 65 | Host not allowed to connect |
| 66 | General error in ssh protocol |
| 67 | Key exchange failed |
| 68 | Reserved |
| 69 | MAC error |
| 70 | Compression error |
| 71 | Service not available |
| 72 | Protocol version not supported |
| 73 | Host key not verifiable |
| 74 | Connection failed |
| 75 | Disconnected by application |
| 76 | Too many connections |
| 77 | Authentication cancelled by user |
| 78 | No more authentication methods available |
| 79 | Invalid user name |

19
doc/zh_CN/README.md Normal file
View File

@ -0,0 +1,19 @@
# 简介
# 应用定义
# 消息格式定义
# 模块内record.json文件格式
# 术语定义
- 应用/流水线应用App
- 模块Job
- 任务Task
- 数据集DataSet
- 数据项/实体Entry
- 集群Cluster
- 主机Host
- 插槽Slot

30
doc/zh_CN/app_def.md Normal file
View File

@ -0,0 +1,30 @@
# 一、简介
scalebox应用程序通过应用模板文件app.yaml、应用参数文件scalebox.env共同定义。应用模板文件定义了应用模块以及应用模块间的关联模板中有参数变量。模板参数变量通过环境变量的形式在应用参数文件来定义。
## 1.1 应用模板文件
应用模板文件的缺省名称为app.yaml。文件中的模板参数变量通过 ${var_name}的形式来定义。
## 1.2 应用参数文件
应用参数文件的缺省名称为scalebox.env。其中定义了应用模板文件中的参数变量
## 1.3 应用解析
```sh
scalebox app create --env-file scalebox.env app.yaml
```
- 解析过程中的模板参数变量的定义优先顺序:
- 命令行执行的环境变量
- 命令行指定的环境变量文件:可通过命令行参数--env-file指定缺省为scalebox.env
- 用户级环境变量配置文件:${HOME}/.scalebox/
- 系统级环境变量配置文件:/etc/scalebox/
# 二、应用模板文件
# 三、模块定义
每个模块是独立的容器镜像。

View File

@ -0,0 +1,64 @@
# 逗号分隔字符串转数组
```bash
# comma-seperated multi-message
arr=($(echo ${MESSAGE_BODY} | tr "," " "))
arr=($(/app/bin/url_parser ${SOURCE_URL}))
export MODE="${arr[0]}"
```
# 数组迭代处理
```bash
# comma-seperated multi-message
arr=($(echo ${MESSAGE_BODY} | tr "," " "))
for m in ${arr[@]}; do
send-message $m
done
```
# bash逻辑条件判断
```bash
if [ "${MODE}" = "ERROR_FORMAT" ]; then
echo "url format error, "${arr[1]} >&2
exit 1
fi
```
# yaml解析
```bash
#!/bin/bash
IMAGE=hub.cstcloud.cn/scalebox/parser
PGHOST=localhost
PGPORT=5432
# Pass all environment variables except PATH to the container
docker run --network host --rm -v $(cd $(dirname $1);pwd):/data:ro \
--env-file <(env|grep -v ^PATH=) -e PGHOST=${PGHOST} -e PGPORT=${PGPORT} \
${IMAGE} $(basename $1)
```
# timestamp in bash
```sh
# rfc-3339 in seconds
$ date --rfc-3339=seconds | sed 's/ /T/'
2014-03-19T18:35:03-04:00
# rfc-3339 in milliseconds
$ date --rfc-3339=ns | sed 's/ /T/; s/\(\....\).*\([+-]\)/\1\2/g'
2014-03-19T18:42:52.362-04:00
# rfc-3339 in microseconds
$ date --rfc-3339=ns | sed 's/ /T/; s/\(\.......\).*\([+-]\)/\1\2/g'
2022-11-06T22:35:04.030320+08:00
# iso-8601 on Linux
$ date --iso-8601=ns
2022-12-02T08:54:04,170460829+0800
```

View File

@ -0,0 +1,10 @@
# message json-object
```json
{
"type":"string",
"text_body":"string",
"array_body":"array",
"priority":"string"
}
```

0
dockerfiles/README.md Normal file
View File

View File

@ -0,0 +1,15 @@
FROM kaichao/ftp-copy
LABEL maintainer="kaichao"
ENV REMOTE_URL= \
LOCAL_ROOT=/ \
NUM_PGET_CONN=4 \
ACTION=PUSH \
ENABLE_RECHECK_SIZE=yes \
ENABLE_LOCAL_RELAY= \
RAM_DISK_GB=
COPY --from=hub.cstcloud.cn/scalebox/base /usr/local/sbin /usr/local/sbin
ENTRYPOINT ["goagent"]

View File

@ -0,0 +1,32 @@
FROM hub.cstcloud.cn/scalebox/agent
LABEL maintainer="kaichao"
# # the newest curlftpfs may not compatible, the following is OK.
# # $ curlftpfs --version
# # curlftpfs 0.9.2 libcurl/7.74.0 fuse/2.9
RUN \
apt-get update \
&& apt-get install -y curlftpfs \
&& apt-get clean autoclean \
&& apt-get autoremove -y \
&& rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
# install the newest version, rsync 3.2.6, zstd 1.5.2
RUN echo "deb http://deb.debian.org/debian testing main" > /etc/apt/sources.list.d/bullseye-testing.list \
&& apt-get update \
&& apt-get install -y rsync openssh-client \
&& apt-get clean autoclean \
&& apt-get autoremove -y \
&& rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
COPY --from=hub.cstcloud.cn/scalebox/actuator /root/.ssh /root/.ssh
COPY --from=hub.cstcloud.cn/scalebox/rsync-agent /app/bin/url_parser /app/bin/
RUN mkdir -p /remote
ENV SOURCE_URL= \
REGEX_FILTER= \
RSYNC_PASSWORD=
COPY *.sh /app/bin/

View File

@ -0,0 +1,26 @@
# list-dir
## Introduction
list-dir is a common module in scalebox. Its function is to traverse the file list of the local directory or the rsync/ftp remote directory, generate messages and send it to the subsequent module.
list-dir supports four types of directories:
- local: Local Directory
- rsync-over-ssh: Server directory backed by the ssh-based rsync protocol
- native rsync: Server directory that supports the standard rsync protocol
- ftp: Directory on ftp service that supports ssl encryption
## Environment variables
- SOURCE_URL: See table below.
| type | description |
| --- | ---- |
| local | represented by an absolute path ```</absolute-path> ```|
| rsync | anonymous access: ```rsync://<rsync-host><rsync-base-dir>```<br/> non-anonymous access: ```rsync://<rsync-user>@<rsync-host><rsync-base-dir>```|
| rsync-over-ssh | The ssh public key is stored in the ssh-server account to support password-free access <br/> ``` <ssh-user>@<ssh-host><ssh-base-dir>``` <br/>OR<br/> ``` <ssh-host><ssh-base-dir>```, default ssh-user is root |
| ftp | anonymous access: ```ftp://<ftp-host>/<ftp-base-dir>```<br/> non-anonymous access: ```ftp://<ftp-user>:<ftp-pass>@<ftp-host>/<ftp-base-dir>``` |
- REGEX_FILTER: File filtering rules represented by regular expressions
- DIR_NAME: Relative to the subdirectory of SOURCE_URL, use "." to represent the current directory of SOURCE_URL
- RSYNC_PASSWORD: Non-anonymous rsync user password

View File

@ -0,0 +1,90 @@
#!/bin/bash
data_dir=$1
if [ "${MODE}" = "SSH" ]; then
# "ssh version", sed-match does not work under macos(Linux-only)
if [ $data_dir = "." ]; then
rsync_url=${REMOTE_USER}@${REMOTE_HOST}:${REMOTE_ROOT}
else
rsync_url=${REMOTE_USER}@${REMOTE_HOST}:${REMOTE_ROOT}/${data_dir}
fi
rsync -avn -e "ssh -p ${REMOTE_PORT}" ${rsync_url} \
| grep ^\- | awk {'print $5'} \
| sed 's/^[^/]\+\//\//' \
| awk -v p="$data_dir" '$0=p$0' \
| egrep "${REGEX_FILTER}"
elif [ "${MODE}" = "RSYNC" ]; then
if [ -z ${REMOTE_USER} ]; then
rsync_url="rsync://"
else
rsync_url="rsync://"${REMOTE_USER}@
fi
if [ $data_dir = "." ]; then
rsync_url=${rsync_url}${REMOTE_HOST}:${REMOTE_PORT}${REMOTE_ROOT}
else
rsync_url=${rsync_url}${REMOTE_HOST}:${REMOTE_PORT}${REMOTE_ROOT}/${data_dir}
fi
echo "[INFO]rsync_url:"$rsync_url >&2
# "rsync version"
rsync -avn ${rsync_url} \
| grep ^\- | awk {'print $5'} \
| sed 's/^[^/]\+\//\//' \
| awk -v p="$data_dir" '$0=p$0' \
| egrep "${REGEX_FILTER}"
else
if [ "${MODE}" = "FTP" ]; then
# echo aaaaaaaaa >&2
# if ! mountpoint -q /remote; then
# echo bbbbbbbbb >&2
# echo "FTP_URL:"$FTP_URL >&2
# if [[ $SOURCE_URL =~ (ftp://([^:]+:[^@]+@)?[^/:]+(:[^/]+)?)(/.*) ]]; then
# ftp_url=${BASH_REMATCH[1]}
# echo ftp_url:$ftp_url >&2
# curlftpfs -o ssl ${ftp_url} /remote
# else
# echo "FTP_URL did not match regex!" >&2
# fi
# fi
# # curlftpfs -o ssl ${FTP_URL} /remote
LOCAL_ROOT="/remote"${REMOTE_ROOT}
fi
# MODE = 'LOCAL'
cd ${LOCAL_ROOT} && find ${data_dir} -type f \
| sed 's/^\.\///' \
| egrep "${REGEX_FILTER}"
fi
# exit status of egrep
# 0 if a line is selected
# 1 if no lines were selected
# 2 if an error occurred.
status=(${PIPESTATUS[@]})
echo "[INFO]pipe_status:"${status[*]} >&2
n=${#status[*]}
if [ $n == 1 ]; then
# line 38
if [ ${status[0]} -ne 0 ]; then
echo "[ERROR]local mode, dir: "${LOCAL_ROOT}" not found" >&2
exit ${status[0]}
fi
fi
declare -i code
for ((i=n-1; i>=0; i--)); do
code=${status[i]}
if [ $i == $((n-1)) ]; then
if [ $code == 1 ]; then
echo "[WARN]All of data are filtered, empty dataset!" >&2
code=0
fi
fi
if [ $code -ne 0 ]; then
break
fi
done
exit $code

51
dockerfiles/list-dir/run.sh Executable file
View File

@ -0,0 +1,51 @@
#!/bin/bash
arr=($(/app/bin/url_parser ${SOURCE_URL}))
export MODE="${arr[0]}"
if [ "${MODE}" = "ERROR_FORMAT" ]; then
echo "url format error, "${arr[1]} >&2
exit 1
fi
if [ "${MODE}" = "LOCAL" ]; then
# MODE | LOCAL_ROOT
export LOCAL_ROOT="${arr[1]}"
elif [ "${MODE}" = "FTP" ]; then
export FTP_URL="${arr[1]}"
export REMOTE_ROOT="${arr[2]}"
else
# MODE | REMOTE_HOST | REMOTE_PORT | REMOTE_ROOT | REMOTE_USER
export REMOTE_HOST="${arr[1]}"
export REMOTE_PORT="${arr[2]}"
export REMOTE_ROOT="${arr[3]}"
if [ ${#arr[*]} -eq 5 ]; then
export REMOTE_USER="${arr[4]}"
else
export REMOTE_USER=
fi
fi
env
ret_code=0
/app/bin/list-files.sh $1 | while read line; do
if [[ $line == ./* ]]; then
# remove prefix './'
line=${line:2}
fi
send-message $line
code=$?
if [ $code -ne 0 ]; then
ret_code=$code
echo "Error send-message, message:"$line 2>&
fi
done
code=${PIPESTATUS[0]}
if [ $code -ne 0 ]; then
ret_code=$code
echo "Error run list-files.sh "$1 2>&
fi
exit $ret_code

11
dockerfiles/list-dir/setup.sh Executable file
View File

@ -0,0 +1,11 @@
#!/bin/bash
# curlftpfs -f -v -o debug,ftpfs_debug=3 -o allow_other -o ssl ${FTP_URL} /remote
if [[ $SOURCE_URL =~ (ftp://([^:]+:[^@]+@)?[^/:]+(:[^/]+)?)(/.*) ]]; then
ftp_url=${BASH_REMATCH[1]}
echo ftp_url:$ftp_url >&2
curlftpfs -o ssl ${ftp_url} /remote
else
echo "SOURCE_URL did not match regex!" >&2
fi

View File

@ -0,0 +1,26 @@
```sh
# local version
CLUSTER=local SOURCE_URL=/ DIR_NAME=etc/postfix REGEX_FILTER= scalebox app create app.yaml
CLUSTER=local SOURCE_URL=/etc/postfix DIR_NAME=. REGEX_FILTER= scalebox app create app.yaml
CLUSTER=local SOURCE_URL=/ DIR_NAME=etc/postfix REGEX_FILTER=^.*cf\$ scalebox app create app.yaml
# rsync-over-ssh version
CLUSTER=local SOURCE_URL=scalebox@10.255.128.1/ DIR_NAME=etc/postfix REGEX_FILTER= scalebox app create app.yaml
CLUSTER=local SOURCE_URL=scalebox@10.255.128.1/etc/postfix DIR_NAME=. REGEX_FILTER= scalebox app create app.yaml
CLUSTER=local SOURCE_URL=scalebox@10.255.128.1/ DIR_NAME=etc/postfix REGEX_FILTER=^.*cf\$ scalebox app create app.yaml
# rsync-native version
CLUSTER=local SOURCE_URL=rsync://fast.cstcloud.cn/doi/10.1038/s41586-021-03878-5 DIR_NAME=20191021 REGEX_FILTER= scalebox app create app.yaml
CLUSTER=local SOURCE_URL=rsync://fast.cstcloud.cn/doi/10.1038/s41586-021-03878-5/20191021 DIR_NAME=. REGEX_FILTER= scalebox app create app.yaml
CLUSTER=local SOURCE_URL=rsync://fast@fast.cstcloud.cn/doi/10.1038/s41586-021-03878-5 DIR_NAME=20191021 REGEX_FILTER= RSYNC_PASSWORD=nao12345 scalebox app create app.yaml
CLUSTER=local SOURCE_URL=rsync://fast@fast.cstcloud.cn/files DIR_NAME=FRB121102/20190830 REGEX_FILTER= RSYNC_PASSWORD=nao12345 scalebox app create app.yaml
# ncbi data
CLUSTER=local SOURCE_URL=rsync://ftp.ncbi.nlm.nih.gov/1000genomes DIR_NAME=. REGEX_FILTER= RSYNC_PASSWORD= scalebox app create app.yaml
```

View File

@ -0,0 +1,25 @@
name: test.list-dir
cluster: ${CLUSTER}
parameters:
initial_status: RUNNING
jobs:
list-dir:
base_image: hub.cstcloud.cn/scalebox/list-dir
schedule_mode: HEAD
variables:
repeated: yes
parameters:
start_message: ${DIR_NAME}
command: docker run -d --rm --privileged --network=host %ENVS% %VOLUMES% %IMAGE%
paths:
- /:/local:ro
environments:
- SOURCE_URL=${SOURCE_URL}
- REGEX_FILTER=${REGEX_FILTER}
- RSYNC_PASSWORD=${RSYNC_PASSWORD}
sink_jobs:
- sink-module
sink-module:
base_image: hub.cstcloud.cn/scalebox/agent

View File

@ -0,0 +1,6 @@
CLUSTER=local
# SOURCE_URL=ftp://<ftp-user>:<ftp-pass>@<ftp-host>/<ftp-base-dir>
SOURCE_URL=
REGEX_FILTER=
DIR_NAME=.

View File

@ -0,0 +1,6 @@
CLUSTER=local
# SOURCE_URL=/<local-base-dir>
SOURCE_URL=
REGEX_FILTER=
DIR_NAME=.

View File

@ -0,0 +1,6 @@
CLUSTER=local
# SOURCE_URL=[<ssh-user>@]<ssh-host><ssh-base-dir>
SOURCE_URL=
REGEX_FILTER=
DIR_NAME=.

View File

@ -0,0 +1,8 @@
CLUSTER=local
# SOURCE_URL=rsync://[<rsync-user>@]<rsync-host>/<rsync-base-dir>
SOURCE_URL=
RSYNC_PASSWORD=
REGEX_FILTER=
DIR_NAME=.

View File

@ -0,0 +1,37 @@
FROM golang:1.19.2
COPY . /src/
RUN cd /src && go build url_parser.go && strip url_parser
FROM debian:11-slim AS agent-base
LABEL maintainer="kaichao"
# install the newest version, rsync 3.2.6, zstd 1.5.2
RUN echo "deb http://deb.debian.org/debian testing main" > /etc/apt/sources.list.d/bullseye-testing.list \
&& apt-get update \
&& apt-get install -y rsync openssh-client zstd \
&& apt-get clean autoclean \
&& apt-get autoremove -y \
&& rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
COPY --from=hub.cstcloud.cn/scalebox/base /usr/local/sbin /usr/local/sbin
COPY --from=hub.cstcloud.cn/scalebox/actuator /root/.ssh /root/.ssh
COPY --from=0 /src/url_parser /app/bin/
RUN mkdir -p /work /app/bin \
&& echo "PATH=/app/bin:\${PATH}" >> /root/.bashrc
WORKDIR /work
FROM agent-base
ENV \
REMOTE_HOST=localhost \
REMOTE_PORT=22 \
REMOTE_USER=root \
# for containerized remote rsyncd
REMOTE_ROOT=/local \
LOCAL_ROOT=/local \
RSYNC_PASSWORD=
ENTRYPOINT ["goagent"]

23
dockerfiles/rsync-agent/run.sh Executable file
View File

@ -0,0 +1,23 @@
#!/bin/bash
# FILENAME=$(cat /tasks/$1 | head -n 1)
# FILENAME=/data/ZD2020_1_1_2bit/$1
FILENAME=/data/$1
# REMOTE_HOST=10.255.0.10
set -e
if [ $TRANSPORT_TYPE = "rsync" ]; then
# RSYNC_PASSWORD="cnic123" rsync -RPut --port $RSYNC_PORT $FILENAME rsync://root@$RECEIVER_HOST:/share
RSYNC_PASSWORD="cnic123" rsync -RPut $FILENAME rsync://root@$REMOTE_HOST:$RSYNC_PORT/share
else
rsync -RPut -e "ssh -p ${SSH_PORT} -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null" \
${SSH_USER}@${REMOTE_HOST}:${FILENAME} /output
# rsync -a -e "ssh -p ${SSH_PORT} -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null" \
# ${SSH_USER}@${REMOTE_HOST}:${FILENAME} /tmp
fi
echo ret_code=$?
exit $ret_code

View File

@ -0,0 +1,101 @@
package main
import (
"fmt"
"os"
"os/user"
"regexp"
"strings"
)
/*
* url parser
* LOCAL : local directory
* SSH : user@host:port/root-dir
* RSYNC : rsync://user@host:port/root-dir
* FTP : ftp://user@host:port/root-dir
*
* LOCAL_ROOT / REMOTE_ROOT should have "/local" prefix
* for mapping to paths outside the container
*/
func main() {
url := os.Args[1]
serverContainerized := "yes" == os.Getenv("SERVER_CONTAINERIZED")
if strings.HasPrefix(url, "/") {
// MODE | LOCAL_ROOT
url = "/local" + url
fmt.Printf("%s %s", "LOCAL", url)
return
}
if strings.HasPrefix(url, "ftp://") {
// MODE | FTP_URL | REMOTE_ROOT
reg := regexp.MustCompile("(ftp://([^:]+:[^@]+@)?[^/:]+(:[^/]+)?)(/.*)")
ss := reg.FindStringSubmatch(url)
if ss == nil {
fmt.Printf("ERROR_FORMAT, url:%s", url)
} else {
ftpURL := ss[1]
remoteRoot := ss[4]
fmt.Printf("%s %s %s", "FTP", ftpURL, remoteRoot)
}
return
}
var mode string
if strings.HasPrefix(url, "rsync://") {
mode = "RSYNC"
url = url[8:]
} else {
mode = "SSH"
}
reg := regexp.MustCompile("^(([^@]+)@)?([^:/]+)(:([0-9]+))?(/.*)$")
ss := reg.FindStringSubmatch(url)
if ss == nil {
fmt.Printf("ERROR_FORMAT, url:%s", url)
}
uname := ss[2]
host := ss[3]
port := ss[5]
path := ss[6]
if uname == "" {
if mode == "SSH" {
if !serverContainerized {
if u, err := user.Current(); err == nil {
uname = u.Username
}
}
if uname == "" {
uname = "root"
}
} else { // "RSYNC"
if serverContainerized {
// rsyncd image's default user name
uname = "user"
}
}
}
if port == "" {
if mode == "SSH" {
if serverContainerized {
port = "2222"
} else {
port = "22"
}
} else {
port = "873"
}
}
if host == "" || path == "" {
fmt.Fprintf(os.Stderr, "REMOTE_HOST or REMOTE_ROOT is null, url=#%s#", url)
}
if serverContainerized {
path = "/local" + path
}
// for anonymous rsync && not containerized, REMOTE_USER is null
// MODE | REMOTE_HOST | REMOTE_PORT | REMOTE_ROOT | REMOTE_USER
fmt.Printf("%s %s %s %s %s", mode, host, port, path, uname)
}

View File

@ -0,0 +1,15 @@
FROM hub.cstcloud.cn/scalebox/rsync-agent
LABEL maintainer="kaichao"
ENV SOURCE_URL= \
TARGET_URL= \
RSYNC_PASSWORD= \
ZSTD_CLEVEL=3 \
ENABLE_ZSTD= \
# Break through the permissions of some file storage
ENABLE_LOCAL_RELAY= \
SERVER_CONTAINERIZED=
COPY *.sh /app/bin/

126
dockerfiles/rsync-copy/run.sh Executable file
View File

@ -0,0 +1,126 @@
#!/bin/bash
# --cc=xxh3 , xxh3 Hash
# --compress --compress-choice=zstd
. /env.sh
# LOCAL_ROOT/REMOTE_ROOT have been added prefix '/local'
# env
if [[ $REMOTE_MODE == 'SSH' ]]; then
if [[ $ENABLE_ZSTD == 'yes' ]]; then
if test -f "/rsync_ver_ge_323"; then
rsync_args="--cc=xxh3 --compress --compress-choice=zstd --compress-level=${ZSTD_CLEVEL}"
fi
fi
ssh_args="-T -c aes128-gcm@openssh.com -o Compression=no -x"
# ssh args in /root/.ssh/config
rsync_url=${REMOTE_USER}@${REMOTE_HOST}:${REMOTE_ROOT}
else
if [[ $ENABLE_ZSTD == 'yes' ]]; then
rsync_args="--cc=xxh3 --compress --compress-choice=zstd --compress-level=${ZSTD_CLEVEL}"
fi
if [ -z ${REMOTE_USER} ]; then
rsync_url="rsync://"
else
rsync_url="rsync://"${REMOTE_USER}@
fi
rsync_url=${rsync_url}${REMOTE_HOST}:${REMOTE_PORT}${REMOTE_ROOT}
fi
echo [DEBUG]rsync_url:$rsync_url >&2
declare -i total_bytes=0 bytes
arr=($(echo $1 | tr "," " "))
# multiple files to copy
ds0=$(date --iso-8601=ns)
for file in ${arr[@]}; do
echo "copying file:"$file
if [[ $ACTION == 'PUSH' ]]; then
cd ${LOCAL_ROOT}
if [[ $REMOTE_MODE == 'SSH' ]]; then
rsync -Rut ${rsync_args} -e "ssh -p ${REMOTE_PORT} ${ssh_args}" $file $rsync_url
else
rsync -Rut ${rsync_args} $file $rsync_url
fi
code=$?
else
dest_dir=$(dirname ${LOCAL_ROOT}/$file)
mkdir -p ${dest_dir}
if [[ $REMOTE_MODE == 'SSH' ]]; then
rsync -ut ${rsync_args} -e "ssh -p ${REMOTE_PORT} ${ssh_args}" $rsync_url/$file ${dest_dir}
# rsync -Rut ${rsync_args} -e "ssh -p ${REMOTE_PORT} ${ssh_args}" $rsync_url/$file ${LOCAL_ROOT}
code=$?
else
if [[ $ENABLE_LOCAL_RELAY == 'yes' ]]; then
rsync -ut ${rsync_args} $rsync_url/$file /tmp
code=$?
if [ $code -eq 0 ]; then
mv /tmp/$(basename $file) ${dest_dir}
code=$?
fi
else
rsync -ut ${rsync_args} $rsync_url/$file ${dest_dir}
code=$?
fi
fi
fi
if [ $code -ne 0 ]; then
if [ $code -eq 23 ];then
# rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1819) [generator=3.2.3]
code=100
elif [ $code -eq 11 ];then
# Input/output error (5)
# rsync error: error in file IO (code 11) at receiver.c(871) [receiver=3.2.3]
code=101
elif [ $code -eq 255 ];then
# ssh: connect to host 60.245.209.223 port 22: Connection timed out
# rsync: connection unexpectedly closed (0 bytes received so far) [sender]
# rsync error: unexplained error (code 255) at io.c(231) [sender=3.2.6]
code=101
else
echo ret_code=$code
# code == 10
# rsync: [Receiver] failed to connect to 10.169.0.68 (10.169.0.68): Connection timed out (110)
# rsync error: error in socket IO (code 10) at clientserver.c(138) [Receiver=3.2.6]
fi
break
fi
bytes=$(stat --printf="%s" ${LOCAL_ROOT}/$file)
((total_bytes=total_bytes+bytes))
done
ds1=$(date --iso-8601=ns)
if [ $code -eq 0 ]; then
cat << EOF > /work/task-exec.json
{
"inputBytes":$total_bytes,
"outputBytes":$total_bytes,
"timestamps":["${ds0}","${ds1}"]
}
EOF
send-message $1
code=$?
else
cat << EOF > /work/task-exec.json
{
"statusCode":$code
}
EOF
fi
if [ $code -lt -127 ];then
echo "actual ret_code:"$code
ret_code=-127
elif [ $code -gt 127 ];then
echo "actual ret_code:"$code
ret_code=127
else
ret_code=$code
fi
exit $ret_code

55
dockerfiles/rsync-copy/setup.sh Executable file
View File

@ -0,0 +1,55 @@
#!/bin/bash
arr1=($(/app/bin/url_parser ${SOURCE_URL}))
arr2=($(/app/bin/url_parser ${TARGET_URL}))
MODE / REMOTE_HOST / REMOTE_PORT / REMOTE_ROOT / REMOTE_USER
if [[ (${arr1[0]} == "LOCAL") && ((${arr2[0]} == "SSH") || (${arr2[0]} == "RSYNC")) ]]; then
action="PUSH"
local_root=${arr1[1]}
remote_mode=${arr2[0]}
remote_host=${arr2[1]}
remote_port=${arr2[2]}
remote_root=${arr2[3]}
remote_user=${arr2[4]}
elif [[ (${arr2[0]} == "LOCAL") && ((${arr1[0]} == "SSH") || (${arr1[0]} == "RSYNC"))]]; then
action="PULL"
local_root=${arr2[1]}
remote_mode=${arr1[0]}
remote_host=${arr1[1]}
remote_port=${arr1[2]}
remote_root=${arr1[3]}
remote_user=${arr1[4]}
else
echo "Only one local and one remote allowed!" >&2
exit
fi
cat > /env.sh << EOF
#!/bin/bash
export ACTION=$action
export LOCAL_ROOT=$local_root
export REMOTE_MODE=$remote_mode
export REMOTE_HOST=$remote_host
export REMOTE_PORT=$remote_port
export REMOTE_ROOT=$remote_root
export REMOTE_USER=$remote_user
EOF
env
chmod +x /env.sh
if [[ ${remote_mode} == "SSH" ]]; then
version=$(ssh -p $remote_port ${remote_user}@${remote_host} rsync -V|grep version|awk '{print $3}')
# version="3.2.3"
if $(dpkg --compare-versions ${version} "ge" "3.2.3"); then
touch /rsync_ver_ge_323
fi
ssh -p ${remote_port} ${remote_user}@${remote_host} mkdir -p ${remote_root}
fi
exit $?

View File

@ -0,0 +1,27 @@
FROM hub.cstcloud.cn/scalebox/rsync-agent
LABEL maintainer="kaichao"
RUN apt-get update \
&& apt-get install -y openssh-server sudo \
&& apt-get clean autoclean \
&& apt-get autoremove -y \
&& rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
RUN useradd -rm -d /home/ubuntu -s /bin/bash -g root -G sudo -u 1000 test \
&& usermod -aG sudo test \
&& echo 'test:test' | chpasswd
RUN service ssh start
ENV TRANSPORT_TYPE=rsync \
RSYNC_PORT=873 \
SSH_PORT=2222
ENV ACTION_RUN=/usr/local/bin/run.sh
COPY rootfs/ /
COPY --from=hub.cstcloud.cn/scalebox/rsync-base /root/.ssh/id_rsa.pub /root/.ssh/authorized_keys
# CMD run.sh

View File

@ -0,0 +1,18 @@
# rsyncd
rsync module name is '/local'
## rsync mode
```sh
make run-rsync-mode
```
## rsync over ssh
```sh
make run-rsync-over-ssh
```
## setup
uid in rsyncd.conf should have read and write permissions for the local directory
- set uid = root
- set local directory = 777
- map uid in rsyncd.conf to an outside user

View File

@ -0,0 +1,35 @@
[global]
charset = utf-8
# uid = nobody
uid = root
gid = nogroup
# max clients
max connections = 10
reverse lookup = no
pid file = /var/run/rsyncd.pid
transfer logging = yes
log format = %t %a %m %f %b
log file = /var/log/rsync.log
exclude = lost+found/
timeout = 900
ignore nonreadable = yes
dont compress = *.gz *.tgz *.zip *.z *.Z *.rpm *.deb *.bz2 *.zst *.xz
[local]
path = /local
read only = no
# list module files
list = yes
comment = remote directory
auth users = user root
#hosts allow = 192.168.0.0/16
secrets file = /etc/rsyncd.secrets
exclude = bin/ boot/ dev/ etc/ proc/ run/ sys/ var/ usr/
strict modes = false

View File

@ -0,0 +1,2 @@
user:cas12345
root:cas12345

View File

@ -0,0 +1,11 @@
#!/bin/bash
set -e
if [ $TRANSPORT_TYPE = "rsync" ]; then
echo "starting rsyncd on port $RSYNC_PORT"
rsync --daemon --port $RSYNC_PORT --no-detach --log-file /dev/stdout
else
echo "starting sshd on port $SSH_PORT"
/usr/sbin/sshd -p $SSH_PORT -D
fi

0
examples/README.md Normal file
View File

View File

@ -0,0 +1,41 @@
name: app-primes-g${NUM_GROUPS}-p${NUM_PARALLEL}
label: Prototype for Distributed Primes Calculation
cluster: ${CLUSTER}
parameters:
initial_status: RUNNING
jobs:
scatter:
label: domain decomposition
base_image: app-primes/scatter
schedule_mode: HEAD
parameters:
start_message: ANY
environments:
- NUM_GROUPS=${NUM_GROUPS}
- GROUP_SIZE=${GROUP_SIZE}
sink_jobs:
- calc
calc:
label: calc primes
base_image: app-primes/calc
# schedule_mode: HEAD
hosts:
- ${CALC_HOST}:${NUM_PARALLEL}
parameters:
tasks_per_queue: 500
environments:
- LENGTH=${GROUP_SIZE}
sink_jobs:
- gather
gather:
label: Summary of results
base_image: app-primes/gather
schedule_mode: HEAD
environments:
- NUM_GROUPS=${NUM_GROUPS}
variables:
# should be 'yes', to support session
repeated: yes

View File

@ -0,0 +1,13 @@
FROM python:3.11-slim
LABEL maintainer="kaichao"
COPY --from=hub.cstcloud.cn/scalebox/base /usr/local/sbin /usr/local/sbin
COPY run.sh primes primes.py /app/bin/
ENV LENGTH=10000
RUN mkdir -p /work
WORKDIR /work
ENTRYPOINT ["goagent"]

View File

@ -0,0 +1,4 @@
#!/bin/bash
# $1 : start
# $2 : length
python $(dirname $(which $0))/primes.py $1 $2

View File

@ -0,0 +1,23 @@
#!/usr/bin/python
import sys
def isPrime(n):
if n<2:
return False
for i in range(2,n-1):
if n%i==0:
return False
return True
def getNumPrimes(start, length):
ret = 0
for k in range(0, length):
if isPrime(start + k):
ret = ret + 1
return ret
start=int(sys.argv[1])
length=int(sys.argv[2])
print(getNumPrimes(start,length))

15
examples/app-primes/calc/run.sh Executable file
View File

@ -0,0 +1,15 @@
#!/bin/bash
cd /app/bin
set -o pipefail
num=$(/app/bin/primes $1 $LENGTH |tail -1)
code=$?
set +o pipefail
if [ $code -eq 0 ]; then
send-message $1,$num
code=$?
fi
exit $code

View File

@ -0,0 +1,7 @@
FROM hub.cstcloud.cn/scalebox/agent
LABEL maintainer="kaichao"
COPY *.sh /app/bin/
ENV NUM_GROUPS=

View File

@ -0,0 +1,23 @@
#!/bin/bash
# second colume is num
num=$(echo $1 | awk -F "," '{print $2}')
echo $1 >> /work/tmp.txt
# define integer variables
declare -i count sum
line=$(head -1 result.txt)
count=$(echo $line | awk -F " " '{print $1}')
sum=$(echo $line | awk -F " " '{print $2}')
((count++))
sum=$(($sum + $num))
echo -n $count $sum > result.txt
# save the result to t_app, and set the status of the application to FINISHED
if [[ "$count" = "${NUM_GROUPS}" ]]; then
scalebox app set-finished -job-id=${JOB_ID} "Result is "${sum}
fi

View File

@ -0,0 +1,4 @@
#!/bin/bash
rm -f /work/result.txt
touch /work/result.txt

View File

@ -0,0 +1,10 @@
# CLUSTER=local
CLUSTER=bio-down
GROUP_SIZE=10000
NUM_GROUPS=100
# CALC_HOST=h0
CALC_HOST=(n[0-3])|(h0)
NUM_PARALLEL=4

View File

@ -0,0 +1,8 @@
FROM hub.cstcloud.cn/scalebox/agent
LABEL maintainer="kaichao"
COPY run.sh /app/bin/
ENV GROUP_SIZE=10000 \
NUM_GROUPS=

View File

@ -0,0 +1,11 @@
#!/bin/bash
declare -i num_groups group_size
num_groups=${NUM_GROUPS}
group_size=${GROUP_SIZE}
for ((i=num_groups*group_size-group_size+1; i>0; i=i-group_size))
do
send-message $(printf "%09d" ${i})
done

View File

@ -0,0 +1,5 @@
FROM hub.cstcloud.cn/scalebox/agent
LABEL maintainer="kaichao"
COPY run.sh /app/bin/

View File

@ -0,0 +1,19 @@
name: hello-scalebox.example.scalebox
cluster: local
parameters:
initial_status: RUNNING
jobs:
hello-scalebox:
base_image: hub.cstcloud.cn/scalebox/hello-scalebox
schedule_mode: HEAD
variables:
repeated: yes
max_seconds_per_task: 100
grpc_server: 127.0.0.1:50051
output_text_size: 1024
text_tranc_mode: HEAD
local_ip_index: 1
locale_mode: NONE
parameters:
start_message: scalebox

8
examples/hello-scalebox/run.sh Executable file
View File

@ -0,0 +1,8 @@
#!/bin/bash
echo "Input message:"$1
echo "Hello, $1!"
scalebox app set-finished -job-id=${JOB_ID} "Hello, Scalebox is OK!"
exit 0

View File

@ -0,0 +1,23 @@
{
"$id": "https://scalebox.dev/message.schema.json",
"$schema": "https://json-schema.org/draft/2020-12/schema",
"title": "task message",
"description": "A representation of scalebox message",
"type": "object",
"properties": {
"body": {
"description": "message body",
"type": "string"
},
"arrayBody": {
"type": "array",
"items": {
"type": "string"
}
},
"sourceIP": {
"description": "ip_addr for source job",
"type": "string"
}
}
}

View File

@ -0,0 +1,43 @@
{
"$id": "https://scalebox.dev/task_exec.schema.json",
"$schema": "https://json-schema.org/draft/2020-12/schema",
"description": "Interaction data between sidecar scripts and agent",
"type": "object",
"required": [ "statusCode" ],
"properties": {
"statusCode": {
"description": "Formatted Name",
"type": "string"
},
"inputBytes": {
"type": "string"
},
"inputFiles": {
"type": "array",
"items": {
"type": "string"
}
},
"outputBytes": {
"type": "string"
},
"outputFiles": {
"type": "array",
"items": {
"type": "string"
}
},
"timestamps": {
"type": "array",
"items": {
"type": "string"
}
},
"sinkJob": {
"type": "string"
},
"messageText": {
"type": "string"
}
}
}

View File

@ -0,0 +1,3 @@
FROM hub.cstcloud.cn/scalebox/agent
COPY *.sh /app/bin/

13
tests/check_test/app.yaml Normal file
View File

@ -0,0 +1,13 @@
name: check_test.test-app
cluster: local
parameters:
initial_status: RUNNING
jobs:
check_test:
base_image: check_test
schedule_mode: HEAD
variables:
repeated: yes
parameters:
start_message: ANY

8
tests/check_test/check.sh Executable file
View File

@ -0,0 +1,8 @@
#!/bin/bash
# code in [0..2]
code=$(($RANDOM%3))
echo "check_code:$code" >> /work/user-file.txt
exit $code

8
tests/check_test/run.sh Executable file
View File

@ -0,0 +1,8 @@
#!/bin/bash
echo "check_test done!" >> /work/user-file.txt
result_txt=`cat /work/user-file.txt`
scalebox app set-finished --job-id ${JOB_ID} "$result_txt"
exit 0

View File

@ -0,0 +1,3 @@
FROM hub.cstcloud.cn/scalebox/agent
COPY run.sh /app/bin/

16
tests/retry_test/app.yaml Normal file
View File

@ -0,0 +1,16 @@
name: retry_test.test-app
cluster: local
parameters:
initial_status: RUNNING
jobs:
retry_test:
base_image: retry_test
schedule_mode: HEAD
variables:
repeated: yes
parameters:
start_message: 0
retry_rules: "['1','2:3']"
sink_jobs:
- retry_test

14
tests/retry_test/run.sh Executable file
View File

@ -0,0 +1,14 @@
#!/bin/bash
m=$1
# random number: [0..2]
# exit_code=$((${RANDOM}%3))
exit_code=$(($1%4))
if [ "$m" = "0" ]; then
for ((i = 1; i < 4; i++));do
send-message $i
done
fi
exit $m

View File

@ -0,0 +1,9 @@
FROM hub.cstcloud.cn/scalebox/agent
LABEL maintainer="kaichao"
# ENV TIMEOUT_SECONDS=600
COPY run.sh /app/bin/
ENTRYPOINT [ "goagent" ]

View File

@ -0,0 +1,11 @@
name: task-exec-files.test-app
cluster: local
parameters:
initial_status: RUNNING
jobs:
task-exec-files:
base_image: task-exec-files
parameters:
start_message: 0
schedule_mode: HEAD

62
tests/task-exec-files/run.sh Executable file
View File

@ -0,0 +1,62 @@
#!/bin/bash
m=$1
input_files="\"/etc/passwd\",\"/etc/group\""
output_files="\"/etc/fstab\",\"/etc/hosts\""
input_bytes=9999
output_bytes=999
user_text="user-defined text\nHello,scalebox!"
cat << EOF > /work/timestamps.txt
2008-03-19T18:35:03-08:00
2009-11-05T17:50:20.154+08:00
2010-11-05T17:50:20.154918+08:00
2011-11-05T17:50:20.154918780+08:00
2012-11-17T08:52:21,963572856+08:00
EOF
cat << EOF > /work/user-file.txt
This is user-defined data in a file.
Multi-line is supported.
EOF
if [ "$m" = "0" ]; then
rm -f /work/timestamps.txt /work/user-file.txt
echo "stdout in message-${m}."
echo "stderr in message-${m}." >&2
cat << EOF > /work/task-exec.json
{
"statusCode":0,
"inputBytes":${input_bytes},
"outputBytes":${output_bytes},
"userText":"user-defined text\nHello scalebox in message-${m}",
"timestamps":["2018-03-19T18:35:03-08:00","2019-11-05T17:50:20.154+08:00","2020-11-05T17:50:20.154918+08:00","2021-11-05T17:50:20.154918780+08:00","2022-11-17T08:52:21,963572856+08:00"],
"sinkJob":"task-exec-files",
"messageBody":"1"
}
EOF
elif [ "$m" = "1" ]; then
echo "stdout in message-${m}."
echo "stderr in message-${m}." >&2
cat << EOF > /work/task-exec.json
{
"statusCode":0,
"inputFiles":[${input_files}],
"outputFiles":[${output_files}],
"sinkJob":"task-exec-files",
"messageBody":"2"
}
EOF
elif [ "$m" = "2" ]; then
rm -f /work/task-exec.json
echo $m
else
echo $m
fi
exit 0

View File

@ -0,0 +1,9 @@
FROM hub.cstcloud.cn/scalebox/agent
LABEL maintainer="kaichao"
ENV SLEEP_SECONDS=
COPY run.sh /app/bin/
ENTRYPOINT [ "goagent" ]

View File

@ -0,0 +1,16 @@
name: timeout-gen.test-app
cluster: local
parameters:
initial_status: RUNNING
jobs:
timeout-gen:
base_image: timeout-gen
parameters:
start_message: 0
variables:
max_seconds_per_task: 10
repeated: yes
schedule_mode: HEAD
environments:
- SLEEP_SECONDS=20

12
tests/timeout-gen/run.sh Executable file
View File

@ -0,0 +1,12 @@
#!/bin/bash
echo "SLEEP_SECONDS:"$SLEEP_SECONDS
# sleep $SLEEP_SECONDS
for ((i = 0; i < $SLEEP_SECONDS; i++))
do
sleep 1
date
done
exit 0