1. Jindo 直接加速 OSS
1
2
3
4
| export ENDPOINT=oss-cn-beijing-internal.aliyuncs.com
export BUCKET=
export AK=
export SK=
|
1
2
3
4
5
6
7
8
9
10
| kubectl apply -f - <<EOF
apiVersion: v1
kind: Secret
metadata:
name: myosssecret
type: Opaque
stringData:
fs.oss.accessKeyId: ${AK}
fs.oss.accessKeySecret: ${SK}
EOF
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
| kubectl apply -f - <<EOF
apiVersion: data.fluid.io/v1alpha1
kind: Dataset
metadata:
name: myoss-jindo
spec:
mounts:
- mountPoint: oss://${BUCKET}/test2/
options:
fs.oss.endpoint: ${ENDPOINT}
encryptOptions:
- name: fs.oss.accessKeyId
valueFrom:
secretKeyRef:
name: myosssecret
key: fs.oss.accessKeyId
- name: fs.oss.accessKeySecret
valueFrom:
secretKeyRef:
name: myosssecret
key: fs.oss.accessKeySecret
accessModes:
- ReadWriteMany
EOF
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
| kubectl apply -f - <<EOF
apiVersion: data.fluid.io/v1alpha1
kind: JindoRuntime
metadata:
name: myoss-jindo
spec:
replicas: 2
tieredstore:
levels:
- mediumtype: SSD
path: /cache
quota: 40960
low: "0.1"
EOF
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
| kubectl apply -f - <<EOF
apiVersion: v1
kind: Pod
metadata:
name: myoss-jindo
spec:
containers:
- name: demo
image: shaowenchen/demo-ubuntu
volumeMounts:
- mountPath: /data
name: data
volumes:
- name: data
persistentVolumeClaim:
claimName: myoss-jindo
EOF
|
1.1 juicefs 性能测试
1
2
3
4
5
6
7
8
9
10
11
12
| juicefs bench --block-size 4096 --big-file-size 1024 --threads 30 ./
BlockSize: 4096 MiB, BigFileSize: 1024 MiB, SmallFileSize: 128 KiB, SmallFileCount: 100, NumThreads: 30
+------------------+---------------+-----------------+
| ITEM | VALUE | COST |
+------------------+---------------+-----------------+
| Write big file | 1520.22 MiB/s | 20.21 s/file |
| Read big file | 1595.94 MiB/s | 19.25 s/file |
| Write small file | 8.9 files/s | 3373.29 ms/file |
| Read small file | 289.1 files/s | 103.79 ms/file |
| Stat file | 496.8 files/s | 60.39 ms/file |
+------------------+---------------+-----------------+
|
1.2 DD 性能测试
1
2
3
4
5
6
7
| time dd if=/dev/zero of=./dd.txt bs=4M count=2500
10485760000 bytes (10 GB, 9.8 GiB) copied, 24.8855 s, 421 MB/s
real 0m25.047s
user 0m0.004s
sys 0m2.857s
|
1
2
3
4
5
6
7
| time dd if=./dd.txt of=/dev/null bs=4M count=2500
10485760000 bytes (10 GB, 9.8 GiB) copied, 21.6259 s, 485 MB/s
real 0m21.683s
user 0m0.000s
sys 0m1.451s
|
1
2
3
4
5
6
7
| time dd if=./dd.txt of=/dev/null bs=4M count=2500
10485760000 bytes (10 GB, 9.8 GiB) copied, 19.6688 s, 533 MB/s
real 0m19.692s
user 0m0.004s
sys 0m1.284s
|
2. JuiceFS 社区版对接 OSS
2.1 配置环境变量
1
2
3
4
5
6
7
8
9
10
11
| export REDIS_IP=x.x.x.x
export REDIS_PORT=6379
export REDIS_USER=default
export REDIS_PASSWORD=mypassword
export REDIS_DIRECTSERVER=redis://${REDIS_USER}:${REDIS_PASSWORD}@${REDIS_IP}:${REDIS_PORT}/1
export ACCESS_KEY=xxx
export SECRET_KEY=xxx
export BUCKET=xxx
export ENDPOINT=oss-cn-beijing-internal.aliyuncs.com
export BUCKET_ENPOINT=$BUCKET.$ENDPOINT
|
2.2 初始化文件系统
1
| curl -sSL https://d.juicefs.com/install | sh -
|
1
2
3
4
5
| juicefs format \
--storage oss \
--bucket ${BUCKET_ENPOINT}\
${REDIS_DIRECTSERVER} \
oss-direct
|
3. 主机直接挂载 JuiceFS
3.1 juicefs 性能测试
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
| juicefs mount -d --buffer-size 2000 --max-uploads 150 ${REDIS_DIRECTSERVER} ./oss-direct --cache-dir=/data/jfs-oss-direct
+------------------+-------------------+---------------+
| ITEM | VALUE | COST |
+------------------+-------------------+---------------+
| Write big file | 2348.54 MiB/s | 13.08 s/file |
| Read big file | 5988.49 MiB/s | 5.13 s/file |
| Write small file | 867.1 files/s | 34.60 ms/file |
| Read small file | 35705.2 files/s | 0.84 ms/file |
| Stat file | 103844.2 files/s | 0.29 ms/file |
| FUSE operation | 534217 operations | 0.91 ms/op |
| Update meta | 9543 operations | 0.10 ms/op |
| Put object | 10680 operations | 154.07 ms/op |
| Get object | 7680 operations | 71.21 ms/op |
| Delete object | 0 operations | 0.00 ms/op |
| Write into cache | 4314 operations | 1.10 ms/op |
| Read from cache | 3000 operations | 0.16 ms/op |
+------------------+-------------------+---------------+
|
3.2 DD 性能测试
1
2
3
4
5
6
7
| time dd if=/dev/zero of=./dd.txt bs=4M count=2500
10485760000 bytes (10 GB, 9.8 GiB) copied, 5.99897 s, 1.7 GB/s
real 0m6.001s
user 0m0.004s
sys 0m3.112s
|
1
2
3
4
5
6
7
| time dd if=./dd.txt of=/dev/null bs=4M count=2500
10485760000 bytes (10 GB, 9.8 GiB) copied, 28.4491 s, 369 MB/s
real 0m29.033s
user 0m0.000s
sys 0m3.808s
|
1
2
3
4
5
6
7
| time dd if=./dd.txt of=/dev/null bs=4M count=2500
10485760000 bytes (10 GB, 9.8 GiB) copied, 1.65887 s, 6.3 GB/s
real 0m1.660s
user 0m0.000s
sys 0m1.659s
|
4. Pod 挂载 JuiceFS
4.1 创建测试负载
1
2
3
4
5
6
7
8
9
10
11
| kubectl apply -f - <<EOF
apiVersion: v1
kind: Secret
metadata:
name: juicefs-direct-secret
type: Opaque
stringData:
metaurl: redis://${REDIS_USER}:${REDIS_PASSWORD}@${REDIS_IP}:6379/1
access-key: ${ACCESS_KEY}
secret-key: ${SECRET_KEY}
EOF
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
| kubectl apply -f - <<EOF
apiVersion: data.fluid.io/v1alpha1
kind: Dataset
metadata:
name: juicefs-direct-demo
spec:
accessModes:
- ReadWriteMany
mounts:
- name: oss-direct
mountPoint: "juicefs:///"
options:
bucket: ${BUCKET_ENPOINT}
storage: oss
encryptOptions:
- name: metaurl
valueFrom:
secretKeyRef:
name: juicefs-direct-secret
key: metaurl
- name: access-key
valueFrom:
secretKeyRef:
name: juicefs-direct-secret
key: access-key
- name: secret-key
valueFrom:
secretKeyRef:
name: juicefs-direct-secret
key: secret-key
EOF
|
需要注意这里的 name 需要和 format
中的 name 保持一致。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
| kubectl apply -f - <<EOF
apiVersion: data.fluid.io/v1alpha1
kind: JuiceFSRuntime
metadata:
name: juicefs-direct-demo
spec:
replicas: 1
tieredstore:
levels:
- mediumtype: SSD
path: /cache
quota: 40960
low: "0.1"
EOF
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
| kubectl apply -f - <<EOF
apiVersion: v1
kind: Pod
metadata:
name: juicefs-direct-demo
spec:
containers:
- name: demo
image: shaowenchen/demo-ubuntu
volumeMounts:
- mountPath: /data/jfs
name: data
volumes:
- name: data
persistentVolumeClaim:
claimName: juicefs-direct-demo
EOF
|
4.2 juicefs 性能测试
进入 Pod 并执行 curl -sSL https://d.juicefs.com/install | sh -
安装 JuiceFS 客户端。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
| juicefs bench --block-size 4096 --big-file-size 1024 --threads 30 ./
+------------------+-------------------+---------------+
| ITEM | VALUE | COST |
+------------------+-------------------+---------------+
| Write big file | 754.37 MiB/s | 40.72 s/file |
| Read big file | 1808.45 MiB/s | 16.99 s/file |
| Write small file | 628.6 files/s | 47.72 ms/file |
| Read small file | 1129.1 files/s | 26.57 ms/file |
| Stat file | 120037.9 files/s | 0.25 ms/file |
| FUSE operation | 536005 operations | 3.39 ms/op |
| Update meta | 9547 operations | 0.32 ms/op |
| Put object | 10680 operations | 80.47 ms/op |
| Get object | 15152 operations | 50.53 ms/op |
| Delete object | 0 operations | 0.00 ms/op |
| Write into cache | 0 operations | 0.00 ms/op |
| Read from cache | 0 operations | 0.00 ms/op |
+------------------+-------------------+---------------+
|
4.3 DD 性能测试
1
2
3
4
5
6
7
| time dd if=/dev/zero of=./dd.txt bs=4M count=2500
10485760000 bytes (10 GB, 9.8 GiB) copied, 13.198 s, 794 MB/s
real 0m13.199s
user 0m0.004s
sys 0m2.860s
|
1
2
3
4
5
6
7
| time dd if=./dd.txt of=/dev/null bs=4M count=2500
10485760000 bytes (10 GB, 9.8 GiB) copied, 34.8118 s, 301 MB/s
real 0m35.162s
user 0m0.004s
sys 0m3.222s
|
1
2
3
4
5
6
7
| time dd if=./dd.txt of=/dev/null bs=4M count=2500
10485760000 bytes (10 GB, 9.8 GiB) copied, 1.48848 s, 7.0 GB/s
real 0m1.490s
user 0m0.000s
sys 0m1.489s
|
5. 总结
测试场景 | 写入大文件速度 | 读取大文件速度 | 写入小文件速度 | 读取小文件速度 |
---|
使用 Jindo 加速 OSS | 1520.22 MiB/s | 1595.94 MiB/s | 8.9 files/s | 289.1 files/s |
主机上 JuiceFS + OSS | 2348.54 MiB/s | 5988.49 MiB/s | 867.1 files/s | 35705.2 files/s |
Pod 上 JuiceFS + OSS | 754.37 MiB/s | 1808.45 MiB/s | 628.6 files/s | 1129.1 files/s |
测试场景 | 写入速度 | 首次读取速度 | 第二次读取速度 |
---|
使用 Jindo 加速 OSS | 421 MB/s | 485 MB/s | 533 MB/s |
主机上 JuiceFS + OSS | 1.7 GB/s | 369 MB/s | 6.3 GB/s |
Pod 上 JuiceFS + OSS | 794 MB/s | 301 MB/s | 7.0 GB/s |
基于以上的测试结果,在阿里云上直接使用 JindoRuntime 将 OSS 以 PVC 挂载到 Pod 中使用即可满足模型推理需求。
使用 Fluid 直接加速对象存储的方式,用于推理时模型的加载,是非常推荐的一种方式。不仅免去部署 JuiceFS 的元数据存储服务,还能够实现 PVC 与 OSS 之间的双向同步,在运维上提供了极大便利。