tidb-in-action

1.2.6 在 Kubernetes 集群上使用 Lightning 导入数据

1. 背景介绍

Mydumper + Loader 使用多线程导入导出数据时需要经过 TiDB SQL 语法解析,导致 TiDB 计算能力成为新的瓶颈。所以又一个想法孕育而出——导入数据不经过 SQL 解析,直接转换成 KV 键值对写入 TiKV 集群。

TiDB Lightning 整体架构:

整体架构

TiDB Lightning 主要包含两个部分:

在 Kubernetes 上,tikv-importer 位于 TiDB 集群的 Helm chart 内,被部署为一个副本数为 1 (replicas=1) 的 StatefulSet;tidb-lightning 位于单独的 Helm chart 内,被部署为一个 Job

为了使用 TiDB Lightning 恢复数据,tikv-importer 和 tidb-lightning 都必须分别部署。

2. 部署 tikv-importer

tikv-importer 可以在一个现有的 TiDB 集群上启用,或者在新建 TiDB 集群时启用。

3. 部署 tidb-lightning

(1) 配置 TiDB Lightning

使用如下命令获得 TiDB Lightning 的默认配置。

helm inspect values pingcap/tidb-lightning --version=<chart-version> > tidb-lightning-values.yaml

tidb-lightning Helm chart 支持恢复本地或远程的备份数据。

(2) 部署 TiDB Lightning

helm install pingcap/tidb-lightning --name=<tidb-lightning-release-name> --namespace=<namespace> --set failFast=true -f tidb-lightning-values.yaml --version=<chart-version>

4. Demo 演示

通过 Mydumper 执行全量逻辑备份

在 namespace test-cluster 下有两套集群:cluster-1、cluster-2。

查看 cluster-1 集群 Mysql 数据:

'select * from cloud.test_tbl;'
+----+------------+--------+------------+
| id | title      | author | date       |
+----+------------+--------+------------+
|  1 | K8s        | shonge | 2020-03-07 |
|  2 | operator   | shonge | 2020-03-07 |
|  3 | kubernetes | shonge | 2020-03-07 |
+----+------------+--------+------------+
kubectl create secret generic backup-secret --namespace=test-backup --from-literal=user=root --from-literal=password=<root_password>
helm install pingcap/tidb-backup --version=v1.1.0-beta.2 --name backup-cluster-1 --namespace test-backup --set-string clusterName=cluster-1,storage.size=500Gi
kubectl -n test-backup get job -l app.kubernetes.io/instance=backup-cluster-1
NAME                            COMPLETIONS   DURATION   AGE
basic-fullbackup-202003080800   1/1           3s         3m32s

查找备份 PV 挂载路径。

kubectl -n test-cluster get pvc -l app.kubernetes.io/instance=backup-cluster-1

NAME               STATUS   VOLUME       CAPACITY   ACCESS MODES   STORAGECLASS    AGE
fullbackup-202003080800   Bound    local-pv-2a2853fb   77Gi      RWO  local-storage  62m

kubectl describe pv local-pv-2a2853fb

Name:              local-pv-2a2853fb
Labels:            kubernetes.io/hostname=tidb-operator-worker2
Annotations:       pv.kubernetes.io/bound-by-controller: yes
                   pv.kubernetes.io/provisioned-by: local-volume-provisioner-tidb-operator-worker2-9d6bdbba-89ff-4180-9917-35b4dda3a3db
Finalizers:        [kubernetes.io/pv-protection]
StorageClass:      local-storage
Status:            Bound
Claim:             test-cluster/fullbackup-202003080800
Reclaim Policy:    Delete
Access Modes:      RWO
VolumeMode:        Filesystem
Capacity:          500Gi
Node Affinity:
  Required Terms:
    Term 0:        kubernetes.io/hostname in [tidb-operator-worker2]
Message:
Source:
    Type:  LocalVolume (a persistent volume backed by local storage on a node)
    Path:  /mnt/disks/20
Events:    <none>

查看备份文件,以 kind worker node 为例:

docker exec -ti tidb-operator-worker2 ls /mnt/disks/20/fullbackup-202003080800

cloud-schema-create.sql                   mysql.opt_rule_blacklist-schema.sql
cloud.test_tbl-schema.sql                 mysql.role_edges-schema.sql
cloud.test_tbl.sql                        mysql.stats_buckets-schema.sql
metadata                                  mysql.stats_feedback-schema.sql
mysql-schema-create.sql                   mysql.stats_histograms-schema.sql
mysql.GLOBAL_VARIABLES-schema.sql         mysql.stats_histograms.sql
mysql.GLOBAL_VARIABLES.sql                mysql.stats_meta-schema.sql
mysql.bind_info-schema.sql                mysql.stats_meta.sql
mysql.columns_priv-schema.sql             mysql.stats_top_n-schema.sql
mysql.db-schema.sql                       mysql.tables_priv-schema.sql
mysql.default_roles-schema.sql            mysql.tidb-schema.sql
mysql.expr_pushdown_blacklist-schema.sql  mysql.tidb.sql
mysql.gc_delete_range-schema.sql          mysql.user-schema.sql
mysql.gc_delete_range_done-schema.sql     mysql.user.sql
mysql.global_priv-schema.sql              test-schema-create.sql
mysql.help_topic-schema.sql

使用 Lightning 恢复数据

helm upgrade cluster-2 --set-string importer.create=true pingcap/tidb-cluster
helm install pingcap/tidb-lightning --version=v1.1.0-beta.2 --name restore-cluster-1 --namespace test-cluster --set-string dataSource.adhoc.pvcName='fullbackup-202003080800',targetTidbCluster.name='cluster-2'
kubectl -n test-cluster get job -l app.kubernetes.io/name='restore-cluster-1-tidb-lightning'

NAME                               COMPLETIONS   DURATION   AGE
restore-cluster-1-tidb-lightning   1/1           3s         9m3s
MySQL [(none)]> select * from cloud.test_tbl;
+----+------------+--------+------------+
| id | title      | author | date       |
+----+------------+--------+------------+
|  1 | K8s        | shonge | 2020-03-07 |
|  2 | operator   | shonge | 2020-03-07 |
|  3 | kubernetes | shonge | 2020-03-07 |
+----+------------+--------+------------+
3 rows in set (0.01 sec)