首页> 自动化运维> 蓝鲸监控3.2Transfer扩容操作

[文章]蓝鲸监控3.2Transfer扩容操作

收藏
0 397 0

蓝鲸监控3.2Transfer扩容操作

陈锦恒[Snow]

【摘要】

    当客户的环境存在大量的数据时,单台transfer处理数据的性能可能不够,所以就需要对transfer进行扩容,以增加对数据的处理能力。

 

【正文】

 

一、停止 monitor(cron) 进程

 

```shell

# 操作主机:中控机

source /data/install/utils.fc

 

# 1. 检查状态

for i in ${BKMONITORV3_MONITOR_IP[@}};

do

    ssh $i "/opt/py36/bin/python /opt/py36/bin/supervisorctl \

       -c /data/bkee/etc/supervisor-bkmonitorv3-monitor.conf status \

       metadata_config_cron:metadata_config_cron0";

done

 

# 2. 停止

for i in ${BKMONITORV3_MONITOR_IP[@}};

do

    ssh $i "/opt/py36/bin/python /opt/py36/bin/supervisorctl \

       -c /data/bkee/etc/supervisor-bkmonitorv3-monitor.conf stop \

       metadata_config_cron:metadata_config_cron0";

done

 

# 3. 停止后检查

for i in ${BKMONITORV3_MONITOR_IP[@}};

do

    ssh $i "/opt/py36/bin/python /opt/py36/bin/supervisorctl \

       -c /data/bkee/etc/supervisor-bkmonitorv3-monitor.conf status \

       metadata_config_cron:metadata_config_cron0";

done

```

 

二、检查配置

 

```shell

# 操作主机:中控机

# 定义操作时间戳

export BACKUPTIME=$(date +%Y%m%d_%H%M%S)

mkdir /data/.variation 2>/dev/null

echo "SPDB20200623-2" > /data/.variation/$BACKUPTIME

 

source /data/install/utils.fc

 

# 1. 检查 ZK

ssh $ZK_IP

cd /data/bkee/service/zk/bin

./zkCli.sh \

    -server zk.service.consul:2181 \

    get /gse/config/etc/dataserver/data/1001 2>&1 |grep data_set |jq

   

# 返回:

# [

#   {

#      "server_id": -1,

#      "data_set": "0bkmonitor_10010",

#      "partition": 1,

#      "cluster_index": 500,

#      "biz_id": 0,

#      "msq_system": 1,

#      "bkmonitor_config": true

#   },

#   {

#      "data_set": "snapshot",

#      "partition": 0,

#      "cluster_index": 1,

#      "biz_id": 2,

#      "msg_system":3,

#      "type":1

#   }

# ]

exit # 退出

 

# 2. 检查kafka

ssh $KAFKA_IP

cd /data/bkee/service/kafka/bin

./kafka-topics.sh \

    --describe \

    --zookeeper zk.service.consul:2181/common_kafka \

    --topic 0bkmonitor_10010

 

# 返回:

# Topic:0bkmonitor_10010 PartitionCount:1 ReplicationFactor:2 Configs:

#   Topic: 0bkmonitor_10010 Partition: 0 Leader: 1 Replicas: 1,3 Isr: 3,1

exit   # 退出

 

# 2. 检查 MySQL

mysql -u$MYSQL_USER -p$MYSQL_PASS -h$MYSQL_IP

mysql> USE bkmonitorv3_alert;

mysql> SELECT * FROM metadata_kafkatopicinfo WHERE bk_data_id=1001;

 

# 返回:

# +----+------------+-----------------+-----------+

# | id | bk_data_id | topic           | partition |

# +----+------------+-----------------+-----------+

# |  1 |       1001 | 0bkmonitor_10010|         1 |

# +----+------------+-----------------+-----------+

```

 

 

 

三、增加 Kafka Partition

 

```shell

# 操作主机: $KAFKA_IP

cd /data/bkee/service/kafka/bin

 

# 1. 增加 partition

./kafka-topics.sh \

    --alter \

    --zookeeper zk.service.consul:2181/common_kafka \

    --topic 0bkmonitor_10010 \

    --partitions 2

 

# 返回:

# WARNING: If partitions are increased for a topic that has a key, the partition logic or ordering of the messages will be affected

# Adding partitions succeeded!

 

# 2. 检查

./kafka-topics.sh \

    --describe \

    --zookeeper zk.service.consul:2181/common_kafka \

    --topic 0bkmonitor_10010

 

# 返回:

# Topic:0bkmonitor_10010 PartitionCount:1 ReplicationFactor:2 Configs:

#   Topic: 0bkmonitor_10010 Partition: 0 Leader: 1 Replicas: 1,3 Isr: 3,1

#   Topic: 0bkmonitor_10010 Partition: 1 Leader: 1 Replicas: 1,2 Isr: 1,2

```

 

 

 

四、修改数据库信息

 

```mysql

-- 操作命令行: mysql -u$MYSQL_USER -p$MYSQL_PASS -h$MYSQL_IP

 

USE bkmonitorv3_alert;

 

-- 1. 更新 partition 数量

UPDATE metadata_kafkatopicinfo SET `partition`=2 WHERE bk_data_id=1001;

 

-- 2. 检查

SELECT * FROM metadata_kafkatopicinfo WHERE bk_data_id=1001;

-- 返回:

-- +----+------------+-----------------+-----------+

-- | id | bk_data_id | topic           | partition |

-- +----+------------+-----------------+-----------+

-- |  1 |       1001 | 0bkmonitor_10010|         2 |

-- +----+------------+-----------------+-----------+

```

 

 

 

五、检查

 

```shell

# 操作主机: 中控机

source /data/install/utils.fc

 

ssh $BKMONITORV3_TRANSFER_IP

cd /data/bkee/bkmonitorv3/transfer

./transfer shadow | grep 1001

 

# : 返回两个实例

 

ssh $ZK_IP

cd /data/bkee/service/zk/bin

./zkCli.sh \

    -server zk.service.consul:2181 \

    get /gse/config/etc/dataserver/data/1001 2>&1 |grep data_set |jq

   

# 返回:

# [

#   {

#      "server_id": -1,

#      "data_set": "0bkmonitor_10010",

#      "partition": 2,

#      "cluster_index": 500,

#      "biz_id": 0,

#      "msq_system": 1,

#      "bkmonitor_config": true

#   },

#   {

#      "data_set": "snapshot",

#      "partition": 0,

#      "cluster_index": 1,

#      "biz_id": 2,

#      "msg_system":3,

#      "type":1

#   }

# ]

 

 

```

 

 

 

六、启动 monitor(cron) 进程

 

```shell

# 操作主机:中控机

source /data/install/utils.fc

 

# 1. 检查状态

for i in ${BKMONITORV3_MONITOR_IP[@}};

do

    ssh $i "/opt/py36/bin/python /opt/py36/bin/supervisorctl \

       -c /data/bkee/etc/supervisor-bkmonitorv3-monitor.conf status \

       metadata_config_cron:metadata_config_cron0";

done

 

# 2. 启动

for i in ${BKMONITORV3_MONITOR_IP[@}};

do

    ssh $i "/opt/py36/bin/python /opt/py36/bin/supervisorctl \

       -c /data/bkee/etc/supervisor-bkmonitorv3-monitor.conf start \

       metadata_config_cron:metadata_config_cron0";

done

 

# 3. 启动后检查

for i in ${BKMONITORV3_MONITOR_IP[@}};

do

    ssh $i "/opt/py36/bin/python /opt/py36/bin/supervisorctl \

       -c /data/bkee/etc/supervisor-bkmonitorv3-monitor.conf status \

       metadata_config_cron:metadata_config_cron0";

done

```

 

 

自动化运维
最近热帖
{{item.Title}} {{item.ViewCount}}
近期热议
{{item.Title}} {{item.PostCount}}