在接入大量数据插入到CK时,遇到了一个插入报错,记录如下

报错信息

Too many partitions for single INSERT block (more than 100)

详细信息:

The limit is controlled by ‘max_partitions_per_insert_block’ setting. Large number of partitions is a common misconception. It will lead to severe negative performance impact, including slow server startup, slow INSERT queries and slow SELECT queries. Recommended total number of partitions for a table is under 1000..10000. Please note, that partitioning is not intended to speed up SELECT queries (ORDER BY key is sufficient to make range queries fast). Partitions are intended for data manipulation (DROP PARTITION, etc). (version 19.17.4.11 (official build))

max_partitions_per_insert_block (官方文档)参数用来限制单个插入Block中,包含的最大分区数量,默认值为100。设置为0时,表示不限制。

修改方法:

  1. 配置文件修改: 在users.xml配置文件中进行配置。配置在 块中。CH的配置文件是即时生效,不需要重启服务🙅‍♂️

  2. 在一个会话中临时修改:SET max_partitions_per_insert_block=1000。用于临时导入大量数据的情况。

客户端解决方法:

为了防止可能出现额极限情况,批量插入时,在逻辑中判断涉及到的分区是否超过max_partitions_per_insert_block,如果超过,则立即执行一次插入。

以下是官方文档原文:

Settings

There are multiple ways to make all the settings described below. Settings are configured in layers, so each subsequent layer redefines the previous settings.

Ways to configure settings, in order of priority:

  • Settings in the users.xml server configuration file.

    Set in the element .

  • Session settings.

    Send SET setting=value from the ClickHouse console client in interactive mode. Similarly, you can use ClickHouse sessions in the HTTP protocol. To do this, you need to specify the session_id HTTP parameter.

  • Query settings.

    • When starting the ClickHouse console client in non-interactive mode, set the startup parameter –setting=value.

    • When using the HTTP API, pass CGI parameters (URL?setting_1=value&setting_2=value…).

Settings that can only be made in the server config file are not covered in this section.