Documented Sink connector configuration in README

This commit is contained in:
Kanthi Subramanian 2022-11-24 12:56:18 -05:00
parent 34864f929a
commit 57fe038828
2 changed files with 25 additions and 2 deletions

View File

@ -107,6 +107,29 @@ mvn install -DskipTests=true
| YEAR | INT32 | INT32 |
| GEOMETRY | Binary of WKB | String |
### Sink Connector Configuration
| Property | Default | Description |
|----------------------------------|-----------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| tasks.max | No | SinkConnector task(essentially threads), ideally this needs to be the same as the Kafka partitions. |
| topics.regex | No | Regex of matching topics. Example: "SERVER5432.test.(.*)" matches SERVER5432.test.employees and SERVER5432.test.products |
| topics | No | The list of topics. topics or topics.regex has to be provided. |
| clickhouse.server.url | | ClickHouse Server URL |
| clickhouse.server.user | | ClickHouse Server username |
| clickhouse.server.pass | | ClickHouse Server password |
| clickhouse.server.database | | ClickHouse Database name |
| clickhouse.server.port | 8123 | ClickHouse Server port |
| clickhouse.topic2table.map | No | Map of Kafka topics to table names, <topic_name1>:<table_name1>,<topic_name2>:<table_name2> This variable will override the default mapping of topics to table names. |
| store.kafka.metadata | false | If set to true, kafka metadata columns will be added to Clickhouse |
| store.raw.data | false | If set to true, the entire row is converted to JSON and stored in the column defined by the ` store.raw.data.column ` field |
| store.raw.data.column | No | Clickhouse table column to store the raw data in JSON form(String Clickhouse DataType) |
| metrics.enable | true | Enable Prometheus scraping |
| metrics.port | 8084 | Metrics port |
| buffer.flush.time.ms | 30 | Buffer(Batch of records) flush time in milliseconds |
| thread.pool.size | 10 | Number of threads that is used to connect to ClickHouse |
| auto.create.tables | false | Sink connector will create tables in ClickHouse (If it does not exist) |
| snowflake.id | true | Uses SnowFlake ID(Timestamp + GTID) as the version column for ReplacingMergeTree |
| replacingmergetree.delete.column | "sign" | Column used as the sign column for ReplacingMergeTree.
## ClickHouse Loader(Load Data from MySQL to CH for Initial Load)
[Clickhouse Loader](python/README.md) is a program that loads data dumped in MySQL into a CH database compatible the sink connector (ReplacingMergeTree with virtual columns _version and _sign)

View File

@ -214,7 +214,7 @@ public class ClickHouseSinkConnectorConfig extends AbstractConfig {
.define(
ClickHouseSinkConnectorConfigVariables.STORE_KAFKA_METADATA,
Type.BOOLEAN,
"true",
"false",
Importance.LOW,
"True, if the kafka metadata needs to be stored in Clickhouse tables, false otherwise",
CONFIG_GROUP_CONNECTOR_CONFIG,
@ -254,7 +254,7 @@ public class ClickHouseSinkConnectorConfig extends AbstractConfig {
.define(
ClickHouseSinkConnectorConfigVariables.STORE_RAW_DATA_COLUMN,
Type.STRING,
"false",
"",
Importance.LOW,
"Column name to store the raw data(JSON form), only applicable if STORE_RAW_DATA is set to True",
CONFIG_GROUP_CONNECTOR_CONFIG,