5 ways to migrate from MySQL to ClickHouse

5 ways to migrate from MySQL to ClickHouse

Data migration needs to be imported from MySQL to ClickHouse. The summary plan is as follows, including three methods supported by ClickHouse itself and two third-party tools.

create table engin mysql
CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
(
 name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1] [TTL expr1],
 name2 [type2] [DEFAULT|MATERIALIZED|ALIAS expr2] [TTL expr2],
 ...
 INDEX index_name1 expr1 TYPE type1(...) GRANULARITY value1,
 INDEX index_name2 expr2 TYPE type2(...) GRANULARITY value2
) ENGINE = MySQL('host:port', 'database', 'table', 'user', 'password'[, replace_query, 'on_duplicate_clause']);

Official documentation: https://clickhouse.yandex/docs/en/operations/table_engines/mysql/

Note that the actual data is stored in a remote MySQL database, which can be understood as a foreign table.

You can verify this by adding and deleting data in MySQL.

insert into select from
-- Create a table first CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
(
 name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1],
 name2 [type2] [DEFAULT|MATERIALIZED|ALIAS expr2],
 ...
) ENGINE = engine
-- Import data INSERT INTO [db.]table [(c1, c2, c3)] select column or * from mysql('host:port', 'db', 'table_name', 'user', 'password')

You can customize the column type and number of columns, and use clickhouse functions to process data, such as

select toDate(xx) from mysql("host:port","db","table_name","user_name","password")
create table as select from
CREATE TABLE [IF NOT EXISTS] [db.]table_name
ENGINE = Log
AS 
SELECT *
FROM mysql('host:port', 'db', 'article_clientuser_sum', 'user', 'password')

Netizen article: http://jackpgao.github.io/2018/02/04/ClickHouse-Use-MySQL-Data/

Custom columns are not supported, and the ENGIN=MergeTree test written by the blogger in the reference failed.

It can be understood as a combination of create table and insert into select

Altinity/clickhouse-mysql-data-reader

Altinity company open sourced a Python tool for migrating data from MySQL to ClickHouse (supports incremental updates of binlog and full import), but the official readme is out of sync with the code, and it does not work according to the quick start.

## Create table clickhouse-mysql \
 --src-host=127.0.0.1 \
 --src-user=reader \
 --src-password=Qwerty1# \
 --table-templates-with-create-database \
 --src-table=airline.ontime > create_clickhouse_table_template.sql
## Modify the script vim create_clickhouse_table_template.sql
## Import and create table clickhouse-client -mn < create_clickhouse_table_template.sql
## Import data into clickhouse-mysql \
 --src-host=127.0.0.1 \
 --src-user=reader \
 --src-password=Qwerty1# \
 --table-migrate \
 --dst-host=127.0.0.1 \
 --dst-table=logunified \
 --csvpool

Official documentation: https://github.com/Altinity/clickhouse-mysql-data-reader#mysql-migration-case-1—migrate-existing-data

Note that the above three methods are all imported from MySQL to ClickHouse. If the amount of data is large, it will put a lot of pressure on MySQL. The following are two offline methods (streamsets support both real-time and offline)

csv

## Ignore creating table clickhouse-client \
 -h host \
 --query="INSERT INTO [db].table FORMAT CSV" < test.csv

However, if the source data is of low quality, there will often be problems, such as special characters (delimiters, escape characters) or line breaks. I was badly cheated.

Custom delimiter, --format_csv_delimiter="|"
When an error occurs, skip it instead of aborting. --input_format_allow_errors_num=10 allows up to 10 rows of errors, --input_format_allow_errors_ratio=0.1 allows 10% errors. csv skips null values ​​and reports Code: 27. DB::Exception: Cannot parse input: expected , before: xxxx: (at row 69) ERROR: garbage after Nullable(Date): "8,002<LINE FEED>0205" sed ' :a;s/,,/,\\N,/g;ta' |clickhouse-client -h host --query "INSERT INTO [db].table FORMAT CSV" Replace ,, with ,\N,
python clean_csv.py --src=src.csv --dest=dest.csv --chunksize=50000 --cols --encoding=utf-8 --delimiter=,

clean_csv.py Refer to my other article 032-csv file fault tolerance processing

streamsets

Streamsets supports full import from MySQL or reading CSV, and also supports incremental insertion by subscribing to binlog. Please refer to my other article 025-Big Data ETL Tool StreamSets Installation and Subscription to MySQL Binlog.

This article only shows how to import clickhouse from mysql

This article assumes that you have already set up the streamsets service

Enable and restart the service

Upload the jdbc jar and dependency packages of mysql and clickhouse

Convenient way, create pom.xml and use maven to download it uniformly

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
 <modelVersion>4.0.0</modelVersion>
 <groupId>com.anjia</groupId>
 <artifactId>demo</artifactId>
 <packaging>jar</packaging>
 <version>1.0-SNAPSHOT</version>
 <name>demo</name>
 <url>http://maven.apache.org</url>
 <dependencies>
 <dependency>
 <groupId>ru.yandex.clickhouse</groupId>
 <artifactId>clickhouse-jdbc</artifactId>
 <version>0.1.54</version>
 </dependency>
 <dependency>
 <groupId>mysql</groupId>
 <artifactId>mysql-connector-java</artifactId>
 <version>5.1.47</version>
 </dependency>
 </dependencies>
</project>

If Maven is installed locally, execute the following command

mvn dependency:copy-dependencies -DoutputDirectory=lib -DincludeScope=compile

All required jar files will be downloaded and copied to the lib directory.

Then copy it to streamsets /opt/streamsets-datacollector-3.9.1/streamsets-libs-extras/streamsets-datacollector-jdbc-lib/lib/ directory

Restart the streamsets service

       

Summarize

The above are the 5 methods of migrating MySQL to ClickHouse introduced by the editor. I hope it will be helpful to everyone. If you have any questions, please leave me a message and the editor will reply to you in time. I would also like to thank everyone for their support of the 123WORDPRESS.COM website!
If you find this article helpful, please feel free to reprint it and please indicate the source. Thank you!

You may also be interested in:
  • Detailed steps for migrating the data folder of the MySQL database
  • Detailed explanation of how to migrate a MySQL database to another machine
  • Summary of methods for migrating Oracle database to MySQL
  • Migrate mysql database to Oracle database
  • MySQL database migration quickly exports and imports large amounts of data
  • How to use mysql Innodb tablespace unloading, migration, and loading
  • Detailed explanation of MySQL data migration--data directory direct replacement precautions
  • Detailed explanation of mysql5.5 database data directory migration method
  • MySQL backup and migration data synchronization method
  • A MySQL migration plan and practical record of pitfalls

<<:  How to deploy SpringBoot project using Docker

>>:  ElementUI implements the el-form form reset function button

Recommend

Elegant practical record of introducing iconfont icon library into vue

Table of contents Preface Generate SVG Introducti...

Issues with Rancher deployment and importing K8S clusters

Rancher deployment can have three architectures: ...

CentOS7 deployment Flask (Apache, mod_wsgi, Python36, venv)

1. Install Apache # yum install -y httpd httpd-de...

jQuery implements ad display and hide animation

We often see ads appear after a few seconds and t...

Ubuntu16.04 builds php5.6 web server environment

Ubuntu 16.04 installs the PHP7.0 environment by d...

How to inherit CSS line-height

How is Line-height inherited?Write a specific val...

MySQL database import and export data error solution example explanation

Exporting Data Report an error SHOW VARIABLES LIK...

Example of how to upload a Docker image to a private repository

The image can be easily pushed directly to the Do...

MySQL quickly inserts 100 million test data

Table of contents 1. Create a table 1.1 Create te...

Use of Linux relative and absolute paths

01. Overview Absolute paths and relative paths ar...

MySQL 5.7.17 and workbench installation and configuration graphic tutorial

This article shares the installation and configur...

React error boundary component processing

This is the content of React 16. It is not the la...

Vue implements bottom query function

This article example shares the specific code of ...

How to install and configure Redis in CentOS7

Introduction There is no need to introduce Redis ...

JavaScript canvas to load pictures

This article shares the specific code of JavaScri...