Convert Figma logo to code with AI

shyiko logomysql-binlog-connector-java

MySQL Binary Log connector

2,224
817
2,224
103

Top Related Projects

4,062

Maxwell's daemon, a mysql-to-json kafka producer

28,667

阿里巴巴 MySQL binlog 增量订阅&消费组件

10,544

Change data capture for a variety of databases. Please log issues at https://issues.redhat.com/browse/DBZ.

Quick Overview

MySQL Binary Log Connector (Java) is a library that allows Java developers to connect to and parse MySQL binary log events. It provides a way to capture and process data changes in real-time, making it useful for various applications such as data replication, auditing, and event-driven architectures.

Pros

  • Supports both MySQL and MariaDB
  • Offers non-blocking I/O for improved performance
  • Provides a simple and intuitive API for consuming binary log events
  • Actively maintained with regular updates and bug fixes

Cons

  • Requires elevated database privileges to access binary logs
  • May have a learning curve for developers unfamiliar with MySQL replication concepts
  • Performance can be impacted when processing large volumes of data changes
  • Limited documentation for advanced use cases

Code Examples

  1. Connecting to MySQL and starting event processing:
BinaryLogClient client = new BinaryLogClient("hostname", 3306, "username", "password");
client.registerEventListener(event -> {
    // Process the event
    System.out.println(event);
});
client.connect();
  1. Filtering events by database and table:
EventDeserializer eventDeserializer = new EventDeserializer();
eventDeserializer.setCompatibilityMode(
    EventDeserializer.CompatibilityMode.DATE_AND_TIME_AS_LONG,
    EventDeserializer.CompatibilityMode.CHAR_AND_BINARY_AS_BYTE_ARRAY
);
client.setEventDeserializer(eventDeserializer);

client.registerEventListener(event -> {
    if (event.getHeader().getEventType() == EventType.TABLE_MAP) {
        TableMapEventData data = event.getData();
        if ("mydatabase".equals(data.getDatabase()) && "mytable".equals(data.getTable())) {
            // Process events for the specified database and table
        }
    }
});
  1. Handling specific event types:
client.registerEventListener(event -> {
    EventHeader header = event.getHeader();
    EventData data = event.getData();

    if (header.getEventType() == EventType.WRITE_ROWS) {
        WriteRowsEventData writeData = (WriteRowsEventData) data;
        // Process inserted rows
    } else if (header.getEventType() == EventType.UPDATE_ROWS) {
        UpdateRowsEventData updateData = (UpdateRowsEventData) data;
        // Process updated rows
    } else if (header.getEventType() == EventType.DELETE_ROWS) {
        DeleteRowsEventData deleteData = (DeleteRowsEventData) data;
        // Process deleted rows
    }
});

Getting Started

  1. Add the dependency to your project:
<dependency>
    <groupId>com.github.shyiko</groupId>
    <artifactId>mysql-binlog-connector-java</artifactId>
    <version>0.21.0</version>
</dependency>
  1. Create a BinaryLogClient instance and connect:
BinaryLogClient client = new BinaryLogClient("hostname", 3306, "username", "password");
client.registerEventListener(event -> {
    // Process events here
});
client.connect();
  1. Ensure the MySQL user has the necessary privileges:
GRANT REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'your_user'@'%';
  1. Configure MySQL to enable binary logging in my.cnf:
[mysqld]
server-id        = 1
log_bin          = /var/log/mysql/mysql-bin.log
binlog_format    = row

Competitor Comparisons

4,062

Maxwell's daemon, a mysql-to-json kafka producer

Pros of Maxwell

  • Higher-level abstraction, providing a complete solution for MySQL replication and streaming
  • Built-in support for various output formats (JSON, Avro, Protobuf) and destinations (Kafka, Kinesis, RabbitMQ, Redis)
  • Includes a web interface for monitoring and management

Cons of Maxwell

  • Less flexibility for custom implementations compared to the lower-level mysql-binlog-connector-java
  • Potentially higher resource usage due to additional features and abstractions
  • May have a steeper learning curve for users who only need basic binlog parsing functionality

Code Comparison

Maxwell (Java):

public class Maxwell {
    public static void main(String[] args) throws Exception {
        MaxwellConfig config = new MaxwellConfig(args);
        Maxwell maxwell = new Maxwell(config);
        maxwell.run();
    }
}

mysql-binlog-connector-java (Java):

BinaryLogClient client = new BinaryLogClient("hostname", 3306, "username", "password");
client.registerEventListener(event -> {
    // Process binlog event
});
client.connect();

The code snippets demonstrate the higher-level abstraction provided by Maxwell compared to the more low-level approach of mysql-binlog-connector-java. Maxwell encapsulates configuration and execution in a single class, while mysql-binlog-connector-java requires more manual setup and event handling.

28,667

阿里巴巴 MySQL binlog 增量订阅&消费组件

Pros of Canal

  • More comprehensive solution with support for multiple databases and data synchronization
  • Better performance and scalability for large-scale data processing
  • Active development and maintenance by Alibaba, with frequent updates and improvements

Cons of Canal

  • Steeper learning curve due to its complexity and extensive features
  • Requires more setup and configuration compared to mysql-binlog-connector-java
  • Primarily documented in Chinese, which may be challenging for non-Chinese speakers

Code Comparison

mysql-binlog-connector-java:

BinaryLogClient client = new BinaryLogClient("hostname", 3306, "username", "password");
client.registerEventListener(event -> {
    // Process event
});
client.connect();

Canal:

CanalConnector connector = CanalConnectors.newSingleConnector(new InetSocketAddress("hostname", 11111), "example", "username", "password");
connector.connect();
connector.subscribe(".*\\..*");
Message message = connector.getWithoutAck(100);
List<CanalEntry.Entry> entries = message.getEntries();
// Process entries

Both projects aim to capture and process MySQL binary logs, but Canal offers a more robust and scalable solution suitable for enterprise-level applications. mysql-binlog-connector-java provides a simpler, more straightforward approach for smaller-scale projects or those primarily focused on Java integration. The choice between the two depends on the specific requirements of the project, including scalability needs, supported databases, and the development team's expertise.

10,544

Change data capture for a variety of databases. Please log issues at https://issues.redhat.com/browse/DBZ.

Pros of Debezium

  • Supports multiple databases (MySQL, PostgreSQL, MongoDB, etc.), not just MySQL
  • Provides a complete CDC (Change Data Capture) solution with Kafka integration
  • Offers more advanced features like schema evolution and data type conversions

Cons of Debezium

  • More complex setup and configuration compared to mysql-binlog-connector-java
  • Higher resource consumption due to its comprehensive feature set
  • Steeper learning curve for developers new to CDC or Kafka ecosystems

Code Comparison

mysql-binlog-connector-java:

BinaryLogClient client = new BinaryLogClient("hostname", 3306, "username", "password");
client.registerEventListener(event -> {
    // Process event
});
client.connect();

Debezium:

Configuration config = Configuration.create()
    .with("name", "mysql-connector")
    .with("connector.class", "io.debezium.connector.mysql.MySqlConnector")
    .with("database.hostname", "mysql")
    .with("database.port", "3306")
    .with("database.user", "debezium")
    .with("database.password", "dbz")
    .with("database.server.id", "184054")
    .with("database.server.name", "dbserver1")
    .with("database.include.list", "inventory")
    .with("include.schema.changes", "true")
    .build();

engine.run(config);

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

mysql-binlog-connector-java Build Status Coverage Status Maven Central

ATTENTION: This repository is no longer maintained. I recommend migrating to osheroff/mysql-binlog-connector-java.

MySQL Binary Log connector.

Initially project was started as a fork of open-replicator, but ended up as a complete rewrite. Key differences/features:

  • automatic binlog filename/position | GTID resolution
  • resumable disconnects
  • plugable failover strategies
  • binlog_checksum=CRC32 support (for MySQL 5.6.2+ users)
  • secure communication over the TLS
  • JMX-friendly
  • real-time stats
  • availability in Maven Central
  • no third-party dependencies
  • test suite over different versions of MySQL releases

If you are looking for something similar in other languages - check out siddontang/go-mysql (Go), noplay/python-mysql-replication (Python).

Usage

Get the latest JAR(s) from here. Alternatively you can include following Maven dependency (available through Maven Central):

<dependency>
    <groupId>com.github.shyiko</groupId>
    <artifactId>mysql-binlog-connector-java</artifactId>
    <version>0.21.0</version>
</dependency>

Reading binary log file

File binlogFile = ...
EventDeserializer eventDeserializer = new EventDeserializer();
eventDeserializer.setCompatibilityMode(
    EventDeserializer.CompatibilityMode.DATE_AND_TIME_AS_LONG,
    EventDeserializer.CompatibilityMode.CHAR_AND_BINARY_AS_BYTE_ARRAY
);
BinaryLogFileReader reader = new BinaryLogFileReader(binlogFile, eventDeserializer);
try {
    for (Event event; (event = reader.readEvent()) != null; ) {
        ...
    }
} finally {
    reader.close();
}

Tapping into MySQL replication stream

PREREQUISITES: Whichever user you plan to use for the BinaryLogClient, he MUST have REPLICATION SLAVE privilege. Unless you specify binlogFilename/binlogPosition yourself (in which case automatic resolution won't kick in), you'll need REPLICATION CLIENT granted as well.

BinaryLogClient client = new BinaryLogClient("hostname", 3306, "username", "password");
EventDeserializer eventDeserializer = new EventDeserializer();
eventDeserializer.setCompatibilityMode(
    EventDeserializer.CompatibilityMode.DATE_AND_TIME_AS_LONG,
    EventDeserializer.CompatibilityMode.CHAR_AND_BINARY_AS_BYTE_ARRAY
);
client.setEventDeserializer(eventDeserializer);
client.registerEventListener(new EventListener() {

    @Override
    public void onEvent(Event event) {
        ...
    }
});
client.connect();

You can register a listener for onConnect / onCommunicationFailure / onEventDeserializationFailure / onDisconnect using client.registerLifecycleListener(...).

By default, BinaryLogClient starts from the current (at the time of connect) master binlog position. If you wish to kick off from a specific filename or position, use client.setBinlogFilename(filename) + client.setBinlogPosition(position).

client.connect() is blocking (meaning that client will listen for events in the current thread). client.connect(timeout), on the other hand, spawns a separate thread.

Controlling event deserialization

You might need it for several reasons: you don't want to waste time deserializing events you won't need; there is no EventDataDeserializer defined for the event type you are interested in (or there is but it contains a bug); you want certain type of events to be deserialized in a different way (perhaps *RowsEventData should contain table name and not id?); etc.

EventDeserializer eventDeserializer = new EventDeserializer();

// do not deserialize EXT_DELETE_ROWS event data, return it as a byte array
eventDeserializer.setEventDataDeserializer(EventType.EXT_DELETE_ROWS, 
    new ByteArrayEventDataDeserializer()); 

// skip EXT_WRITE_ROWS event data altogether
eventDeserializer.setEventDataDeserializer(EventType.EXT_WRITE_ROWS, 
    new NullEventDataDeserializer());

// use custom event data deserializer for EXT_DELETE_ROWS
eventDeserializer.setEventDataDeserializer(EventType.EXT_DELETE_ROWS, 
    new EventDataDeserializer() {
        ...
    });

BinaryLogClient client = ...
client.setEventDeserializer(eventDeserializer);

Exposing BinaryLogClient through JMX

MBeanServer mBeanServer = ManagementFactory.getPlatformMBeanServer();

BinaryLogClient binaryLogClient = ...
ObjectName objectName = new ObjectName("mysql.binlog:type=BinaryLogClient");
mBeanServer.registerMBean(binaryLogClient, objectName);

// following bean accumulates various BinaryLogClient stats 
// (e.g. number of disconnects, skipped events)
BinaryLogClientStatistics stats = new BinaryLogClientStatistics(binaryLogClient);
ObjectName statsObjectName = new ObjectName("mysql.binlog:type=BinaryLogClientStatistics");
mBeanServer.registerMBean(stats, statsObjectName);

Using SSL

Introduced in 0.4.0.

TLSv1.1 & TLSv1.2 require JDK 7+.
Prior to MySQL 5.7.10, MySQL supported only TLSv1 (see Secure Connection Protocols and Ciphers).

To check that MySQL server is properly configured with SSL support - mysql -h host -u root -ptypeyourpasswordmaybe -e "show global variables like 'have_%ssl';" ("Value" should be "YES"). State of the current session can be determined using \s ("SSL" should not be blank).

System.setProperty("javax.net.ssl.trustStore", "/path/to/truststore.jks");
System.setProperty("javax.net.ssl.trustStorePassword","truststore.password");
System.setProperty("javax.net.ssl.keyStore", "/path/to/keystore.jks");
System.setProperty("javax.net.ssl.keyStorePassword", "keystore.password");

BinaryLogClient client = ...
client.setSSLMode(SSLMode.VERIFY_IDENTITY);

Implementation notes

  • data of numeric types (tinyint, etc) always returned signed(!) regardless of whether column definition includes "unsigned" keyword or not.
  • data of var*/*text/*blob types always returned as a byte array (for var* this is true starting from 1.0.0).

Frequently Asked Questions

Q. How does a typical transaction look like?

A. GTID event (if gtid_mode=ON) -> QUERY event with "BEGIN" as sql -> ... -> XID event | QUERY event with "COMMIT" or "ROLLBACK" as sql.

Q. EventData for inserted/updated/deleted rows has no information about table (except for some weird id). How do I make sense out of it?

A. Each WriteRowsEventData/UpdateRowsEventData/DeleteRowsEventData event is preceded by TableMapEventData which contains schema & table name. If for some reason you need to know column names (types, etc). - the easiest way is to

select TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME, ORDINAL_POSITION, COLUMN_DEFAULT, IS_NULLABLE, 
DATA_TYPE, CHARACTER_MAXIMUM_LENGTH, CHARACTER_OCTET_LENGTH, NUMERIC_PRECISION, NUMERIC_SCALE, 
CHARACTER_SET_NAME, COLLATION_NAME from INFORMATION_SCHEMA.COLUMNS;
# see https://dev.mysql.com/doc/refman/5.6/en/columns-table.html for more information

(yes, binary log DOES NOT include that piece of information).

You can find JDBC snippet here.

Documentation

API overview

There are two entry points - BinaryLogClient (which you can use to read binary logs from a MySQL server) and BinaryLogFileReader (for offline log processing). Both of them rely on EventDeserializer to deserialize stream of events. Each Event consists of EventHeader (containing among other things reference to EventType) and EventData. The aforementioned EventDeserializer has one EventHeaderDeserializer (EventHeaderV4Deserializer by default) and a collection of EventDataDeserializer|s. If there is no EventDataDeserializer registered for some particular type of Event - default EventDataDeserializer kicks in (NullEventDataDeserializer).

MySQL Internals Manual

For the insight into the internals of MySQL look here. MySQL Client/Server Protocol and The Binary Log sections are particularly useful as a reference documentation for the **.binlog.network and **.binlog.event packages.

Real-world applications

Some of the OSS using / built on top of mysql-binlog-conector-java:

  • apache/nifi An easy to use, powerful, and reliable system to process and distribute data.
  • debezium A low latency data streaming platform for change data capture (CDC).
  • mavenlink/changestream - A stream of changes for MySQL built on Akka.
  • mardambey/mypipe MySQL binary log consumer with the ability to act on changed rows and publish changes to different systems with emphasis on Apache Kafka.
  • ngocdaothanh/mydit MySQL to MongoDB data replicator.
  • sharetribe/dumpr A Clojure library for live replicating data from a MySQL database.
  • shyiko/rook Generic Change Data Capture (CDC) toolkit.
  • streamsets/datacollector Continuous big data ingestion infrastructure.
  • twingly/ecco MySQL replication binlog parser in JRuby.
  • zendesk/maxwell A MySQL-to-JSON Kafka producer.
  • zzt93/syncer A tool sync & manipulate data from MySQL/MongoDB to ES/Kafka/MySQL, which make 'Eventual Consistency' promise.

It's also used on a large scale in MailChimp. You can read about it here.

Development

git clone https://github.com/shyiko/mysql-binlog-connector-java.git
cd mysql-binlog-connector-java
mvn # shows how to build, test, etc. project

Contributing

In lieu of a formal styleguide, please take care to maintain the existing coding style.
Executing mvn checkstyle:check within project directory should not produce any errors.
If you are willing to install vagrant (required by integration tests) it's highly recommended to check (with mvn clean verify) that there are no test failures before sending a pull request.
Additional tests for any new or changed functionality are also very welcomed.

License

Apache License, Version 2.0