【Bytebase】数据库结构变更和版本管理
Bytebase 是一款聚焦在团队协作场景下的数据库结构变更和版本管理(database schema change and version control for teams)的开源工具,主要解决研发工程师和 DBA(数据库管理员)在变更数据库结构时的协同问题。
作者 |
版本 |
时间 |
内容 |
备注 |
Allen |
V0.0.1 |
2021/11/26 |
初稿完成 |
|
Allen |
V0.0.2 |
2021/11/29 |
补充本地运行示例 |
|
Allen |
V1.0.0 |
2021/11/30 |
补充详细介绍相关内容 |
|
Allen |
V1.0.1 |
2021/12/01 |
补充VCS workflow |
1.背景说明
Bytebase 是一款聚焦在团队协作场景下的数据库结构变更和版本管理(database schema change and version control for teams)的开源工具,主要解决研发工程师和 DBA(数据库管理员)在变更数据库结构时的协同问题。
1.1 Why we build Bytebase
Database schema management is one of the two fundamental tasks to develop any non-trivial applications. It covers the stateful aspect of an application. The counterpart is source code management, which covers the stateless aspect. While it has become a standard practice to use GitLab/GitHub to manage the source code, the tooling around database schema management is a bit lacking.
For schema management, frameworks like Ruby on Rails, Django come with built-in schema migration support, and many teams use tools like Liquibase, Flyway to manage the database schema. These are all viable options. However, we believe all the existing options only deliver a partial solution. e.g.
- There is no web-based workspace providing UX and workflow optimized for the collaboration among DBAs and Developers. (Like how Figma delivers such an experience to Designers, Product Managers and Developers)
- There is no cohesive experience to deliver an end-to-end integration between schema management and VCS. (Like how Terraform delivers such an experience for managing cloud infrastructure).
- All existing tools merely serve as a building block in the large CI pipeline, mostly as a step in the entire CI pipeline. However, the tool itself doesn't gather info from the CI context and catch signals from the database instance to provide a holistic view of the schema state across all development environments, spanning all history timelines.
In short, unlike code management, there is no equivalent tool like GitLab/GitHub which provides a comprehensive solution to manage database schema, and Bytebase wants to fill this gap. It's like GitLab, but for managing database schema related tasks instead of code related tasks.
It's like Terraform, where Bytebase integrates with VCS to manage database schema, this is known as database-as-code which is a similar concept to Terraform's infrastructure-as-code to integrate VCS to manage cloud infrastructure.
1.2 既存数据库工具
1.2.1 Navicat
【Navicat】可视化数据库开发工具:https://www.navicat.com.cn
1.2.2 CloudQuery
【CloudQuery】数据库管控平台:登录 · 数字化产品部
1.2.3 Flyway
【Flyway】数据库迁移工具:登录 · 数字化产品部
1.2.4 Gh-ost
【Gh-ost:】ddl在线变更:登录 · 数字化产品部
2.技术简介
2.1 What is Bytebase
Open source, self-hosted, web-based, zero-config, dependency-free database schema change and version control management tool for Developers and DBAs.
2.2 Introduction
Bytebase is a database schema change and version control management tool for teams. It consists of a web console and a backend. The backend has a migration core to manage database schema changes. It also integrates with VCS to enable version controlled schema management.
2.3 technology stack
3.详细介绍
3.1 名词解释
VCS:版本控制系统(version control system),是一种记录一个或若干文件内容变化,以便将来查阅特定版本修订情况的系统。版本控制系统不仅可以应用于软件源代码的文本文件,而且可以对任何类型的文件进行版本控制。用的比较多的如svn,git等;
GitLab EE:企业版( Enterprise Edition);
GitLab CE:GitLab 社区版(Community Edition);
Figma:一个基于浏览器的协作式 UI 设计工具;
CI: 持续集成(Continuous Integration);
CD:持续交付(Continuous Delivery)完成 CI 中构建及单元测试和集成测试的自动化流程后,持续交付可自动将已验证的代码发布到存储库,CD 持续部署(Continuous Deployment)持续部署可以自动将应用发布到生产环境。
3.2 整体架构
3.2.1 功能应用
3.2.2 数据模型
3.3 核心功能
3.3.1 资源创建
3.3.1.1 创建管理员账号
3.3.1.2 创建环境(Environment)
Environment models after various environments in the development pipeline such as test, staging, prod. Most of the time, there is a 1:1 mapping between Environment and the real environment.
Most of the time, Owners and DBAs work with the Environment.
默认仅有测试(Test)、正式环境(Prod)
按需创建环境,如开发环境(Dev)并设置相应审核策略及备份方案
编排环境顺序
3.3.1.3 创建实例(Instance)
Database Instance or simply Instance models after a single database instance which is usually accessed via a host:port address. A typical database instance could be your on-premises MySQL instance, an AWS RDS instance etc. Each Database Instance belongs to an Environment
Most of the time, Owners and DBAs work with the Database Instance.
3.3.1.4 创建数据库(Database)
Database refers to a single database from a Database Instance. A database is the one created by 'CREATE DATABASE xxx'. A database always belongs to a single Project.
Most of the time, Developers and DBAs work with the Database.
添加既存数据库至项目
3.3.1.5 创建项目(Project)
Project is a logic unit to model a team effort. It's similar to the project concept in other dev tools such as Jira, GitLab. Project is the container to group logically related Databases, Issues and Users together. In Bytebase, ADatabase or anIssue always belongs to a single Project. Project is also the peering entity with the VCS repository to setup version control workflow.
Most of the time, Developers work with the Project.
3.3.1.6 提交议题(Issue)
Issue represents a specific collaboration activity between Developer and DBA such as creating a database, altering a schema. It's similar to the issue concept in other issue management tools.
In Bytebase, Issue is optimized for database domain. An Issue always belongs to a Project. A single Issue is only dealing with a particular Database Instance (e.g. creating a database on a database instance). Except for creating database issue, most other issues are also associated with an existing Database (e.g. altering a table on a database).
Internally, the issue progression is represented by a Pipeline. A Pipeline contains multiple Stages, each usually corresponds to an Environment. A Stage contains multiple Tasks dealing with a specific database operation such as altering table. A single Task can run multiple times (e.g. failed first and then retry). Each run is represented by a Task Run.
Rollback
3.3.1.7 创建成员
3.3.1.8 关联项目
3.3.2 UI workflow
Classic SQL Review workflow where the developer submits a SQL review ticket directly from Bytebase and waits for the assigned DBA or peer developer to review. Bytebase applies the SQL change after review approved.
3.3.3 Version control workflow
The VCS Integration is a 4-step setup. You can check this demo issue created by observing the code commit to see what it looks like after the setup.
3.3.3.1 Add Git Provider
This can only be performed by the "Workspace Owner" with the help of the selected Git provider instance admin. It only needs to be configured once for each Git provider.
3.3.3.2 Enable Version Control Workflow in Project
Configure project to use "Version control workflow" and link the project with a repository from the Git provider configured in Step 1. This can only be performed by the "Project Owner".
3.3.3.3 Name and Organize Schema Files
Organize the repository schema files according to the configured base directory and file path template in step 2. Afterwards, the file changes can be observed and identified by Bytebase to apply the schema changes to the corresponding database.
The default file path template is {{ENV_NAME}}/{{DB_NAME}}__{{VERSION}}__{{TYPE}}__{{DESCRIPTION}}.sql
Let's say the base directory is bytebase :
- An example file path for normal migration type:bytebase/env1/db1__202101131000__migrate__create_tablefoo_for_bar.sql
- An example file path for baseline migration type:bytebase/env1/db1__202101131000__baseline__create_tablefoo_for_bar.sql
3.3.3.4 Create the First Baseline Migration
To bootstrap the VCS integration, Bytebase needs to know the current schema of the corresponding live database. This is achieved by using a baseline migration script which includes the entire schema of that live database. The first migration script after the setup should always be a baseline migration script so that Bytebase can establish the baseline of the current schema in the corresponding live database.
3.4 拓展功能
3.4.1 SQL Review
3.4.2 异常中心(Anomaly Center)
3.4.3 消息通知(webhook)
Slack
Discord
Microsoft Teams
DingTalk(钉钉)
Feishu(飞书)
WeCom(企业微信)
WeCom does not provide its own official guide. Please follow this similar setup from Tencent Cloud instead.
3.4.4 备份归档(Backup and Restore)
针对环境设置备份策略
随时可从归档(archive)菜单进行恢复
3.5 相关规范
3.5.1 File Path Template
Bytebase allows user to customize the file path of the schema file. This file path is relative to the base directory.
The default file path template is {{ENV_NAME}}/{{DB_NAME}}__{{VERSION}}__{{TYPE}}__{{DESCRIPTION}}.sql
Let's say the base directory is bytebase :
- An example file path for normal migration typebytebase/env1/db1__202101131000__migrate__create_tablefoo_for_bar.sql
- An example file path for baseline migration type:bytebase/env1/db1__202101131000__baseline__create_tablefoo_for_bar.sql
3.5.2 Supported Placeholders
- All placeholder can contain one or more characters in [a-zA-Z0-9+-=/_#?!$. ] (whitespace is also allowed)
- To improve readability, we recommend to use separator between different placeholders and one common separator is __ (two underscores).
3.5.3 Version (Required)
Version can be an arbitrary string as long as it's unique among all SQL files. Bytebase uses the alphabetical order of the version part to determine the order of the SQL file to apply. A common practice is to use timestamp like YYYYMMDDHHMMSS or v1, v2 as the version name.
3.5.4 Schema Path Template
The default schema path template is {{ENV_NAME}}/.{{DB_NAME}}__LATEST.sql
Let's say the base directory is bytebase
- An example schema path is bytebase/env1/.db1__LATEST.sql
4.前置条件
4.1 环境准备
项目 |
资源 |
GitLab |
|
部署服务器 |
10.0.53.29 |
4.2 服务依赖
服务器安装docker(10.0.53.29)
开放指定端口(8084)
5.本地运行
5.1 调试运行
5.1.1 docker部署
docker run -d --init --name bytebase --restart always --add-host host.docker.internal:host-gateway --publish 8084:8084 --volume ~/.bytebase/data:/var/opt/bytebase bytebase/bytebase:0.8.1 --data /var/opt/bytebase --host http://localhost --port 8084
5.1.2 开放端口
firewall-cmd --zone=public --add-port=8084/tcp --per && firewall-cmd --reload
5.2 结果展示
6.实施部署
6.1 docker部署
6.1.2 启动命令
When running on docker, the --publish {{hostport}}:{{containerport}} and the ---port flag must be the same. Like the example below, all 3 ports are 5678: --publish 5678:5678 --port 5678
docker run --init --name bytebase --restart always --add-host host.docker.internal:host-gateway --publish 5678:5678 --volume ~/.bytebase/data:/var/opt/bytebase bytebase/bytebase:0.8.1 --data /var/opt/bytebase --host http://localhost --port 5678
6.1.2 访问使用
Bytebase will then start on http://localhost:8080 and store its data under ~/.bytebase/data (Check Server Startup Options for other startup options).
Open http://localhost:8080 in you browser and create the admin account.
6.2 源码部署
6.2.1 前置条件
1.Install Yarn
2.Install Go, Bytebaes requires Go >= 1.16
6.2.2 构建部署
Download source code from GitHub, then go to the source root directory and run:
$ scripts/build.sh [<<out_directory>>]
If out_directoryis not specified, the default directory is ./bytebase-build
Suppose you run scripts/build.sh foo After build completes, run:
$ foo/bytebase --host http://localhost --port 8080
(check Server Startup Options for other startup options)
6.2.3 访问使用
Open http://localhost:8080 in you browser and create the admin account.
7.常见问题
7.1 Which database engines are supported?
Bytebase currently supports MySQL, PostgreSQL, TiDB, ClickHouse and Snowflake. We may add other open source databases in the future. On the other hand, we do not plan to support any commercial databases such as Oracle, SQL Server.
7.2 Which versions of each database engine are supported?
Bytebase officially supports the following major versions for each supported database engine
- MySQL - 8.0 and 5.7
- PostgreSQL - 12.0, 13.0, 14.0
- TiDB - 5.0
- Snowflake
- ClickHouse - 21.0
Bytebase usually works fine with older database versions, we just won't support features specific to those older versions.
7.3 Which version control systems (VCS) and providers are supported?
Bytebase only supports Git based VCS. It currently supports GitLab Enterprise Edition (EE) and Community Edition (CE), we plan to support more Git providers roughly in the following order:
- GitHub Enterprise
- GitLab.com
- GitHub.com
7.4 System requirements
Bytebase is lightweight and has no external dependency. For normal workload, it consumes 10MB ~ 20MB memory and can run on the lowest tier machine from any cloud provider.
7.5 How do you make money?
Bytebase is still in early stage, so everything is free to use. We plan to keep a free community version and offer an additional paid team plan.
8.归纳总结
整体功能完善,核心功能包含SQL审核、版本控制、消息通知、备份恢复等,但与既存工具CloudQuery、Flyway存在部分功能重叠,且暂无中文版本(仅英文版本),全流程执行需要各网络互通,产品由 Google 和蚂蚁集团团队创建,目前文档资料相对较少,实际试用可能存在一定成本。
因为Bytebase既然想成为业界的标杆产品,所以一开始瞄准的就是全球市场,所以也是和全球市场上的标杆产品做对比。而Bytebase的代码库里除了某人TODO的Github用户名拼音外,应该也是看不到任何一行中文的。
Bytebase 宣布,近期已完成由经纬创投独家投资的300万美元天使轮融资,现阶段产品处于初步推广阶段,Bytebase 代码已于今年7月全部开源,采用 Apache 2.0 证书。自开源以来,也已陆续收到不错的反馈,数次登上 GitHub trending 榜。
原创团队会继续打磨产品,也计划在年内推出团队付费版,明年推出企业版,立志把Bytebase打造成一款重视开发者体验,可扩展的开源标准化工具,成为整个业界在开发者和DBA协同领域的标杆产品。
9.参考资料
官方文档:https://docs.bytebase.com

DAMO开发者矩阵,由阿里巴巴达摩院和中国互联网协会联合发起,致力于探讨最前沿的技术趋势与应用成果,搭建高质量的交流与分享平台,推动技术创新与产业应用链接,围绕“人工智能与新型计算”构建开放共享的开发者生态。
更多推荐
所有评论(0)