citus 官方文档很不错,资料很全,同时包含一个多租户应用的文档,所以运行下,方便学习
环境准备
使用docker-compose 运行,同时集成了graphql 引擎,很方便
- docker-compose 文件
version: '2.1'services: graphql-engine: image: hasura/graphql-engine:v1.0.0-alpha26 ports: - "8080:8080" command: > /bin/sh -c " graphql-engine --database-url postgres://postgres@master/postgres serve --enable-console; " master: container_name: "${COMPOSE_PROJECT_NAME:-citus}_master" image: 'citusdata/citus:7.5.1' ports: ["${MASTER_EXTERNAL_PORT:-5432}:5432"] labels: ['com.citusdata.role=Master'] worker: image: 'citusdata/citus:7.5.1' labels: ['com.citusdata.role=Worker'] depends_on: { manager: { condition: service_healthy } } manager: container_name: "${COMPOSE_PROJECT_NAME:-citus}_manager" image: 'citusdata/membership-manager:0.2.0' volumes: ['/var/run/docker.sock:/var/run/docker.sock'] depends_on: { master: { condition: service_healthy } }
- 数据准备
curl https://examples.citusdata.com/tutorial/companies.csv > companies.csvcurl https://examples.citusdata.com/tutorial/campaigns.csv > campaigns.csvcurl https://examples.citusdata.com/tutorial/ads.csv > ads.csv
- 创建表
CREATE TABLE companies ( id bigint NOT NULL, name text NOT NULL, image_url text, created_at timestamp without time zone NOT NULL, updated_at timestamp without time zone NOT NULL);CREATE TABLE campaigns ( id bigint NOT NULL, company_id bigint NOT NULL, name text NOT NULL, cost_model text NOT NULL, state text NOT NULL, monthly_budget bigint, blacklisted_site_urls text[], created_at timestamp without time zone NOT NULL, updated_at timestamp without time zone NOT NULL);CREATE TABLE ads ( id bigint NOT NULL, company_id bigint NOT NULL, campaign_id bigint NOT NULL, name text NOT NULL, image_url text, target_url text, impressions_count bigint DEFAULT 0, clicks_count bigint DEFAULT 0, created_at timestamp without time zone NOT NULL, updated_at timestamp without time zone NOT NULL);
- 添加表关系
ALTER TABLE companies ADD PRIMARY KEY (id);ALTER TABLE campaigns ADD PRIMARY KEY (id, company_id);ALTER TABLE ads ADD PRIMARY KEY (id, company_id);
citus 分布式处理
- 添加数据分布式表功能
很方便,就是select 语句,调用函数即可
SELECT create_distributed_table('companies', 'id');SELECT create_distributed_table('campaigns', 'company_id');SELECT create_distributed_table('ads', 'company_id');
- 导入数据
citus 环境起来之后就可以使用功能导入数据了
- 效果
- 一个json 查询
SELECT campaigns.id, campaigns.name, campaigns.monthly_budget, sum(impressions_count) as total_impressions, sum(clicks_count) as total_clicksFROM ads, campaignsWHERE ads.company_id = campaigns.company_idAND campaigns.company_id = 5AND campaigns.state = 'running'GROUP BY campaigns.id, campaigns.name, campaigns.monthly_budgetORDER BY total_impressions, total_clicks;
- 效果
数据模型说明
实际上上面的核心是创建分布式表,使用的create_distributed_table,同时定义了,多租户的数据隔离id company_id
后边的操作都是基本的sql 操作,后边会有citus 多租户应用开发的一些好的实践介绍。参考资料