在线时间:8:00-16:00
迪恩网络APP
随时随地掌握行业动态
扫描二维码
关注迪恩网络微信公众号
无序uuid对数据库的影响由于最近在做超大表的性能测试,在该过程中发现了无序uuid做主键对表插入性能有一定影响。结合实际情况发现当表的数据量越大,对表插入性能的影响也就越大。 测试环境PostgreSQL创建插入脚本,测试各种情况的tps。 数据库版本:PostgreSQL 10.4 (ArteryBase 5.0.0, Thunisoft) 操作系统配置:CentOS Linux release 7 ,32GB内存,8 cpu 测试参数:pgbench -M prepared -r -n -j 8 -c 8 -T 60 -f /opt/thunisoft/pgbench_uuid_v4.sql -U sa pgbenchdb 空表,1000w数据,5000w数据,一亿数据的各种主键测试。 测试无序的uuid,有序的uuid,序列,有普通btree,有唯一索引和没有主键的情况 测试1.创建表 --无序的uuid pgbenchdb=# create table test_uuid_v4(id char(32) primary key); CREATE TABLE --有序的uuid pgbenchdb=# create table test_time_nextval(id char(32) primary key); CREATE TABLE --递增序列 pgbenchdb=# create table test_seq_bigint(id int8 primary key); CREATE TABLE --创建序列 create sequence test_seq start with 1 ; 2.测试脚本 --测试无序uuid脚本 vi pgbench_uuid_v4.sql insert into test_uuid_v4 (id) values (replace(uuid_generate_v4()::text,'-','')); --测试有序uuid脚本 vi pgbench_time_nextval.sql insert into test_time_nextval (id) values (replace(uuid_time_nextval()::text,'-','')); --测试序列脚本 vi pgbench_seq_bigint.sql insert into test_seq_bigint (id) values (nextval('test_seq'::regclass)); 无序uuid,无数据情况 磁盘使用情况 avg-cpu: %user %nice %system %iowait %steal %idle 0.76 0.00 0.38 4.67 0.00 94.19 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sda 0.00 0.00 0.00 96.00 0.00 2048.00 42.67 1.02 10.67 0.00 10.67 10.33 99.20 dm-0 0.00 0.00 0.00 96.00 0.00 2048.00 42.67 1.02 10.66 0.00 10.66 10.32 99.10 dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 tps: [thunisoft@localhost thunisoft]$ pgbench -M prepared -r -n -j 8 -c 8 -T 60 -f /opt/thunisoft/pgbench_uuid_v4.sql -U sa pgbenchdb transaction type: /opt/thunisoft/pgbench_uuid_v4.sql scaling factor: 1 query mode: prepared number of clients: 8 number of threads: 8 duration: 60 s number of transactions actually processed: 53494 latency average = 8.974 ms tps = 891.495404 (including connections establishing) tps = 891.588967 (excluding connections establishing) script statistics: - statement latencies in milliseconds: 9.006 insert into test_uuid_v4 (id) values (replace(uuid_generate_v4()::text,'-','')); 无数据情况下,tps 类别 | 第一次 | 第二次 | 第三次 | 平均值(tps) |%util |await ---------------+---------+---------+---------+---------+-------+------- 无序uuid | 919 | 907 | 891 | 906 | 99.2% | 10.66 有序uuid | 985 | 882 | 932 | 933 | 98.7% | 4.4 序列 | 1311 | 1277 | 1280 | 1289 | 97.5% | 3.4 向表里面初始化100w数据 pgbenchdb=# insert into test_uuid_v4 (id) select replace(uuid_generate_v4()::text,'-','') from generate_series(1,1000000); INSERT 0 1000000 Time: 43389.817 ms (00:43.390) pgbenchdb=# insert into test_time_nextval (id) select replace(uuid_time_nextval()::text,'-','') from generate_series(1,1000000); INSERT 0 1000000 Time: 30585.134 ms (00:30.585) pgbenchdb=# insert into test_seq_bigint select generate_series (1,1000000); INSERT 0 1000000 Time: 9818.639 ms (00:09.819) 无序uuid插入100w需要43s,有序需要30s,序列需要10s。 插入一百万数据后的tps 类别 | 第一次 | 第二次 | 第三次 | 平均值(tps) |%util |await ---------------+---------+---------+---------+---------+-------+------- 无序uuid | 355 | 440 | 302 | 365 | 98.8% | 13 有序uuid | 948 | 964 | 870 | 927 | 97.2% | 4.0 序列 | 1159 | 1234 | 1115 | 1169 | 96.6% | 3.5 插入一千万数据后的tps 类别 | 第一次 | 第二次 | 第三次 | 平均值(tps) |%util |await ---------------+---------+---------+---------+---------+-------+------- 无序uuid | 260 | 292 | 227 | 260 | 99.2% | 16.8 有序uuid | 817 | 960 | 883 | 870 | 97.7% | 3.9 序列 | 1305 | 1261 | 1270 | 1278 | 96.8% | 3.0 插入五千万数据后 向表中插入5kw数据,并且添加主键 pgbenchdb=# insert into test_time_nextval (id) select replace(uuid_time_nextval()::text,'-','') from generate_series(1,50000000); INSERT 0 50000000 Time: 453985.318 ms (07:33.985) pgbenchdb=# insert into test_seq_bigint select generate_series (1,50000000); INSERT 0 50000000 Time: 352206.160 ms (05:52.206) pgbenchdb=# insert into test_uuid_v4 (id) select replace(uuid_generate_v4()::text,'-','') from generate_series(1,50000000); INSERT 0 50000000 Time: 1159689.338 ms (00:19:19.689) 在无主键情况下,插入五千万数据,有序uuid耗时7分钟,序列耗时6分钟,而无序uuid耗时接近20分钟。 pgbenchdb=# alter table test_uuid_v4 add primary key ("id"); ALTER TABLE Time: 845199.296 ms (14:05.199) pgbenchdb=# alter table test_time_nextval add primary key ("id"); ALTER TABLE Time: 932151.103 ms (15:32.151) pgbenchdb=# alter table test_seq_bigint add primary key ("id"); ALTER TABLE Time: 148138.871 ms (02:28.139) pgbenchdb=# select pg_size_pretty(pg_total_relation_size('test_uuid_v4')); pg_size_pretty ---------------- 6072 MB (1 row) Time: 0.861 ms pgbenchdb=# select pg_size_pretty(pg_total_relation_size('test_time_nextval')); pg_size_pretty ---------------- 6072 MB (1 row) Time: 0.942 ms pgbenchdb=# select pg_size_pretty(pg_total_relation_size('test_seq_bigint')); pg_size_pretty ---------------- 2800 MB (1 row) Time: 0.699 ms 插入5kw后 类别 | 第一次 | 第二次 | 第三次 | 平均值(tps) |%util |await ---------------+---------+---------+---------+---------+-------+------- 无序uuid | 162 | 163 | 163 | 163 | 99.6% | 18.4 有序uuid | 738 | 933 | 979 | 883 | 97.7% | 3.9 序列 | 1132 | 1264 | 1265 | 1220 | 96.8% | 3.5 插入1亿条数据后 类别 | 第一次 | 第二次 | 第三次 | 平均值(tps) |%util |await ---------------+---------+---------+---------+---------+-------+------- 无序uuid | 121 | 131 | 143 | 131 | 99.6% | 28.2 有序uuid | 819 | 795 | 888 | 834 | 99.2% | 28.7 序列 | 1193 | 1115 | 1109 | 1139 | 96.8% | 11.3 普通btree索引上面测了无序uuid,1kw情况下,有主键的tps是260,无主键的tps是1234。尝试测试普通的索引,和唯一索引tps --创建普通索引 pgbenchdb=# create index i_test_uuid_v4_id on test_uuid_v4(id); CREATE INDEX Time: 316367.010 ms (05:16.367) --创建普通索引后 [thunisoft@localhost thunisoft]$ pgbench -M prepared -r -n -j 8 -c 8 -T 60 -f /opt/thunisoft/pgbench_uuid_v4.sql -U sa pgbenchdb transaction type: /opt/thunisoft/pgbench_uuid_v4.sql scaling factor: 1 query mode: prepared number of clients: 8 number of threads: 8 duration: 60 s number of transactions actually processed: 13308 latency average = 36.080 ms tps = 221.727391 (including connections establishing) tps = 221.749660 (excluding connections establishing) script statistics: - statement latencies in milliseconds: 38.512 insert into test_uuid_v4 (id) values (replace(uuid_generate_v4()::text,'-','')); --创建唯一索引 pgbenchdb=# drop index i_test_uuid_v4_id; DROP INDEX Time: 267.451 ms pgbenchdb=# create unique index i_test_uuid_v4_id on test_uuid_v4(id); CREATE INDEX Time: 153372.622 ms (02:33.373) [thunisoft@localhost thunisoft]$ pgbench -M prepared -r -n -j 8 -c 8 -T 60 -f /opt/thunisoft/pgbench_uuid_v4.sql -U sa pgbenchdb ^[[3~transaction type: /opt/thunisoft/pgbench_uuid_v4.sql scaling factor: 1 query mode: prepared number of clients: 8 number of threads: 8 duration: 60 s number of transactions actually processed: 13847 latency average = 34.693 ms tps = 230.593988 (including connections establishing) tps = 230.620469 (excluding connections establishing) script statistics: - statement latencies in milliseconds: 36.410 insert into test_uuid_v4 (id) values (replace(uuid_generate_v4()::text,'-','')); 无论是普通btree索引和唯一索引,都会影响插入的效率。 删除所有的主键索引--删除所有主键 alter table test_uuid_v4 drop constraint "test_uuid_v4_pkey"; alter table test_time_nextval drop constraint "test_time_nextval_pkey" ; alter table test_seq_bigint drop constraint "test_seq_bigint_pkey"; 1,--无序uuid:测试pgbench_uuid_v4.sql [thunisoft@localhost thunisoft]$ pgbench -M prepared -r -n -j 8 -c 8 -T 60 -f /opt/thunisoft/pgbench_uuid_v4.sql -U sa pgbenchdb transaction type: /opt/thunisoft/pgbench_uuid_v4.sql scaling factor: 1 query mode: prepared number of clients: 8 number of threads: 8 duration: 60 s number of transactions actually processed: 74109 latency average = 6.479 ms tps = 1234.842229 (including connections establishing) tps = 1235.042674 (excluding connections establishing) script statistics: - statement latencies in milliseconds: 6.112 insert into test_uuid_v4 (id) values (replace(uuid_generate_v4()::text,'-','')); 2、--有序uuid,测试pgbench_time_nextval.sql [thunisoft@localhost thunisoft]$ pgbench -M prepared -r -n -j 8 -c 8 -T 60 -f /opt/thunisoft/pgbench_time_nextval.sql -U sa pgbenchdb transaction type: /opt/thunisoft/pgbench_time_nextval.sql scaling factor: 1 query mode: prepared number of clients: 8 number of threads: 8 duration: 60 s number of transactions actually processed: 74027 latency average = 6.486 ms tps = 1233.364360 (including connections establishing) tps = 1233.482292 (excluding connections establishing) script statistics: - statement latencies in milliseconds: 6.186 insert into test_time_nextval (id) values (replace(uuid_time_nextval()::text,'-','')); 3、--序列,测试pgbench_seq_bigint.sql [thunisoft@localhost thunisoft]$ pgbench -M prepared -r -n -j 8 -c 8 -T 60 -f /opt/thunisoft/pgbench_seq_bigint.sql -U sa pgbenchdb transaction type: /opt/thunisoft/pgbench_seq_bigint.sql scaling factor: 1 query mode: prepared number of clients: 8 number of threads: 8 duration: 60 s number of transactions actually processed: 76312 latency average = 6.290 ms tps = 1271.832907 (including connections establishing) tps = 1272.124397 (excluding connections establishing) script statistics: - statement latencies in milliseconds: 5.916 insert into test_seq_bigint (id) values (nextval('test_seq'::regclass)); 删除主键约束后,三种情况下tps非常接近,都达到了1200+。 Btree索引,插入操作的平均tps对比 类别/平均tps | 无数据 | 一千万 | 五千万 | 一亿 | ---------------+---------+---------+---------+---------+ 无序uuid | 960 | 260 | 163 | 131 | 有序uuid | 933 | 870 | 883 | 834 | 序列 | 1289 | 1278 | 1220 | 1139 | 根据测试数据可以看出无序的uuid在数据到达1kw后插入数据的tps下降的非常厉害,而有序的uuid和递增序列下降的比较少。到一亿数据的tps有序uuid是无序的6倍,序列是无序uuid的9倍。 创建单独的表空间用来存储索引信息如果有多快磁盘那么可以将索引和数据分开存储,以此来加快写入的速度。 创建单独的索引空间: create tablespace indx_test owner sa location '/home/tablespace/index_test'; 指定索引存储目录: create index i_test_uuid_v4_id on test_uuid_v4 using btree(id) tablespace indx_test; 关于有序uuid测试使用的sequential-uuids插件,生成的有序uuid。 有序uuid的结构为(block ID; random data),实际上就是把数据拆成两部分,一部分自增,一部分随机。 提供了两种算法: 1.uuid_sequence_nextval(sequence regclass, block_size int default 65536, block_count int default 65536) 前缀为自增序列,如果块ID使用2字节存储,一个索引BLOCK里面可以存储256条记录(假设8K的BLOCK,一条记录包括uuid VALUE(16字节)以及ctid(6字节),所以一个索引页约存储363条记录(8000 /(16 + 6))) 2.uuid_time_nextval(interval_length int default 60, interval_count int default 65536) RETURNS uuid 默认每60秒内的数据的前缀是一样的,前缀递增1,到65535后循环。 使用uuid_time_nextval生成的有序uuid pgbenchdb=# select id from test_time_nextval; id ---------------------------------- a18b7dd0ca92b0b5c1844a402f9c6999 a18b540b8bbe0ddb2b6d0189b2e393c6 a18b83eb7320b0a90e625185421e065e a18bade4ff15e05dab81ecd3f4c2dee4 a18b79e41c3bc8d2d4ba4b70447e6b29 a18bdad18d9e0d2fa1d9d675bc7129f0 a18b13723ec7be9a2f1a3aec5345a88b a18bd9d866047aec69a064d30e9493d2 a18bd76e8c787c7464479502f381e6d7 a18ba5c0c966f81cfdbeff866618da8d ...... 有序uuid前四位有序,后面的随机生成。 结语1.关于有序的uuid,前4位是有序的,后面都是随机生成的。 2.在该环境中发现,无序uuid随着数据量的不断增大,tps下滑比较厉害。 3.由于btree索引的存在,无序的uuid会导致大量的离散io。导致磁盘使用率高。进而影响插入效率。随着表数据量的增大更加明显。 4.该测试是在普通的磁盘上面测试,并未在ssd上面测试。 5.如果要使用有序uuid,有多种实现方式,还需要考虑分布式情况下生成全局有序uuid。 以上就是postgresql无序uuid性能测试的详细内容,更多关于postgresql无序uuid性能测试的资料请关注极客世界其它相关文章! |
请发表评论