Postgresql

主伺服器上的慢查詢在熱備用上執行得很快

  • May 3, 2018

我有一個具有 64GB RAM 的主 PostgreSQL 9.3.4 伺服器,它使用流複製複製到具有 32GB RAM 的熱備用伺服器。我的問題如下:我檢測到一個查詢需要很長時間才能在我的主伺服器上執行,但在備用伺服器上執行速度非常快。我對查詢進行了解釋分析:

EXPLAIN ANALYZE
SELECT this_.id AS id1_31_0_,
      this_.nom AS nom2_31_0_,
      this_.nom_slug AS nom3_31_0_,
      this_.descripcio AS descripc4_31_0_,
      this_.url AS url5_31_0_,
      this_.data_captura AS data6_31_0_,
      this_.data_publicacio AS data7_31_0_,
      this_.propietari AS propieta8_31_0_,
      this_.privacitat AS privacit9_31_0_,
      this_.desnivellpujada AS desnive10_31_0_,
      this_.desnivellbaixada AS desnive11_31_0_,
      this_.longitud AS longitu13_31_0_,
      this_.beginpoint AS beginpo14_31_0_,
      this_.endpoint AS endpoin15_31_0_,
      this_.caixa3d AS caixa16_31_0_,
      this_.pic_id AS pic17_31_0_,
      this_.skill AS skill18_31_0_,
      this_.spatial_type AS spatial19_31_0_,
      this_.tags_cached AS tags20_31_0_,
      this_.images_cached AS images21_31_0_,
      this_.ncomments AS ncommen22_31_0_,
      this_.group_order AS group23_31_0_,
      this_.author AS author24_31_0_,
      this_.proper_a AS proper25_31_0_,
      this_.duration AS duratio26_31_0_,
      this_.isloop AS isloop27_31_0_,
      this_.seo_country AS seo28_31_0_,
      this_.seo_region AS seo29_31_0_,
      this_.seo_place AS seo30_31_0_,
      this_.source AS source31_31_0_,
      this_.source_name AS source32_31_0_,
      this_.api_key AS api33_31_0_,
      this_.ratingui AS ratingu34_31_0_,
      this_.nratings AS nrating35_31_0_,
      this_.trailrank AS trailra36_31_0_,
      this_.ncoords AS ncoords37_31_0_,
      this_.egeom AS egeom38_31_0_,
      this_.elevels AS elevels39_31_0_,
      this_.elevations AS elevati40_31_0_,
      this_.nphotoswpts AS nphotos41_31_0_,
      this_.nfavourited AS nfavour42_31_0_,
      this_.ncompanions AS ncompan43_31_0_,
      this_.group_id AS group44_31_0_
FROM spatial_artifact this_
WHERE this_.group_id IS NULL
 AND this_.propietari=7649
ORDER BY this_.id DESC LIMIT 20

–PRIMARY SERVER(解釋分析輸出)

"Limit  (cost=0.43..22734.71 rows=20 width=604) (actual time=1804.124..293469.085 rows=20 loops=1)"
"  ->  Index Scan Backward using "PK_SPATIAL_ARTIFACT" on spatial_artifact this_  (cost=0.43..7776260.84 rows=6841 width=604) (actual time=1804.121..293469.056 rows=20 loops=1)"
"        Filter: ((group_id IS NULL) AND (propietari = 7649))"
"        Rows Removed by Filter: 2848286"
"Total runtime: 293469.135 ms"

–STANDBY SERVER(解釋分析輸出)

"Limit  (cost=23533.73..23533.78 rows=20 width=604) (actual time=2.566..2.569 rows=20 loops=1)"
"  ->  Sort  (cost=23533.73..23550.83 rows=6841 width=604) (actual time=2.566..2.567 rows=20 loops=1)"
"        Sort Key: id"
"        Sort Method: top-N heapsort  Memory: 35kB"
"        ->  Index Scan using idx_own_spas on spatial_artifact this_  (cost=0.43..23351.70 rows=6841 width=604) (actual time=0.037..2.119 rows=618 loops=1)"
"              Index Cond: (propietari = 7649)"
"Total runtime: 2.612 ms"

我已經在我的表上執行了 ANALYZE 並在我的主伺服器上重新索引了索引 idx_own_spas,但它沒有幫助。

這是我的兩台伺服器的 postgresql 配置:

–主伺服器(postgresql.conf)

shared_buffers = 8GB    
work_mem = 42MB
maintenance_work_mem = 2GB
effective_cache_size = 44GB
random_page_cost = 4
seq_page_cost = 1
default_statistics_target = 100

–備用伺服器(postgresql.conf)

shared_buffers = 800MB
work_mem = 20MB
maintenance_work_mem = 128MB
effective_cache_size = 1024MB
random_page_cost = 4
seq_page_cost = 1
default_statistics_target = 100

EXPLAIN 成本估算似乎已關閉,但我不明白為什麼會這樣?你能對此有所了解嗎?謝謝你。


編輯:一些額外的資訊:

表 SQL:

CREATE TABLE spatial_artifact
(
 id integer NOT NULL,
 nom character varying(128),
 descripcio character varying(25000),
 data_captura timestamp without time zone NOT NULL DEFAULT now(),
 propietari integer NOT NULL,
 privacitat character varying(7) NOT NULL DEFAULT 'private'::character varying,
 data_publicacio timestamp without time zone NOT NULL DEFAULT now(),
 desnivellpujada integer,
 desnivellbaixada integer,
 longitud integer,
 geometria geometry,
 spatial_type smallint NOT NULL DEFAULT 0,
 tags_cached character varying(1024),
 images_cached text,
 nviews integer DEFAULT 0,
 ndowngpx integer DEFAULT 0,
 nratings smallint DEFAULT 0,
 ndownkml integer DEFAULT 0,
 rating numeric(3,2) DEFAULT 2.50,
 caixa3d box3d NOT NULL,
 bbox geometry NOT NULL,
 group_id integer,
 group_order smallint DEFAULT 0,
 pic_id smallint DEFAULT 0,
 skill smallint DEFAULT 2,
 url character varying(256),
 ncomments smallint DEFAULT 0,
 author character varying(50),
 proper_a character varying(200) DEFAULT NULL::character varying,
 beginpoint geometry NOT NULL,
 endpoint geometry,
 trailrank real DEFAULT 0,
 seo_country character varying(3),
 seo_region character varying(15),
 seo_place integer,
 nfavourited integer NOT NULL DEFAULT 0,
 togoogle integer NOT NULL DEFAULT 1,
 source smallint DEFAULT 0,
 ncoords integer DEFAULT 0,
 egeom character varying,
 elevels character varying,
 duration bigint DEFAULT 0,
 isloop integer DEFAULT 0,
 nom_slug character varying(128),
 source_name character varying(50),
 api_key character varying,
 elevations text,
 ratingui numeric(3,2) DEFAULT 0.00,
 wptphotos boolean DEFAULT false,
 ncompanions integer DEFAULT 0,
 nphotoswpts integer DEFAULT 0,
 CONSTRAINT "PK_SPATIAL_ARTIFACT" PRIMARY KEY (id),
 CONSTRAINT "FK_PICTOGRAM" FOREIGN KEY (pic_id)
     REFERENCES pictogram (id) MATCH SIMPLE
     ON UPDATE RESTRICT ON DELETE RESTRICT,
 CONSTRAINT "FK_SPATIAL_ARTIFACT_GROUP" FOREIGN KEY (group_id)
     REFERENCES spatial_artifact (id) MATCH SIMPLE
     ON UPDATE NO ACTION ON DELETE NO ACTION,
 CONSTRAINT "FK_SPATIAL_ARTIFACT_USUARI" FOREIGN KEY (propietari)
     REFERENCES usuari (id) MATCH SIMPLE
     ON UPDATE RESTRICT ON DELETE RESTRICT,
 CONSTRAINT enforce_dims_bbox CHECK (st_ndims(bbox) = 2),
 CONSTRAINT enforce_dims_beginpoint CHECK (st_ndims(beginpoint) = 2),
 CONSTRAINT enforce_dims_endpoint CHECK (st_ndims(endpoint) = 2),
 CONSTRAINT enforce_dims_geometria CHECK (st_ndims(geometria) = 4),
 CONSTRAINT enforce_geotype_beginpoint CHECK (st_geometrytype(beginpoint) = 'ST_Point'::text OR beginpoint IS NULL),
 CONSTRAINT enforce_geotype_endpoint CHECK (st_geometrytype(endpoint) = 'ST_Point'::text OR endpoint IS NULL),
 CONSTRAINT enforce_geotype_geometria CHECK (st_geometrytype(geometria) = 'ST_Line'::text OR st_geometrytype(geometria) = 'ST_LineString'::text OR geometria IS NULL),
 CONSTRAINT enforce_srid_bbox CHECK (st_srid(bbox) = 4326),
 CONSTRAINT enforce_srid_beginpoint CHECK (st_srid(beginpoint) = 4326),
 CONSTRAINT enforce_srid_endpoint CHECK (st_srid(endpoint) = 4326),
 CONSTRAINT enforce_srid_geometria CHECK (st_srid(geometria) = 4326)
)
WITH (
 OIDS=FALSE
);

索引

CREATE INDEX idx_own_spas
 ON spatial_artifact
 USING btree
 (propietari)
 WHERE group_id IS NULL;

CREATE INDEX "FKI_SPATIAL_ARTIFACT_GROUP"
 ON spatial_artifact
 USING btree
 (group_id);


CREATE INDEX idx_seo_spatial_artifact
 ON spatial_artifact
 USING btree
 (seo_country COLLATE pg_catalog."default", seo_region COLLATE pg_catalog."default", seo_place);

CREATE INDEX idx_spatial_artifact_data_publicacio
 ON spatial_artifact
 USING btree
 (data_publicacio)
 WHERE proper_a IS NULL AND privacitat::text = 'public'::text AND (spatial_type = ANY (ARRAY[0, 3]));

CREATE INDEX idx_spatial_artifact_data_publicacio_only
 ON spatial_artifact
 USING btree
 (data_publicacio);

CREATE INDEX idx_spatial_artifact_propietari
 ON spatial_artifact
 USING btree
 (propietari);

受影響表的大小:

select pg_size_pretty(pg_total_relation_size('spatial_artifact'));
pg_size_pretty 
----------------
98 GB
(1 row)


select relpages,reltuples, relname from pg_class where relname in ('idx_own_spas ','spatial_artifact','PK_SPATIAL_ARTIFACT');

relpages |  reltuples  |       relname       
----------+-------------+---------------------
   49217 | 9.70814e+06 | PK_SPATIAL_ARTIFACT
   14329 | 5.22472e+06 | idx_own_spas
 3423824 | 1.11087e+07 | spatial_artifact
(3 rows)

硬體 就硬體而言,主伺服器的 CPU、RAM 和磁碟都比備用伺服器好。例如,主伺服器的數據庫保存在 SSD 上,而備用伺服器執行在磁碟上。

系統統計

據我所知,沒有什麼不尋常的:

頂部(按 CPU)

top - 15:26:05 up 667 days,  4:36,  1 user,  load average: 1.86, 2.24, 2.25
Tasks: 158 total,   2 running, 156 sleeping,   0 stopped,   0 zombie
%Cpu(s): 13.6 us,  0.5 sy,  0.0 ni, 84.6 id,  1.1 wa,  0.0 hi,  0.2 si,  0.0 st
KiB Mem:  66082816 total, 64774132 used,  1308684 free,   322604 buffers
KiB Swap: 33554428 total,     3664 used, 33550764 free, 16937780 cached

 PID USER      PR  NI  VIRT  RES  SHR S  %CPU %MEM    TIME+  COMMAND
17356 root      20   0 19.5g  15g 2.3g S   108 25.3   4837:05 java
4540 postgres  20   0 8481m 264m 250m R    13  0.4   0:02.24 postgres
5592 postgres  20   0 8480m 185m 172m S     6  0.3   0:01.37 postgres
6787 root      20   0 24568 1440 1060 R     6  0.0   0:00.01 top
31897 root      20   0 36632 4528  976 S     6  0.0 215:24.83 nginx
   1 root      20   0 10648   32    0 S     0  0.0   5:51.09 init
   2 root      20   0     0    0    0 S     0  0.0   0:01.40 kthreadd
   3 root      20   0     0    0    0 S     0  0.0  26:29.26 ksoftirqd/0
   6 root      rt   0     0    0    0 S     0  0.0   2:37.09 migration/0
   7 root      rt   0     0    0    0 S     0  0.0   2:16.42 watchdog/0
   8 root      rt   0     0    0    0 S     0  0.0   3:00.24 migration/1
  10 root      20   0     0    0    0 S     0  0.0   6:29.60 ksoftirqd/1
  12 root      rt   0     0    0    0 S     0  0.0   1:51.26 watchdog/1
  13 root      rt   0     0    0    0 S     0  0.0   1:31.24 migration/2
  15 root      20   0     0    0    0 S     0  0.0   5:37.32 ksoftirqd/2
  16 root      rt   0     0    0    0 S     0  0.0   1:34.45 watchdog/2
  17 root      rt   0     0    0    0 S     0  0.0   1:34.16 migration/3
  19 root      20   0     0    0    0 S     0  0.0   4:38.17 ksoftirqd/3
  20 root      rt   0     0    0    0 S     0  0.0   1:28.69 watchdog/3
  21 root      rt   0     0    0    0 S     0  0.0   1:29.48 migration/4
  23 root      20   0     0    0    0 S     0  0.0   4:07.97 ksoftirqd/4
  24 root      rt   0     0    0    0 S     0  0.0   1:29.92 watchdog/4
  25 root      rt   0     0    0    0 S     0  0.0   1:24.09 migration/5

頂部(按記憶)

top - 15:27:43 up 667 days,  4:37,  1 user,  load average: 1.51, 2.01, 2.17
Tasks: 156 total,   1 running, 155 sleeping,   0 stopped,   0 zombie
%Cpu(s): 13.6 us,  0.5 sy,  0.0 ni, 84.6 id,  1.1 wa,  0.0 hi,  0.2 si,  0.0 st
KiB Mem:  66082816 total, 65083212 used,   999604 free,   331972 buffers
KiB Swap: 33554428 total,     3664 used, 33550764 free, 17188544 cached

 PID USER      PR  NI  VIRT  RES  SHR S  %CPU %MEM    TIME+  COMMAND
30270 root      20   0 28.4g  26g  932 S     0 42.7   2042:56 redis-server
17356 root      20   0 20.7g  15g 2.3g S   137 25.3   4839:24 java
8911 postgres  20   0 8436m 8.1g 8.1g S     0 12.9  18:49.76 postgres
30259 root      20   0 4058m 3.7g 1012 S     7  5.9   1981:01 redis-server
8912 postgres  20   0 8433m 1.8g 1.7g S     0  2.8  18:19.52 postgres
32263 postgres  20   0 8482m 550m 536m S     0  0.9   0:05.71 postgres
4540 postgres  20   0 8482m 456m 441m S     0  0.7   0:04.16 postgres
5592 postgres  20   0 8482m 312m 298m S     0  0.5   0:02.56 postgres
5593 postgres  20   0 8477m 305m 291m S     0  0.5   0:02.35 postgres
2810 postgres  20   0 8477m 296m 283m S     0  0.5   0:02.50 postgres
5591 postgres  20   0 8477m 230m 216m S     0  0.4   0:01.48 postgres
8908 postgres  20   0 8429m 216m 215m S     0  0.3   3:13.31 postgres
4556 postgres  20   0 8477m 191m 178m S     0  0.3   0:01.16 postgres
5624 postgres  20   0 8480m 182m 169m S     7  0.3   0:00.74 postgres
5632 postgres  20   0 8476m  46m  37m S     0  0.1   0:00.11 postgres
17472 root      20   0  391m  42m 5868 S     0  0.1   1:56.15 java
5631 postgres  20   0 8457m  20m  16m S     0  0.0   0:00.03 postgres
8913 postgres  20   0 8433m  17m  16m S     0  0.0  21:01.87 postgres
15261 root      20   0 54496 9560 1792 S     0  0.0   5:45.10 munin-node
2338 root      20   0  117m 5420  436 S     0  0.0 238:08.17 rsyslogd
31897 root      20   0 36632 4528  976 S     0  0.0 215:25.31 nginx
28132 root      20   0 71372 3736 2844 S     0  0.0   0:00.02 sshd
31896 root      20   0 35628 3472  980 S     0  0.0 212:50.13 nginx
28512 xorxio1   20   0 21964 3340 1684 S     0  0.0   0:00.04 bash
31898 root      20   0 35404 3336  976 S     7  0.0 221:58.88 nginx
8176 root      20   0 35024 3328 1700 S     0  0.0   0:00.03 nginx
31899 root      20   0 35404 3304  976 S     0  0.0 213:28.28 nginx
31900 root      20   0 35388 3304  976 S     0  0.0 210:39.00 nginx
30446 postfix   20   0 42624 3160 2004 S     0  0.0   0:05.03 tlsmgr
31903 root      20   0 35140 3048  968 S     0  0.0 247:17.16 nginx
31901 root      20   0 35172 3040  976 S     0  0.0 210:03.54 nginx
31902 root      20   0 35100 3040  976 S     0  0.0 261:43.64 nginx
19032 postgres  20   0 8434m 2836 1568 S     0  0.0   0:44.81 postgres
29894 postfix   20   0 40356 2788 1808 S     0  0.0  12:14.27 qmgr
8914 postgres  20   0 8434m 2420 1364 S     0  0.0   1:11.83 postgres
8909 postgres  20   0 20492 2400  760 S     0  0.0   0:42.46 postgres

vmstat 1

procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
3  0   3664 1115260 308108 16988420    0    0   139   424    0    0 14  1 85  1
13  0   3664 1109744 308156 16988740    0    0  2036  3536 5340 4782 17  1 79  3
1  0   3664 1113616 308168 16989100    0    0  3540   296 4986 4691 14  0 84  2
2  0   3664 1107704 308608 16992984    0    0  2100   220 6467 5861 14  1 84  2
2  0   3664 1105884 308652 16994704    0    0  1804   728 5163 4835 15  1 82  2
2  0   3664 1103364 308652 16997228    0    0  2288    40 4597 4601  9  1 89  1
1  0   3664 1094156 308800 17006684    0    0  1808   236 9617 8739 18  1 79  2
1  0   3664 1090656 308840 17008044    0    0  2196   284 7062 5923 26  1 72  1
0  0   3664 1089352 308852 17009588    0    0  1048   276 4306 3951  9  1 89  1
3  0   3664 1087740 308852 17011224    0    0  1600     0 4533 4066  6  1 92  1

sar -dp 1

03:16:09 PM       DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
03:16:10 PM       sdb    124.00   2432.00     32.00     19.87      0.02      0.13      0.13      1.60
03:16:11 PM       sdb    150.00   3312.00    160.00     23.15      0.02      0.11      0.11      1.60
03:16:12 PM       sdb    144.00   3296.00    368.00     25.44      0.05      0.36      0.28      4.00
03:16:13 PM       sdb    124.00   3104.00     32.00     25.29      0.03      0.26      0.23      2.80
03:16:14 PM       sdb    139.00   3312.00     16.00     23.94      0.04      0.29      0.26      3.60
03:16:15 PM       sdb     90.00   1472.00    120.00     17.69      0.02      0.18      0.18      1.60
03:16:16 PM       sdb    145.00   3616.00    128.00     25.82      0.04      0.28      0.28      4.00

sar -q 1(CPU 是具有 6 個物理核心的 Intel Xeon E5)

03:18:29 PM   runq-sz  plist-sz   ldavg-1   ldavg-5  ldavg-15   blocked
03:18:30 PM         2       657      2.96      2.56      2.26         0
03:18:31 PM         1       657      2.96      2.56      2.26         0
03:18:32 PM         1       657      2.96      2.56      2.26         0
03:18:33 PM         1       657      2.96      2.56      2.26         0
03:18:34 PM         2       657      2.72      2.52      2.25         0
03:18:35 PM         3       657      2.72      2.52      2.25         0
03:18:36 PM         1       657      2.72      2.52      2.25         1
03:18:37 PM         1       657      2.72      2.52      2.25         0
03:18:38 PM         2       657      2.72      2.52      2.25         0

sar -w 1

03:20:47 PM    proc/s   cswch/s
03:20:48 PM      0.00   4902.00
03:20:49 PM     15.00   5289.00
03:20:50 PM      0.00   4663.00
03:20:51 PM      2.00   5223.00
03:20:52 PM      1.00   5124.00
03:20:53 PM      0.00   4426.00
03:20:54 PM      0.00   4327.00
03:20:55 PM      1.00   3936.00
03:20:56 PM      0.00   4369.00
03:20:57 PM     28.00   6492.00
03:20:58 PM      1.00   4465.00
03:20:59 PM      0.00   4028.00

糟糕的計劃可能是許多問題的結果。這意味著有很多方法可以解決它。我的猜測是問題的高潮導致兩個計劃在成本上看起來彼此錯誤地接近,然後effective_cache_size主從之間的記憶體設置差異(可能)是壓垮駱駝的稻草並提示主使用錯誤的。

解決它的最簡單的起點可能是這個:

"        ->  Index Scan using idx_own_spas on spatial_artifact this_  (cost=0.43..23351.70 rows=6841 width=604) (actual time=0.037..2.119 rows=618 loops=1)"
"              Index Cond: (propietari = 7649)"

這很簡單,因為掃描執行到完成,並且只有一個索引限定,但估計值相差 10 倍。由於您的統計數據是最新的,因此問題一定是 default_statistics_target 不足。您可以全域增加它,或者只propietari使用SET STATISTICS為列增加它,然後重新分析表。如果這個估計問題得到解決,主人幾乎肯定會切換到那個更快的計劃,因為它看起來會更快。

另一個問題是它認為滿足條件((group_id IS NULL) AND (propietari = 7649))的行隨機分佈在 的值上this_.id,但顯然不是。這導致它認為 LIMIT 子句將比實際執行得更快,因此使緩慢的計劃看起來比實際更快。不幸的是,您無法做任何簡單的事情來使這種估計更好。但是,如果您創建了正確的索引,您可以提供第三個計劃,它實際上比目前的兩個計劃中的任何一個都快得多,而且看起來也快得多。

該索引將是:

CREATE INDEX idx_own_spas
 ON spatial_artifact
 USING btree
 (propietari,id)
 WHERE group_id IS NULL;

您可以使用此索引替換現有的類似索引但沒有id列,因此它不應該佔用太多額外的空間或維護成本。

最後,您可以effective_cache_size在每個會話中進行更改,因此很容易在主伺服器和從伺服器上更改此值,以查看是否確實將計劃從好的計劃轉變為壞計劃。但如果是這樣的話,那仍然沒有告訴你該怎麼做。僅僅因為某個值導致一個查詢的規劃器選擇錯誤並不意味著它沒有為整個系統正確設置。所以希望早期的兩種方法中的一種能夠奏效。

引用自:https://dba.stackexchange.com/questions/144168