在 Postgres 中用於空間查詢的 3d 點數據的良好佈局？

January 19, 2018

如另一個問題所示，我處理了 3D 空間中的很多（>10,000,000）個點條目。這些點定義如下：

CREATE TYPE float3d AS (
 x real,
 y real,
 z real);

如果我沒記錯的話，需要 3*8 字節 + 8 字節填充（MAXALIGN是 8）來儲存這些點之一。有沒有更好的方法來儲存這種數據？在上述問題中，據說複合類型涉及相當多的成本。

我經常做這樣的空間查詢：

 SELECT t1.id, t1.parent_id, (t1.location).x, (t1.location).y, (t1.location).z,
        t1.confidence, t1.radius, t1.skeleton_id, t1.user_id,
        t2.id, t2.parent_id, (t2.location).x, (t2.location).y, (t2.location).z,
        t2.confidence, t2.radius, t2.skeleton_id, t2.user_id
 FROM treenode t1
      INNER JOIN treenode t2 ON
        (   (t1.id = t2.parent_id OR t1.parent_id = t2.id)
         OR (t1.parent_id IS NULL AND t1.id = t2.id))
       WHERE (t1.LOCATION).z = 41000.0
         AND (t1.LOCATION).x &gt; 2822.6
         AND (t1.LOCATION).x &lt; 62680.2
         AND (t1.LOCATION).y &gt; 33629.8
         AND (t1.LOCATION).y &lt; 65458.6
         AND t1.project_id = 1 LIMIT 5000;

像這樣的查詢大約需要 160 毫秒，但我想知道這是否可以減少。

這是結構用於的表格佈局：

   Column     |           Type           |                       Modifiers                    
---------------+--------------------------+-------------------------------------------------------
id            | bigint                   | not null default nextval('location_id_seq'::regclass)
user_id       | integer                  | not null
creation_time | timestamp with time zone | not null default now()
edition_time  | timestamp with time zone | not null default now()
project_id    | integer                  | not null
location      | float3d                  | not null
editor_id     | integer                  |
parent_id     | bigint                   |
radius        | real                     | not null default 0
confidence    | smallint                 | not null default 5
skeleton_id   | integer                  | not null

Indexes:
   "treenode_pkey" PRIMARY KEY, btree (id)
   "treenode_parent_id" btree (parent_id)
   "treenode_project_id_location_x_index" btree (project_id, ((location).x))
   "treenode_project_id_location_y_index" btree (project_id, ((location).y))
   "treenode_project_id_location_z_index" btree (project_id, ((location).z))
   "treenode_project_id_skeleton_id_index" btree (project_id, skeleton_id)
   "treenode_project_id_user_id_index" btree (project_id, user_id)
   "treenode_skeleton_id_index" btree (skeleton_id)

複合型是簡潔的設計，但對性能一點幫助都**沒有。
首先，在 Postgres 中float翻譯為float8aka 。double precision你建立在一個誤解之上。
數據類型占用 4個real字節（不是 8 個）。它必須以 4 個字節的倍數對齊。
用測量實際尺寸pg_column_size()。
SQL Fiddle展示了實際大小。
複合類型real3d占用 36 個字節。那是：
23 byte tuple header
1 byte padding
4 bytes real x
4 bytes real y
4 bytes real z
---
36 bytes
如果您將其嵌入到表格中，則可能必須添加填充。另一方面，該類型的標頭在磁碟上可以小 3 個字節。磁碟上的表示通常比 RAM 中的小一些。沒有太大區別。
更多的：
為讀取性能配置 PostgreSQL
在 PostgreSQL 中計算和節省空間
表格佈局
使用此等效設計可大幅減少行大小：
   Column     |           Type           |                       Modifiers
---------------+--------------------------+---------------------------------
id            | bigint                   | not null default nextval(...
creation_time | timestamp with time zone | not null default now()
edition_time  | timestamp with time zone | not null default now()
user_id       | integer                  | not null
project_id    | integer                  | not null
location_x    | real                     | not null
location_y    | real                     | not null
location_z    | real                     | not null
radius        | real                     | not null default 0
skeleton_id   | integer                  | not null
confidence    | smallint                 | not null default 5
parent_id     | bigint                   |
editor_id     | integer                  |
在驗證我的聲明之前和之後進行測試：
SELECT pg_relation_size('treenode') As table_size;

SELECT avg(pg_column_size(t) AS avg_row_size
FROM   treenode t;
更多細節：
測量 PostgreSQL 表行的大小

引用自：https://dba.stackexchange.com/questions/72787

在 Postgres 中用於空間查詢的 3d 點數據的良好佈局？

表格佈局

相關問答

PostgreSQL：將行轉換為類型

Postgres jsonb 與復合類型的性能差異

如何在沒有 PostGIS 的情況下儲存緯度和經度？

有沒有辦法部分初始化 TYPE？

`SELECT my_table FROM my_table` 返回什麼類型的行？

GIS點還是離散的經度和緯度列？

在 Postgres 中用於空間查詢的 3d 點數據的良好佈局？

表格佈局

相關問答

PostgreSQL：將行轉換為類型

Postgres jsonb 與復合類型的性能差異

如何在沒有 PostGIS 的情況下儲存緯度和經度？

有沒有辦法部分初始化 TYPE？

SELECT my_table FROM my_table 返回什麼類型的行？

GIS點還是離散的經度和緯度列？

`SELECT my_table FROM my_table` 返回什麼類型的行？