為什麼 Redshift UNLOAD 會提高輸出中 float4 欄位的精度？

October 22, 2019

我在 AWS Redshift 中創建了一個表，例如
CREATE TABLE exampleTableName (
 id       int identity(1,1),
 accId    varchar(16) encode zstd,
 amount   float4,

 primary key(id)
)
distkey(accId)
interleaved sortkey(accId);
表中的範例記錄的amount欄位值為120.12。
但是，當我嘗試通過執行 UNLOAD 導出數據時，生成的文件（本質上是 CSV）對欄位值具有額外的精度。
解除安裝命令：
UNLOAD ('SELECT * from exampleTableName')
TO 's3://bucket/prefixFile_'
IAM_ROLE 'XXX'
HEADER
ADDQUOTES
PARALLEL OFF
MAXFILESIZE AS 5gb
DELIMITER AS ',' 
GZIP;
結果輸出中的欄位值：（120.120003即它增加了 4 個小數位，這不在原始數據集中）。
為什麼會發生這種情況，如何防止額外的精度（即小數位）作為 UNLOAD 命令的一部分輸出？

AWS 論壇的回答：

this happens when you use FLOAT for the decimal data because it cannot store values
of arbitrary precision. https://en.wikipedia.org/wiki/IEEE_754

I would generally always recommend using the DECIMAL datatype unless you have an
existing application that has an unchangeable requirement for FLOAT, e.g., 
calculation expects FLOAT and output cannot change.

Additionally, by using DECIMAL you are able to use our new AZ64 compression encoding
which will reduce the amount of storage needed and improve your query performance.
https://aws.amazon.com/about-aws/whats-new/2019/10/amazon-redshift-introduces-az64-a-new-compression-encoding-for-optimized-storage-and-high-query-performance/

引用自：https://dba.stackexchange.com/questions/250485

為什麼 Redshift UNLOAD 會提高輸出中 float4 欄位的精度？

相關問答

SQL |基於另一張表的 row_number 設置的 Shuffle 順序

在對某些對象具有特權的 redshift 中刪除使用者

pg_catalog 是否可用於 aws redshift 數據共享使用者

RedShift 中的顯式鎖

Redshift：如果單元格數組中存在值，則返回行

能否使用 AWS Data Migration Service 將非 AWS SQL 伺服器與 Redshift 集成？