Mysql

在 MariaDB (MySQL) 中加入小表並過濾掉非鍵列時性能下降

  • August 15, 2020

我對 MariaDB 相當陌生,我正在努力解決一個我無法深入了解的問題。這是查詢:

SELECT SQL_NO_CACHE STRAIGHT_JOIN
   `c`.`Name` AS `CategoryName`, 
   `c`.`UrlSlug` AS `CategorySlug`, 
   `n`.`Description`, 
   IF(n.OriginalImageUrl IS NOT NULL, n.OriginalImageUrl, s.LogoUrl) AS `ImageUrl`, 
   `n`.`Link`, 
   `n`.`PublishedOn`, 
   `s`.`Name` AS `SourceName`, 
   `s`.`Url` AS `SourceWebsite`, 
  s.UrlSlug AS SourceUrlSlug,
   `n`.`Title`
FROM `NewsItems` AS `n`
INNER JOIN `NewsSources` AS `s` ON `n`.`NewsSourceId` = `s`.`Id`
LEFT JOIN `Categories` AS `c` ON `n`.`CategoryId` = `c`.`CategoryId`
WHERE s.UrlSlug = 'slug'
#WHERE s.Id = 52
ORDER BY `n`.`PublishedOn` DESC
LIMIT 50

NewsSources 是一個大約有 40 行的表,NewsItems 有大約 100 萬行。每個新聞條目屬於一個來源,一個來源可以有多個條目。我正在嘗試獲取由源的 URL slug 標識的源的所有項目。

  1. 如果我使用 STRAIGHT_JOIN 並且查詢包含大量新聞項目的源時,查詢會立即返回。但是,如果我查詢項目數量較少(~100)的源,或者如果我查詢不屬於任何源的 URL slug(結果集為 0 行),則查詢執行 12 秒。
  2. 如果我刪除 STRAIGHT_JOIN,我會看到與第一種情況相反的性能 - 當我查詢具有許多項目的新聞源並立即返回具有少量項目或結果集為 0 的源時,它執行非常慢,因為 URL slug 不屬於任何新聞來源。
  3. 如果我按新聞源 ID 查詢(註釋掉的 WHERE s.Id = 52),結果會立即出現,無論該源有很多項目還是該源有 0 個項目。

我想再次指出 NewsSources 表只包含大約 40 行。

這是上述查詢的分析器結果:解釋分析器

我該怎麼做才能使此查詢始終快速執行?

以下是表和索引定義:

-- --------------------------------------------------------
-- Server version:               10.4.13-MariaDB-1:10.4.13+maria~bionic - mariadb.org binary distribution
-- Server OS:                    debian-linux-gnu
-- --------------------------------------------------------

-- Dumping structure for table Categories
CREATE TABLE IF NOT EXISTS `Categories` (
 `CategoryId` int(11) NOT NULL AUTO_INCREMENT,
 `Name` varchar(50) COLLATE utf8mb4_unicode_ci NOT NULL,
 `Description` longtext COLLATE utf8mb4_unicode_ci NOT NULL,
 `UrlSlug` varchar(30) COLLATE utf8mb4_unicode_ci NOT NULL,
 `CreatedOn` datetime(6) NOT NULL,
 `ModifiedOn` datetime(6) NOT NULL,
 PRIMARY KEY (`CategoryId`)
) ENGINE=InnoDB AUTO_INCREMENT=16 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;


-- Dumping structure for table NewsItems
CREATE TABLE IF NOT EXISTS `NewsItems` (
 `Id` bigint(20) NOT NULL AUTO_INCREMENT,
 `NewsSourceId` int(11) NOT NULL,
 `Title` varchar(500) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
 `Link` varchar(500) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
 `Description` longtext COLLATE utf8mb4_unicode_ci DEFAULT NULL,
 `PublishedOn` datetime(6) NOT NULL,
 `GlobalId` varchar(500) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
 `CategoryId` int(11) DEFAULT NULL,
 PRIMARY KEY (`Id`),
 KEY `IX_NewsItems_CategoryId` (`CategoryId`),
 KEY `IX_NewsItems_NewsSourceId_GlobalId` (`NewsSourceId`,`GlobalId`),
 KEY `IX_NewsItems_PublishedOn` (`PublishedOn`),
 KEY `IX_NewsItems_NewsSourceId` (`NewsSourceId`),
 FULLTEXT KEY `Title` (`Title`,`Description`),
 CONSTRAINT `FK_NewsItems_Categories_CategoryId` FOREIGN KEY (`CategoryId`) REFERENCES `Categories` (`CategoryId`),
 CONSTRAINT `FK_NewsItems_NewsSources_NewsSourceId` FOREIGN KEY (`NewsSourceId`) REFERENCES `NewsSources` (`Id`)
) ENGINE=InnoDB AUTO_INCREMENT=649802 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;


-- Dumping structure for table NewsSources
CREATE TABLE IF NOT EXISTS `NewsSources` (
 `Id` int(11) NOT NULL AUTO_INCREMENT,
 `Name` varchar(500) COLLATE utf8mb4_unicode_ci NOT NULL,
 `Url` varchar(500) COLLATE utf8mb4_unicode_ci NOT NULL,
 `UrlSlug` varchar(50) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
 `LogoUrl` varchar(500) COLLATE utf8mb4_unicode_ci DEFAULT NULL
 PRIMARY KEY (`Id`)
) ENGINE=InnoDB AUTO_INCREMENT=55 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;

根據 POINT - 3:

如果我按新聞源 ID 查詢(註釋掉的 WHERE s.Id = 52),結果會立即出現,無論該源有很多項目還是該源有 0 個項目。

這是可能的,因為使用&表WHERE s.Id = 52中的索引確實檢查解釋計劃可能與給出的不同。NewSources``NewITems

嘗試創建以下索引:

create index IDX_UrlSlug on NewsSources(UrlSlug);  

並優化所有三個表:


OPTIMIZE TABLE NewsSources;
OPTIMIZE TABLE NewsItems;
OPTIMIZE TABLE Categories;

你的問題涉及到什麼STRAIGHT JOIN

在某些步驟中,執行 aSTRAIGHT JOIN可能會使查詢優化器不受影響。

例如,請注意MySQL 內部文件所說的內容:

find_best() 和 greedy_search() 的直接使用不適用於 LEFT JOIN 或 RIGHT JOIN。例如,從 MySQL 4.0.14 開始,優化器可能會在某些情況下將左連接更改為直連接並交換錶順序。另請參見外部連接優化

使用 aSTRAIGHT JOIN將按照它們在查詢中出現的順序處理表。這並不總是好的。為什麼不 ???

例如,查看查詢的這一部分

FROM `NewsItems` AS `n`
INNER JOIN `NewsSources` AS `s` ON `n`.`NewsSourceId` = `s`.`Id`

刪除STRAIGHT JOIN將使查詢優化器檢查這兩個表的表和索引指標(io 行計數、索引基數等),並決定哪個應該先出現。這樣做STRAIGHT JOIN可以通過使查詢優化器始終在NewItems表之前處理表來避免該步驟NewsSources,而不管哪個表具有更好的指標。

我建議三(3)件事:

建議 #1

不要使用STRAIGHT JOIN

SELECT SQL_NO_CACHE
   `c`.`Name` AS `CategoryName`, 
   `c`.`UrlSlug` AS `CategorySlug`, 
   `n`.`Description`, 
   IF(n.OriginalImageUrl IS NOT NULL, n.OriginalImageUrl, s.LogoUrl) AS `ImageUrl`, 
   `n`.`Link`, 
   `n`.`PublishedOn`, 
   `s`.`Name` AS `SourceName`, 
   `s`.`Url` AS `SourceWebsite`, 
  s.UrlSlug AS SourceUrlSlug,
   `n`.`Title`
FROM `NewsItems` AS `n`
INNER JOIN `NewsSources` AS `s` ON `n`.`NewsSourceId` = `s`.`Id`
LEFT JOIN `Categories` AS `c` ON `n`.`CategoryId` = `c`.`CategoryId`
WHERE s.UrlSlug = 'slug'
#WHERE s.Id = 52
ORDER BY `n`.`PublishedOn` DESC
LIMIT 50

建議 #2

如果您仍然需要STRAIGHT JOIN,請更改表格的順序:

SELECT SQL_NO_CACHE STRAIGHT_JOIN
   `c`.`Name` AS `CategoryName`, 
   `c`.`UrlSlug` AS `CategorySlug`, 
   `n`.`Description`, 
   IF(n.OriginalImageUrl IS NOT NULL, n.OriginalImageUrl, s.LogoUrl) AS `ImageUrl`, 
   `n`.`Link`, 
   `n`.`PublishedOn`, 
   `s`.`Name` AS `SourceName`, 
   `s`.`Url` AS `SourceWebsite`, 
  s.UrlSlug AS SourceUrlSlug,
   `n`.`Title`
FROM `NewsSources` AS `s`
INNER JOIN `NewsItems` AS `n` ON `n`.`NewsSourceId` = `s`.`Id`
LEFT JOIN `Categories` AS `c` ON `n`.`CategoryId` = `c`.`CategoryId`
WHERE s.UrlSlug = 'slug'
ORDER BY `n`.`PublishedOn` DESC
LIMIT 50

建議#3(可選)

如果您需要使用特定s.Id(例如52)執行查詢,請在執行任何連接操作之前查找該行:

SELECT SQL_NO_CACHE
   `c`.`Name` AS `CategoryName`, 
   `c`.`UrlSlug` AS `CategorySlug`, 
   `n`.`Description`, 
   IF(n.OriginalImageUrl IS NOT NULL, n.OriginalImageUrl, s.LogoUrl) AS `ImageUrl`, 
   `n`.`Link`, 
   `n`.`PublishedOn`, 
   `s`.`Name` AS `SourceName`, 
   `s`.`Url` AS `SourceWebsite`, 
  s.UrlSlug AS SourceUrlSlug,
   `n`.`Title`
FROM `NewsItems` AS `n`
INNER JOIN (SELECT * FROM `NewsSources` WHERE Id = 52) AS `s`
ON `n`.`NewsSourceId` = `s`.`Id`
LEFT JOIN `Categories` AS `c` ON `n`.`CategoryId` = `c`.`CategoryId`
WHERE s.UrlSlug = 'slug'
ORDER BY `n`.`PublishedOn` DESC
LIMIT 50

更新 2020-06-14 14:42 EDT

另一個建議:WHERE s.UrlSlug = 'slug'在沒有的情況下進入子查詢STRAIGHT JOIN

SELECT SQL_NO_CACHE
   `c`.`Name` AS `CategoryName`, 
   `c`.`UrlSlug` AS `CategorySlug`, 
   `n`.`Description`, 
   IF(n.OriginalImageUrl IS NOT NULL, n.OriginalImageUrl, s.LogoUrl) AS `ImageUrl`, 
   `n`.`Link`, 
   `n`.`PublishedOn`, 
   `s`.`Name` AS `SourceName`, 
   `s`.`Url` AS `SourceWebsite`, 
   s.UrlSlug AS SourceUrlSlug,
   `n`.`Title`
FROM `NewsItems` AS `n`
INNER JOIN (SELECT * FROM `NewsSources` WHERE s.UrlSlug = 'slug') AS `s`
ON `n`.`NewsSourceId` = `s`.`Id`
LEFT JOIN `Categories` AS `c` ON `n`.`CategoryId` = `c`.`CategoryId`
ORDER BY `n`.`PublishedOn` DESC
LIMIT 50

引用自:https://dba.stackexchange.com/questions/268532