Google-Bigquery

計算 2 列中的唯一記錄並根據布爾值計算分數

  • October 7, 2016

我在 Google Big Query 中有一個如下所示的表:

custID  InteractionDate Purchased?  Sales
1       20150312           F         0
1       20150312           T         200
1       20150314           T         150
2       20150221           T         400
2       20150221           F         0
2       20150221           T         120

..我需要轉換為以下結構:

custID InteractionDate Success Sales
1     20150312          0.5    200
1     20150314           1     150
2     20150221         0.66    520

..在第二個表中,成功列定義為

count(purchase=Yes)/(count(purchase=Yes) + count(purchase=No)),

銷售額只是總和。

我對 BQ 和非 SQL 語言很陌生。我根本不知道該怎麼做,我已經搜尋並看到了建議使用分區的範例,但我根本沒有看到任何這樣做的範例。提前非常感謝。

這在 SQL 中非常簡單。據我可以閱讀BigQuery 查詢參考,它支持所有(GROUP BY子句,COUNT()SUM()函式,CASE表達式):

SELECT 
   custID, InteractionDate,
   1.0 * COUNT(CASE WHEN Purchased = 'T' THEN 1 END) / COUNT(*) AS Success,
   SUM(Sales) AS Sales
FROM
   tableName
GROUP BY
   custID, InteractionDate ;

你可以使用這個:

SELECT custID,
      InteractionDate,
      IFNULL(ROUND((yes)/(yes+NO), 2),0) AS Success,
      sales
FROM
 (SELECT custID,
         InteractionDate,
         sum(if(Purchased=='T',1,0)) AS yes,
         sum(if(Purchased!='T',1,0)) AS NO,
         sum(sales) AS sales
  FROM
    (SELECT 1 AS custID,
            20150312 AS InteractionDate,
            'F' AS Purchased,
            0 AS Sales),
    (SELECT 1 AS custID,
            20150312 AS InteractionDate,
            'T' AS Purchased,
            200 AS Sales),
    (SELECT 1 AS custID,
            20150314 AS InteractionDate,
            'T' AS Purchased,
            150 AS Sales),
    (SELECT 2 AS custID,
            20150221 AS InteractionDate,
            'T' AS Purchased,
            400 AS Sales),
    (SELECT 2 AS custID,
            20150221 AS InteractionDate,
            'F' AS Purchased,
            0 AS Sales),
    (SELECT 2 AS custID,
            20150221 AS InteractionDate,
            'T' AS Purchased,
            120 AS Sales)
  GROUP BY custID,
           InteractionDate)

以更簡單的方式:

SELECT custID,
      InteractionDate,
      round(sum(if(Purchased=='T',1,0))/(sum(if(Purchased=='T',1,0))+sum(if(Purchased!='T',1,0))),2),
      sum(sales) AS sales
FROM TABLE
GROUP BY custID,
        InteractionDate

引用自:https://dba.stackexchange.com/questions/95773