tl; dr:我想在Redshift中生成一個日期表,以便更容易地生成報告。不需要大型表已經在redshift,需要上傳一個csv文件。如何在Redshift中創建日期表?
長版本: 我正在編寫一份報告,我必須平均每週創建新項目。日期範圍可能會持續數月或更長時間,所以可能會有5個星期一,但只有4個星期日,這可能會使數學有點棘手。另外,我無法保證每天有單個項目的實例,特別是一旦用戶開始分割數據。其中,這正在絆倒BI工具。
解決此問題的最佳方法很可能是日期表。但是,日期表的大多數教程都使用了Redshift無法提供或不完全支持的SQL命令(我在看着你,generate_series)。
有沒有一種簡單的方法在Redshift中生成日期表?
我嘗試使用的代碼(在此基礎上也 - 不工作的建議:http://elliot.land/post/building-a-date-dimension-table-in-redshift)
CREATE TABLE facts.dates (
"date_id" INTEGER NOT NULL PRIMARY KEY,
-- DATE
"full_date" DATE NOT NULL,
-- YEAR
"year_number" SMALLINT NOT NULL,
"year_week_number" SMALLINT NOT NULL,
"year_day_number" SMALLINT NOT NULL,
-- QUARTER
"qtr_number" SMALLINT NOT NULL,
-- MONTH
"month_number" SMALLINT NOT NULL,
"month_name" CHAR(9) NOT NULL,
"month_day_number" SMALLINT NOT NULL,
-- WEEK
"week_day_number" SMALLINT NOT NULL,
-- DAY
"day_name" CHAR(9) NOT NULL,
"day_is_weekday" SMALLINT NOT NULL,
"day_is_last_of_month" SMALLINT NOT NULL
) DISTSTYLE ALL SORTKEY (date_id)
;
INSERT INTO facts.dates
(
"date_id"
,"full_date"
,"year_number"
,"year_week_number"
,"year_day_number"
-- QUARTER
,"qtr_number"
-- MONTH
,"month_number"
,"month_name"
,"month_day_number"
-- WEEK
,"week_day_number"
-- DAY
,"day_name"
,"day_is_weekday"
,"day_is_last_of_month"
)
SELECT
cast(seq + 1 AS INTEGER) AS date_id,
-- DATE
datum AS full_date,
-- YEAR
cast(extract(YEAR FROM datum) AS SMALLINT) AS year_number,
cast(extract(WEEK FROM datum) AS SMALLINT) AS year_week_number,
cast(extract(DOY FROM datum) AS SMALLINT) AS year_day_number,
-- QUARTER
cast(to_char(datum, 'Q') AS SMALLINT) AS qtr_number,
-- MONTH
cast(extract(MONTH FROM datum) AS SMALLINT) AS month_number,
to_char(datum, 'Month') AS month_name,
cast(extract(DAY FROM datum) AS SMALLINT) AS month_day_number,
-- WEEK
cast(to_char(datum, 'D') AS SMALLINT) AS week_day_number,
-- DAY
to_char(datum, 'Day') AS day_name,
CASE WHEN to_char(datum, 'D') IN ('1', '7')
THEN 0
ELSE 1 END AS day_is_weekday,
CASE WHEN
extract(DAY FROM (datum + (1 - extract(DAY FROM datum)) :: INTEGER +
INTERVAL '1' MONTH) :: DATE -
INTERVAL '1' DAY) = extract(DAY FROM datum)
THEN 1
ELSE 0 END AS day_is_last_of_month
FROM
-- Generate days for 81 years starting from 2000.
(
SELECT
'2000-01-01' :: DATE + generate_series AS datum,
generate_series AS seq
FROM generate_series(0,81 * 365 + 20,1)
) DQ
ORDER BY 1;
會拋出這個錯誤
[Amazon](500310) Invalid operation: Specified types or functions (one per INFO message) not supported on Redshift tables.;
1 statement failed.
......因爲,我假設INSERT和generate_series不允許在Redshift中的同一命令中
正如你已經發現,'generate_series()'不能與實際的數據,因爲它僅執行領導節點上使用。你的方法生成一個數字表,然後加入它的效果很好。或者,在Excel中創建源文件並僅導入結果。像這樣的日期表非常適合報告。您可能想要添加的其他內容:公共假期標誌,季度標誌的最後一天,年份標誌的最後一天(適用於按期間最後一個日期分組的報告)。 –
我喜歡那些額外的列。謝謝約翰! – Phillip