使用SAS:如果日期實際上不匹配,如何連接兩個表,按日期如何連接? 例如,我想在full_table中添加一個包含來自changepoints表的'type'的列,智能地按日期合併匹配。按不匹配的日期連接表
ods listing;
/**********************************************************
main table
***********************************************************/
DATA full_table;
input id $ date date9.;
FORMAT date date9.;
DATALINES;
a 01APR2015
b 02APR2015
c 03APR2015
d 01JUN2015
e 24JUN2015
f 01DEC2015
;
RUN;
PROC PRINT;
run;
/**********************************************************
additional information
***********************************************************/
DATA changepoints;
input date date9. type $;
FORMAT date date9.;
DATALINES;
15MAR2014 spiral
05JUN2015 circle
29NOV2015 square
;
RUN;
PROC PRINT;
run;
/**********************************************************
Desired result
***********************************************************/
DATA new_table;
input id $ date date9. type $;
FORMAT date date9.;
DATALINES;
a 01APR2015 spiral
b 02APR2015 spiral
c 03APR2015 spiral
d 01JUN2015 spiral
e 24JUN2015 circle
f 01DEC2015 square
;
RUN;
PROC PRINT;
run;
/**********************************************************
join not working this way
***********************************************************/
PROC SQL;
create table new_table2 as
select full_table.*, changepoints.type
from full_table left join changepoints
on full_table.date = changepoints.date;
QUIT;
所需的輸出將是:
Obs id date type
1 a 01APR2015 spiral
2 b 02APR2015 spiral
3 c 03APR2015 spiral
4 d 01JUN2015 spiral
5 e 24JUN2015 circle
6 f 01DEC2015 square
ANSWER基於下面的正確答案:
ods listing;
/**********************************************************
main table
***********************************************************/
DATA full_table;
input id $ date date9.;
FORMAT date date9.;
DATALINES;
a 01APR2015
b 02APR2015
c 03APR2015
d 01JUN2015
e 24JUN2015
f 01DEC2015
;
RUN;
PROC PRINT;
RUN;
/**********************************************************
additional information
***********************************************************/
DATA changepoints;
input date date9. type $;
FORMAT date date9.;
DATALINES;
15MAR2014 spiral
05JUN2015 circle
29NOV2015 square
;
RUN;
PROC PRINT;
RUN;
/**********************************************************
Update changepoints to have start/end dates so the sql join
works
***********************************************************/
PROC SORT data=changepoints;
by descending date;
RUN;
DATA changepoints;
set changepoints;
end = lag(date);
start = date;
format start end date9.;
RUN;
PROC SORT data=changepoints;
by date;
RUN;
DATA changepoints;
set changepoints end=eof;
by start;
IF eof and missing(end) THEN end = today();
RUN;
PROC PRINT;
RUN;
/**********************************************************
Join
***********************************************************/
proc sql noprint;
create table test as
select a.id,a.date,b.type
from full_table as a
left join
changepoints as b
on a.date >= b.start
and a.date < b.end;
quit;
PROC PRINT;
RUN;
儘管這不是一個完整的答案(因爲這需要編寫代碼將更多日期添加到變更點數據集),但我決定採用它,因爲它更清晰,更清晰,更易於理解。謝謝!! – variable