Classifcation /決策樹和選擇拆分

這是一個非常基本的例子。但是我正在做一些數據分析，並且不斷地發現自己編寫非常類似的SQL計數查詢來生成概率表。Classifcation /決策樹和選擇拆分

我的表被定義爲使得爲0的值意味着而值1意味着該事件確實發生的事件沒有發生。

> sqldf("select count(distinct Date) from joinedData where C_O_Above_prevHigh = 0 and C_O_Below_prevLow = 0") 
    count(distinct Date) 
1     1081 

> sqldf("select count(distinct Date) from joinedData where C_O_Above_prevHigh = 0 and C_O_Below_prevLow = 0 and E_halfGap = 1") 
    count(distinct Date) 
1     956 

> sqldf("select count(distinct Date) from joinedData where C_O_Above_prevHigh = 1 OR C_O_Below_prevLow = 1 and E_halfGap = 1") 
    count(distinct Date) 
1     504

在上述例子中，我的預測值變量是C_O_Above_prevHigh和C_O_Below_prevLow我的結果變量是E_halfGap。有幾種情況下可能有更多的預測變量，例如Time

而不是做上述和手動不同permuations輸入我的所有查詢，有什麼可用在R或將某些其他應用程序：

1）輸出的電位概率路徑基於我的預測？ 2）讓我選擇如何拆分的路徑

我很欣賞你的輸入。

來源

2012-04-26 Dave

如果你希望所有總計和小計，你可以在SQL中R.

addmargins(Titanic) 
# More readable: 
ftable(addmargins(Titanic))

使用CUBE BY（但不是SQLite中）或addmargins如果你想建立一個決策樹，你可以使用rpart包或檢查 machine learning 或 graphical models 任務視圖

來源

2012-04-26 08:42:57

感謝讓我轉向立方體。 – Dave 2012-05-23 15:49:20

Classifcation /決策樹和選擇拆分

回答

相關問題