有我的是一種更有效的方式,但這種功能可以給你想要的東西:
consolidate_rules <- function(tree){
split.vars <- colnames(tree$node$info$criterion)
split <- partykit:::.list.rules.party(tree)
new.split <- c()
for(i.split in seq_along(split)) {
for (i.split.var in split.vars) {
x0 <- split[i.split]
x1 <- strsplit(x0, " & ")
x2 <- grep(i.split.var, x1[[1]], value = TRUE)
x3l <- strsplit(grep("<=", x2, value = TRUE), " <= ") # lower than
x3g <- strsplit(grep(">", x2, value = TRUE), " > ") # greater
x3e <- strsplit(grep(" %in% ", x2, value = TRUE), "%in%") # elements
x4 <- c()
if (length(x3e) != 0) {
b <- sapply(x3e, "[[", 2)
b1 <- gsub('"', '', b)
b2 <- gsub("[c()]", "", b1)
b3 <- gsub("(NA,)|(,NA)", "", b2)
b4 <- unique(strsplit(paste0(b3, collapse = ","), ",")[[1]])
x4 <- paste0(i.split.var, ' %in% c("',
paste0(b4, collapse = '", "'),'")')
}
if (length(x3l) != 0) {
x4 <- paste0(i.split.var, " <= ",
min(as.numeric(sapply(x3l, "[[", 2))))
}
if (length(x3g) != 0) {
x4 <- paste0(x4, ifelse(length(x4) > 0 ," & ",""),
i.split.var, " > ",
max(as.numeric(sapply(x3g, "[[", 2))))
}
tmp <- paste0(if(!is.null(new.split[i.split]) &&
!is.na(new.split[i.split]) &
length(x4) >0) {" & "}, x4)
new.split[i.split] <-
paste0(if(!is.null(new.split[i.split]) &&
!is.na(new.split[i.split])) {new.split[i.split]},
tmp)
rm(x0, x1, x2, x3l, x3g, x3e, x4)
}
}
names(new.split) <- names(split)
return(new.split)
}
可以調用函數:
ct <- ctree(Petal.Length~.,data=iris)
consolidate_rules(ct)
對於節點6,結果如下所示:
6
"Sepal.Length <= 5.5 & Petal.Width <= 1.3 & Petal.Width > 0.6"
由於結果是「只是」一個字符串與規則,我不知道你是否可以像對待.list.rules.party
一樣使用它。 但我希望這個mioght可以幫助你。
感謝您的回覆。你所說的肯定是真的,我並不是說樹可以簡化。但是,當樹被轉換爲規則時,'Sepal.Length <= 6.2'確實變得多餘。也就是說,子集(iris,Sepal.Length <= 5.5&Petal.Width> 0.6&Petal.Width <= 1.3)可以恢復節點6中的9種情況,而不需要使用「Sepal.Length <= 6.2」這個子句。所以在這個意義上說,我試圖鞏固規則。 – qoheleth