我正在研究R項目。在試圖分析情緒時,我必須創建一個數據框(在我的前面,這是「sentiment.df」)。data.frame中的錯誤:參數意味着不同的行數:2,5,19,7,1,11,4,6,9,3,13,14,22,26,27,30,31,29,35
sentiment.df <- data.frame(text, emotion=emotion, polarity=polarity, stringsAsFactors=FALSE)
在這裏,文本 - 包含處理(清理)的推文分成關鍵字的列表;情感 - 包含一包角色的情感;極性 - 包含+ ve,-ve評論家。當運行上面的LOC我RStudio引發了以下錯誤:
Error in data.frame(c("httpstcoux1aacnxbk", "endalz"), c("i", "have", :
arguments imply differing number of rows: 2, 5, 19, 7, 1, 11, 4, 6, 9, 3, 13, 17, 8, 10, 24, 21, 15, 12, 25, 16, 20, 23, 18, 28, 14, 22, 26, 27, 30, 31, 29, 35
的3個變量的長度 - 文本,情感&極性都是一樣的:2621
這是我的數據看起來像:
> str(text)
List of 2621
$ : chr [1:2] "httpstcoux1aacnxbk" "endalz"
$ : chr [1:5] "i" "have" "the" "best" ...
$ : chr [1:19] "kenny" "easley" "seahawks" "captain" ...
$ : chr [1:2] "good" "defense"
$ : chr [1:7] "superbowlxlix" "party" "<U+FFFD><U+FFFD><U+FFFD><U+FFFD>""| __truncated__ "" ...
$ : chr "ihatetombrady"
$ : chr [1:11] "coachbourbonusa" "understood" "still" "dont" ...
$ : chr [1:19] "tiwaworks" "whitney" "houston" "sings" ...
$ : chr [1:4] "thats" "still" "bae" "<U+2764><U+FE0F>""| __truncated__
$ : chr [1:6] "were" "a" "thousand" "miles" ...
$ : chr [1:7] "dredoo24" "what" "i" "like" ...
$ : chr [1:2] "bww" "<U+FFFD><U+FFFD><U+FFFD><U+FFFD>""| __truncated__
$ : chr [1:9] "i" "seriously" "cant" "wait" ...
$ : chr [1:3] "flyysociety" "photoshoot<U+2716><U+FE0F>""| __truncated__ "httptcoxkywsj5i2x"
$ : chr [1:5] "lienne11" "wait" "whos" "performing" ...
$ : chr [1:13] "game" "on" "go" "wildcats<U+FFFD><U+FFFD>\u2b07<U+FE0F>""| __truncated__ ...
$ : chr [1:2] "good" "defense"
$ : chr [1:11] "seattle" "seahawks" "fan" "" ...
$ : chr [1:9] "realprestonj" "congratulations" "preston" "the" ...
$ : chr [1:5] "tsu19" "so" "funny" "bruh" ...
$ : chr [1:4] "drunk" "tweets" "coming" "soon"
$ : chr "tb12"
$ : chr [1:13] "hicksville" "schools" "will" "be" ...
$ : chr [1:5] "but" "momma" "said" "superbowl" ...
$ : chr [1:4] "raggedy" "ass" "bitch" ""
$ : chr [1:5] "arbyscares" "arbys" "prairie" "village" ...
$ : chr [1:17] "lovetruth79" "ltltltloves" "to" "send" ...
$ : chr [1:8] "「boynamedhxlz""| __truncated__ "quote" "this" "tweet" ...
$ : chr [1:13] "stretching" "for" "ballet" "now" ...
$ : chr [1:7] "jerrodflusche" "janabewley" "narnia" "for" ...
$ : chr [1:8] "here" "goes" "my" "whole" ...
$ : chr [1:10] "who" "you" "going" "for" ...
$ : chr [1:3] "good" "stop" "hawks"
$ : chr [1:5] "brady" "be" "smokin" "blounts" ...
$ : chr [1:8] "me" "decepcioné" "perdoné" "hice" ...
$ : chr [1:7] "happy21stbirthdayharry" "" "its" "also" ...
$ : chr [1:24] "teammic3rd" "sounds" "amazing" "" ...
$ : chr [1:21] "millions" "of" "people" "packed" ...
$ : chr [1:8] "missed" "idina" "singing" "by" ...
$ : chr [1:2] "your" "stupid"
$ : chr [1:5] "seahawks" "all" "the" "way" ...
$ : chr [1:4] "takeathillpill" "you" "are" "vile"
$ : chr [1:3] "lets" "goo" "superbowlixlix"
$ : chr [1:4] "snow" "day" "nigga" "<U+FFFD><U+FFFD><U+FFFD><U+FFFD>""| __truncated__
$ : chr [1:6] "ill" "just" "watch" "total" ...
$ : chr [1:9] "liveextra" "site" "down" "its" ...
$ : chr [1:3] "time" "to" "punt"
$ : chr [1:5] "zachdettloff516" "groans" "at" "terrible" ...
$ : chr [1:3] "go" "seahawks" "<U+FFFD><U+FFFD>""| __truncated__
$ : chr [1:7] "pizza" "friends" "super" "bowl" ...
$ : chr [1:9] "hold" "onto" "me" "cause" ...
$ : chr [1:6] "tom" "gonna" "get" "his" ...
$ : chr [1:6] "lets" "goooooo" "nice" "3rd" ...
$ : chr [1:15] "2" "fatal" "crashes" "reported" ...
$ : chr [1:12] "supra" "dope" "atx" "sundayfunday" ...
$ : chr [1:19] "all" "these" "students" "from" ...
$ : chr [1:3] "danstricko" "not" "happening"
$ : chr [1:17] "tom" "brady" "may" "wear" ...
$ : chr "httptconqabzdezwf"
$ : chr [1:4] "i" "miss" "you" "<U+FFFD><U+FFFD><U+FFFD><U+FFFD>""| __truncated__
$ : chr [1:25] "john" "legend" "and" "idina" ...
$ : chr [1:13] "snowed" "in" "with" "kadybuchler" ...
$ : chr [1:6] "that" "bright" "green" "and" ...
$ : chr [1:9] "ive" "got" "the" "seahawks" ...
$ : chr [1:9] "sds" "by" "mac" "miller" ...
$ : chr [1:5] "jakeski52" "rotowire" "or" "roger" ...
$ : chr "damnit"
$ : chr "hawks"
$ : chr [1:7] "my" "nephews" "and" "niece" ...
$ : chr [1:16] "liking" "your" "own" "posts" ...
$ : chr [1:2] "bailaconbruce" "fb"
$ : chr [1:4] "djones7" "hell" "no" "<U+FFFD><U+FFFD>""| __truncated__
$ : chr [1:7] "best" "part" "of" "the" ...
$ : chr [1:13] "holls016" "f" "u" "i" ...
$ : chr [1:6] "mikebarnicle" "nice" "to" "meet" ...
$ : chr [1:5] "u" "played" "me" "dirty" ...
$ : chr [1:13] "my" "bac" "is" "looking" ...
$ : chr [1:2] "est" "2008"
$ : chr [1:12] "vacation" "time" "" "thats" ...
$ : chr [1:3] "<U+FFFD><U+FFFD>""| __truncated__ "ok" "<U+FFFD><U+FFFD><U+FFFD><U+FFFD><U+FFFD><U+FFFD><U+FFFD><U+FFFD><U+FFFD><U+FFFD><U+FFFD><U+FFFD><U+FFFD><U+FFFD><U+FFFD><U+FFFD"| __truncated__
$ : chr [1:2] "common" "seattle"
$ : chr [1:3] "no" "cacc" "talc"
$ : chr "lob"
$ : chr [1:3] "cut" "the" "crap"
$ : chr [1:11] "im" "at" "las" "alitas" ...
$ : chr [1:3] "backstreets" "back" "alrighttttt"
$ : chr [1:6] "the" "seahawks" "are" "going" ...
$ : chr [1:13] "baby" "its" "cold" "outside" ...
$ : chr [1:15] "i" "have" "sooo" "much" ...
$ : chr [1:10] "so" "whos" "gonna" "pull" ...
$ : chr [1:5] "my" "driveway" "tonight" "nwiweather" ...
$ : chr "fuck"
$ : chr [1:21] "now" "that" "its" "actually" ...
$ : chr [1:7] "green" "goats" "<U+FFFD><U+FFFD>""| __truncated__ "" ...
$ : chr [1:15] "i" "guess" "its" "time" ...
$ : chr [1:3] "lets" "go" "seattle"
$ : chr [1:20] "jozybrambila7" "do" "you" "ever" ...
$ : chr [1:4] "reggiewo" "nice" "choice" "cheers"
$ : chr [1:20] "i" "enjoy" "super" "bowl" ...
[list output truncated]
> str(emotion)
chr [1:2621] "unknown" "unknown" "unknown" "unknown" "unknown" "unknown" "unknown" "joy" ...
> str(polarity)
chr [1:2621] "positive" "positive" "positive" "positive" "positive" "positive" "positive" ...
當我在網上發佈這個錯誤時,程序員說沒有行& cols不一樣。即它不是一個正方形矩陣,Dataframe將不能用於矩形矩陣。
如果有人幫我解決了這個錯誤,將不勝感激。
在此先感謝!
你可以檢查'str(情緒)'和'str(極性)' – akrun
有時物體看起來沒問題,但是內部有一個問題結構。看'STR(文本)' –
我猜你有'文本'存儲爲列表,所以它試圖使列表的每一部分列。你可以嘗試'data.frame(unlist(text),emotion = emotion,polarity = polarity,stringsAsFactors = FALSE)'取決於你的確切數據佈局 – jeremycg