我有以下的數據集(這是一個示例):費舍爾耶茨洗牌在python
ID Sub1 Sub2 Sub3 Sub4
Creb3l1 10.14 9.67 10.14 10.42
Chchd6 11.25 10.74 10.80 11.07
Arih1 9.91 9.25 10.20 9.34
Prpf8 11.54 11.58 11.14 11.36
Rfng 11.71 11.56 10.81 10.72
Rnf114 12.66 12.60 12.59 12.56
我要進行的費雪耶茨對這個數據交叉設置10倍(即寫10個輸出文件,每一個使用Fisher Yates shuffle進行一次數據隨機化)。
我寫這個代碼:
import sys
import itertools
from itertools import permutations
for line in open(sys.argv[1]).readlines()[2:]:
line = line.strip().split()
ID = line[0]
expression_values = line[1:]
for shuffle in permutations(expression_values):
print shuffle
此代碼的輸出是這樣的(樣品):
('11.25', '10.74', '10.80', '11.07')
('11.25', '10.74', '11.07', '10.80')
('11.25', '10.80', '10.74', '11.07')
('11.25', '10.80', '11.07', '10.74')
('11.25', '11.07', '10.74', '10.80')
('11.25', '11.07', '10.80', '10.74')
('10.74', '11.25', '10.80', '11.07')
('10.74', '11.25', '11.07', '10.80')
('10.74', '10.80', '11.25', '11.07')
('10.74', '10.80', '11.07', '11.25')
('10.74', '11.07', '11.25', '10.80')
('10.74', '11.07', '10.80', '11.25')
('10.80', '11.25', '10.74', '11.07')
('10.80', '11.25', '11.07', '10.74')
('10.80', '10.74', '11.25', '11.07')
('10.80', '10.74', '11.07', '11.25')
('10.80', '11.07', '11.25', '10.74')
('10.80', '11.07', '10.74', '11.25')
('11.07', '11.25', '10.74', '10.80')
('11.07', '11.25', '10.80', '10.74')
('11.07', '10.74', '11.25', '10.80')
('11.07', '10.74', '10.80', '11.25')
('11.07', '10.80', '11.25', '10.74')
('11.07', '10.80', '10.74', '11.25')
('9.91', '9.25', '10.20', '9.34')
('9.91', '9.25', '9.34', '10.20')
,我有麻煩正在產生的隨機化數據的塊的特定部分(例如給我一組7條Fisher-Yates隨機線,我可以寫入文件)。如果有人能告訴我如何編輯上面的代碼來生成10個輸出文件,每個文件包含7行文本(即與輸入文件相同的編號),每個文件都帶有一個隨機化的Fisher Yates混洗值集合,我將不勝感激它。
編輯1:我已經嘗試了幾種不同的方式: 例如下面的代碼:
for line in open(sys.argv[1]).readlines()[2:]:
line = line.strip().split()
gene_name = line[0]
expression_values = line[1:]
RandomList = []
for shuffle in permutations(expression_values):
while len(RandomList) <10:
RandomList.append(shuffle)
print RandomList
我以爲會給我回每行10個randomisations。它給我回同樣的隨機線,10倍,每行:
[('11.25', '10.74', '10.80', '11.07'), ('11.25', '10.74', '10.80', '11.07'), ('11.25', '10.74', '10.80', '11.07'), ('11.25', '10.74', '10.80', '11.07'), ('11.25', '10.74', '10.80', '11.07'), ('11.25', '10.74', '10.80', '11.07'), ('11.25', '10.74', '10.80', '11.07'), ('11.25', '10.74', '10.80', '11.07'), ('11.25', '10.74', '10.80', '11.07'), ('11.25', '10.74', '10.80', '11.07')]
[('9.91', '9.25', '10.20', '9.34'), ('9.91', '9.25', '10.20', '9.34'), ('9.91', '9.25', '10.20', '9.34'), ('9.91', '9.25', '10.20', '9.34'), ('9.91', '9.25', '10.20', '9.34'), ('9.91', '9.25', '10.20', '9.34'), ('9.91', '9.25', '10.20', '9.34'), ('9.91', '9.25', '10.20', '9.34'), ('9.91', '9.25', '10.20', '9.34'), ('9.91', '9.25', '10.20', '9.34')]
[('11.54', '11.58', '11.14', '11.36'), ('11.54', '11.58', '11.14', '11.36'), ('11.54', '11.58', '11.14', '11.36'), ('11.54', '11.58', '11.14', '11.36'), ('11.54', '11.58', '11.14', '11.36'), ('11.54', '11.58', '11.14', '11.36'), ('11.54', '11.58', '11.14', '11.36'), ('11.54', '11.58', '11.14', '11.36'), ('11.54', '11.58', '11.14', '11.36'), ('11.54', '11.58', '11.14', '11.36')]
[('11.71', '11.56', '10.81', '10.72'), ('11.71', '11.56', '10.81', '10.72'), ('11.71', '11.56', '10.81', '10.72'), ('11.71', '11.56', '10.81', '10.72'), ('11.71', '11.56', '10.81', '10.72'), ('11.71', '11.56', '10.81', '10.72'), ('11.71', '11.56', '10.81', '10.72'), ('11.71', '11.56', '10.81', '10.72'), ('11.71', '11.56', '10.81', '10.72'), ('11.71', '11.56', '10.81', '10.72')]
[('12.66', '12.60', '12.59', '12.56'), ('12.66', '12.60', '12.59', '12.56'), ('12.66', '12.60', '12.59', '12.56'), ('12.66', '12.60', '12.59', '12.56'), ('12.66', '12.60', '12.59', '12.56'), ('12.66', '12.60', '12.59', '12.56'), ('12.66', '12.60', '12.59', '12.56'), ('12.66', '12.60', '12.59', '12.56'), ('12.66', '12.60', '12.59', '12.56'), ('12.66', '12.60', '12.59', '12.56')]
編輯2:肖恩:非常感謝你的幫助,所以我確實知道如何寫入文件一般,例如我可以說:
for i in range(10):
output_file = "random." + str(i)
open_output_file = open(output_file, 'a')
***for each line of the randomised array***:
open_output_file.write(line + "\n")
open_output_file.close()
我有寫文件的問題是,我甚至不能得到我想要打印到屏幕首先,例如,如果我運行這段代碼是什麼:
import sys
import itertools
from itertools import permutations
for i in range(10):
for line in open(sys.argv[1]).readlines()[2:]:
line = line.strip().split()
gene_name = line[0]
expression_values = line[1:]
for shuffle in permutations(expression_values):
print shuffle[:6]
print "***"
i +=1
我會希望輸出是7條隨機線,接着是「***」,然後是7條隨機線,10次。但是它會打印每行的所有組合。
你被困在哪一部分?獲得七個小組?將它們寫入文件?所有這些東西都有答案。 – jonrsharpe
謝謝,我編輯了這個問題。是的,我得到的輸出是120行打印到屏幕/寫入文件。我很困惑如何獲得7人組,例如每次打印一行7行,寫入文件(然後執行10次)。 – user1288515
你有什麼嘗試?製作一份清單,也許?在達到適當的長度時行動?如果你已經做出努力,展示它。如果你還沒有,就製作一個!或者只是[做一些研究](http://stackoverflow.com/questions/3992735/python-generator-that-groups-another-iterable-into-groups-of-n)。 – jonrsharpe