2017-09-27 98 views
1

我有一個看起來像一個圖的頂部特徵:添加上述X軸或在圖形

enter image description here

現在我想基於兩個略微增加X軸上面的一些附加信息X座標。

例如連接值1376年和1837年,並加以註釋,它看起來像(我知道這看起來糟糕,但只是你得到一個想法,當然還有文字的地方是不理想的。):

enter image description here

而且會有幾個區域可以重疊。我試圖用plt.arrow(1376, 0, 1837, 0)做到這一點,但箭頭並沒有在1837年停止。它正在進入X軸的末端。我也用基本的文本註釋工具嘗試過,但我從來沒有得到我想要的。另一種解決方案是在標題下面的圖表頂部添加信息。因此,任何關於頂部或底部的想法都可能有所幫助。

+0

爲了澄清,你想表明基因組區域,即一系列x值,或者你想表示對具體的x值嗎? – Paul

+0

我想指出基因組區域。 – JFS31

+0

另外,您的箭頭不起作用,因爲呼叫簽名是'plt.arrow(x,y,dx,dy)'。你需要像'plt.arrow(1376,0,1837-1376,0)'這樣的東西。不過,我只是畫一條線。 – Paul

回答

2

一個可能的解決方案,儘管這是一個手動過程,並不理想(如果您有很多這些,可能有點乏味),只是簡單地在圖上畫一條額外的線。您可以指定要在其間繪製線條的x座標,y座標將成爲圖形上的垂直位置。

import matplotlib.pyplot as plt 
import numpy as np 

# create some data 
x = np.arange(0,10,0.1) 
y = np.sin(x) 

fig, ax = plt.subplots() 
ax.plot(x,y) 

ax.plot([2,4],[-1,-1], color="red", lw=1) # add the line 
ax.annotate('Test 1', xy=(2.5, -0.95)) # add text above the line 

# increase the thickness of the line using lw = 
ax.plot([6,8],[-1,-1], color="red", lw=3) 
ax.annotate('Test 2', xy=(6.5, -0.95)) 

plt.show() 

導致像圖:

enter image description here

+0

對我來說很好。我會與此合作。謝謝:) – JFS31

1

取決於有多少這些圖,你需要做的,你可能要爲地區/間隔的列表過程自動化。那麼問題當然就是如何處理重疊的時間間隔。下面的代碼是嘗試在解決間隔重疊的同時自動執行該過程。

enter image description here

#!/usr/bin/env python 
# -*- coding: utf-8 -*- 

import numpy as np 
import matplotlib.pyplot as plt 
from itertools import chain, combinations 

def annotate_intervals(intervals, labels, y0=0, dy=-1, ax=None): 
    """ 
    Annotates an interval with a bar and a centred label below. 

    Arguments: 
    ---------- 
    intervals - (N, 2) array 
     list of intervals 
    labels - (N,) iterable of strings 
     list of corresponding labels 
    y0 - int/float (default 0) 
     baseline y value of annotations 
    dy - int/float (default -1) 
     shift in y to avoid overlaps of annotations 
    ax - matplotlib axis object (default plt.gca()) 
     axis to annotate 
    """ 

    if ax is None: 
     ax = plt.gca() 

    # assign y values to each interval; resolve overlaps 
    y = y0 + _get_levels(intervals) * dy 

    for (start, stop), yy, label in zip(intervals, y, labels): 
     ax.plot([start, stop], [yy, yy], lw=3) 
     ax.text(start + (stop-start)/2., yy, label, 
       horizontalalignment='center', verticalalignment='bottom') 

def _get_levels(intervals): 
    """ 
    Assign a 'level' to each interval such that no two overlapping intervals are on the same level. 
    Fill lower levels as much as possible before creating a new level. 
    """ 

    # initialise output 
    n = len(intervals) 
    levels = np.zeros((n)) 

    # resolve overlaps 
    overlaps = _get_overlaps(intervals) 
    if np.any(overlaps): 
     contains_overlaps, = np.where(np.any(overlaps, axis=0)) 
     remaining = list(contains_overlaps) 
     ctr = 0 
     while len(remaining) > 0: 
      indices = _get_longest_non_overlapping_set(intervals[remaining]) 
      longest = [remaining.pop(ii) for ii in indices[::-1]] 
      levels[longest] = ctr 
      ctr += 1 

    return levels 

def _get_overlaps(intervals): 
    """ 
    Arguments: 
    ---------- 
    intervals - (N, 2) array 
     list of intervals 

    Returns: 
    -------- 
    overlap - (N, N) array 
     type of overlap (if any) 

    overlap[ii,jj] = 0 - no overlap 
    overlap[ii,jj] = 1 - start of interval[jj] within interval[ii] 
    overlap[ii,jj] = 2 - stop of interval[jj] within interval[ii] 
    overlap[ii,jj] = 3 - interval[jj] encapsulated by interval[ii] 
    overlap[ii,jj] = 4 - interval[jj] encapsulates interval[ii] 

    """ 

    n = len(intervals) 
    overlap = np.zeros((n,n), dtype=np.int) 
    for ii, (start, stop) in enumerate(intervals): 
     for jj, (s, t) in enumerate(intervals): 
      if ii != jj: 
       overlap[ii,jj] += int((s >= start) and (s < stop)) 
       overlap[ii,jj] += 2 * int((t >= start) and (t < stop)) 

    # if interval[jj] encapsulates interval[ii], overlaps[ii,jj] is still 0 
    mask = overlap == 3 
    overlap[mask.T] += 4 

    return overlap 

def _get_longest_non_overlapping_set(intervals): 
    """ 
    Brute-force approach: 
    1) Get all possible sets of intervals. 
    2) Filter for non-overlapping sets. 
    3) Determine total length of intervals for each. 
    4) Select set with highest total. 
    """ 
    indices = np.arange(len(intervals)) 
    lengths = np.diff(intervals, axis=1) 
    powerset = list(_get_powerset(indices)) 
    powerset = powerset[1:] # exclude empty set 

    total_lengths = np.zeros((len(powerset))) 
    for ii, selection in enumerate(powerset): 
     selection = np.array(selection) 
     if not np.any(_get_overlaps(intervals[selection])): 
      total_lengths[ii] = np.sum(lengths[selection]) 

    return powerset[np.argmax(total_lengths)] 

def _get_powerset(iterable): 
    "powerset([1,2,3]) -->() (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)" 
    s = list(iterable) # allows duplicate elements 
    return chain.from_iterable(combinations(s, r) for r in range(len(s)+1)) 

def test(): 
    import string 
    n = 6 
    intervals = np.sort(np.random.rand(n, 2), axis=1) 
    labels = [letter for letter in string.ascii_lowercase[:n]] 
    annotate_intervals(intervals, labels) 
    plt.show() 
+0

謝謝你。幸運的是,我現在只需要三個地塊。所以手動操作並不困難。但也許對未來你的代碼可能是非常有用的:) – JFS31