2013-05-11 78 views
1

我在Mnesia有一個大桌子,因爲各種原因(這裏不重要,說我正在遠程執行選擇,結果必須通過使用一些第三方庫的網絡發送)我無法選擇所有行一個選擇。我已經將select分割爲只能一次檢索特定數量的列。在Mnesia中選擇每第二行?

例如這是一個選擇的例子僅檢索特定的列:

mnesia:dirty_select([table,[{{table,'$1','_','$3','$4','_','_','_'},[],['$$']}]]). 

我運行兩次選擇不同的一組列,然後合併結果。但現在事實證明,其中一列也太大而無法在一次選擇中檢索到。所以我想將一個大選擇分成兩個選擇,每個只選取該列中的一半行。有沒有簡單的方法來檢索每隔一行?就像只選擇奇數行一樣,然後只選擇偶數行?或者,也許是一種檢索前半部分和後半部分的方法?

我試圖從其中一列中選擇所有行,然後將其用作檢索特定行的索引。這可行,但需要相當一段時間來構建選擇然後執行它。

編輯:對不起,我沒有強調選擇正在遠程執行的事實。我知道關於迭代記錄或直接訪問文件,但這裏的挑戰是必須使用相對少量的命令來檢索那些有問題的記錄,並且這些命令必須能夠遠程執行。

例如:

  1. 僅選擇第一列(簡單的單mnesia:dirty_select命令)。
  2. 一旦結果被檢索(通過網絡),將其分爲兩組,並用作鍵來構造選擇以獲取特定記錄(每個選擇將包含要檢索的很長的鍵列表,但這很好,因爲它們是簡單的Erlang術語可以通過網絡來發送)
  3. 使用這兩組在2

這一工程,但因爲它發出了很多的數據是不容易的,而不是最佳創建的密鑰的檢索分兩步所有行兩方法。除非構造選項以考慮每列和每行中包含的特定數據(例如,匹配第一列中名稱以字母'A'到'M'開頭的所有行),否則可能沒有簡單的解決方案。我只是不確定使用標準的Mnesia命令有什麼可能,並希望有一個更簡單的解決方案。

+0

我知道這與您的問題沒有直接關係,但是在一個select中檢索過多數據的症狀是什麼?將查詢分開時要記住的一件事是競爭條件的潛在增加。 – 2013-05-11 12:14:57

+0

閱讀你的問題,似乎你所面臨的限制不是來自Mnesia或Erlang。那麼爲什麼你不把你的對象列表放在一個單獨的select中,然後將結果按照你的第三方庫的要求分成許多部分? – Pascal 2013-05-11 17:38:29

+0

這只是解釋爲什麼我需要使用多個選擇來檢索數據。我意識到這涉及到的所有問題。正如我所說的,我正在遠程執行它們,問題在於第三方庫只能通過執行一個命令發送有限數量的數據。但無論如何,這不是問題所在,但是我很欣賞你的意見。 – Amiramix 2013-05-11 20:11:17

回答

0

如果你在你的表上放置了一個自動遞增整數的鍵,那麼你可以用QLC大致做到這一點。

Evens = qlc:q([Rec || Rec <- Table, (Rec#table.int_key rem 2) =:= 0]). 
Odds = qlc:q([Rec || Rec <- Table, (Rec#table.int_key rem 2) =:= 1]). 

的選擇,如果你沒有,或者想這樣一個關鍵,就是用qlc:fold/3在你的表篩選出每一秒的紀錄。如果需要出於內存原因,Mnesia應該使用臨時文件迭代數據。

CollectOddsFun = fun(Rec, {N, List}) -> 
        NewList = case N rem 2 of 
         0 -> [Rec|List]; 
         1 -> List 
        end, 
        {N+1, NewList} 
        end, 

qlc:fold(CollectOddsFun, {0, []}, Table). 
+0

謝謝@ d11wtq,但第一種方法並不好。我不擁有該表,因此我無法添加新列。另外,正如我所提到的,我正在遠程執行選擇。第二種方法是否可行?樂趣會通過網絡發送並在遠程節點上執行嗎? – Amiramix 2013-05-14 14:26:19

0

我創建了一個基於gen_server的行爲,我將其命名爲gen_select。使用它可以使用模塊屬性-behaviour(gen_select)編寫回調模塊。在您的init/1回調函數中,您打開一個ets或dets文件並定義一個匹配規範和一個限制。該過程將通過對每個記錄調用handle_record/2回調的表進行分組,直到文件結束。對於我一直在做的一些「大數據」工作,我發現這是一個方便的範例。 你可以在mnesia表的底層ets表上使用它(如果適用的話),或者修改它以使用mnesia:select/4。

%%% gen_select.erl 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
%%% @doc This module implements a behaviour pattern where a potentially 
%%%  large number of records are read from an {@link //stdlib/ets. ets} 
%%%  or {@link //stdlib/dets. dets} table. This is used in an application 
%%%  to have supervised workers mapping over all the records in a table. 
%%%  The user will call `gen_select:start_link/3', from a supervisor, to 
%%%  create a process which will iterate over the selected records of a 
%%%  table. The `init/1' callback should open the table to 
%%%  be read and construct a match specification to be used to select 
%%%  records from the table. It should return a tuple of the form: 
%%%  ``` 
%%%  {ok, TableType, Table, MatchSpec, Limit, State} | {stop, Reason} | ignore 
%%%   TableType :: ets | dets 
%%%   Table :: ets:tid() | atom() % when Type=ets 
%%%   Table :: dets:tab_name()  % when Type=dets 
%%%   MatchSpec :: match_spec() % see ets:select/2 
%%%   Limit :: integer()   % see ets:select/3 
%%%   State :: term() 
%%%   Reason :: term() 
%%%  ''' 
%%% After initialization {@link //stdlib/ets:select/3. ets:select/3} 
%%% or {@link //stdlib/dets:select/3. dets:select/3} will be called 
%%% using the `match_spec()' and `Limit' returned by `init/`'. The 
%%% callback function `handle_record/2' will then be called for each 
%%% record returned then `select/1' will be called to get more records. 
%%% This is repeated until the end of the table is reached when the 
%%% callback `terminate/2' is called with `Reason=eof'.. 
%%% 
-module(gen_select). 
-author('[email protected]'). 

%% export the gen_select API 
-export([start_link/3]). 

%% export the callbacks needed for a system process 
-export([system_continue/3, system_terminate/4, system_code_change/4]). 
-export([format_status/2]). 

%% exports used internally 
-export([init_it/6]). 

%% define the callback exports of a module behaving as gen_select 
-type state() :: term(). 
-callback init(Args :: term()) -> 
    {ok, TableType :: ets | dets, Table :: ets:tid() | atom() | dets:tab_name(), 
     MatchSpec :: ets:match_spec(), Limit :: non_neg_integer(), State :: state()} 
     | {stop, Reason :: term()} | ignore. 
-callback handle_record(Record :: tuple(), State :: state()) -> 
    {next_record, NewState :: state()} 
     | {stop, Reason :: term(), NewState :: state()}. 
-callback terminate(Reason :: eof | term(), State :: state()) -> 
    any(). 

-import(error_logger, [format/2]). 

%%---------------------------------------------------------------------- 
%% gen_select API 
%%---------------------------------------------------------------------- 

-spec start_link(Mod :: atom(), Args :: term(), 
     Options :: gen:options()) -> gen:start_ret(). 
%% @doc Creates a {@module} process as part of a supervision tree. 
%% 
start_link(Mod, Args, Options) -> 
    gen:start(?MODULE, link, Mod, Args, Options). 

%%---------------------------------------------------------------------- 
%% internal exports 
%%---------------------------------------------------------------------- 

-spec init_it(Starter :: pid(), LinkP :: gen:linkage(), Pid :: pid(), 
     CallBackMod :: atom(), Args :: term(), Options :: gen:options()) -> 
    no_return(). 
%% @doc Called by {@link //stdlib/gen:start/5. gen:start/5} to initialize 
%% the process. 
%% Copied from //stdlib/gen_server:init_it/6. 
%% @hidden 
init_it(Starter, Parent, Pid, CallBackMod, Args, Options) -> 
    Debug = debug_options(Pid, Options), 
    case catch CallBackMod:init(Args) of 
     {ok, TableMod, Table, MatchSpec, Limit, State} -> 
      proc_lib:init_ack(Starter, {ok, self()}), 
      case catch ets:select(Table, MatchSpec, Limit) of 
       {Matches, Cont} when is_list(Matches) -> 
        loop1(Parent, CallBackMod, Debug, State, 
          TableMod, Cont, Matches); 
       '$end_of_table' -> 
        proc_lib:init_ack(Starter, {error, eof}), 
        exit(eof); 
       {error, Reason} -> 
        proc_lib:init_ack(Starter, {error, Reason}), 
        exit(Reason); 
       {'EXIT', Reason} -> 
        proc_lib:init_ack(Starter, {error, Reason}), 
        exit(Reason) 
      end; 
     {stop, Reason} -> 
      proc_lib:init_ack(Starter, {error, Reason}), 
      exit(Reason); 
     ignore -> 
      proc_lib:init_ack(Starter, ignore), 
      exit(normal); 
     {'EXIT', Reason} -> 
      proc_lib:init_ack(Starter, {error, Reason}), 
      exit(Reason); 
     Else -> 
      Error = {bad_return_value, Else}, 
      proc_lib:init_ack(Starter, {error, Error}), 
      exit(Error) 
    end. 

%%---------------------------------------------------------------------- 
%% system process callbacks 
%%---------------------------------------------------------------------- 

-type misc() :: [CallBackMod :: atom() | [State :: state() 
     | [TableMod :: atom() | [Cont :: term() 
     | [Matches :: [tuple()] | []]]]]]. 

-spec system_continue(Parent :: pid(), Debug :: [gen:dbg_opt()], 
     Misc :: misc()) -> no_return(). 
%% @doc Called by {@link //sys:handle_system_msg/6} to continue. 
%% @private 
system_continue(Parent, Debug, [CallBackMod, State, 
     TableMod, Cont, Matches]) -> 
    loop1(Parent, CallBackMod, Debug, State, TableMod, Cont, Matches). 

-spec system_terminate(Reason :: term(), Parent :: pid(), 
     Debug :: [gen:dbg_opt()], Misc :: misc()) -> no_return(). 
%% @doc Called by {@link //sys:handle_system_msg/6} to terminate. 
%% @private 
system_terminate(Reason, _Parent, Debug, [CallBackMod, State, 
     _TableMod, _Cont, _Matches]) -> 
    terminate(Reason, CallBackMod, Debug, State). 

-spec system_code_change(Misc :: misc(), Module :: atom(), 
     OldVsn :: undefined | term(), Extra :: term()) -> 
    {ok, NewMisc :: misc()}. 
%% @doc Called by {@link //sys:handle_system_msg/6} to update `Misc'. 
%% @private 
system_code_change([CallBackMod, State, TableMod, Cont, Matches], 
     _Module, OldVsn, Extra) -> 
    case catch CallBackMod:code_change(OldVsn, State, Extra) of 
     {ok, NewState} -> 
      {ok, [CallBackMod, NewState, TableMod, Cont, Matches]}; 
     Other -> 
      Other 
    end. 

-type pdict() :: [{Key :: term(), Value :: term()}]. 
-type status_data() :: [PDict :: pdict() | [SysState :: term() 
     | [Parent :: pid() | [Debug :: [gen:dbg_opt()] | [Misc :: misc() | []]]]]]. 
-spec format_status(Opt :: normal | terminate, StatusData :: status_data()) -> 
    [tuple()]. 
%% @doc Called by {@link //sys:get_status/1} to print state. 
%% @private 
format_status(Opt, [PDict, SysState, Parent, Debug, 
     [CallBackMod, State, _TableMod, _Cont, _Matches]]) -> 
    Header = gen:format_status_header("Status for table reader", self()), 
    Log = sys:get_debug(log, Debug, []), 
    DefaultStatus = [{data, [{"State", State}]}], 
    Specfic = case erlang:function_exported(CallBackMod, format_status, 2) of 
     true -> 
      case catch CallBackMod:format_status(Opt, [PDict, State]) of 
       {'EXIT', _} -> 
        DefaultStatus; 
       StatusList when is_list(StatusList) -> 
        StatusList; 
       Else -> 
        [Else] 
      end; 
     _ -> 
      DefaultStatus 
    end, 
    [{header, Header}, 
      {data, [{"Status", SysState}, 
        {"Parent", Parent}, 
        {"Logged events", Log}]} 
      | Specfic]. 

%%---------------------------------------------------------------------- 
%% internal functions 
%%---------------------------------------------------------------------- 

-spec loop1(Parent :: pid(), CallBackMod :: atom(), Debug :: [gen:dbg_opt()], 
     State :: state(), TableMod :: atom(), 
     Cont :: term(), Matches :: [tuple()]) -> no_return(). 
%% @doc Main loop. 
%% Copied from //stdlib/gen_server:loop1/6. 
%% @hidden 
loop1(Parent, CallBackMod, Debug, State, TableMod, Cont, Matches) -> 
    receive 
     {system, From, Req} -> 
      sys:handle_system_msg(Req, From, Parent, ?MODULE, Debug, 
        [CallBackMod, State, TableMod, Cont, Matches]); 
     {'EXIT', Parent, Reason} -> 
      terminate(Reason, CallBackMod, Debug, State); 
     Msg -> 
      sys:handle_debug(Debug, fun print_event/3, self(), {in, Msg}) 
    after 0 -> 
     loop2(Parent, CallBackMod, Debug, State, TableMod, Cont, Matches) 
    end. 

-spec loop2(Parent :: pid(), CallBackMod :: atom(), Debug :: [gen:dbg_opt()], 
     State :: state(), TableMod :: atom(), Cont :: term(), 
     Matches :: [tuple()]) -> no_return(). 
%% @doc Run the `select/1' function. 
%% @hidden 
loop2(Parent, CallBackMod, Debug, State, TableMod, Cont, [H | T]) -> 
    case catch CallBackMod:handle_record(H, State) of 
     {next_record, NewState} -> 
      loop1(Parent, CallBackMod, Debug, NewState, TableMod, Cont, T); 
     {stop, Reason, NewState} -> 
      terminate(Reason, CallBackMod, Debug, NewState); 
     {'EXIT', Reason} -> 
      terminate(Reason, CallBackMod, Debug, State) 
    end; 
loop2(Parent, CallBackMod, Debug, State, TableMod, Cont, []) -> 
    case catch TableMod:select(Cont) of 
     {Matches, NewCont} when is_list(Matches) -> 
      sys:handle_debug(Debug, fun print_event/3, self(), {read, Matches}), 
      loop1(Parent, CallBackMod, Debug, State, TableMod, NewCont, Matches); 
     '$end_of_table' -> 
      terminate(eof, CallBackMod, Debug, State); 
     {error, Reason} -> 
      terminate(Reason, CallBackMod, Debug, State); 
     {'EXIT', Reason} -> 
      terminate(Reason, CallBackMod, Debug, State) 
    end. 

-spec terminate(Reason :: term(), CallBackMod :: atom(), Debug :: [gen:dbg_opt()], 
     State :: state()) -> no_return(). 
%% @doc Terminate the {@module} process. 
%% Copied from //stdlib/gen_server:terminate/6. 
%% @hidden 
terminate(Reason, CallBackMod, Debug, State) -> 
    case catch CallBackMod:terminate(Reason, State) of 
     {'EXIT', R} -> 
      error_info(R, State, Debug), 
      exit(R); 
     _ -> 
      case Reason of 
       normal -> 
        exit(normal); 
       shutdown -> 
        exit(shutdown); 
       {shutdown, _} = Shutdown -> 
        exit(Shutdown); 
       _ -> 
        FmtState = case erlang:function_exported(CallBackMod, 
          format_status, 2) of 
         true -> 
          case catch CallBackMod:format_status(terminate, 
            [get(), State]) of 
           {'EXIT', _} -> 
            State; 
           Else -> 
            Else 
          end; 
         _ -> 
          State 
        end, 
        error_info(Reason, FmtState, Debug), 
        exit(Reason) 
      end 
    end. 

-spec error_info(Reason :: term(), State :: state(), 
     Debug :: [gen:dbg_opt()]) -> ok. 
%% @doc Print error log message. 
%% Copied from //stdlib/gen_server:error_info/5. 
%% @hidden 
error_info(Reason, State, Debug) -> 
    Reason1 = case Reason of 
     {undef, [{M, F, A, L} | MFAs]} -> 
      case code:is_loaded(M) of 
       false -> 
        {'module could not be loaded', [{M, F, A, L} | MFAs]}; 
       _ -> 
        case erlang:function_exported(M, F, length(A)) of 
         true -> 
          Reason; 
         false -> 
          {'function not exported', [{M, F, A, L} | MFAs]} 
        end 
      end; 
     _ -> 
      Reason 
    end, 
    format("** Table reader ~p terminating \n" 
      "** When Server state == ~p~n" 
      "** Reason for termination == ~n** ~p~n", 
      [self(), State, Reason1]), 
    sys:print_log(Debug), 
    ok. 

%% Copied from //stdlib/gen_server:opt/2 
opt(Op, [{Op, Value} | _]) -> 
    {ok, Value}; 
opt(Op, [_ | Options]) -> 
    opt(Op, Options); 
opt(_, []) -> 
    false. 

%% Copied from //stdlib/gen_server:debug_options/2 
debug_options(Name, Opts) -> 
    case opt(debug, Opts) of 
     {ok, Options} -> 
      dbg_options(Name, Options); 
     _ -> 
      dbg_options(Name, []) 
    end. 

%% Copied from //stdlib/gen_server:dbg_options/2 
dbg_options(Name, []) -> 
    Opts = case init:get_argument(generic_debug) of 
     error -> 
      []; 
     _ -> 
      [log, statistics] 
    end, 
    dbg_opts(Name, Opts); 
dbg_options(Name, Opts) -> 
    dbg_opts(Name, Opts). 

%% Copied from //stdlib/gen_server:dbg_opts/2 
dbg_opts(Name, Opts) -> 
    case catch sys:debug_options(Opts) of 
     {'EXIT',_} -> 
      format("~p: ignoring erroneous debug options - ~p~n", 
        [Name, Opts]), 
      []; 
     Dbg -> 
      Dbg 
    end. 

-spec print_event(IoDevice :: io:device(), Event :: term(), Pid :: pid()) -> ok. 
%% @doc Called by {@link //sys:handle_debug/4} to print trace events. 
print_event(Dev, {in, Msg}, Pid) -> 
    io:format(Dev, "*DBG* ~p got ~p~n", [Pid, Msg]); 
print_event(Dev, {read, Matches}, Pid) -> 
    io:format(Dev, "*DBG* ~p read ~b records~n", [Pid, length(Matches)]). 
+0

謝謝@VanceShipley,但正如我所提到的,我正在遠程執行選擇。我無法訪問數據庫文件,因爲它位於遠程節點上。我的意思是可以遠程訪問文件,就像我可以執行選擇一樣,但通過網絡分塊會導致性能下降。 – Amiramix 2013-05-14 14:29:16