2017-03-16 153 views
5

我一直在努力恢復tensorflow一個模型,但我一直 遇到了一些問題,當我嘗試導入元圖:Tensorflow:保存/恢復會話,檢查站,元圖

這是我的代碼導入在元圖:

#Create a clean graph and import MetaGraphDef nodes 
new_graph = tf.Graph() 
with tf.Session(graph=new_graph) as sess: 
    # Import the previously exported metagraph 
    saver = tf.train.import_meta_graph('/tmp/saver-model.meta') 
    saver.restore(sess, tf.train.latest_checkpoint('./')) 

在我的模型類我指定的佔位符和集合,如下所示:

"""Place Holders""" 
    self.input = tf.placeholder(tf.float32, [None, sl], name = 'input') 
    self.labels = tf.placeholder(tf.int64, [None], name = 'labels') 
    self.keep_prob = tf.placeholder("float", name= 'Drop_out_keep_prob') 
    tf.add_to_collection('vars', self.input) 
    tf.add_to_collection('vars', self.labels) 
    tf.add_to_collection('vars', self.keep_prob) 

我訓練我的模型如下:

saver = tf.train.Saver(tf.global_variables()) 
# Session time 
sess = tf.Session() # without context manager, close the session later. 
writer = tf.summary.FileWriter("/tmp/model/log_tb", sess.graph) # Writer for tensorboard 
sess.run(model.init_op) 

self.init_op = tf.global_variables_initializer()

和使用這些三種不同的選擇,包括無證export_scoped_meta_graph導出:

# Export the model to /tmp/my-model.meta. 
scoped_meta = meta_graph.export_scoped_meta_graph(filename='/tmp/scoped.meta') 
meta_graph_def = tf.train.export_meta_graph(filename='/tmp/my-model.meta') 
saver.save(sess, '/tmp/saver-model') 

這是嘗試在Windows 10下運行時出現錯誤:

E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "BestSplits" device_type: "CPU"') for unknown op: BestSplits 
E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "CountExtremelyRandomStats" device_type: "CPU"') for unknown op: CountExtremelyRandomStats 
E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "FinishedNodes" device_type: "CPU"') for unknown op: FinishedNodes 
E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "GrowTree" device_type: "CPU"') for unknown op: GrowTree 
E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "ReinterpretStringToFloat" device_type: "CPU"') for unknown op: ReinterpretStringToFloat 
E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "SampleInputs" device_type: "CPU"') for unknown op: SampleInputs 
E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "ScatterAddNdim" device_type: "CPU"') for unknown op: ScatterAddNdim 
E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "TopNInsert" device_type: "CPU"') for unknown op: TopNInsert 
E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "TopNRemove" device_type: "CPU"') for unknown op: TopNRemove 
E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "TreePredictions" device_type: "CPU"') for unknown op: TreePredictions 
E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "UpdateFertileSlots" device_type: "CPU"') for unknown op: UpdateFertileSlots 
TypeError: expected bytes, NoneType found 

During handling of the above exception, another exception occurred: 


--------------------------------------------------------------------------- 
TypeError         Traceback (most recent call last) 
TypeError: expected bytes, NoneType found 

During handling of the above exception, another exception occurred: 

SystemError        Traceback (most recent call last) 
<ipython-input-37-60792895b01c> in <module>() 
     6  #saver = tf.train.import_meta_graph('/tmp/saver-model.meta') 
     7  saver = tf.train.import_meta_graph('/tmp/my-model.meta') 
----> 8  saver.restore(sess, tf.train.latest_checkpoint('./')) 

c:\users\carlos\anaconda3\lib\site-packages\tensorflow\python\training\saver.py in restore(self, sess, save_path) 
    1437  return 
    1438  sess.run(self.saver_def.restore_op_name, 
-> 1439    {self.saver_def.filename_tensor_name: save_path}) 
    1440 
    1441 @staticmethod 

c:\users\carlos\anaconda3\lib\site-packages\tensorflow\python\client\session.py in run(self, fetches, feed_dict, options, run_metadata) 
    765  try: 
    766  result = self._run(None, fetches, feed_dict, options_ptr, 
--> 767       run_metadata_ptr) 
    768  if run_metadata: 
    769   proto_data = tf_session.TF_GetBuffer(run_metadata_ptr) 

c:\users\carlos\anaconda3\lib\site-packages\tensorflow\python\client\session.py in _run(self, handle, fetches, feed_dict, options, run_metadata) 
    963  if final_fetches or final_targets: 
    964  results = self._do_run(handle, final_targets, final_fetches, 
--> 965        feed_dict_string, options, run_metadata) 
    966  else: 
    967  results = [] 

c:\users\carlos\anaconda3\lib\site-packages\tensorflow\python\client\session.py in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata) 
    1013  if handle is None: 
    1014  return self._do_call(_run_fn, self._session, feed_dict, fetch_list, 
-> 1015       target_list, options, run_metadata) 
    1016  else: 
    1017  return self._do_call(_prun_fn, self._session, handle, feed_dict, 

c:\users\carlos\anaconda3\lib\site-packages\tensorflow\python\client\session.py in _do_call(self, fn, *args) 
    1020 def _do_call(self, fn, *args): 
    1021  try: 
-> 1022  return fn(*args) 
    1023  except errors.OpError as e: 
    1024  message = compat.as_text(e.message) 

c:\users\carlos\anaconda3\lib\site-packages\tensorflow\python\client\session.py in _run_fn(session, feed_dict, fetch_list, target_list, options, run_metadata) 
    1002   return tf_session.TF_Run(session, options, 
    1003         feed_dict, fetch_list, target_list, 
-> 1004         status, run_metadata) 
    1005 
    1006  def _prun_fn(session, handle, feed_dict, fetch_list): 

SystemError: <built-in function TF_Run> returned a result with an error set 

當試圖debian下運行這個命令:

I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0 1 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0: Y Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 1: Y Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX TITAN X, pci bus id: 0000:01:00.0) 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:1) -> (device: 1, name: GeForce GTX TITAN X, pci bus id: 0000:02:00.0) 
Traceback (most recent call last): 
    File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/client/session.py", line 1022, in _do_call 
    return fn(*args) 
    File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/client/session.py", line 1004, in _run_fn 
    status, run_metadata) 
    File "/usr/lib/python3.4/contextlib.py", line 66, in __exit__ 
    next(self.gen) 
    File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/framework/errors_impl.py", line 469, in raise_exception_on_not_ok_status 
    pywrap_tensorflow.TF_GetCode(status)) 
tensorflow.python.framework.errors_impl.InternalError: Unable to get element from the feed as bytes. 

During handling of the above exception, another exception occurred: 

Traceback (most recent call last): 
    File "<stdin>", line 3, in <module> 
    File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/training/saver.py", line 1439, in restore 
    {self.saver_def.filename_tensor_name: save_path}) 
    File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/client/session.py", line 767, in run 
    run_metadata_ptr) 
    File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/client/session.py", line 965, in _run 
    feed_dict_string, options, run_metadata) 
    File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/client/session.py", line 1015, in _do_run 
    target_list, options, run_metadata) 
    File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/client/session.py", line 1035, in _do_call 
    raise type(e)(node_def, op, message) 
tensorflow.python.framework.errors_impl.InternalError: Unable to get element from the feed as bytes. 
+0

你並不需要調用'export_scoped_meta_graph'和'export_meta_graph'。 'Saver'默認保存與檢查點一起的metagraph。 – Temak

+0

爲了澄清,我嘗試了三種彼此獨立的方式,並得到了我發佈的錯誤。我相信它是錯誤的檢查點 – SerialDev

回答

3

我設法解決它,並決定分享的情況下,有人來翻過這在未來:

添加所有的佔位符集合:

tf.add_to_collection('vars', input) 
tf.add_to_collection('vars', labels) 
tf.add_to_collection('vars', keep_prob) 

合併和獨立地初始化變量(避免使用tf.global_variables_initializer()):

merged = tf.summary.merge([loss_summ, cost_summ, tloss_summ, acc_summ]) 

節省培訓期間模型:

if i%100 == 0: 
    saver.save(sess, save_dir + 'model.ckpt', global_step=i+100) 

初始化新的元圖,包括之前導入元圖到新 會話的保護:

這將防止saver.saver_def.filename_tensor_name錯誤

名稱'save/Const:0'是指不存在的張量

這是因爲:

* The default name scope for a tf.train.Saver is "save/" and the placeholder 
is actually a tf.constant() whose name defaults to "Const:0", which explains 
why the flag defaults to "save/Const:0". 



saver = tf.train.Saver() 
sess = tf.Session() 
sess.run(init_op) 

獲取使用tf.train.get_checkpoint_state()檢查點:

sess =tf.Session() 
ckpt = tf.train.get_checkpoint_state(save_dir) 
saver.restore(sess, ckpt.model_checkpoint_path)