Algorithmia模型持久與Sklearn

我是很新，但Algorithmia我用scikit學習了一下，我知道如何堅持我的機器學習模型，我已經與JOBLIB受訓後：Algorithmia模型持久與Sklearn

from sklearn.externals joblib 

model = RandomForestRegressor() 
# Train the model, etc 
joblib.dump(model, "prediction/model/model.pkl")

現在我想託管我的ML模型，並使用Algorithmia將其稱爲服務，但我無法弄清楚如何讀取模型。我在一個名爲「testcollection」的Algorithmia中創建了一個名爲「model.pkl」的文件，該文件是joblib.dump調用的結果。根據該文件，這意味着我的文件應位於

數據：//（用戶名）/testcollection/model.pkl

我想在使用joblib.load的文件模型閱讀。這是我目前的算法Algorithmia：

import Algorithmia 

def apply(input): 
    client = Algorithmia.client() 
    f = client.file("data://(username)/testcollection/model.pkl") 
    print(f.path) 
    print(f.url) 
    print(f.getName()) 
    model = joblib.load(f.url) # Or f.path, both don't work 
    return "empty"

這裏的輸出：

(username)/testcollection/model.pkl 
/v1/data/(username)/testcollection/model.pkl 
model.pkl

，並在joblib.load線它的錯誤，給人以「沒有這樣的文件或目錄（我將在任何路徑）「

這裏的所有路徑/網址我呼籲joblib.load嘗試：

/V1 /數據/（用戶名）/爲TestCollection /model.pkl
數據：//（用戶名）/testcollection/model.pkl
（用戶名）/testcollection/model.pkl
https://algorithmia.com/v1/data/(username)/testcollection/model.pkl

如何加載在從模型使用joblib的文件？我是否以這種錯誤的方式去做？

來源

2016-12-15 Nick

我想你只是需要用'f.name'替換'f.url' 路徑和url應該是DataFile對象內部的私有字段......但是它是python，所以沒什麼是私人的 – jamesatha

有幾種訪問DataAPI上的數據的方法。

這裏有4種不同的方法，通過Python客戶端來訪問文件。

import Algorithmia 

client = Algorithmia.client("<YOUR_API_KEY>") 

dataFile = client.file("data://<USER_NAME>/<COLLECTION_NAME>/<FILE_NAME>").getFile() 

dataText = client.file("data://<USER_NAME>/<COLLECTION_NAME>/<FILE_NAME>").getString() 

dataJSON = client.file("data://<USER_NAME>/<COLLECTION_NAME>/<FILE_NAME>").getJson() 

dataBytes = client.file("data://<USER_NAME>/<COLLECTION_NAME>/<FILE_NAME>").getBytes()

由於Sklearn預計路徑模型文件，最簡單的方式來獲得，這將是通過一個文件對象（又名數據文件）。

According to the Official Python2.7 Documentation，如果創建的文件對象不是open()函數，則對象屬性name通常對應於文件的路徑。

在這種情況下，你需要寫這樣的事：

import Algorithmia 

def apply(input): 

    # You don't need to write your API key if you're editing in the web editor 
    client = Algorithmia.client() 

    modelFile = client.file("data://(username)/testcollection/model.pkl").getFile() 

    modelFilePath = modelFile.name 

    model = joblib.load(modelFilePath) 

    return "empty"

但according to the Official Sklearn Model Persistence Documentation，你也應該能夠只通過類似文件的對象，而不是文件名。

因此，我們就可以跳過部分，我們試圖獲得的文件名，只是通過modelFile對象：

import Algorithmia 

def apply(input): 

    # You don't need to write your API key if you're editing in the web editor 
    client = Algorithmia.client() 

    modelFile = client.file("data://(username)/testcollection/model.pkl").getFile() 

    model = joblib.load(modelFile) 

    return "empty"

編輯：Here's also an article in the Offical Algorithmia Developer Center talking about Model Persistence in Scikit-Learn。

詳細信息：我在Algorithmia擔任算法工程師。

來源

2016-12-15 23:12:56

這太棒了，謝謝！ – Nick

Algorithmia模型持久與Sklearn

回答

相關問題