2014-09-12 92 views
2

我有一個以json的形式返回curl語句的文件。每個對象都有一組值,但這些值的參數都被稱爲相同的名稱。 請參閱下面的代碼使用awk/sed提取json對象中的值,但無法使其工作

這些對象是大對象workflow的一部分。 Cleaning up對象是在我們的工作流程中運行的最後一個過程。對於通過工作流程的每個視頻,都會創建此格式的json文件。 (不止有這三個對象,這僅僅是爲了說明的目的)

我想取對象的值completed"description": "Cleaning up"並將其存儲爲變量$end_time。然後我想取"description": "Ingest"的對象的completed的值,並將其存儲爲變量$start_time。然後將這兩個值減去以給出一個以毫秒爲單位的整數時間,以便我可以計算視頻通過該部分過程所花費的時間。數學部分我很好,並且知道如何去做。這是我正在努力掙扎的價值觀的提取。

我希望這有道理嗎?任何幫助,將不勝感激。先謝謝你!

編輯:曾在後刪除原代碼,由於字符限制

下面是該文件的一個適當的例子,我有工作:

{ 
    "workflows": { 
     "count": "20", 
     "searchTime": "1", 
     "startPage": "0", 
     "totalCount": "1", 
     "workflow": { 
      "configurations": { 
       "configuration": [ 
        { 
         "$": "1409750880000", 
         "key": "schedule.start" 
        }, 
        { 
         "$": "1409755980000", 
         "key": "schedule.stop" 
        }, 
        { 
         "$": "Capture_agent", 
         "key": "schedule.location" 
        }, 
        { 
         "$": "false", 
         "key": "trimHold" 
        }, 
        { 
         "$": "true", 
         "key": "archiveOp" 
        }, 
        { 
         "$": "false", 
         "key": "captionHold" 
        }, 
        { 
         "$": "false", 
         "key": "videoPreview" 
        } 
       ] 
      }, 
      "creator": { 
       "organization": "mh_default_org", 
       "roles": [ 
        "76b1bdde-a080-40a4-b929-bde89af6a0a8_Instructor", 
        "ROLE_ADMIN", 
        "ROLE_ANONYMOUS", 
        "ROLE_USER" 
       ], 
       "userName": user_name 
      }, 
      "description": "This workflow definition defines the steps involved in scheduling a recording, capturing it, and\n ingesting it, after which processing operations may be added.\n ", 
      "errors": "", 
      "id": "15518", 
      "mediapackage": { 
       "attachments": "", 
       "creators": { 
        "creator": "Name" 
       }, 
       "id": "2d25ed19-2978-458d-a4a0-c9c56d791c68", 
       "license": "Creative Commons 3.0: Attribution-NonCommercial-NoDerivs", 
       "media": "", 
       "metadata": "", 
       "publications": { 
        "publication": { 
         "channel": "engage-player", 
         "id": "b7b68f91-2c33-4673-ba7c-2e9b891788f9", 
         "mimetype": "text/html", 
         "tags": "", 
         "url": "http://some.url.com:80/engage/ui/watch.html?id=2d25ed19-2978-458d-a4a0-c9c56d791c68" 
        } 
       }, 
       "series": "76b1bdde-a080-40a4-b929-bde89af6a0a8", 
       "seriestitle": "Recording_Title_user_name", 
       "start": "2014-09-03T13:28:00Z", 
       "title": "Recording_Title" 
      }, 
      "operations": { 
       "operation": [ 
        { 
         "abortable": "false", 
         "completed": 1409750882092, 
         "configurations": { 
          "configuration": [ 
           { 
            "$": "1409750880000", 
            "key": "schedule.start" 
           }, 
           { 
            "$": "1409755980000", 
            "key": "schedule.stop" 
           }, 
           { 
            "$": "Capture_agent", 
            "key": "schedule.location" 
           } 
          ] 
         }, 
         "continuable": "false", 
         "description": "Scheduled", 
         "execution-history": "", 
         "execution-host": "http://some.url.com:8080", 
         "fail-on-error": "true", 
         "failed-attempts": "0", 
         "hold-action-title": "View schedule", 
         "holdurl": "/workflow/hold/org.opencastproject.workflow.handler.scheduleworkflowoperationhandler", 
         "id": "schedule", 
         "job": "15519", 
         "max-attempts": "1", 
         "retry-strategy": "none", 
         "started": 1409750881745, 
         "state": "SUCCEEDED", 
         "time-in-queue": 0 
        }, 
        { 
         "abortable": "false", 
         "configurations": "", 
         "continuable": "false", 
         "description": "Capture", 
         "execution-history": "", 
         "execution-host": "http://some.url.com:8080", 
         "fail-on-error": "true", 
         "failed-attempts": "0", 
         "hold-action-title": "Monitor capture", 
         "holdurl": "/workflow/hold/org.opencastproject.workflow.handler.captureworkflowoperationhandler", 
         "id": "capture", 
         "job": "42894", 
         "max-attempts": "1", 
         "retry-strategy": "none", 
         "started": 1409750884085, 
         "state": "SKIPPED", 
         "time-in-queue": 0 
        }, 
        { 
         "completed": 1409756171224, 
         "configurations": "", 
         "description": "Ingest", 
         "execution-history": "", 
         "fail-on-error": "true", 
         "failed-attempts": "0", 
         "id": "ingest", 
         "max-attempts": "1", 
         "retry-strategy": "none", 
         "state": "SUCCEEDED" 
        },      
        { 
         "completed": 1409854379552, 
         "configurations": { 
          "configuration": { 
           "key": "preserve-flavors" 
          } 
         }, 
         "description": "Cleaning up", 
         "execution-history": "", 
         "execution-host": "http://some.url.com:8080", 
         "fail-on-error": "false", 
         "failed-attempts": "0", 
         "id": "cleanup", 
         "job": "45113", 
         "max-attempts": "1", 
         "retry-strategy": "none", 
         "started": 1409854378128, 
         "state": "SUCCEEDED", 
         "time-in-queue": 0 
        } 
       ] 
      }, 
      "organization": { 
       "adminRole": "ROLE_ADMIN", 
       "anonymousRole": "ROLE_ANONYMOUS", 
       "id": "mh_default_org", 
       "name": "Opencast Project", 
       "properties": { 
        "property": [ 
         { 
          "$": "true", 
          "key": "adminui.i18n_tab_episode.enable" 
         }, 
         { 
          "$": "false", 
          "key": "adminui.i18n_tab_users.enable" 
         }, 
         { 
          "$": "/engage/ui/img/mh_logos/OpencastLogo.png", 
          "key": "logo_small" 
         }, 
         { 
          "$": "http://opencast.org/matterhorn/", 
          "key": "engageui.link_mobile_redirect.url" 
         }, 
         { 
          "$": "false", 
          "key": "engageui.annotations.enable" 
         }, 
         { 
          "$": "true", 
          "key": "engageui.links_media_module.enable" 
         }, 
         { 
          "$": "2024", 
          "key": "adminui.chunksize" 
         }, 
         { 
          "$": "false", 
          "key": "adminui.series_prepopulate.enable" 
         }, 
         { 
          "$": "true", 
          "key": "engageui.link_download.enable" 
         }, 
         { 
          "$": "false", 
          "key": "engageui.link_mobile_redirect.enable" 
         }, 
         { 
          "$": "For more information have a look at the official site.", 
          "key": "engageui.link_mobile_redirect.description" 
         }, 
         { 
          "$": "/engage/ui/img/mh_logos/MatterhornLogo_large.png", 
          "key": "logo_large" 
         } 
        ] 
       }, 
       "servers": { 
        "server": { 
         "name": "localhost", 
         "port": "8080" 
        } 
       } 
      }, 
      "parent": { 
       "nil": "true" 
      }, 
      "state": "SUCCEEDED", 
      "template": "full", 
      "title": "Scheduled Workflow" 
     } 
    } 
} 
+3

嘗試json解析器而不是awk或sed。 – 2014-09-12 14:17:27

+0

嘗試使用Python。 – 2014-09-12 14:21:55

+0

awk和sed用於正則表達式,正如你發現的那樣,你不能用正則表達式來解析JSON結構。對於像XML和JSON這樣更復雜的結構,您需要使用Python或Perl,它們具有可以處理這些數據結構的模塊。 – 2014-09-12 14:22:08

回答

1

這裏是一個jq例如,應該指向你得到你想要的東西:

#!/bin/bash 
# Assuming the json is in a file workflow.json 
end_time=$(jq '.workflows.workflow.operations.operation[] | select(.description == "Cleaning up") | .completed' < workflow.json) 
start_time=$(jq '.workflows.workflow.operations.operation[] | select(.description == "Ingest") | .completed' < workflow.json) 

這是假設你輸入的內容ve位於頂層的名爲workflow的JSON數組中。這是在命令行上:

$ jq '.workflows.workflow.operations.operation[] | select(.description == "Ingest") | .completed' < workflow.json 
1406051539118 
$ jq '.workflows.workflow.operations.operation[] | select(.description == "Cleaning up") | .completed' < workflow.json 
1406051695440 
+0

非常感謝。我明天會測試一下,讓你知道!非常非常感謝你! – BashNewbie 2014-09-13 18:13:07

+0

我在這裏掙扎了一下。 我已經嘗試了你的命令,但我不斷收到錯誤。 錯誤:'jq:error:Can not'迭代空' 我在原始文章中添加了完整的json文件,因爲此評論中沒有足夠的字符。 – BashNewbie 2014-09-14 11:10:00

+0

我更新了基於新JSON的答案,但在示例中沒有看到「提取文本...」。如果你尋找這個,你會得到一個空白的結果。但是,「清理」示例適用於您提供的JSON。需要記住的是第一部分是你想要查看的數組,而最初的例子並沒有完整的路徑。我會先嚐試命令行上的'jq'命令來驗證它是否給你你想要的。 – zerodiff 2014-09-14 17:47:13

相關問題