3

我們的要求是以編程方式備份​​Google數據存儲並將這些備份加載到Google Big查詢以供進一步分析。我們成功地自動備份使用以下方法將Google數據存儲備份從數據存儲加載到Google BigQuery

 Queue queue = QueueFactory.getQueue("datastoreBackupQueue"); 

     /* 
     * Create a task which is equivalent to the backup URL mentioned in 
     * above cron.xml, using new queue which has Datastore admin enabled 
     */ 
     TaskOptions taskOptions = TaskOptions.Builder.withUrl("/_ah/datastore_admin/backup.create") 
       .method(TaskOptions.Method.GET).param("name", "").param("filesystem", "gs") 
       .param("gs_bucket_name", 
         "db-backup" + "/" + TimeUtils.parseDateToString(new Date(), "yyyy/MMM/dd")) 
       .param("queue", queue.getQueueName()); 

     /* 
     * Get list of dynamic entity kind names from the datastore based on 
     * the kinds present in the datastore at the start of backup 
     */ 
     List<String> entityNames = getEntityNamesForBackup(); 
     for (String entityName : entityNames) { 
      taskOptions.param("kind", entityName); 
     } 

     /* Add this task to above queue */ 
     queue.add(taskOptions); 

我能夠再導入這個備份手動谷歌的BigQuery,但我們如何自動完成這一過程?

我也看過大多數的文檔,並沒有任何幫助 https://cloud.google.com/bigquery/docs/loading-data-cloud-storage#loading_data_from_google_cloud_storage

回答

1

我已經解決了這個問題,下面是使用JAVA的解決方案 以下代碼將從GoogleCloud存儲中獲取備份文件並將其加載到Google Big Query中。

 AppIdentityCredential bqCredential = new AppIdentityCredential(
       Collections.singleton(BigqueryScopes.BIGQUERY)); 

     AppIdentityCredential dsCredential = new AppIdentityCredential(
       Collections.singleton(StorageScopes.CLOUD_PLATFORM)); 

     Storage storage = new Storage(HTTP_TRANSPORT, JSON_FACTORY, dsCredential); 
     Objects list = storage.objects().list(bucket).setPrefix(prefix).setFields("items/name").execute(); 

     if (list == null) { 
      Log.severe(BackupDBController.class, "BackupToBigQueryController", 
        "List from Google Cloud Storage was null", null); 
     } else if (list.isEmpty()) { 
      Log.severe(BackupDBController.class, "BackupToBigQueryController", 
        "List from Google Cloud Storage was empty", null); 
     } else { 

      for (String kind : getEntityNamesForBackup()) { 
       Job job = new Job(); 
       JobConfiguration config = new JobConfiguration(); 
       JobConfigurationLoad loadConfig = new JobConfigurationLoad(); 

       String url = ""; 
       for (StorageObject obj : list.getItems()) { 
        String currentUrl = obj.getName(); 
        if (currentUrl.contains(kind + ".backup_info")) { 
         url = currentUrl; 
         break; 
        } 
       } 

       if (StringUtils.isStringEmpty(url)) { 
        continue; 
       } else { 
        url = "gs://"+bucket+"/" + url; 
       } 

       List<String> gsUrls = new ArrayList<>(); 
       gsUrls.add(url); 

       loadConfig.setSourceUris(gsUrls); 
       loadConfig.set("sourceFormat", "DATASTORE_BACKUP"); 
       loadConfig.set("allowQuotedNewlines", true); 

       TableReference table = new TableReference(); 
       table.setProjectId(projectId); 
       table.setDatasetId(datasetId); 
       table.setTableId(kind); 
       loadConfig.setDestinationTable(table); 

       config.setLoad(loadConfig); 
       job.setConfiguration(config); 

       Bigquery bigquery = new Bigquery.Builder(HTTP_TRANSPORT, JSON_FACTORY, bqCredential) 
         .setApplicationName("BigQuery-Service-Accounts/0.1").setHttpRequestInitializer(bqCredential) 
         .build(); 
       Insert insert = bigquery.jobs().insert(projectId, job); 

       JobReference jr = insert.execute().getJobReference(); 
       Log.info(BackupDBController.class, "BackupToBigQueryController", 
         "Moving data to BigQuery was successful", null); 
      } 
     } 

如果任何人有一個更好的方法,請讓我知道

1

在,你在你的問題中提到的loading data from Google Cloud Storage article,僅描述GCS進口的一些綱領性的例子是使用命令行,Node.js的或蟒蛇。

您還可以自動位於雲存儲BigQuery的進口數據,通過在腳本中運行以下命令:

$ gcloud alpha bigquery import SOURCE DESTINATION_TABLE 

有關該命令訪問此article更多信息。

相關問題