2012-02-09 155 views
0

考慮以下從「NASDAQ.csv」片斷CSV數據導入CSV數據到MySQL

"Symbol,""Name"",""LastSale"",""MarketCap"",""ADR TSO"",""IPOyear"",""Sector"",""industry"",""Summary Quote"",";; 
"FLWS,""1-800 FLOWERS.COM, Inc."",""2.9"",""81745200"",""n/a"",""1999"",""Consumer Services"",""Other Specialty Stores"",""http://www.nasdaq.com/symbol/flws"",";; 
"FCTY,""1st Century Bancshares, Inc"",""4"",""36172000"",""n/a"",""n/a"",""Finance"",""Major Banks"",""http://www.nasdaq.com/symbol/fcty"",";; 
"FCCY,""1st Constitution Bancorp (NJ)"",""8.8999"",""44908895.4"",""n/a"",""n/a"",""Finance"",""Savings Institutions"",""http://www.nasdaq.com/symbol/fccy"",";; 

我試圖導入符號,部門,產業進入一個MySQL表對應字段:

$path = "NASDAQ.csv"; 
$row = 1; 
if (($handle = fopen($path, "r")) !== FALSE) { 
    while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) { 
    $row++; 
    $entries[] = $data ; 
    } 
    fclose($handle); 
} 

foreach ($entries as $line) { 
    db_query(" 
    INSERT INTO us_stocks (symbol, name, sector, industry) 
    VALUES ('%s', '%s', '%s', '%s', '%s')", 
    $line[0], $line[1], $line[6], $line[7] 
); 
} 

但是,結果並不是我所期望的。在數據庫中,只有符號字段得到填補,甚至不會正確:

symbol  name sector industry 
---------------------------------- 
Symbol,"Na 
FLWS,"1-80 
FCTY,"1st 
FCCY,"1st 

我在做什麼錯?

[編輯]

如果我的print_r($項目),輸出看起來像

Array (
    [0] => Array(
    [0] => Symbol,"Name","LastSale","MarketCap","ADR TSO","IPOyear","Sector","industry","Summary Quote",;; 
) 
    [1] => Array(
    [0] => FLWS,"1-800 FLOWERS.COM, Inc.","2.9","81745200","n/a","1999","Consumer Services","Other Specialty Stores","http://www.nasdaq.com/symbol/flws",;; 
) 
    [2] => Array(
    [0] => FCTY,"1st Century Bancshares, Inc","4","36172000","n/a","n/a","Finance","Major Banks","http://www.nasdaq.com/symbol/fcty",;; 
) 
) 

[EDIT2]

我刪除了CSV的第一行,所建議的。我現在有一種快速和骯髒的方式來完成我想要的東西。基本上,只要有一個公司名稱與「,公司」在一起就會攪亂。所以我只是把它粘到上面的名字上:$ data [1] = $ data [1]。 $數據[2]:

$path = "NASDAQ.csv"; 
$row = 1; 
if (($handle = fopen($path, "r")) !== FALSE) { 
    while (($data = fgetcsv($handle, 1000, ";;")) !== FALSE) { 
    if ($row < 100) { 
     $row++; 
     $data = explode(',', $data[0]); 
     if (substr($data[2], 0, 1) == ' ') { 
     $data[1] = $data[1] . $data[2]; 
     unset($data[2]); 
     } 
     $entries[] = $data ; 
    } 
    } 
    fclose($handle); 
} 

一的print_r($項目)現在給:

[0] => Array 
    (
     [0] => FLWS 
     [1] => "1-800 FLOWERS.COM Inc." 
     [3] => "2.9" 
     [4] => "81745200" 
     [5] => "n/a" 
     [6] => "1999" 
     [7] => "Consumer Services" 
     [8] => "Other Specialty Stores" 
     [9] => "http://www.nasdaq.com/symbol/flws" 
     [10] => 
    ) 

最後一個問題:我不知道如何重新編號的鑰匙。因此,3到2,4到3等,使輸出的樣子:

[0] => Array 
    (
     [0] => FLWS 
     [1] => "1-800 FLOWERS.COM Inc." 
     [2] => "2.9" 
     [3] => "81745200" 
     [4] => "n/a" 
     [5] => "1999" 
     [6] => "Consumer Services" 
     [7] => "Other Specialty Stores" 
     [8] => "http://www.nasdaq.com/symbol/flws" 
     [9] => 
    ) 

任何幫助將不勝感激!

+1

我猜它有d o使用CSV文件中使用的雙引號。 'fgetcsv()'('$ enclosure')的第四個參數可以設置爲'「\」\「」'來查看是否屬於這種情況。 – Crontab 2012-02-09 13:03:52

回答

1

正如Crontab所說,這可能是一個報價問題。嘗試:

foreach ($entries as $line) { 

    // Escape (see mysql_real_escape_string too) and remove double quotes 
    foreach ($line as $k => $v) $line[$k] = mysql_escape_string(trim($v, '"')); 

    // Rebuild array 
    $line = array_values($line); 

    db_query(" 
    INSERT INTO us_stocks (symbol, name, sector, industry) 
    VALUES ('%s', '%s', '%s', '%s', '%s')", 
    $line[0], $line[1], $line[6], $line[7] 
); 

} 

PS:我不知道你是否已經在db_query()逃脫字符串。

+0

我已經做了。但是,它不起作用。你的代碼也不是。它現在只是讀取FLWS,「1-8等,因爲雙重逃逸。也許最好是使用正則表達式從每個$數據行中刪除所有單引號和雙引號? – Pr0no 2012-02-09 13:18:51

+0

'trim($ v,''')'從字符串的開始和結尾中刪除單個或多個雙引號,所以,恐怕是'fgetcsv()'無法正確解析CSV。在查詢和沒有我的代碼之前嘗試查看'print_r($ line)'的輸出?是否正確地分割了字段? – 2012-02-09 13:25:26

+0

請參閱編輯:) – Pr0no 2012-02-09 13:27:21

2

我會說數據不是「真正」的CSV。 。

「FLWS」, 「1-800 FLOWERS.COM,公司」 「 」「 2.9 」「, 應該是: 」FLWS「, 」1-800 FLOWERS.COM,INC「,」 2.9。 「 - 。該報價應換用逗號分隔條件各領域的各個字段通常數字字段不裹

取決於你如何加載數據,逗號的數據可以混淆(即FLOWERS.COM ,INC」

順便說一句 - 如果它真的CSV - 看:http://dev.mysql.com/doc/refman/5.1/en/load-data.html

+0

嗯,它肯定不是我見過的最好的csv文件...但它是nasday.com上提供的,我無法找到任何其他來源來導入所有美國股票的代碼符號(我有其他的csv,比如來自同一個網站的AMEX,NYSE)。我不能從所有字段中去掉所有的「和」字符嗎? – Pr0no 2012-02-09 13:16:37

+0

第一行必須有一個拼寫錯誤,因爲Symbol和Name之間的分隔符不在引號內,我只是用「 (更改或tr 2 x引號爲1 x引用),並使用加載數據infile跳過第1行,並指定要加載的列。*我保證*如果您使用加載數據infile,插入將會非常快速。 – FreudianSlip 2012-02-09 13:26:54

+0

也許,但現在,我認爲在PHP中一起黑客攻擊工作對我來說工作得更快......幾乎在那裏btw :)請看看我的最後一個問題 - 如何重新編號鍵 - 如果你有時間的話。 – Pr0no 2012-02-09 14:15:22