我有一個大約有150萬公司記錄(名稱,國家和其他小文本字段)的MySQL數據庫我想用一個標記標記相同的記錄(例如,如果兩個同名的公司美國然後我必須設置一個字段(match_id)等於一個整數10),同樣適用於其他比賽。目前它需要很長時間(天),我覺得我沒有正確使用MYsql我發佈我的代碼下面,有沒有更快的方法來做到這一點?MYSQL匹配文本字段
<?php
//Create the table if does not already exist
mysql_query("CREATE TABLE IF NOT EXISTS proj (
id INT(11) NOT NULL AUTO_INCREMENT PRIMARY KEY ,
company_id text NOT NULL ,
company_name varchar(40) NOT NULL ,
company_name_text varchar(33) NOT NULL,
company_name_metaphone varchar(19) NOT NULL,
country varchar(20) NOT NULL ,
file_id int(2) NOT NULL ,
thompson_id varchar(11) NOT NULL ,
match_no int(7) NOT NULL ,
INDEX(company_name_text))")
or die ("Couldn't create the table: " . mysql_error());
//********Real script starts********
$countries_searched = array(); //To save record ids already flagged (save time)
$counter = 1; //Flag
//Since the company_names which are same are going to be from the same country so I get all the countries first in the below query and then in the next get all the companies in that country
$sql = "SELECT DISTINCT country FROM proj WHERE country='Canada'";
$result = mysql_query($sql) or die(mysql_error());
while($resultrow = mysql_fetch_assoc($result)) {
$country = $resultrow['country'];
$res = mysql_query("SELECT company_name_metaphone, id, company_name_text
FROM proj
WHERE country='$country'
ORDER BY id") or die (mysql_error());
//Loop through the company records
while ($row = mysql_fetch_array($res, MYSQL_NUM)) {
//If record id is already flagged (matched and saved in the countries searched array) don't waste time doing anything
if (in_array($row[1], $countries_searched)) {
continue;
}
if (strlen($row[0]) > 9) {
$row[0] = substr($row[0],0,9);
$query = mysql_query("SELECT id FROM proj
WHERE country='$country'
AND company_name_metaphone LIKE '$row[0]%'
AND id<>'$row[1]'") or die (mysql_error());
while ($id = mysql_fetch_array($query, MYSQL_NUM)) {
if (!in_array($id[0], $countries_searched)) $countries_searched[] = $id[0];
}
if(mysql_num_rows($query) > 0) {
mysql_query("UPDATE proj SET match_no='$counter'
WHERE country='$country'
AND company_name_metaphone LIKE '$row[0]%'")
or die (mysql_error()." ".mysql_errno());
$counter++;
}
}
else if(strlen($row[0]) > 3) {
$query = mysql_query("SELECT id FROM proj WHERE country='$country'
AND company_name_text='$row[2]' AND id<>'$row[1]'")
or die (mysql_error());
while ($id = mysql_fetch_array($query, MYSQL_NUM)) {
if (!in_array($id[0], $countries_searched)) $countries_searched[] = $id[0];
}
if(mysql_num_rows($query) > 0) {
mysql_query("UPDATE proj SET match_no='$counter'
WHERE country='$country'
AND company_name_text='$row[2]'") or die (mysql_error());
$counter++;
}
}
}
}
?>
請修復您的代碼格式。它的出現爲我破碎。 – jsw 2011-04-26 02:40:35
你真的想完成什麼?有什麼要求?我可以在代碼中看到很多問題,但是如果不知道需求,我不確定指向哪個方向。例如,你的第一個while循環是毫無意義的。你只是試圖去除你的記錄?或者你只需要用同一個INT標記所有匹配的記錄?你最終的目標是什麼? – 2011-04-26 03:04:58
是標誌匹配的記錄具有相同的int – nikhil 2011-04-26 06:04:56