2016-10-03 74 views
0

我是比較新的Python,我正在導入Excel到PostgreSQL。 Excel中的地址列有我想要捕獲的重複項。什麼是最可行的方法。如何捕獲重複項?

import psycopg2 
import xlrd 
book = xlrd.open_workbook("data.xlsx") 
sheet = book.sheet_by_name("List") 
database = psycopg2.connect (database = "Excel", user="SQL", password="PASS", host="YES", port="DB") 
cursor = database.cursor() 
delete = """Drop table if exists "Python".list""" 
print (delete) 
mydata = cursor.execute(delete) 
cursor.execute('''CREATE TABLE "Python".list 
    (DCAD_Prop_ID varchar(50), 
Address varchar(50), 
Addition varchar(50), 
Block varchar(50), 
Lot integer, 
Project_ID integer 
    );''') 
print "Table created successfully" 
query = """INSERT INTO "Python".list (DCAD_Prop_ID, Address,Addition,Block ,Lot,Project_ID) 
VALUES (%s, %s, %s, %s, %s, %s)""" 
for r in range(1, sheet.nrows): 
    DCAD_Prop_ID = sheet.cell(r,0).value 
    Address = sheet.cell(r,1).value 
    Addition = sheet.cell(r,2).value 
    Block = sheet.cell(r,3).value 
    Lot = sheet.cell(r,4).value 
    Project_ID = sheet.cell(r,5).value 
values = (DCAD_Prop_ID, Address,Addition,Block ,Lot,Project_ID) 
cursor.execute(query, values) 
cursor.close() 
database.commit() 
database.close() 
print "" 
print "All Done! Bye, for now." 
print "" 
columns = str(sheet.ncols) 
rows = str(sheet.nrows) 
print "I just imported Excel into postgreSQL" 

回答

2

這將返回重複的行:

select DCAD_Prop_ID, Address,Addition,Block,Lot,Project_ID, count(*) 
from "Python".list 
group by 1,2,3,4,5,6 
having count(*) > 1 

爲了消除在Python中重複使用set

>>> t = ((1,2),(2,3),(1,2)) 
>>> set(t) 
set([(1, 2), (2, 3)]) 
+1

感謝您迴應Clodo阿爾。這是我在Python中運行程序時想要捕獲重複項的SQL級別。在Python中可能嗎? –

+1

@PLearner當然。在編輯的問題中使用一個集合。 –

+1

謝謝Clodoaldo。我將如何在原始腳本中應用SET。它會像Address = set(sheet.cell(r,1).value)? –