2016-11-30 59 views
0

我使用MySQL的比較兩個字段的值,並有3個表,像這樣:msyql子查詢

Create Table users (
    firstName VARCHAR, 
    lastName VARCHAR, 
    userName VARCHAR, 
    email VARCHAR, 
    created DATETIME, etc. 

Create Table data_2013 (
    uid VARCHAR, 
    d1 INT, 
    d2 INT, 
    d3 INT, etc 

Create Table data_2016 (
    uid VARCHAR, 
    d1 INT, 
    d2 INT, 
    d3 INT, etc 
  • uid在兩個數據表的userName場比賽中users

  • 每個用戶在users表中存在兩次(或更多),但始終匹配firstNamelastName

  • 這些用戶的子集(約100個)在「data_xxxx」表中都有數據。

  • 對於2013年的數據,userName是一個8個字符的字符串。對於2016年的數據,userName是他們當前的電子郵件地址(不一定與2013年相同)。

我可以得到所有誰擁有2016年的數據瓦特/這樣的查詢用戶:

SELECT firstName,lastName,userName 
FROM users 
WHERE created > '2016-01-01' 
AND userName IN(SELECT uid FROM data_2016)` 

但我現在想的是,會給我的用戶列表查詢,通過userName,有2013年的數據。但是,如我所說,userName(或uid)不匹配,但firstNamelastName值應該。

我需要這樣的事情,在僞代碼:

SELECT userName 
FROM users 
WHERE created < '2014-01-01' 
and firstName,lastName IN (
    SELECT firstName,lastName 
    FROM users 
    WHERE created > '2016-01-01' 
    AND userName IN(SELECT uid FROM data_2016)) 

我敢肯定,聯合或連接是答案,但我不能弄明白。

任何提示?

由於

EDIT

下面是從users表中的一些示例性數據:

 

    +--------+---------------------+----+----+----+----+----+ 
    | uid | created    | d1 | d2 | d3 | d4 | d5 | 
    +--------+---------------------+----+----+----+----+----+ 
    | rwhite | 2013-08-05 13:24:24 | 38 | 31 | 7 | 22 | 46 | 
    +--------+---------------------+----+----+----+----+----+ 

以上用戶的的實施例:上述用戶的2013數據的

 

    +------------------------+-----------+----------+------------------------+---------------------+ 
    | userName    | firstName | lastName | email     | created    | 
    +------------------------+-----------+----------+------------------------+---------------------+ 
    | rwhite     | ROBERT | WHITE | [email protected] | 2013-08-05 13:13:23 | 
    | [email protected]  | Robert | White | [email protected]  | 2016-10-23 20:26:52 | 
    +------------------------+-----------+----------+------------------------+---------------------+ 

實施例2016年d ATA:

 

    +--------------------+---------------------+----+----+----+----+----+ 
    | uid    | created    | d1 | d2 | d3 | d4 | d5 | 
    +--------------------+---------------------+----+----+----+----+----+ 
    | [email protected] | 2016-10-24 12:37:29 | 38 | 48 | 59 | 71 | 17 | 
    +--------------------+---------------------+----+----+----+----+----+ 

EDIT2

我忘了,我有對某些客戶的額外數據的第4個表:

Create Table users_custA (
    userName VARCHAR, 
    id_num VARCHAR, 
    etc. 
) 

和示例該表中的同一用戶的:

+--------------------+-----------+ 
| userName   | id_num | 
+--------------------+-----------+ 
| rwhite    | N0| 
| [email protected] | N0| 
+--------------------+-----------+ 

This id_num is guarant對一個給定的人來說是唯一的(即,R White是一個單人,在users_custA表中有兩個條目)。

問題依然如此:我如何構建一個查詢來生成在兩個data_xxxx表中都有數據的用戶名列表?

+0

做你data_中*表有名字和姓氏字段? – Nerdwood

+0

向我們展示一些樣品會更有幫助。 – Blank

+0

data_ *表只有以下字段:uid,d1..dN,創建 – atreyu

回答

0

一般來說,期望名稱在時間上是獨一無二且一致的,但如果您確信數據中存在這種情況,那麼您可以像這樣調整您的查詢(假設您有案例不敏感的排序):

SELECT userName 
FROM users As u2013 
WHERE created >= '2013-01-01' 
AND created < '2014-01-01' 
AND EXISTS (
    SELECT 1 
    FROM users As u2016 
    WHERE created >= '2016-01-01' 
    AND created < '2017-01-01' 
    AND u2016.FirstName = u2013.FirstName 
    AND u2016.LastName = u2013.LastName 
    AND EXISTS (SELECT 1 FROM data_2016 WHERE data_2016.uid = u2016.userName)); 

你會使用WHERE EXISTS而不是WHERE ... IN因爲不支持WHERE (col1, col2) IN ...,它只支持它單列or so I understand.

編輯

您可以整合您users_custA表以這種方式獲得更一定的匹配:

Select * 
    From users_custA 
    Where id_num In (
    SELECT id_num 
     FROM (
     SELECT DISTINCT id_num 
      FROM users As u 
      JOIN users_custA As a On u.userName = a.userName 
      WHERE created >= '2013-01-01' 
      AND created < '2014-01-01' 
     UNION ALL 
     SELECT DISTINCT id_num 
      FROM users As u 
      JOIN users_custA As a On u.userName = a.userName 
      WHERE created >= '2016-01-01' 
      AND created < '2017-01-01') As union_subquery 
     GROUP BY id_num 
     HAVING COUNT(*) = 2); 
+0

感謝您的回覆。會試一試。在我這樣做之前,我必須提及 - 我忘記了我有第四張桌子,每個用戶都有一個唯一的ID。將在另一個編輯中解釋更多。我同意依靠名字是壞的。 – atreyu

+0

@atreyu在你的問題中,你說'users_custA'表具有「* some * customers的額外數據」。如果它沒有*全部*客戶,那麼我認爲這不會有助於回答您的查詢。 – mendosi

+0

好點。幸運的是,我可以保證,雖然'users_custA'表沒有'user'表中每個用戶的額外數據,但'data_xxxx'表中的每個用戶都有額外的數據,這對我來說很重要,爲這個查詢。 – atreyu