我有兩個表格(主題和術語)和第三個表格,用於我的兩個實體之間的多對多關係。從MySQL數據計算加權分數的函數?
每個稱爲bagging的關係都有一個源(文本)和一個權重(int在0和100之間)。同一對(主題詞)可以有幾個裝袋(不同的來源),每個都有不同的重量。
現在,當我詢問的話題,找出什麼是它最好的條件(更重),我非常想與計算重量爲唯一值:
- 100的權重是指這個項目是在最大
- 幾個重量爲同一對(不同的源)權衡比單對更
- 沒有「負」重量
這裏是數據庫模式:
| TOPIC
+-------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| label | varchar(255) | NO | UNI | NULL | |
| wtext | varchar(40) | YES | | NULL | |
+-------+------------------+------+-----+---------+----------------+
| TERM
+-------+---------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+---------------------+------+-----+---------+----------------+
| id | bigint(20) unsigned | NO | PRI | NULL | auto_increment |
| label | varchar(255) | NO | UNI | NULL | |
| slug | varchar(255) | NO | | NULL | |
+-------+---------------------+------+-----+---------+----------------+
| BAGGING
+----------+---------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------+---------------------+------+-----+---------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| topic_id | int(11) unsigned | NO | MUL | NULL | |
| term_id | bigint(11) unsigned | NO | MUL | NULL | |
| weight | tinyint(1) unsigned | NO | | NULL | |
| source | varchar(8) | YES | | GEN | |
+----------+---------------------+------+-----+---------+----------------+
這是我簡單的查詢:
SELECT
bagging.topic_id as topic_id,
topic.label as topic_label,
bagging.term_id as term_id,
term.label as term_label,
bagging.weight as weight,
bagging.source as source
FROM
bagging
JOIN term ON term.id = bagging.term_id
JOIN topic ON topic.id = bagging.topic_id
WHERE
bagging.topic_id = (SELECT id FROM topic WHERE label = 'Altruism')
ORDER BY
bagging.weight DESC
這給了我下面的結果:
+----------+-------------+---------+-----------------------+--------+--------+
| topic_id | topic_label | term_id | term_label | weight | source |
+----------+-------------+---------+-----------------------+--------+--------+
| 8 | Altruism | 83 | Altruism | 100 | TOPIC |
+----------+-------------+---------+-----------------------+--------+--------+
| 8 | Altruism | 100 | Altruism (philosophy) | 95 | WPRD |
| 8 | Altruism | 100 | Altruism (philosophy) | 95 | MAN |
| 8 | Altruism | 84 | Truist | 95 | MAN |
| 8 | Altruism | 84 | Truist | 15 | WPRD |
+----------+-------------+---------+-----------------------+--------+--------+
| 8 | Altruism | 94 | Selfless action | 95 | WPRD |
| 8 | Altruism | 95 | Alturism | 95 | WPRD |
| 8 | Altruism | 96 | Digital altruism | 95 | WPRD |
| 8 | Altruism | 97 | Selflessly | 95 | WPRD |
| 8 | Altruism | 98 | Altruistical | 95 | WPRD |
| 8 | Altruism | 99 | Law of mutual aid | 95 | WPRD |
| 8 | Altruism | 101 | Altruistically | 95 | WPRD |
| 8 | Altruism | 85 | Altruistic | 95 | WPRD |
| 8 | Altruism | 86 | Altruist | 95 | WPRD |
| 8 | Altruism | 87 | Otherism | 95 | WPRD |
| 8 | Altruism | 88 | Unselfishness | 95 | WPRD |
| 8 | Altruism | 89 | Altruistic behavior | 95 | WPRD |
| 8 | Altruism | 90 | Altutrists | 95 | WPRD |
| 8 | Altruism | 91 | Altruists | 95 | WPRD |
| 8 | Altruism | 102 | Pathological altruism | 95 | WPRD |
+----------+-------------+---------+-----------------------+--------+--------+
現在,如何創建一個計分功能,將採取以下考慮到這個特定的例子:
Altruism
是無與倫比的,只能等於(= 100)Truist
顯然應該由15
/100重量,但這樣的事實,有兩個也應被考慮,尤其是,因爲第二是95
- 受到懲罰
Altruist (Philosophy)
體重應該超過所有其他(除Altruism
不是隻能望其項背。)即使95倍看起來大於100
最終的結果並沒有從1擴展到100,它可以是考慮到這些限制的相對或抽象評級。
我試着通過計算每行(term_sum_weight * 100/topic_weight_sum_of_all_terms)
,但看到下面的結果,它們沒有足夠的重量。
該公式比在將要使用的語言更重要...... MySQL或Python/PHP中的程序。
預期結果(沿着這些線路...)
+----------+-------------+---------+-----------------------+-------+--------+
| topic_id | topic_label | term_id | term_label | score | source |
+----------+-------------+---------+-----------------------+-------+--------+
| 8 | Altruism | 83 | Altruism | 1 | TOPIC |
+----------+-------------+---------+-----------------------+-------+--------+
| 8 | Altruism | 100 | Altruism (philosophy) | 0.98 | WPRD |
| 8 | Altruism | 84 | Truist | 0.96 | MAN |
+----------+-------------+---------+-----------------------+--------+-------+
| 8 | Altruism | 94 | Selfless action | 0.95 | MAN |
| 8 | Altruism | 95 | Alturism | 0.95 | MAN |
| 8 | Altruism | 96 | Digital altruism | 0.95 | MAN |
...........
| 8 | Altruism | 97 | Selflessly | 0.95 | MAN |
| 8 | Altruism | 90 | Altutrists | 0.95 | MAN |
| 8 | Altruism | 91 | Altruists | 0.95 | MAN |
| 8 | Altruism | 102 | Pathological altruism | 0.95 | MAN |
+----------+-------------+---------+-----------------------+--------+-------+
您的預期成果是什麼? – Viki888
請參閱http://meta.stackoverflow.com/questions/333952/why-should-i-provide-an-mcve-for-what-seems-to-me-to-be-a-very-simple-sql-查詢 – Strawberry
1.您對您想要的內容(項目符號列表和引見)的描述無法理解。 2.粗略地說,猜測,您的評分函數可能應該包括將給定對的權重總和除以該對的行數。 – philipxy