理想情況下,您會標準化您的數據並將這些單詞存儲在單獨的表中。
但是,對於您的直接需求,您首先需要提供一個UDF來將'desc'拆分爲單詞。我挖走this function:
-- this function splits the provided strings on a delimiter
-- similar to .Net string.Split.
-- I'm sure there are alternatives (such as calling string.Split through
-- a CLR function).
CREATE FUNCTION [dbo].[Split]
(
@RowData NVARCHAR(MAX),
@Delimeter NVARCHAR(MAX)
)
RETURNS @RtnValue TABLE
(
ID INT IDENTITY(1,1),
Data NVARCHAR(MAX)
)
AS
BEGIN
DECLARE @Iterator INT
SET @Iterator = 1
DECLARE @FoundIndex INT
SET @FoundIndex = CHARINDEX(@Delimeter,@RowData)
WHILE (@FoundIndex>0)
BEGIN
INSERT INTO @RtnValue (data)
SELECT
Data = LTRIM(RTRIM(SUBSTRING(@RowData, 1, @FoundIndex - 1)))
SET @RowData = SUBSTRING(@RowData,
@FoundIndex + DATALENGTH(@Delimeter)/2,
LEN(@RowData))
SET @Iterator = @Iterator + 1
SET @FoundIndex = CHARINDEX(@Delimeter, @RowData)
END
INSERT INTO @RtnValue (Data)
SELECT Data = LTRIM(RTRIM(@RowData))
RETURN
END
然後,你需要分割的描述,並做一些分組(你也做,如果數據進行歸一化)
-- get the count of each grp_id
with group_count as
(
select grp_id, count(*) cnt from [Group]
group by grp_id
),
-- get the count of each word in each grp_id
group_word_count as
(
select count(*) cnt, grp_id, data from
(
select * from [group] g
cross apply dbo.Split(g.[Desc], ' ')
)
t
group by grp_id, data
)
-- return rows where number of grp_id = number of words in grp_id
select gwc.GRP_ID, gwc.Data [Desc] from group_word_count gwc
inner join group_count gc on gwc.GRP_ID = gc.GRP_ID and gwc.cnt = gc.cnt
其中[集團]是你的表。
'3 Brown'和'3 Medium'是否也是您的答案之一? – Byron 2013-02-24 20:23:30