4

給定一個二維空間中的點列表,您想要在 Haskell中執行函數來找到兩個最近點之間的距離。 例子: 輸入:項目[(1,5),(3,4),(2,8),(-1,2),(-8.6),(7.0),(1.5),(5.5), (4.8),(7.4)] 輸出:2.0Haskell找到兩個最近點之間的距離

假設在列表中的最遠的兩個點之間的距離爲至多10000

Here's我的代碼:

import Data.List 
import System.Random 

sort_ :: Ord a => [a] -> [a] 
sort_ [] = [] 
sort_ [x] = [x] 
sort_ xs = merge (sort_ left) (sort_ right) 
    where 
    (left, right) = splitAt (length xs `div` 2) xs 
    merge [] xs = xs 
    merge xs [] = xs 
    merge (x:xs) (y:ys)= 
    if x <= y then 
     x : merge xs (y:ys) 
    else y : merge (x:xs) ys  

project :: [(Float,Float)] -> Float 
project [] = 0 
project (x:xs)= 
    if null (xs) then 
     error "The list have only 1 point" 
    else head(sort_(dstList(x:xs))) 

distance :: (Float,Float)->(Float,Float) -> Float 
distance (x1,y1) (x2,y2) = sqrt((x1 - x2)^2 + (y1 - y2)^2) 


dstList :: [(Float,Float)] -> [Float] 
dstList (x:xs)= 
    if length xs == 1 then 
     (dstBetween x xs):[] 
    else (dstBetween x xs):(dstList xs) 


dstBetween :: (Float,Float) -> [(Float,Float)] -> Float 
dstBetween pnt (x:xs)= 
    if null (xs) then 
     distance pnt x 
    else minimum ((distance pnt):((dstBetween pnt xs)):[]) 

{- 
Calling generator to create a file created at random points 
-} 
generator = do 
    putStrLn "Enter File Name" 
    file <- getLine 
    g <- newStdGen 
    let pts = take 1000 . unfoldr (Just . (\([a,b],c)->((a,b),c)) . splitAt 2) 
       $ randomRs(-1,1) g :: [(Float,Float)] 
    writeFile file . show $ pts 

{- 
Call the main to read a file and pass it to the function of project 
The function of the project should keep the name 'project' as described 
in the statement 
-} 
main= do 
    putStrLn "Enter filename to read" 
    name <- getLine 
    file <- readFile name 
    putStrLn . show . project $ readA file 

readA::String->[(Float,Float)] 
readA = read 

我可以執行程序的運行,如下例所示或使用生成器:

in haskell interpreter must鍵入「生成器」,程序會在這裏要求一個包含千分的文件名。並且在Haskell解釋器中生成文件後,必須寫入main,並請求一個文件名,這是使用「generator」創建的文件的名稱。

問題是,對於1000點隨機生成我的程序需要很長時間,在雙核處理器的計算機上約3分鐘。我究竟做錯了什麼?我如何優化代碼以更快地工作?

+0

你是否介紹了你的程序? – Jonke

+0

你爲什麼要刪除這麼多你的帖子?看看你的嘗試是有幫助的。 – AndrewC

+2

我已恢復第二個版本,以恢復上下文。 –

回答

11

您使用的是二次算法:

project [] = error "Empty list of points" 
project [_] = error "Single point is given" 
project ps = go 10000 ps 
    where 
    go a [_] = a 
    go a (p:ps) = let a2 = min a $ minimum [distance p q | q<-ps] 
        in a2 `seq` go a2 ps 

你應該用更好的算法。 Search computational-geometry tag on SO爲更好的算法。請參閱http://en.wikipedia.org/wiki/Closest_pair_of_points_problem


@maxtaldykin proposes一個不錯的,簡單而有效的改變的算法,應該爲隨機數據的真正的區別 - 預排序X點座標,從來沒有嘗試點超過d單位離目前來看,在X座標(其中d是目前已知的最小距離):

import Data.Ord (comparing) 
import Data.List (sortBy) 

project2 [email protected](_:_:_) = go 10000 p1 t 
    where 
    (p1:t) = sortBy (comparing fst) ps 
    go d _   [] = d 
    go d [email protected](x1,_) t = g2 d t 
     where 
     g2 d []   = go d (head t) (tail t) 
     g2 d ([email protected](x2,_):r) 
      | x2-x1 >= d = go d (head t) (tail t) 
      | d2 >= d  = g2 d r 
      | otherwise = g2 d2 r -- change it "mid-flight" 
       where 
       d2 = distance p1 p2 

隨機數據,g2將在O(1)時間工作,使go將採取O(n)整個事情將由SOR爲界t,~ n log n

Empirical orders of growth~ n^2.1顯示用於第一碼(上1K/2K範圍)和~n^1.1用於第二,上10K/20K範圍,測試它quick'n'dirty編譯加載到GHCI(與第二代碼運行50倍的速度比2000年的第一名還要快,而在3000分的情況下,則快80倍)。

+0

優秀點。 O(n^2)給出了1 000 000個計算的非常粗略的估計,O(n log n)給出了非常粗略的估計值3000.您還應該將生成函數和主函數合併到一個主函數中,並使用ghc編譯文件 - O2,與解釋者相比,它會加快速度。 – AndrewC

+0

@AndrewC,不要認爲'ghc -O2'會有幫助,這個問題被標記爲* hugs * –

+0

@maxtaldykin糟糕。然後修改建議也是下載[Haskell平臺](http://www.haskell.org/platform/)。 – AndrewC

6

可以稍微修改你的bruteforce搜索以獲得更好的隨機數據性能。

主要思想是用x座標對點進行排序,並且在比較循環中的距離時,只考慮水平距離不大於當前最小距離的點。

這可能是數量級更快,但在最壞的情況下,它仍然是O(n^2)
實際上,在2000點上,我的機器速度快了50倍。

project points = loop1 10000 byX 
    where 
    -- sort points by x coordinate 
    -- (you need import Data.Ord to use `comparing`) 
    byX = sortBy (comparing fst) points 

    -- loop through all points from left to right 
    -- threading `d` through iterations as a minimum distance so far 
    loop1 d = foldl' loop2 d . tails 

    -- `tail` drops leftmost points one by one so `x` is moving from left to right 
    -- and `xs` contains all points to the right of `x` 
    loop2 d [] = d 
    loop2 d (x:xs) = let 
     -- we take only those points of `xs` whose horizontal distance 
     -- is not greater than current minimum distance 
     xs' = takeWhile ((<=d) . distanceX x) xs 
     distanceX (a,_) (b,_) = b - a 

     -- then just get minimum distance from `x` to those `xs'` 
     in minimum $ d : map (distance x) xs' 

順便說一句,請不要使用太多括號。 Haskell不需要包含函數參數。

+0

非常好!簡單而有效。 :) –