2016-07-27 90 views
1

我在學習Golang,所以我可以重寫一些shell腳本。在Golang中提取部分字符串?

我的URL看起來像這樣:

https://example-1.example.com/a/c482dfad3573acff324c/list.txt?parm1=value,parm2=value,parm3=https://example.com/a?parm1=value,parm2=value 

我想提取以下部分:

https://example-1.example.com/a/c482dfad3573acff324c/list.txt 

在一個shell腳本,我會做這樣的事情:

echo "$myString" | grep -o 'http://.*.txt' 

只有使用標準庫才能在Golang中做同樣的事情,最好的辦法是什麼?

回答

6

有幾個選項:

// match regexp as in question 
pat := regexp.MustCompile(`https?://.*\.txt`) 
s := pat.FindString(myString) 

// everything before the query 
s := strings.Split(myString, "?")[0] string 

// same as previous, but avoids []string allocation 
s := myString 
if i := strings.IndexByte(s, '?'); i >= 0 { 
    s = s[:i] 
} 

// parse and clear query string 
u, err := url.Parse(myString) 
u.RawQuery = "" 
s := u.String() 

最後一個選項是最好的,因爲它會處理所有可能的角落的情況。

try it on the playground

+0

我推薦使用的url.Parse由於應該處理任何奇怪的邊緣的情況下這可能會被正則表達式或分割錯過。例如,沒有?的URL? –

+0

我同意url.Parse是最好的方法。列出的所有選項都處理沒有'?'的URL。 –

1

你可以使用strings.IndexRunestrings.IndexBytestrings.Splitstrings.SplitAfterstrings.FieldsFuncurl.Parseregexp或你的函數。

第一個最簡單的方法:
你可以使用i := strings.IndexRune(s, '?')i := strings.IndexByte(s, '?')然後s[:i]像這樣(註釋輸出):

package main 

import "fmt" 
import "strings" 

func main() { 
    s := `https://example-1.example.com/a/c482dfad3573acff324c/list.txt?parm1=value,parm2=value,parm3=https://example.com/a?parm1=value,parm2=value` 
    i := strings.IndexByte(s, '?') 
    if i != -1 { 
     fmt.Println(s[:i]) // https://example-1.example.com/a/c482dfad3573acff324c/list.txt 
    } 
} 

,或者您可以使用url.Parse(s)(我用這個):

package main 

import "fmt" 
import "net/url" 

func main() { 
    s := `https://example-1.example.com/a/c482dfad3573acff324c/list.txt?parm1=value,parm2=value,parm3=https://example.com/a?parm1=value,parm2=value` 
    url, err := url.Parse(s) 
    if err == nil { 
     url.RawQuery = "" 
     fmt.Println(url.String()) // https://example-1.example.com/a/c482dfad3573acff324c/list.txt 
    } 
} 

或者您可以使用regexp.MustCompile(".*\\.txt")

package main 

import "fmt" 
import "regexp" 

var rgx = regexp.MustCompile(`.*\.txt`) 

func main() { 
    s := `https://example-1.example.com/a/c482dfad3573acff324c/list.txt?parm1=value,parm2=value,parm3=https://example.com/a?parm1=value,parm2=value` 

    fmt.Println(rgx.FindString(s)) // https://example-1.example.com/a/c482dfad3573acff324c/list.txt 
} 

,或者您可以使用splits := strings.FieldsFunc(s, func(r rune) bool { return r == '?' })然後splits[0]

package main 

import "fmt" 
import "strings" 

func main() { 
    s := `https://example-1.example.com/a/c482dfad3573acff324c/list.txt?parm1=value,parm2=value,parm3=https://example.com/a?parm1=value,parm2=value` 
    splits := strings.FieldsFunc(s, func(r rune) bool { return r == '?' }) 
    fmt.Println(splits[0]) // https://example-1.example.com/a/c482dfad3573acff324c/list.txt 
} 

您可以使用splits := strings.Split(s, "?")然後splits[0]

package main 

import "fmt" 
import "strings" 

func main() { 
    s := `https://example-1.example.com/a/c482dfad3573acff324c/list.txt?parm1=value,parm2=value,parm3=https://example.com/a?parm1=value,parm2=value` 
    splits := strings.Split(s, "?") 
    fmt.Println(splits[0]) // https://example-1.example.com/a/c482dfad3573acff324c/list.txt 
} 

您可以使用splits := strings.SplitAfter(s, ".txt")然後splits[0]

package main 

import "fmt" 
import "strings" 

func main() { 
    s := `https://example-1.example.com/a/c482dfad3573acff324c/list.txt?parm1=value,parm2=value,parm3=https://example.com/a?parm1=value,parm2=value` 
    splits := strings.SplitAfter(s, ".txt") 
    fmt.Println(splits[0]) // https://example-1.example.com/a/c482dfad3573acff324c/list.txt 
} 

,或者你可以使用你的功能(最獨立的方式):

package main 

import "fmt" 

func left(s string) string { 
    for i, r := range s { 
     if r == '?' { 
      return s[:i] 
     } 
    } 
    return "" 
} 

func main() { 
    s := `https://example-1.example.com/a/c482dfad3573acff324c/list.txt?parm1=value,parm2=value,parm3=https://example.com/a?parm1=value,parm2=value` 
    fmt.Println(left(s)) // https://example-1.example.com/a/c482dfad3573acff324c/list.txt 
} 
1

如果您prosessing只有網址,你可以使用Go的net/urlhttps://golang.org/pkg/net/url/解析URL,截斷查詢和片段部分(查詢將parm1=value,parm2=value等。),並提取剩餘部分scheme://host/path,如在下面的例子中(https://play.golang.org/p/Ao0jU22NyA):

package main 

import (
    "fmt" 
    "net/url" 
) 

func main() { 
    u, _ := url.Parse("https://example-1.example.com/a/b/c/list.txt?parm1=value,parm2=https%3A%2F%2Fexample.com%2Fa%3Fparm1%3Dvalue%2Cparm2%3Dvalue#somefragment") 
    u.RawQuery, u.Fragment = "", "" 
    fmt.Printf("%s\n", u) 
} 

輸出:

https://example-1.example.com/a/b/c/list.txt