2010-06-17 142 views
16

我正在將其全局文件lib/中的71 .jar文件轉換爲使用Maven。當然,這些項目在過去的十年中已經被很多開發人員從網絡中拉出來,並且並不總是將所有必要的版本信息添加到VCS中。在Maven存儲庫中找到正確版本的正確JAR

有沒有簡單,自動化的方式,從那套.jar文件到相應的<dependency/>元素,用於我的pom.xml文件中?我希望有一個網頁,我可以提交一個jar文件的校驗和並取回一個XML片段。谷歌命中'Maven存儲庫搜索'基本上只是找到基於名稱的搜索。據我所知,http://repo1.maven.org/沒有任何搜索。

更新:GrepCode看起來像它可以找到給定MD5校驗和的項目。但它沒有提供Maven需要的特定細節(groupIdartifactId)。

這是我想出了根據公認的答案腳本:

#!/bin/bash 

for f in *.jar; do 
    s=`md5sum $f | cut -d ' ' -f 1`; 
    p=`wget -q -O - "http://www.jarvana.com/jarvana/search?search_type=content&content=${s}&filterContent=digest" | grep inspect-pom | cut -d \" -f 4`; 
    pj="http://www.jarvana.com${p}"; 
    rm -f tmp; 
    wget -q -O tmp "$pj"; 

    g=`grep groupId tmp | head -n 1 | cut -d \> -f 3 | cut -d \< -f 1`; 
    a=`grep artifactId tmp | head -n 1 | cut -d \> -f 3 | cut -d \< -f 1`; 
    v=`grep version tmp | head -n 1 | cut -d \> -f 3 | cut -d \< -f 1`; 
    rm -f tmp; 

    echo '<dependency> <!--' $f $s $pj '-->'; 
    echo " <groupId>$g</groupId>"; 
    echo " <artifactId>$a</artifactId>"; 
    echo " <version>$v</version>"; 
    echo "</dependency>"; 
    echo; 
done 
+0

哇噢找到了!看起來像一個新的商業機會。 – 2010-06-17 15:51:07

回答

0

嗨,你可以使用mvnrepository搜索文物或者您可以使用Eclipse和經過加依賴有一個搜索是使用maven central的索引。

2

我和OP的情況相同,但是在後面的回答中提到Jarvana已經不在了。

我使用了Maven Central Search和它們的search api的校驗和功能來實現相同的結果。

首先創建一個sha1sums

sha1sum *.jar > jar-sha1sums.txt 

然後用下面的python腳本來檢查文件中是否存在問題

import json 
import urllib2 

f = open('./jar-sha1sums.txt','r') 
pom = open('./pom.xml','w') 
for line in f.readlines(): 
    sha = line.split(" ")[0] 
    jar = line.split(" ")[1] 
    print("Looking up "+jar) 
    searchurl = 'http://search.maven.org/solrsearch/select?q=1:%22'+sha+'%22&rows=20&wt=json' 
    page = urllib2.urlopen(searchurl) 
    data = json.loads("".join(page.readlines())) 
    if data["response"] and data["response"]["numFound"] == 1: 
     print("Found info for "+jar) 
     jarinfo = data["response"]["docs"][0] 
     pom.write('<dependency>\n') 
     pom.write('\t<groupId>'+jarinfo["g"]+'</groupId>\n') 
     pom.write('\t<artifactId>'+jarinfo["a"]+'</artifactId>\n') 
     pom.write('\t<version>'+jarinfo["v"]+'</version>\n') 
     pom.write('</dependency>\n') 
    else: 
     print "No info found for "+jar 
     pom.write('<!-- TODO Find information on this jar file--->\n') 
     pom.write('<dependency>\n') 
     pom.write('\t<groupId></groupId>\n') 
     pom.write('\t<artifactId>'+jar.replace(".jar\n","")+'</artifactId>\n') 
     pom.write('\t<version></version>\n') 
     pom.write('</dependency>\n') 
pom.close() 
f.close() 

上的罐子的任何信息是因人而異

2

借用代碼和想法從@Karl Tryggvason但不能得到python腳本工作。作爲一個Windows猴子,我在Powershell(需要v3)中做了類似的事情,但沒有那麼複雜(不會生成一個pom,只是轉儲結果),但我認爲這可能會在幾分鐘內拯救某個人。

$log = 'c:\temp\jarfind.log' 

Get-Date | Tee-Object -FilePath $log 

$jars = gci d:\source\myProject\lib -Filter *.jar 

foreach ($jar in $jars) 
{ 
    $sha = Get-FileHash -Algorithm SHA1 -Path $jar.FullName | select -ExpandProperty hash 
    $name = $jar.Name 
    $json = Invoke-RestMethod "http://search.maven.org/solrsearch/select?q=1:%22$($sha)%22&rows=20&wt=json" 
    "Found $($json.response.numfound) jars with sha1 matching that of $($name)..." | Tee-Object -FilePath $log -Append 
    $jarinfo = $json.response.docs 
    $jarinfo | Tee-Object -FilePath $log -Append 
} 
0

如果要使用artifactId和從jar名稱中讀取的版本,可以使用以下代碼。這是一個即興版本Karl's

import os 
import sys 
from subprocess import check_output 

import requests 

def searchByShaChecksum(sha): 
    searchurl = 'http://search.maven.org/solrsearch/select?q=1:%22' + sha + '%22&rows=20&wt=json' 
    resp = requests.get(searchurl) 
    data = resp.json() 
    return data 


def searchAsArtifact(artifact, version): 
    searchurl = 'http://search.maven.org/solrsearch/select?q=a:"' + artifact + '" AND v:"' + version.strip() + '"&rows=20&wt=json' 
    resp = requests.get(searchurl) 
    # print(searchurl) 
    data = resp.json() 
    return data 


def processAsArtifact(file: str): 
    data = {'response': {'start': 0, 'docs': [], 'numFound': 0}} 
    jar = file.replace(".jar", "") 
    splits = jar.split("-") 
    if (len(splits) < 2): 
    return data 
    for i in range(1, len(splits)): 
    artifact = "-".join(splits[0:i]) 
    version = "-".join(splits[i:]) 
    data = searchAsArtifact(artifact, version) 
    if data["response"] and data["response"]["numFound"] == 1: 
     return data 
    return data 


def writeToPom(pom: object, grp: str = None, art: str = None, ver: str = None): 
    if grp is not None and ver is not None: 
    pom.write('<dependency>\n') 
    else: 
    pom.write('<!-- TODO Find information on this jar file--->\n') 
    pom.write('<dependency>\n') 
    grp = grp if grp is not None else "" 
    art = art if art is not None else "" 
    ver = ver if ver is not None else "" 
    pom.write('\t<groupId>' + grp + '</groupId>\n') 
    pom.write('\t<artifactId>' + art + '</artifactId>\n') 
    pom.write('\t<version>' + ver + '</version>\n') 
    pom.write('</dependency>\n') 


def main(argv): 
    if len(argv) == 0: 
    print(bcolors.FAIL + 'Syntax : findPomJars.py <lib_dir_path>' + bcolors.ENDC) 
    lib_home = str(argv[0]) 
    if os.path.exists(lib_home): 
    os.chdir(lib_home) 

    pom = open('./auto_gen_pom_list.xml', 'w') 
    successList = [] 
    failedList = [] 
    jarCount = 0 
    for lib in sorted(os.listdir(lib_home)): 
     if lib.endswith(".jar"): 
     jarCount += 1 
     sys.stdout.write("\rProcessed Jar Count: %d" % jarCount) 
     sys.stdout.flush() 
     checkSum = check_output(["sha1sum", lib]).decode() 
     sha = checkSum.split(" ")[0] 
     jar = checkSum.split(" ")[1].strip() 
     data = searchByShaChecksum(sha) 
     if data["response"] and data["response"]["numFound"] == 0: 
      data = processAsArtifact(jar) 

     if data["response"] and data["response"]["numFound"] == 1: 
      successList.append("Found info for " + jar) 
      jarinfo = data["response"]["docs"][0] 
      writeToPom(pom, jarinfo["g"], jarinfo["a"], jarinfo["v"]) 
     else: 
      failedList.append("No info found for " + jar) 
      writeToPom(pom, art=jar.replace(".jar\n", "")) 
    pom.close() 

    print("\n") 
    print("Success : %d" % len(successList)) 
    print("Failed : %d" % len(failedList)) 

    for entry in successList: 
     print(entry) 
    for entry in failedList: 
     print(entry) 

    else: 
    print 
    bcolors.FAIL + lib_home, " directory doesn't exists" + bcolors.ENDC 


if __name__ == "__main__": 
    main(sys.argv[1:]) 

代碼也可以在GitHub