C python中的詞法分析器

我使用python創建了一個C語法分析器，作爲開發解析器的一部分。在我的代碼中，我編寫了一些識別關鍵字，數字，運算符等的方法。編譯後沒有顯示錯誤。執行時，我可以輸入.c文件。我的輸出應該列出輸入文件中的所有關鍵字，標識符等。但它沒有顯示任何東西。任何人都可以幫助我。該代碼已附加。C python中的詞法分析器

import sys 
import string 
delim=['\t','\n',',',';','(',')','{','}','[',']','#','<','>'] 
oper=['+','-','*','/','%','=','!'] 
key=["int","float","char","double","bool","void","extern","unsigned","goto","static","class","struct","for","if","else","return","register","long","while","do"] 
predirect=["include","define"] 
header=["stdio.h","conio.h","malloc.h","process.h","string.h","ctype.h"] 
word_list1="" 
i=0 
j=0 
f=0 
numflag=0 
token=[0]*50 


def isdelim(c): 
    for k in range(0,14): 
     if c==delim[k]: 
      return 1 
     return 0 

def isop(c): 
    for k in range(0,7): 
     if c==oper[k]: 
      ch=word_list1[i+1] 
      i+=1 
      for j in range(0,6): 
       if ch==oper[j]: 
        fop=1 
        sop=ch 
        return 1 
       #ungetc(ch,fp); 
       return 1 
       j+=1 
     return 0; 
     k+=1 

def check(t): 
    print t 
    if numflag==1: 
     print "\n number "+str(t) 
     return 
    for k in range(0,2):#(i=0;i<2;i++) 
     if strcmp(t,predirect[k])==0: 
      print "\n preprocessor directive "+str(t) 
      return 
    for k in range(0,6): #=0;i<6;i++) 
     if strcmp(t,header[k])==0: 
      print "\n header file "+str(t) 
      return 
    for k in range(0,21): #=0;i<21;i++) 
     if strcmp(key[k],t)==0: 
      print "\n keyword "+str(key[k]) 
      return 
     print "\n identifier \t%s"+str(t) 

def skipcomment(): 
    ch=word_list[i+1] 
    i+=1 
    if ch=='/': 
     while word_list1[i]!='\0': 
      i+=1#ch=getc(fp))!='\0': 
    elif ch=='*': 
     while f==0: 
      ch=word_list1[i] 
      i+=1 
     if c=='/': 
      f=1 
    f=0 




a=raw_input("Enter the file name:") 
s=open(a,"r") 
str1=s.read() 
word_list1=str1.split() 




i=0 
#print word_list1[i] 
for word in word_list1 : 
    print word_list1[i] 
    if word_list1[i]=="/": 
     print word_list1[i] 
    elif word_list1[i]==" ": 
     print word_list1[i] 
    elif word_list1[i].isalpha(): 
     if numflag!=1: 
      token[j]=word_list1[i] 
      j+=1 
     if numflag==1: 
      token[j]='\0' 
      check(token) 
      numflag=0 
      j=0 
      f=0 
     if f==0: 
      f=1 
    elif word_list1[i].isalnum(): 
     if numflag==0: 
      numflag=1 
      token[j]=word_list1[i] 
      j+=1 
     else: 
      if isdelim(word_list1[i]): 
       if numflag==1: 
        token[j]='\0' 
        check(token) 
        numflag=0 
       if f==1: 
        token[j]='\0' 
        numflag=0 
        check(token) 
       j=0 
       f=0 
       print "\n delimiters : "+word_list1[i] 
    elif isop(word_list1[i]): 
     if numflag==1: 
      token[j]='\0' 
      check(token) 
      numflag=0 
      j=0 
      f=0 
     if f==1: 
      token[j]='\0' 
      j=0 
      f=0 
      numflag=0 
      check(token)  
     if fop==1: 
      fop=0 
      print "\n operator \t"+str(word_list1[i])+str(sop) 
     else: 
      print "\n operator \t"+str(c) 
    elif word_list1[i]=='.': 
     token[j]=word_list1[i] 
     j+=1 
    i+=1

來源

2010-10-22 Aneeshia

哇。重新發明輪子有很多工作要做。爲什麼不下載'ply'並從現有的C語言解析器開始？爲什麼要這樣做？ – 2010-10-22 10:51:07

我不明白你爲什麼要這樣做。你有很多關於你以前的問題的好建議（我認爲這是你的動機）http://stackoverflow.com/questions/3976665/parser-generation包括對Python中完整的C語法分析器的引用。 – 2010-10-22 19:51:58

你的代碼不好。嘗試將其分成更小的函數，您可以單獨測試。您是否嘗試過調試該程序？一旦你找到導致問題的地方，你可以回到這裏問一個更具體的問題。

更多提示。您可以實現isdelim這樣簡單得多：

def isdelim(c): 
    return c in delim

要爲相等比較字符串，使用string1 == string2。 Python中不存在strcmp。我不知道你是否知道Python通常是解釋的而不是編譯的。這意味着如果你調用一個不存在的函數，你將得不到編譯器錯誤。該程序只會在運行時進行投訴。

在你的功能isop你有無法訪問的代碼。 j += 1和k += 1這兩行不能到達，因爲它們恰好在return聲明之後。

在Python遍歷集合就像下面這樣：

for item in collection: 
    # do stuff with item

這些都只是一些提示。你應該真的閱讀Python Tutorial。

來源

2010-10-22 09:25:07

我是新的Python ..反正thanx。 – Aneeshia 2010-10-22 09:37:56

@Aneeshia：「我是Python新手」。這意味着你必須首先閱讀Python教程。然後，在閱讀完教程後，您應該使用Google進行「Python詞法掃描」並閱讀您在其中找到的代碼。從這樣大的代碼開始，這是一個糟糕的主意。該教程是一個好主意。 – 2010-10-22 12:51:36

def isdelim(c): 
    if c in delim: 
     return 1 
    return 0

您應該瞭解更多關於Python基礎知識。 ATM，您的代碼包含太多的if s和for s。

試着學習它hard way。

來源

2010-10-22 09:27:21

它似乎爲我輸出了相當多的輸出，但代碼很難跟蹤。我跑這對本身和它出錯了，像這樣：

Traceback (most recent call last): 
    File "C:\dev\snippets\lexical.py", line 92, in <module> 
    token[j]=word_list1[i] 
IndexError: list assignment index out of range

老實說，這是非常糟糕的代碼。你應該給的功能更好的名稱，並且沒有使用魔法的數字是這樣的：

for k in range(0,14)

我的意思是，你已經讓你可以使用的範圍列表。

for k in range(delim)

更有意義。

但你只是想確定是否c是在列表DELIM，所以只說：

if c in delim

爲什麼要退1和0，它們意味着什麼？爲什麼不使用True和False。

有可能是其他幾個明顯的問題，如整個代碼的「主要」部分。

這不是很Python的：

token=[0]*50

你真的剛纔的意思是說什麼？

token = []

現在它只是一個空的列表。

，而不是試圖用一個計數器是這樣的：

token[j]=word_list1[i]

要附加，就像這樣：

token.append (word_list[i])

老實說，我認爲你已經開始用太硬的問題。

來源

2010-10-22 09:36:03 jgritty

C python中的詞法分析器

回答

相關問題