2017-03-06 58 views
0

我正在創建一個程序,它接受來自用戶的輸入字符串,然後使用ord函數從字符串中去除標點符號。然後它應該計算每個單詞的位置(從1開始)並忽略任何重複的單詞。壓縮的句子應該像位置一樣寫入文本文件。Python 3通過從輸入字符串中去除標點符號並計算字符串中字的位置來壓縮文本

我的代碼的問題是,輸入字符串被拆分爲單個字母和位置計數,計算單個字母。我確信有一個簡單的修復,但84個版本後,我已經用完了想法。

import string 

sentence=input("Please enter a sentence: ") 
sentence=sentence.upper() 
sentencelist = open("sentence_List.txt","w") 
sentencelist.write(str(sentence)) 
sentencelist.close() 


words=list(str.split(sentence)) 
wordlist=len(words) 
position=[] 
text=() 
uniquewords=[] 
texts="" 
nsentence=(sentence) 

for c in list(sentence): 
     if not ord(c.lower()) in range(97,122): 
       nsentence=nsentence.replace(c, "")#Ascii a-z 
print(nsentence) 


nsentencelist=len(nsentence) 
print(nsentencelist) 
nsentencelist2 = open("nsentence_List.txt","w") 
nsentencelist2.write(str(nsentence)) 
nsentencelist2.close() 
+0

你可以請包括樣本輸入和期望的輸出嗎? – Crispin

+0

如果是這樣的話,爲什麼你把no-alphabet換成「」(空字符串)? – EvanL00

回答

0

問題是你有""(空字符串)替換標點符號,所以當你嘗試了句"we are good. OK"拆分的話,你實際上分裂"wearegoodOK"。嘗試用空格" "替換標點符號。

或者你可以使用正則表達式來分割的話,作爲Strip Punctuation From String in Python

0

建議這裏是一個返回與標點符號和大小寫句子剝離和字的排序字典的功能:index_of_first_occurrence對。你可以輸出這個數據到一個文件,我沒有在這裏完成,因爲我不知道你的具體輸出要求:

import re 
from collections import OrderedDict 

def compress(sentence): 

    # regular expression looks for punctuation 
    PUNCTUATION_REGEX = re.compile(str(r'[^a-zA-Z\s]')) 

    # use an OrderedDict to keep items sorted 
    words = OrderedDict() 
    # look for punctuation and replace it with an empty string. also sets case to lower. 
    sentence = re.sub(PUNCTUATION_REGEX, '', sentence).lower() 
    # loop through words in the sentence 
    for idx, word in enumerate(sentence.split()): 
     # check that we haven't encountered this word before 
     if not words.get(word): 
      # add new word to dict, with index as value (not 0-indexed) 
      words[word] = idx + 1 
    return sentence, words 
相關問題