2016-09-28 192 views

我有一個html表格,我想從中刪除具有某個類的行。 不過:當我嘗試sed 's/<tr class="expandable">.*<\/tr>//g只是什麼也不做(比如:不刪除標記)從html輸入中刪除標籤sed


<tr><td>Some col</td></tr> 
<tr class="expandable"> 
    <td colspan="6"> 
     <div class="expandable-content"> 
<p>Holds ACCA Practising Certificate: This indicates a member holding a practising certificate issued by ACCA. This means that the member is authorised to provide a range of general accountancy services to individuals and businesses, including business and tax advice and planning, preparation of personal and business tax returns, set up of book-keeping and business systems, providing book-keeping services, payroll work, assistance with management accounting help with raising finance, budgeting and cash-flow advice, business start-up advice and expert witness.</p> 



強制性[不解析與正則表達式HTML(http://stackoverflow.com/ a/1732454/7552)鏈接。 –


「您是否嘗試過使用XML解析器?」 - > xmllint和xidel這兩個都不能刪除某一行「類型」 - 至少我不知道一種方式 – Fuzzyma


我認爲有示例輸入顯示的錯字,最後一行可能是''......這可能會工作'perl -0777 -pe's | 。*? || gs'file'但不像已經指出的那樣健壯 – Sundeep




xmlstarlet ed -d '//tr[@class="expandable"]' <<ENDHTML 
    <tr><td>Some col</td></tr> 
    <tr class="expandable"> 
     <td colspan="6"> 
      <div class="expandable-content"> 
    <p>Holds ACCA Practising Certificate: This indicates a member holding a practising certificate issued by ACCA. This means that the member is authorised to provide a range of general accountancy services to individuals and businesses, including business and tax advice and planning, preparation of personal and business tax returns, set up of book-keeping and business systems, providing book-keeping services, payroll work, assistance with management accounting help with raising finance, budgeting and cash-flow advice, business start-up advice and expert witness.</p> 
<?xml version="1.0"?> 
     <td>Some col</td> 