限制文本一定數目的字符忽略HTML標籤/屬性

我有一個文本塊這樣的：限制文本一定數目的字符忽略HTML標籤/屬性

<p class="post">Lorem ipsum dolor sit amet, <a href="http://website.com/link" target="_blank" title="hello">consectetur adipiscing elit</a>. Pellentesque vehicula tortor eget tortor fermentum bibendum. Duis mollis nisl et metus vulputate, a aliquam quam pharetra. <a href="http://website.com/link" target="_blank" title="hello">consectetur adipiscing elit</a> quis hendrerit nibh ultrices eget. <span class="highlight">Praesent</span> eu mollis lectus, sed convallis quam.</p>

我想經過100個字符截斷文本。只需一個文本字符串，我會使用類似：

var new_string = text_string.substring(0,100);

但我需要計時的字符時，使其截斷後100個可見字符的文本，以文本中的鏈接和其他HTML考慮，不是100個字符的HTML本身，並且保留文本中的HTML標記。

注意：我不能保留任何HTML標記，因此我需要在截斷標記之前不截斷文本，或截斷文本，然後添加正確的結束標記。

可以做到這一點嗎？

來源

2016-12-16 John

您可以按文檔順序遍歷節點，並且無論何時到達文本節點時，都可以查看它有多少個字符。保持運行總數，當你到達超過最大值的節點時，截斷那裏，然後清空每個後續的文本節點。 – 2016-12-16 22:13:16

你可以運行正則表達式來查找><之間的所有文本。 – Alon

你想要去掉html嗎？或截斷文本並離開HTML？這通常是在清除html之後完成的，因爲只計算文本並且仍然有一個有效的html，沒有一堆空的html標記或格式可能會炸燬佈局，這並不容易。 –

地帶的所有的HTML從與正則表達式的字符串標籤和然後子串

var new_string = text_string.replace(/<[^>]*>/g, "").substring(0,100);

[UPDATE]我讀到的保留的HTML代碼，唯一的解決辦法我認爲是這樣的：

var regx = new RegExp(/(<[^>]*>)/g); 
var counter = 0; 

//convert the string in array using the HTML tags as delimiter and keeping they as array elements 
strArray = str.split(regx); 

for (var i = 0, len = strArray.length; i < len; i++) { 
    //ignore the array elements that is HTML tags 
    if (!(regx.test(strArray[i]))) { 
     //if the counter is 100, remove this element with text 
     if (counter == 100) { 
      strArray.splice(i, 1); 
      continue; //ignore next commands and continue the for loop 
     } 
     //if the counter != 100, increase the counter with this element length 
     counter = counter + strArray[i].length; 
     //if is over 100, slice the text of this element to match the total of 100 chars and set the counter to 100 
     if (counter > 100) { 
      var diff = counter - 100; 
      strArray[i] = strArray[i].slice(0, -diff); 
      counter = 100; 
     } 
    } 
} 

//new string from the array 
new_string = strArray.join(''); 

//remove empty html tags from the array 
new_string = new_string.replace(/(<(?!\/)[^>]+>)+(<\/[^>]+>)/g, "");

現場示例Codepen

來源

2016-12-16 22:28:57 Davebra

感謝您的回覆。問題是我需要在文本中保留任何HTML標記，而不是僅僅刪除它們並截斷文本。 – John

對不起，我沒有紅。我能想到的唯一解決方案是使用html標記的正則表達式將數組中的字符串拆分爲「splitter」，然後使用for循環，僅使用計數器變量對包含文本的元素進行chars計數，然後斷開或當計數器是100時，用文本刪除元素。我發表了帶有註釋的代碼。 – Davebra

這正是我所需要的，它看起來很完美！你是男人！非常感謝 - 非常感謝！ – John

一個做

var html = 'YOUR HTML STRING' 
var elt = document.createElement('container'); 
elt.innerHTML = html; 
var text = elt.textContent; 
var result = text.substring(0,100);

來源

2016-12-16 22:21:01 IAmDranged

感謝您的回覆。問題是我需要在文本中保留任何HTML標記，而不是僅僅刪除它們並截斷文本。 – John

如果str爲您字符串中使用它來獲取所有的文字方式。

var str = '<p class="post">Lorem ipsum dolor sit amet, <a href="http://website.com/link" target="_blank" title="hello">consectetur adipiscing elit</a>. Pellentesque vehicula tortor eget tortor fermentum bibendum. Duis mollis nisl et metus vulputate, a aliquam quam pharetra. <a href="http://website.com/link" target="_blank" title="hello">consectetur adipiscing elit</a> quis hendrerit nibh ultrices eget. <span class="highlight">Praesent</span> eu mollis lectus, sed convallis quam.</p>' 
 
var allTheText = str.replace(/<[^>]*>/g,"") 
 
console.log(allTheText.length)

來源

2016-12-16 22:21:40 Alon

感謝您的回覆。問題是我需要在文本中保留任何HTML標記，而不是僅僅刪除它們並截斷文本。 – John

@john你可以得到allTheText的長度，找到你想刪除的最後一個字符，找到它在原始字符串中，並刪除它後面的所有字符串。 – Alon

限制文本一定數目的字符忽略HTML標籤/屬性

回答

相關問題