我需要解析包含由單個空格字符分隔的4個浮點數字的幾個C樣式字符串(大約500k)。下面是一個字符串的例子:解析數字字符串
「90292 5879 89042.2576 5879」
我需要這些數字存儲在代表兩個點的兩個結構。考慮到字符串可以在解析時被修改,並且99.99%的數字只是無符號整數,那麼最快的方法是什麼?
以下是我目前的執行情況:
#include <iostream>
#include <cassert>
#include <chrono>
#include <algorithm>
#include <vector>
#include <string>
using namespace std;
using namespace chrono;
struct PointF
{
float x;
float y;
};
void parse_points(char* points, PointF& p1, PointF& p2)
{
auto start = points;
const auto end = start + strlen(points);
// p1.x
start = std::find(start, end, ' ');
assert(start < end);
*start = '\0';
p1.x = static_cast<float>(atof(points));
points = start + 1;
// p1.y
start = std::find(start, end, ' ');
assert(start < end);
*start = '\0';
p1.y = static_cast<float>(atof(points));
points = start + 1;
// p2.x
start = std::find(start, end, ' ');
assert(start < end);
*start = '\0';
p2.x = static_cast<float>(atof(points));
points = start + 1;
// p2.y
start = std::find(start, end, ' ');
assert(start == end);
p2.y = static_cast<float>(atof(points));
}
int main()
{
const auto n = 500000;
char points_str[] = "90292 5879 89042.2576 5879";
PointF p1, p2;
vector<string> data(n);
for (auto& s : data)
s.assign(points_str);
const auto t0 = system_clock::now();
for (auto i = 0; i < n; i++)
parse_points(const_cast<char*>(data[i].c_str()), p1, p2);
const auto t1 = system_clock::now();
const auto elapsed = duration_cast<milliseconds>(t1 - t0).count();
cout << "Elapsed: " << elapsed << " ms" << endl;
cin.get();
return 0;
}
我猜'boost :: lexical_cast'比'atof'快。 –
@sorosh_sabz實際上慢了8倍以上.... – Nick
解析問題太多,至少你可以做的就是先搜索。試試這個:[「stackoverflow C++讀取文件空間分隔浮動」](https://www.google.com/search?q=stackoverflow+c%2B%2B+read+file+space+separated+float&ie=utf-8&oe = utf-8) –