这是之前的两篇:
本篇主要对比 Eigen 本身和链接 OpenBLAS 或 MKL 后 Eigen 的运行效率,同时也和 Python 的 Numpy 计算作为对比,以矩阵求逆为测试例子。
主要结论为:
- 如果未连接外部高性能线性代数库,Eigen 本身的运行效率较低,同时多核加速效果不明显。
- 链接 OpenBLAS 或 MKL 后,可以显著提高 Eigen 的运行效率。
- MKL 的运行效率会比 OpenBLAS 快。
- OpenBLAS 或 MKL + Eigen 的矩阵求逆速度仍然低于 Python 的 np.linalg.inv()。这是因为 Python (NumPy) 调用的是 MKL 库,同时在接口处也做了更多的优化。
速度排序为:Eigen < Eigen (OpenBLAS) < Eigen (MKL) < Numpy (MKL)。
C++ 测试代码 a.cpp:
#define EIGEN_USE_BLAS // 注释或取消注释来测试
// #define EIGEN_USE_MKL_ALL // 如果使用 MKL,优先用 EIGEN_USE_MKL_ALL
#include <iostream>
#include <chrono>
#include <vector>
#include <iomanip>
#include <Eigen/Dense>
int main() {
std::vector<int> sizes = {100, 200, 300, 500, 1000, 2000, 3000, 5000}; // 要测试的不同矩阵大小
const int trials = 3; // 每个尺寸的测试次数
for (int size : sizes) {
std::cout << "Testing size: " << size << "x" << size << std::endl;
Eigen::MatrixXd A = Eigen::MatrixXd::Random(size, size);
A = A.transpose() * A + Eigen::MatrixXd::Identity(size, size); // 确保矩阵可逆
auto start = std::chrono::high_resolution_clock::now();
for (int i = 0; i < trials; ++i) {
Eigen::MatrixXd A_inv = A.inverse();
}
auto end = std::chrono::high_resolution_clock::now();
auto duration = std::chrono::duration_cast<std::chrono::milliseconds>(end - start);
std::cout << "Average time per inversion: "
<< std::fixed << std::setprecision(3)
<< (static_cast<double>(duration.count()) / 1000 / trials)
<< " s" << std::endl;
std::cout << "----------------------------------" << std::endl;
}
return 0;
}
未链接外部高性能线性代数库的编译命令:
g++ -std=c++14 a.cpp
链接 OpenBLAS 的编译命令:
g++ -std=c++14 a.cpp -lopenblas
链接 MKL 的编译命令:
g++ -std=c++14 a.cpp -lmkl_rt
Python 的矩阵求逆 np.linalg.inv() 测试 a.py,用于对比:
import numpy as np
import time
sizes = [100, 200, 300, 500, 1000, 2000, 3000, 5000]
trials = 3
for size in sizes:
print(f"Testing size: {size}x{size}")
A = np.random.rand(size, size)
A = A.T @ A + np.eye(size)
start = time.time()
for _ in range(trials):
A_inv = np.linalg.inv(A)
end = time.time()
duration = end - start
print(f"Average time per inversion: {duration/trials:.3f} s")
print("----------------------------------")
以下是单核的运行结果。
未连接外部高性能线性代数库的 Eigen 的运行结果(单核):
Testing size: 100x100
Average time per inversion: 0.022 s
----------------------------------
Testing size: 200x200
Average time per inversion: 0.152 s
----------------------------------
Testing size: 300x300
Average time per inversion: 0.483 s
----------------------------------
Testing size: 500x500
Average time per inversion: 2.456 s
----------------------------------
Testing size: 1000x1000
Average time per inversion: 18.990 s
----------------------------------
Testing size: 2000x2000
Average time per inversion: 149.247 s
----------------------------------
Testing size: 3000x3000
Average time per inversion: 499.806 s
----------------------------------
Testing size: 5000x5000
Average time per inversion: 2298.678 s
----------------------------------
链接 OpenBLAS 的 Eigen 的运行结果(单核):
Testing size: 100x100
Average time per inversion: 0.003 s
----------------------------------
Testing size: 200x200
Average time per inversion: 0.016 s
----------------------------------
Testing size: 300x300
Average time per inversion: 0.023 s
----------------------------------
Testing size: 500x500
Average time per inversion: 0.068 s
----------------------------------
Testing size: 1000x1000
Average time per inversion: 0.297 s
----------------------------------
Testing size: 2000x2000
Average time per inversion: 1.673 s
----------------------------------
Testing size: 3000x3000
Average time per inversion: 4.302 s
----------------------------------
Testing size: 5000x5000
Average time per inversion: 15.269 s
----------------------------------
链接 MKL 的 Eigen 的运行结果(单核):
Testing size: 100x100
Average time per inversion: 0.077 s
----------------------------------
Testing size: 200x200
Average time per inversion: 0.005 s
----------------------------------
Testing size: 300x300
Average time per inversion: 0.009 s
----------------------------------
Testing size: 500x500
Average time per inversion: 0.027 s
----------------------------------
Testing size: 1000x1000
Average time per inversion: 0.140 s
----------------------------------
Testing size: 2000x2000
Average time per inversion: 0.820 s
----------------------------------
Testing size: 3000x3000
Average time per inversion: 2.069 s
----------------------------------
Testing size: 5000x5000
Average time per inversion: 7.359 s
----------------------------------
Python 中 np.linalg.inv() 的运行结果(单核):
Testing size: 100x100
Average time per inversion: 0.002 s
----------------------------------
Testing size: 200x200
Average time per inversion: 0.002 s
----------------------------------
Testing size: 300x300
Average time per inversion: 0.003 s
----------------------------------
Testing size: 500x500
Average time per inversion: 0.010 s
----------------------------------
Testing size: 1000x1000
Average time per inversion: 0.064 s
----------------------------------
Testing size: 2000x2000
Average time per inversion: 0.491 s
----------------------------------
Testing size: 3000x3000
Average time per inversion: 1.415 s
----------------------------------
Testing size: 5000x5000
Average time per inversion: 5.558 s
----------------------------------
以下是八核的运行结果。
未连接外部高性能线性代数库的 Eigen 的运行结果(八核):
Testing size: 100x100
Average time per inversion: 0.027 s
----------------------------------
Testing size: 200x200
Average time per inversion: 0.175 s
----------------------------------
Testing size: 300x300
Average time per inversion: 0.565 s
----------------------------------
Testing size: 500x500
Average time per inversion: 2.224 s
----------------------------------
Testing size: 1000x1000
Average time per inversion: 17.355 s
----------------------------------
Testing size: 2000x2000
Average time per inversion: 135.106 s
----------------------------------
Testing size: 3000x3000
Average time per inversion: 454.176 s
----------------------------------
Testing size: 5000x5000
Average time per inversion: 2093.090 s
----------------------------------
链接 OpenBLAS 的 Eigen 的运行结果(八核):
Testing size: 100x100
Average time per inversion: 0.003 s
----------------------------------
Testing size: 200x200
Average time per inversion: 0.013 s
----------------------------------
Testing size: 300x300
Average time per inversion: 0.022 s
----------------------------------
Testing size: 500x500
Average time per inversion: 0.060 s
----------------------------------
Testing size: 1000x1000
Average time per inversion: 0.229 s
----------------------------------
Testing size: 2000x2000
Average time per inversion: 1.135 s
----------------------------------
Testing size: 3000x3000
Average time per inversion: 2.610 s
----------------------------------
Testing size: 5000x5000
Average time per inversion: 7.531 s
----------------------------------
链接 MKL 的 Eigen 的运行结果(八核):
Testing size: 100x100
Average time per inversion: 0.062 s
----------------------------------
Testing size: 200x200
Average time per inversion: 0.020 s
----------------------------------
Testing size: 300x300
Average time per inversion: 0.010 s
----------------------------------
Testing size: 500x500
Average time per inversion: 0.038 s
----------------------------------
Testing size: 1000x1000
Average time per inversion: 0.132 s
----------------------------------
Testing size: 2000x2000
Average time per inversion: 0.529 s
----------------------------------
Testing size: 3000x3000
Average time per inversion: 1.215 s
----------------------------------
Testing size: 5000x5000
Average time per inversion: 3.517 s
----------------------------------
Python 中 np.linalg.inv() 的运行结果(八核):
Testing size: 100x100
Average time per inversion: 0.025 s
----------------------------------
Testing size: 200x200
Average time per inversion: 0.021 s
----------------------------------
Testing size: 300x300
Average time per inversion: 0.003 s
----------------------------------
Testing size: 500x500
Average time per inversion: 0.006 s
----------------------------------
Testing size: 1000x1000
Average time per inversion: 0.021 s
----------------------------------
Testing size: 2000x2000
Average time per inversion: 0.129 s
----------------------------------
Testing size: 3000x3000
Average time per inversion: 0.298 s
----------------------------------
Testing size: 5000x5000
Average time per inversion: 1.008 s
----------------------------------
【说明:本站主要是个人的一些笔记和代码分享,内容可能会不定期修改。为了使全网显示的始终是最新版本,这里的文章未经同意请勿转载。引用请注明出处:https://www.guanjihuan.com】