在計算最終骨骼矩陣時並行計算骨骼矩陣i Hirerachy

ReadNodeHeirarchy找到平移的內插值，在給定的時間戳處旋轉&比例，所以每次Frame計算這些值時將它們饋送到GPU上以進行每個骨骼的最終變換。有沒有優化代碼？有沒有人試圖把這個計算轉移到GPU上？在計算最終骨骼矩陣時並行計算骨骼矩陣i Hirerachy

void Mesh::ReadNodeHeirarchy(float AnimationTime, const aiNode* pNode, const Matrix4f& ParentTransform) 
    { 
     string NodeName(pNode->mName.data); 

     const aiAnimation* pAnimation = m_pScene->mAnimations[0]; 

     Matrix4f NodeTransformation(pNode->mTransformation); 

     const aiNodeAnim* pNodeAnim = FindNodeAnim(pAnimation, NodeName); 

     if (pNodeAnim) { 
      // Interpolate scaling and generate scaling transformation matrix 
      aiVector3D Scaling; 
      CalcInterpolatedScaling(Scaling, AnimationTime, pNodeAnim); 
      Matrix4f ScalingM; 
      ScalingM.InitScaleTransform(Scaling.x, Scaling.y, Scaling.z); 

      // Interpolate rotation and generate rotation transformation matrix 
      aiQuaternion RotationQ; 
      CalcInterpolatedRotation(RotationQ, AnimationTime, pNodeAnim); 
      Matrix4f RotationM = Matrix4f(RotationQ.GetMatrix()); 

      // Interpolate translation and generate translation transformation matrix 
      aiVector3D Translation; 
      CalcInterpolatedPosition(Translation, AnimationTime, pNodeAnim); 
      Matrix4f TranslationM; 
      TranslationM.InitTranslationTransform(Translation.x, Translation.y, Translation.z); 

      // Combine the above transformations 
      NodeTransformation = TranslationM * RotationM * ScalingM; 
     } 

     Matrix4f GlobalTransformation = ParentTransform * NodeTransformation; 

     if (m_BoneMapping.find(NodeName) != m_BoneMapping.end()) { 
      uint BoneIndex = m_BoneMapping[NodeName]; 
      m_BoneInfo[BoneIndex].FinalTransformation = m_GlobalInverseTransform * GlobalTransformation * 
                 m_BoneInfo[BoneIndex].BoneOffset; 
     } 

     for (uint i = 0 ; i < pNode->mNumChildren ; i++) { 
      ReadNodeHeirarchy(AnimationTime, pNode->mChildren[i], GlobalTransformation); 
     } 
    }

來源

2016-09-14 surya

您的代碼是遞歸的，GPU在運行遞歸代碼或循環語句時非常糟糕。所以你必須調整你的邏輯，以便GPU能夠接受並提供更好的性能。

你實際上在這裏混淆了2個不同的問題。

1）計算所有節點的全局矩陣，必須基於場景圖遞歸。這是一個CPU問題，而不是GPU問題，因爲遞歸是最好的方式。

2）實際插值和向量數學。這可以通過使用SIMD優化的代碼來加快，這將使您爲所有矢量數學運算提供4倍的速度提升。

對於你的問題，我會建議使用SIMD優化。

注意：我寫的東西與您在此發佈的內容非常相似。使用Assimp和OpenGLES for iOS & Android。

來源

2016-09-14 12:19:47 codetiger

對於SIMD操作您是否在使用任何第三方庫？。我使用glm進行似乎是SIMD優化的矩陣乘法 – surya

對於SIMD，GLM應該沒問題，但是我使用了我自己版本的SIMD優化，它來自Apple的GLK庫，它比GLM快得多。要使用皮膚渲染網格，我正在使用GPU蒙皮，這意味着我在CPU上執行所有插值併發送統一陣列中所有骨骼的全局矩陣。 – codetiger

是的，我做同樣的事情，但發送它作爲低端QualCom芯片的紋理原因，他們只有240uniforms – surya

在計算最終骨骼矩陣時並行計算骨骼矩陣i Hirerachy

回答

相關問題