

In the case of vertices, they need to be loaded, assembled and processed in the vertex shader, before the GPU can decide whether to cull or clip parts of the geometry. However, this does not mean that the GPU should do all the work in deciding what is visible.

It also avoids processing hidden fragments when the depth test or stencil test fails (P4). The GPU will not rasterize primitives when all of its vertices fall outside the viewport. The attributes that are updated dynamically should be stored in smaller separate buffer objects (or perhaps just a single buffer if the attributes are updated with the same frequency). The attributes that remain constant should be stored in an array of structures. In this scenario, the recommendation is to partition the attributes such that constant and dynamic attributes can be read and written sequentially, respectively. Strided writes in array of structures can be expensive relative to the number of bytes modified. The only time to consider a structure of arrays layout is if one or more attributes must be updated dynamically. The structure of arrays layout is therefore less efficient than an array of structures in most cases. This layout forces the GPU to jump around and fetch from different memory locations as it assembles the needed attributes for each vertex. In contrast, a structure of arrays stores the vertex attributes in separate buffers using the same offset for each attribute and a stride of zero.
