Computer Science, asked by TbiaSamishta, 11 months ago

Consider a 1 GHz processor having a 32K cache with cache line size 16 bytes. If the memory latency (for 1 cache line) is 100 ns and cache latency is 1 ns per word (4 bytes), compute the FLOPS for performing the dot-product of two vectors of N elements comprising of 32 bit floating point numbers. You can assume that the madd operation takes 1 cycle to execute. Ignore the time to perform address increment operations and loop branching. Assume N is a large integer.

Answers

Answered by aqibkincsem
0

There will be two matrices into the cache memory which will be of 2k words. This will take up the time of 200µs (2000*100ns).


On multiplying both the nxn matrices, there will be 2n^3 operations which means there will be 64k operations.


To perform these operations, there will be 16k cycles needed which means 16µs. So, the total time for computation will be 200 + 16µs or computation rate of 64K FLOP/216µs which is 303 MFLOPS.

Similar questions