things%2bto%2bdo

Upload: muhammad-enam-ul-haq

Post on 03-Apr-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/29/2019 Things%2Bto%2Bdo

    1/3

    COMP 2213 X2: Computer Architecture and Organization

    Assignment 3

    Due: Wednesday, March 20, 2013 at midnight

    Part 1: Caching

    [25 pts]

    1. In class, code for computing the sum of all cells in a 10,000 x 10,000 matrix of double precisionnumbers was presented, and the performance of doing row-wise summation vs. column-wise

    summation was compared. For the row-wise summation, the following output was produced

    when the program was executed under the performance profiler on Linux:

    duane@laptop:~/mat_sum$ perf stat -e cycles:u -e instructions:u \

    -e L1-dcache-loads:u -e L1-dcache-load-misses:u \

    -e L1-icache-load-misses:u ./matrix_sum_rowwise

    Sum: 49997291.929030

    Performance counter stats for './matrix_sum_rowwise':

    3,884,859,754 cycles:u # 0.000 GHz

    8,800,430,880 instructions:u # 2.27 insns per cycle

    2,900,762,202 L1-dcache-loads

    12,569,154 L1-dcache-misses # 0.43% of all L1-

    dcache hits

    8,246 L1-icache-misses

    1.304023634 seconds time elapsed

    Based on the number of L1 data cache misses (slightly over 12,500,000) while computing the

    sum of the matrix in a row-wise fashion, how many bytes are stored in each L1 cache entry of

    the processor on which this program was executed? Describe how you derive your answer.

  • 7/29/2019 Things%2Bto%2Bdo

    2/3

    2. [10 pts]

    The i7 processor has three levels of cache. Each processor has at least two cores (like CPUs within a

    CPU). Each core has its own small L1 cache, and a larger and slightly slower L2 cache. Then, there is an

    L3 cache shared by the whole processor, and the even larger L3 cache is connected to the memory

    system.

    What is the advantage to the L3 cache being shared by all cores, instead of each core having its own

    large L3 cache?

  • 7/29/2019 Things%2Bto%2Bdo

    3/3

    Part 2: Virtual Memory

    1. [30pts]Suppose in an Intel system with a 3-level hierarchical page table, as described in section 3.1 of

    https://www.kernel.org/doc/gorman/html/understand/understand006.html ,

    that entries for a process' pgd table and the pmd tables consumed 8 bytes each, each entry for

    the page tables (described as pte_t page frames in the above URL) consume 16 bytes each. Also

    assume that each table consumes 4K of memory.

    a. How many pages can be assigned by the system to a single program?b. Assuming 4K page sizes, how much memory (in gigabytes) can be assigned to a single

    program?

    2. [35 pts]Assuming the structure as provided in question #1, if a program is assigned 3G of memory:

    a. What is the smallest number of pages assigned to this program?b. What is the smallest number of page tables (pte_t page frames) required for this

    program?

    c. What is the smallest number of pmd tables required for this program?d. What is the smallest number of pgd tables required for this program?e. Assuming the space overhead for memory management is the number of bytes required

    to store all the tables related to paging (i.e. your answers to b, c, and d), how much

    memory is consumed in memory management overhead for this program?