Transcript

Testing HCN for PRAM

Michael Jones, Ganesh Gopalakrishnan

University of Utah, School of Computing

Outline

• Goals: – re-use an abstraction for branching topologies– combine test model checking and abstraction

• How HCN works

• What was verified and how

• Discussion

HCN

• Directory-based hierarchical caching netw.

• Obeys sequential consistency, and PRAM is weaker than SC.

• Written by Arvind and Xiaowei Shen

HCN Model

P

P

PP P P

M

M

M M

M

M

M

M

M

HCN Model

P

P

PP P P

M

M

M M2

M

M

M0

M

M1

wr_req (a,2)ex-req(a)

HCN Model

P

P

PP P P

M

M

M M2

M

M

M0

M

M1

wr_req (a,2)ex-req(a)

ex-req(a)

Testing for PRAM

• Any 3 processors

• Located anywhere in any HCN network

• Sharing a single address

• Always satisfy PRAM

• Abstraction to cover all networks

• Test model check for PRAM with N=3.

Testing for PRAM

• # Procs sharing address: 3

• # Procs in system: arbitrary

• # Caches in system: arbitrary

• # Addresses being shared 1

• # Addresses in system arbitrary

• Property mem model

Abstraction Recipe

1. Throw away enough transactions and structure, and...

2. Merge enough structure to get a finite state model.

3. Add enough non-determinism to get same behavior on remaining observed state

(Inspired by trace inclusion refinement)

Why the Recipe Works

For some class of protocols, a “nice amount” of non-determinism is required to capture all behaviors of the observed state in the reduced model

HCN Abstraction

M

M

M M2

M

M

M0

M

M1

HCN Abstraction

M

M

M M2

M

M

M0

M

M1

P

QP

HCN Abstraction

M

M

M M2

M

M

M0

M

M1

P

QP

HCN Abstraction

M

M

M0

M

M1

P

QP

Merging Linear State

M

M

M

...

...

......

...

...

...

......

...

...

HCN Abstraction

M

M

M0

M

M1

P

QPP P Q

|{Finite State Configs}| is Finite

P P Q P PQ P P Q

Modeling a TRS in MurRule "receive wb rep and send sh rep"

(trec[addr].req = sh_req & hd_in.opc = wb_rep &

hd_in.addr = addr & state[addr] = ex_w &

(current_writer(addr,m) = hd_in.src))

==>

var rep_msg : tMsg ; begin

rep_msg.opc := sh_rep; rep_msg.src := m;

rep_msg.dst := trec[addr].id; rep_msg.addr := addr;

rep_msg.data := hd_in.data;

enqueue (outq, rep_msg);

state[addr] := ex_r;

add_to_dir (addr, trec[addr].id, m, dir);

add_to_dir (addr, hd_in.src, m, dir);

clearTrec (addr,trec); delete (inq, 0);

end;

receive-wb-rep-and-send-sh-rep

<id,Cell(a,u,(Ex,W(idk)))|m,

Msg(idk,id,Wbrep,a,v)+i,o,

Trec(a,(idp,Sh-req))|t>

<id,Cell(a,v,(Ex,R(idk|idj)))|m,

i,o+Msg(id,idj,sh-rep,a,v),

t>

Testing for PRAM

wr(A,2)rd(A,-)

wr(A,2)rd(A,-)

rd(A,1)

rd(A,0)E

wr(A,0)rd(A,-)

wr(A,1)rd(A,-)

wr(A,1)

rd(A,1)

rd(A,1)

E

wr(A,1)rd(A,-)

rd(A,0)

rd(A,0)Model

Checker

Inadvertantly Seeded Error

Model Checking Results

P P Q

P Q P

P P Q

States CPU time (sec)

110,995 87.57

Total 881,467 435.48

151,598 65.51

618,874 282.40

Discussion

at least one error in which topology matters

• Abstraction carried over nicely to a non-PCI protocol.

• N=4 and 2 addresses: both too big.– only explore several million states per model

• Abstraction + test model checking =

more general results.

Inadvertantly Seeded Error

read&miss

sh-req

Inadvertantly Seeded Error

read&miss

sh-req

write&miss

ex-req

Inadvertantly Seeded Error

read&miss

sh-req

write&miss

ex-req

write&miss

ex-req

Inadvertantly Seeded Error

read&miss

sh-req

write&miss

ex-req

write&miss

ex-req

ex-req(2)

10 2

Inadvertantly Seeded Error

read&miss

sh-req

write&miss

ex-req

write&miss

ex-rep

10 2:0

Inadvertantly Seeded Error

read&miss

sh-req

write&miss

ex-req

write&miss

wb-req

ex-req(1)

10 2:0

Cache State Encoding

M

StateAddress Value Cache Home

cell cellcell...cell

Cache State Encoding

StateAddress Value Cache Home

cell cellcell...cell

“Cstate”: Shared or exclusive wrt siblings“Horizontal” state

Sh = shared with siblingsEx = has an exclusive copy.

Cache State Encoding

StateAddress Value Cache Home

cell cellcell...cell

“Hstate”: Which children have cached the state and why“Vertical” state

R(dir) = all children in dir have shared copies for readingW(id) = the child id has an exclusive copy for writting

HCN Model

P

P

PP P P

M

M

M M2

M

M

M0

M

M1 M1 is a child of M0, so M1 is a cache for data in M0.

HCN Model

P

P

PP P P

M

M

M M2

M

M

M0

M

M1 M1 is the parent of M2,so M1 is the home of data in M2

HCN Model

P

P

PP P P

M

M

M M

M

M

M

M

M

Innermost memories, or L1 caches.

HCN Model

P

P

PP P P

M

M

M M

M

M

M

M

M

Outermost memory

Testing for PRAM

wr(A,2)rd(A,-)

wr(A,2)

rd(A,-)

rd(A,1)

rd(A,0)E

wr(A,0)rd(A,-)

wr(A,1)rd(A,-)

wr(A,1)

rd(A,1)

rd(A,1)

E

wr(A,1)rd(A,-)

rd(A,0)

rd(A,0)

HCN Model

P

P

PP P P

M

M

M M2

M

M

M0

M

M1

wr_req (a,2)ex-req(a)

ex-req(a)

wb-req(a)

HCN Model

P

P

PP P P

M

M

M M2

M

M

M0

M

M1

wr_req (a,2)ex-req(a)

ex-req(a)

wb-req(a)


Top Related