a static program analyzer to increase software reuse

21
Department of Computer Science A Static Program Analyzer to increase software reuse Ramakrishnan Venkitaraman and Gopal Gupta

Upload: lareina-mccarty

Post on 02-Jan-2016

27 views

Category:

Documents


2 download

DESCRIPTION

A Static Program Analyzer to increase software reuse. Ramakrishnan Venkitaraman and Gopal Gupta. Cost of software always on the rise. Why do we need a software standard?. Lack of software reuse because of lack of software standards Non availability of a rich set of COTS components - PowerPoint PPT Presentation

TRANSCRIPT

Department of Computer Science

A Static Program Analyzer to increase software reuse

Ramakrishnan Venkitaraman and

Gopal Gupta

Source: Data and Analysis center for Software

Cost of software always on the rise

Why do we need a software standard?

Lack of software reuse because of lack of software standards

Non availability of a rich set of COTS components Time to market new products measured in years

rather than months Incompatibilities make integration of software from

multiple vendors impossible

The discussion refers mainly to DSP software but the problems are comparable to any software development process

TI TMS320 DSP Algorithm Standard

Contains 34 rules and 15 guidelines Intended to enable a rich set of COTS

marketplace and significantly reduce the time to market for new products

Will allow system integrators to integrate compliant algorithms from multiple vendors into a single system

Reduces time to market, increases software quality and software reuse

General Programming Rules

No tool currently exists to check for compliance

Programs must be relocatable No hard coded data memory locations No hard coded program memory locations Programs must be reusable Algorithms must be re-entrant

Hard Coded Addresses

Generally a bad programming practice unless you are programming for device drivers

Results in non relocatable code Results in non reusable code A pointer variable is said to be NOT hard coded if

a) If the address is derived from a call to memory allocation routines like “malloc” or “calloc”

b) If the address is derived as a function of the “stack pointer”c) If the address is derived from another pointer that is

legitimate.

Static Program Analysis

Static program analysis (or static analysis for brevity) is defined as any analysis of a program carried out without completely executing the program

The traditional data-flow analysis found in compiler back-ends is an example of static analysis

Another example of static analysis is abstract interpretation, in which a program's data and operations are approximated and the program abstractly executed

Basic Blocks and Flow Graphs

A “Basic Block” is a sequence of consecutive statements in which flow of control enters at the beginning and leaves at the end without halting or possibility of branching except at the end.

The basic blocks form the nodes in a directed graph called the “Control Flow-Graph”. This graph will help us to visualize and arrive at all possible paths through which program control could flow at runtime. All such paths must be analyzed for compliance.

Overview of our approach

Input: Object Code of the algorithm Output: Compliant / Not Compliant status

Activity Diagram for our Static Analyzer

Our Algorithm for Static Analysis

1) Get the disassembled code from the input object code2) From the disassembled code, get the basic blocks and

construct the flow-graph3) Analyze the flow-graph and check for the dereferencing of

pointer variables4) For each such dereferencing, scan back and find out from

where did this pointer get its value from (involves the formation of unsafe sets which are explained later)

• If the original source of this pointer is hard coded, then declare that the algorithm is not compliant (“unsafe")

• If the original source from of this pointer is legitimate then declare that dereferencing is safe

5) The algorithm is declared to be safe if and only if all such pointer dereferencing are safe

Phases in Static Analysis of the Flow Graph

Phase 1: The analyzer detects statements in the disassembled code which correspond to the dereferencing of pointer variables by scanning downwards in the flow graph

Phase 2: The analyzer checks whether any dereferencing detected in phase 1 is safe by scanning upwards in the flow graph

Building Unsafe Sets

“Unsafe Set” is the set of registers which may potentially contain hard coded references

First element is added to the unsafe set when phase 1 detects dereferencing of a pointer

Example: If we find “ *Reg ” in the analyzed code, the unsafe set is initialized to {*Reg}

Note: Most Examples used in the presentation use the ‘C’ programming language for easy understanding while the real analysis is done at the Assembly Language level.

Building unsafe sets (continued)

Phase 2 populates the equivalence set by “scanning backwards”

For example if we find Reg = Reg1 + Reg2, the element “Reg” is deleted

from the unsafe set and the elements “Reg1” and “Reg2” are inserted into the unsafe set

Contents of the unsafe set will now become {Reg1, Reg2}

Now we scan backwards searching for both “Reg1” and “Reg2” in this case

Analysis Stops when…

All pointer dereferencing in the program are declared to be “safe” (not hard coded)

OR At least one of the pointer dereferencing in

the program is declared to be “unsafe” (hard coded)

Handling Loops

Complex because the number of iterations of the loop may not be known until runtime

We scan and cycle through the loop until the unsafe set reaches a “Fixed Point”

A Fixed Point is reached when The unsafe set repeats itself at the same point

in the loop during successive iterations No new information is added to the unsafe set

during successive iterations

Handling Function Calls

Similar to a Branch statement Marks the beginning and end of basic blocks Recursive function calls are handled as if

they were looping constructs

Handling Parallelism

The || characters signify that an instruction is to execute in parallel with the previous instruction

Instructions A, B, C are executed in parellel Example

Instruction A|| Instruction B|| Instruction C

Handle/Skip parallel instructions encountered during phase 2 until an instruction in the previous cycle is found

Current Work

Current work includes fine tuning the handling of loops and extending our system for the remaining rules

The development and testing of the tool is currently in progress

The system is being developed using the ‘C’ programming language

Related Work and Conclusion

Compared to Dynamic Analysis, Static Analysis can give correct results for a larger set of cases because of the very nature of the analysis

Our work so far can be regarded as an attempt to demonstrate the efficacy of static analysis to perform these checks and aid in software reuse

References

Ramakrishnan Venkitaraman and Gopal Gupta, “Static Program Analysis to Detect Hard Coded Addresses and its Application to TI's DSP Processor”, CS department technical report UTD CS-23-03

For More information, contact [email protected]

Questions……….