[ieee 2011 international symposium on vlsi design, automation and test (vlsi-dat) - hsinchu, taiwan...

Cloud Computing And EDA: Is Cloud Technology Ready for Verification…

(And Is Verification Ready for Cloud)?

Hasmukh Ranjan Vice President Engineering Compute & Infrastructure Services

Synopsys Inc. 700 East Middlefield Road

Mountain View, CA, USA 94043

GROWTH OF VERIFICATION COMPLEXITY

ASIC/SOC design size and complexity is growing at unprecedented rates, and this is driving more than exponential growth in the verification effort for these designs. This in turn is driving increasing investment in the HW infrastructure needed to support verification - to the point of becoming unsustainable. Even when companies can afford the cost of such expansion, many have reached other limitations – data center space, cooling, power, etc.

Variable Characteristics of Verification

For those companies that can afford a large farm of servers whose cost is spread across multiple projects, and have adopted a modern verification methodology, another challenge arises – the variability of verification workload. Typically the earliest stages of verification within a single project involve simulation of individual blocks and subsystems. Each is relatively small compared to the entire SoC or system; individual tests complete relatively quickly and are not limiting factors. However, as multiple subsystems are integrated, and the easy bugs are found, test queues grow much larger, and much longer individual tests are typically added. At this point, HW resources provisioned at the beginning of the project may no longer support the expected schedule and either the schedule must be delayed, or verification coverage (i.e., quality) must be sacrificed.

Unfortunately, schedules don’t always go as planned. Last minute bugs, for example, can require an entire regression run to be repeated, causing unpredictable demand peaks and further delays. This can lead to a cascade effect, where delays in one project will impact schedule in others. Even minor schedule variations can lead to resource challenges. And occasionally, a “perfect alignment” of multiple demand peaks can overwhelm a server farm, leading to massive delays.

Ramifications of Over and Under Provisioning

To keep things on track, these peaks and valleys in work load must be planned for in advance. That is a difficult task at best, and worst-case overprovisioning of HW infrastructure, while effective, is hugely expensive. A last minute, unexpected regression run on a large project could require a sizeable increase in the size of the server farm (servers, storage, networking, power, etc), and a corresponding increased number of EDA software licenses. Such overprovisioning clearly leads to very low utilization. Likewise, under provisioning translates to up-front expenses, and much better utilization, but it also means longer schedules in the best case, and potential disaster otherwise.

Historically, most companies learn to live with occasional schedule delays as even reserve capacity is seen as a wasted expense. But with the growth in complexity, these delays are sure to become the norm unless something else is considered. Verification capacity is growing beyond the financial means of companies to allocate resources. And with small delays in one project having a cascading effect on others, companies are facing more and more risk. Missed schedules, delayed products, lost market share, and reduced revenue are the likely results.

CLOUD COMPUTING AND VERIFICATION

And so, more and more companies are looking for a paradigm shift in their design methodology. What they need are two key capabilities:

Scalable and flexible compute provisioning: For peak demand.

Predictable and accurate baseline resource provisioning: To reduce cost.

If we look at the above requirements it becomes clear that Cloud Computing for EDA appears best situated to meet these needs. Cloud computing is unique in that it provides:

Enormous scalability: Thousands of servers can be made available (pro-configured) in a matter of minutes.

On-demand availability: No lag time between when you need it and when you get it. And if you don’t need it, you don’t pay for it.

Flexibility and elasticity: Perhaps the most important point – when you’re peak demands pass, you ramp down your resources – virtually instantly – and stop paying for them.

FIGURE 1. SERVER FARM DEPLOYMENT HURDLES1

FIGURE 1. SERVER FARM DEPLOYMENT HURDLES1

978-1-4244-8499-7/11/$26.00 ©2011 IEEE

Is Cloud Economical?

This question typically arises when one compares the HW cost for adding an on-site server to the hourly access cost for cloud-based resources. This is a simplistic approach to expenses, and the ultimate determination of cost and benefits is dependent on many factors. For example, what is the true cost of a server? It’s not just the cost of the hardware – it’s also the cost of electricity cooling; storage, networking, physical space; management; redundancy; HW obsolescence; back-ups, etc. This represents the fully-loaded cost of a server and can be many times more than the cost of the hardware itself. And depending on the buying power of your company, the big cloud players are likely purchasing orders of magnitude more hardware than you – which gives them significant leverage for bigger discounts.

All that said, for most companies the outright replacement of on-premise compute resources is not optimal. The real economies arise when one looks at using Cloud to augment existing infrastructure. By leveraging cloud to meet peak verification demands, companies can reduce over-provisioning costs and meet all demand requirements – even those that are driven by unexpected, last-minute bugs.

Perhaps more important advantage is being able to avoid schedule delays, and in some cases, reducing project schedules by having more compute resources available. This is especially applicable on projects where long verification regression suites are required. Though there may be extra cost upfront, delivering a product to market early can payoff many times over. As noted in the Ateq model shown here, late market entry generally means lower market share and less revenue over the life of the product. [2] Likewise, early market entry means more revenue and a greater market share.

If you work for a publicly traded company, not to mention a start-up, the other big advantage of Cloud computing is that it shifts investment dollars from Capital Expenditures (CAPEX) to Operation Expenditures (OPEX). This helps more precisely match project expenses to activities and frees up investment dollars when they are scarce (typical for start-ups).

What about Security?

When one looks closely they will see that most cloud providers offer security that is at least as good, and typically better than, a typical enterprise. For the purposes of this paper, we will reference Amazon’s Elastic Compute Cloud “EC2” [3].

For Amazon and other major cloud providers, the cost of security is spread out over many customers. This means that what would be considered too expensive or simply “over-kill” for a single customer makes economic sense for them. Physical intrusion, virus detection, service attacks and data storage can all benefit from economies of scale. Amazon and other cloud providers, include the following certifications, which few enterprise design companies attain:

ISO 27001: ISO 27001/27002 is a widely-adopted global security standard that sets out requirements and best practices for managing company and customer information. It is based on periodic risk assessments covering infrastructure, data centers, and services. [4]

SAS 70 Type II: SAS70 certifies that a service organization has had an in-depth audit of its controls (including control objectives and control activities), which in the case of AWS relates to operational performance and security to safeguard customer data. [4]

What Changes Must Be Made To Move To Cloud?

In order for verification customers to be ready to move to the cloud, there are a few major steps that need to be taken into account.

Technical Readiness: Customer designs and scripts must be prepared to run optimally on a cloud infrastructure.

Export Laws: Care must be taken to assure that moving data to the cloud does not violate national security laws.

Corporate Readiness: Companies must ensure that their internal company policies support running workloads on cloud environments.

Business Readiness: Companies must have the business processes in place to procure on demand scalable cloud products

CLOUD COMPUTING – WELL WORTH A CLOSER LOOK…

It is clear that market pressure to deliver bigger, faster, lower cost, lower power integrated products is having a toll on the design engineer. Today verification is the dominant cost in developing an ASIC and as design size grows, verification grows exponentially. Unless a paradigm shift in methodology is adopted soon, engineers will find themselves in a resource crunch with schedule dictated by the availability of verification compute resources. We believe that that paradigm shift is Cloud computing. The industry is now mature, the limitations are well understood and the costs are now viable. The time for a paradigm shift in thinking is now.

REFERENCES

[1] Filani, He, Gao, Rajappa, Kumar, Shah, Nagappan, Dynamic Data Center Power Management: Trends, Issues, and Solutions, Intel Technology Journal, Volume 12, Issue 01, 2/21/2008, pp. 60

[2] Fred Y. Philips, Market-oriented technology management:

innovation for profit in Entrepreneurial Times, New York: Springer2001, pp. 88

[3] http://aws.amazon.com/security/, http://media.amazonwebservices.com/pdf/AWS_Securit

y_Whitepaper.pdf [4] http://www.iso.org/iso/iso_catalogue.htm

FIGURE 2. COST OF BEING LATE TO MARKET2

[ieee 2011 international symposium on vlsi design, automation and test (vlsi-dat) - hsinchu, taiwan...

Documents