manufacturing equipment data collection framework · the bee framework project is essentially a...

Faculdade de Engenharia da Universidade do Porto

Manufacturing Equipment Data

Collection Framework

Daniel José Barbudo Aguilar

Report of Project

Master in Informatics and Computing Engineering

Supervisor: Maria Teresa Galvão Dias (PhD)

July 2008

c© Daniel Aguilar, 2008

Manufacturing Equipment Data CollectionFramework

Daniel José Barbudo Aguilar

Report of Project

Master in Informatics and Computing Engineering

Approved in oral examination by the committee:

Chair: António Augusto de Sousa (PhD)

External Examiner: César Analide (PhD)

Internal Examiner: Maria Teresa Galvão Dias (PhD)

31st July, 2008

Abstract

The Bee Framework project is essentially a data collection framework used to collectthe data generated by equipments which intends to help improving the equipmentintegration process. Even considering different equipment types, there are severalcommon data collection functionalities that can be easily reused just by doing a smallamount of light changes or even no changes at all. Some of these functionalities are,for example, database access, folder monitoring actions, communication and datacollection, among others.

The main goal of the project described in this report is the elaboration of adetailed specification for a data collection framework used to collect the data gener-ated by several equipments in an assembly line. Additionally, the core requirementsof the framework should also be implemented and an equipment integration shouldalso be done as a proof of concept. This framework intends to control and monitorthese equipments so that the data they generate can be collected. This data willbe fed into data analysis solutions that provide the tools required to follow-up on acontinuous process evolution and optimization and process control.

However, the wide diversity of existing equipments used in such assembly lines isa well-known problem that makes it hard to adopt a common integration solution.Moreover, a large number of these equipments do not follow some of the interna-tional standards for data collection and data control used in the semiconductorindustry. These strong difficulties to adopt a common solution lead to the softwaredevelopment of a specific data collection solution for each new equipment type.

The Bee Framework has been designed with a modular architecture in orderto provide easier and general methods capable of integrating equipments and theirrelated systems. By using such a framework, it will not be necessary to developa specific data collection solution for each new equipment type’s integration, sincethe core functionalities will already be available and ready for use. It will just benecessary to configure the framework according the equipment being integrated andadapt it to face the specific equipment requirements.

Beyond the aspects related to the framework architecture and specification de-scribed in this document, this project has also considered the development of a proofof concept. The main goal of the proof of concept is to show the resulting advan-tages of using the framework in the equipment integration process. The equipmenttype considered had already a specific solution to collect the data generated and isintegrated in the Qimonda assembly line using a different approach. So, this proof ofconcept intends to establish a comparison regarding both the effort and time neededto integrate the same equipment type using the framework approach instead of theprevious one.

i

Resumo

O projecto descrito neste relatório e designado por Bee Framework é, essencial-mente, uma framework de recolha de dados usada para a integração de equipamen-tos, que pretende auxiliar e melhorar o processo de integração. Mesmo tratando-sede equipamentos de tipos diferentes, existe um grande conjunto de funcionalidadesem comum no que à recolha de dados diz respeito, podendo assim proceder-se àreutilização das funcionalidades comuns efectuando apenas ligeiras ou até mesmonenhumas alterações. Algumas destas funcionalidades são, por exemplo, o acesso abases de dados, monitorização de directórios, comunicação e recolha de dados.

O principal objectivo do projecto descrito é a elaboração de uma especificaçãodetalhada para uma framework de recolha de dados gerados por equipamentos uti-lizados em linhas de produção. Adicionalmente, os requisitos nucleares da frame-work devem também ser implementados, devendo ainda realizar-se a integração deum equipamento como prova de conceito. Pretende-se assim um controlo e moni-torização destes equipamentos de modo a recolher os dados por estes produzidos earmazená-los em suportes que permitam posterior análise e interpretação dos mes-mos, possibilitando uma cont́ınua evolução e optimização dos processos.

No entanto, devido à grande diversidade de equipamentos existentes numa linhade produção e devido ao facto de alguns destes equipamentos não seguirem as normasexistentes utilizadas na indústria de semicondutores no que à recolha e controlo dosdados diz respeito, torna-se necessário desenvolver soluções espećıficas de recolha dedados para cada novo equipamento que se pretende integrar, de modo a proceder àrecolha dos dados que são gerados.

A Bee Framework possui uma arquitectura modular, de modo a fornecer me-canismos fáceis e gerais capazes de integrar equipamentos e os sistemas com elesrelacionados. Utilizando uma framework deste género, não será necessário desen-volver uma solução espećıfica para cada nova integração de um equipamento, umavez que as funcionalidades nucleares estarão já dispońıveis e prontas a utilizar. Aframework deverá apenas ser configurada de acordo com os requisitos espećıficos doequipamento a integrar.

Para além dos aspectos relacionados com a arquitectura e especificação da frame-work descritos neste relatório, este projecto contou ainda com o desenvolvimento deuma prova de conceito. O seu principal objectivo é demonstrar as vantagens decor-rentes da utilização da framework na integração de um equipamento, isto é, procederà recolha dos dados produzidos pelo equipamento quando este se encontra em fun-cionamento numa linha de produção da Qimonda. Poderá assim ser efectuada umacomparação relativa à quantidade de esforço e tempo requerido, já que o equipa-mento considerado se encontrava já previamente integrado com outra abordagem.

iii

Acknowledgments

I would like to thank Qimonda Portugal for giving me the required conditions Ineeded to complete my project. I would like to thank all my work colleagues,especially my project supervisor Nuno Soares, but also Rui Alves and all othercolleagues belonging to the Equipment Control team for being so supportive andhelping me better understand my project domain.

A special thank too for my project supervisor at Faculdade de Engenharia daUniversidade do Porto, professor Teresa Galvão, for giving me all the encouragementand support I needed, not only in this project but also during the course of mystudies, always with honest interest and friendship.

I also would like to thank all the teachers that in some way contributed not onlyto my education but also throughout my personal life.

Additionally, I extend my thanks to the English professor Sónia Nogueira for thehelp she provided me in the reviewing of this document.

Finally, my biggest gratitude to my family, especially my parents and sister,and to all my friends for all the support they give me, both in the good and badtimes, always helping me facing my problems and giving me the encouragement andboldness I needed to keep going and achieve my goals. My sincere and special wordof thanks to all of you.

Daniel Aguilar

v

To my family and friends

vii

Contents

Abstract i

Resumo iii

Acknowledgments v

Contents ix

List of Figures xiii

List of Tables xv

Glossary xvii

1 Introduction 11.1 Project Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Project Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3 Project Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.4 Approach Methodology and Constraints . . . . . . . . . . . . . . . . 41.5 Report Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2 Data Collection Problem Analysis 72.1 Data Collection Overview . . . . . . . . . . . . . . . . . . . . . . . . 72.2 Data Collection in Semiconductor Industry . . . . . . . . . . . . . . . 82.3 Data Collection at Qimonda . . . . . . . . . . . . . . . . . . . . . . . 9

3 State of the Art 113.1 Technology Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3.1.1 Programming Languages and Tools . . . . . . . . . . . . . . . 113.1.2 Modeling Languages and Tools . . . . . . . . . . . . . . . . . 143.1.3 RDBMS and Database Tools . . . . . . . . . . . . . . . . . . . 153.1.4 Data Persistence . . . . . . . . . . . . . . . . . . . . . . . . . 173.1.5 Communication Technologies . . . . . . . . . . . . . . . . . . 203.1.6 Markup Languages . . . . . . . . . . . . . . . . . . . . . . . . 22

3.2 Previous Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233.2.1 Collecting Data from Files . . . . . . . . . . . . . . . . . . . . 243.2.2 Database Concurrency . . . . . . . . . . . . . . . . . . . . . . 243.2.3 Handling Messages and Communication . . . . . . . . . . . . 26

ix

CONTENTS

3.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

4 Framework Specification and Architecture 294.1 Framework Black-Box Overview . . . . . . . . . . . . . . . . . . . . . 29

4.1.1 Collecting Data . . . . . . . . . . . . . . . . . . . . . . . . . . 304.1.2 Saving Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

4.2 Framework White-Box Overview . . . . . . . . . . . . . . . . . . . . . 324.3 Framework Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 34

4.3.1 Folder Monitor Module . . . . . . . . . . . . . . . . . . . . . . 394.3.2 Backup Module . . . . . . . . . . . . . . . . . . . . . . . . . . 424.3.3 Equipment Modules . . . . . . . . . . . . . . . . . . . . . . . 444.3.4 Message Handling . . . . . . . . . . . . . . . . . . . . . . . . . 48

4.4 Framework Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . 574.4.1 YODA Service . . . . . . . . . . . . . . . . . . . . . . . . . . 584.4.2 Database Service . . . . . . . . . . . . . . . . . . . . . . . . . 584.4.3 Email Service . . . . . . . . . . . . . . . . . . . . . . . . . . . 594.4.4 Logging Service . . . . . . . . . . . . . . . . . . . . . . . . . . 594.4.5 Framework Messages Service . . . . . . . . . . . . . . . . . . . 604.4.6 Timer Service . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

5 Prototype Development 635.1 Prototype Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 635.2 AOI — Automatic Optical Inspection — Equipment Overview . . . . 645.3 Collecting Data From AOI Equipments . . . . . . . . . . . . . . . . . 655.4 AOI Integration Use Cases . . . . . . . . . . . . . . . . . . . . . . . . 67

5.4.1 Complementary Functions . . . . . . . . . . . . . . . . . . . . 685.5 AOI Use Cases Implementation . . . . . . . . . . . . . . . . . . . . . 71

5.5.1 Process XML Files . . . . . . . . . . . . . . . . . . . . . . . . 725.5.2 Notify XML Generation Down . . . . . . . . . . . . . . . . . . 725.5.3 Backup AOI Log Files . . . . . . . . . . . . . . . . . . . . . . 735.5.4 Log Equipment Breakdown Reason . . . . . . . . . . . . . . . 75

5.6 AOI Integration Architecture . . . . . . . . . . . . . . . . . . . . . . 765.6.1 Global Logical View . . . . . . . . . . . . . . . . . . . . . . . 785.6.2 Bee Framework Logical View . . . . . . . . . . . . . . . . . . 81

5.7 AOI Integration Test Cases . . . . . . . . . . . . . . . . . . . . . . . 845.7.1 Process XML Files . . . . . . . . . . . . . . . . . . . . . . . . 84

5.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

6 Findings and Discussion 896.1 Event-based Framework . . . . . . . . . . . . . . . . . . . . . . . . . 89

6.1.1 Detecting Changes in Files . . . . . . . . . . . . . . . . . . . . 896.1.2 Notifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

6.2 Parsing XML Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . 926.3 Database Access and Saving Data . . . . . . . . . . . . . . . . . . . . 936.4 Time Required for Integration . . . . . . . . . . . . . . . . . . . . . . 946.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

x

CONTENTS

7 Conclusions 977.1 Project Applicability . . . . . . . . . . . . . . . . . . . . . . . . . . . 977.2 Final Recommendations and Perspectives of Future Work . . . . . . . 987.3 Final Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

References 101

Index 107

A Bee Framework Configurations 113

B Services Configurations 117B.1 YODA and Message Services . . . . . . . . . . . . . . . . . . . . . . . 117B.2 Database Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117B.3 Email Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119B.4 Logging Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120B.5 Timer Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

C AOI Integration Test Cases 127C.1 Process XML Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127C.2 Notify XML File Generation Down . . . . . . . . . . . . . . . . . . . 129C.3 Backup AOI Log Files . . . . . . . . . . . . . . . . . . . . . . . . . . 130C.4 Log Equipment Breakdown Reason . . . . . . . . . . . . . . . . . . . 131

D AOI Database Schema 133D.1 AOI Control Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133D.2 AOI Raw Data Tables . . . . . . . . . . . . . . . . . . . . . . . . . . 135D.3 AOI Summary Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . 136D.4 AOI Target Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136

E AOI — SQL*Loader Usage 139E.1 SQL*Loader Header Files . . . . . . . . . . . . . . . . . . . . . . . . 140

E.1.1 Board Inspection Header . . . . . . . . . . . . . . . . . . . . . 141E.1.2 Board Rework Header . . . . . . . . . . . . . . . . . . . . . . 141E.1.3 Location Inspection Header . . . . . . . . . . . . . . . . . . . 141E.1.4 Location Rework Header . . . . . . . . . . . . . . . . . . . . . 141

F AOI Configurations 143F.1 Bee Framework Configuration File . . . . . . . . . . . . . . . . . . . . 143F.2 Folder Monitor Module Configuration File . . . . . . . . . . . . . . . 143

F.2.1 CopyFile Message . . . . . . . . . . . . . . . . . . . . . . . . . 144F.2.2 MoveFile Message . . . . . . . . . . . . . . . . . . . . . . . . . 144F.2.3 LoadWatchers Message . . . . . . . . . . . . . . . . . . . . . . 144F.2.4 StartMonitoring Message . . . . . . . . . . . . . . . . . . . . . 145F.2.5 ListFilesDirectory Message . . . . . . . . . . . . . . . . . . . . 145

F.3 AOI Equipment Module Configurations . . . . . . . . . . . . . . . . . 145F.3.1 CommandLot Message . . . . . . . . . . . . . . . . . . . . . . 145F.3.2 Created.AOI Watcher and Created.AOI Log Watcher Messages 146

xi

CONTENTS

xii

List of Figures

1.1 Bee Framework logo . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

3.1 Visual Studio logo . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123.2 Resharper logo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133.3 NUnit logo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143.4 UML logo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143.5 Oracle Corporation logo . . . . . . . . . . . . . . . . . . . . . . . . . 163.6 Interdependencies of the Enterprise Library application blocks . . . . 183.7 Technology review . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

4.1 Black-box overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304.2 White-box overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334.3 Framework architecture overview . . . . . . . . . . . . . . . . . . . . 354.4 Singleton pattern . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364.5 Framework modules hierarchy . . . . . . . . . . . . . . . . . . . . . . 374.6 Gang of Four — Factory Method pattern . . . . . . . . . . . . . . . . 384.7 Bee Framework — Factory Method pattern . . . . . . . . . . . . . . . 384.8 Relationship between Observer pattern actors . . . . . . . . . . . . . 404.9 Implemented architecture of the Observer pattern . . . . . . . . . . . 404.10 FolderMonitor sequence diagram . . . . . . . . . . . . . . . . . . . . 414.11 Automatic updates of external assemblies . . . . . . . . . . . . . . . . 434.12 Equipment modules hierarchy . . . . . . . . . . . . . . . . . . . . . . 444.13 Start data collection flow . . . . . . . . . . . . . . . . . . . . . . . . . 454.14 Strategy pattern . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 464.15 Implemented architecture of the Strategy pattern . . . . . . . . . . . 474.16 Template Method pattern . . . . . . . . . . . . . . . . . . . . . . . . 484.17 Architecture that supports message handling . . . . . . . . . . . . . . 494.18 Flow of actions performed when instantiating modules . . . . . . . . 504.19 Messages hierarchy and parameters . . . . . . . . . . . . . . . . . . . 514.20 Chain of Responsibility pattern . . . . . . . . . . . . . . . . . . . . . 534.21 Sequence followed by a request in the chain . . . . . . . . . . . . . . . 544.22 Implemented architecture of the Chain of Responsibility pattern . . . 554.23 Example of broadcasting a message in the chain . . . . . . . . . . . . 56

5.1 AOI equipment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 645.2 AOI integration overview . . . . . . . . . . . . . . . . . . . . . . . . . 675.3 FolderMonitorModule use cases . . . . . . . . . . . . . . . . . . . . . 685.4 AOIEquipmentModule use cases . . . . . . . . . . . . . . . . . . . . . 69

xiii

LIST OF FIGURES

5.5 Use case: Process XML files . . . . . . . . . . . . . . . . . . . . . . . 735.6 Use case: Notify XML generation down . . . . . . . . . . . . . . . . . 745.7 Use case: Backup AOI log files . . . . . . . . . . . . . . . . . . . . . . 755.8 Use case: Log equipment breakdown reason . . . . . . . . . . . . . . 765.9 AOI main flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 795.10 AOI integration entities . . . . . . . . . . . . . . . . . . . . . . . . . 805.11 Bee Framework logical view . . . . . . . . . . . . . . . . . . . . . . . 83

D.1 AOI database schema . . . . . . . . . . . . . . . . . . . . . . . . . . . 134D.2 AOI Control table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134D.3 AOI raw data tables . . . . . . . . . . . . . . . . . . . . . . . . . . . 135D.4 AOI summary tables . . . . . . . . . . . . . . . . . . . . . . . . . . . 137D.5 AOI Target table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

xiv

List of Tables

5.1 Process XML files — Test case description . . . . . . . . . . . . . . . 845.2 Process XML files — Test case details . . . . . . . . . . . . . . . . . 85

C.1 Process XML files — Test case description . . . . . . . . . . . . . . . 127C.2 Process XML files — Test case details . . . . . . . . . . . . . . . . . 128C.3 Notify XML file generation down — Test case description . . . . . . . 129C.4 Notify XML file generation down — Test case details . . . . . . . . . 129C.5 Backup AOI log files — Test case description . . . . . . . . . . . . . . 130C.6 Backup AOI log files — Test case details . . . . . . . . . . . . . . . . 130C.7 Log equipment breakdown reason — Test case description . . . . . . 131C.8 Log equipment breakdown reason — Test case details . . . . . . . . . 132

xv

LIST OF TABLES

xvi

Glossary

AOI Automatic Optical InspectionAPI Application Programming InterfaceApplication block Software component designed to be as ag-

nostic as possible to the application archi-tecture, so that it can be easily reused bydifferent software applications.

Assembly line Manufacturing process in which parts areadded to a product in a sequential man-ner using optimized logistic and operationsplans in order to create a finished productfaster.

CFGmgr Configuration ManagerCLR Common Language Runtime — Virtual

machine component of Microsoft’s .NETinitiative.

CPU Central Processing Unit

DAB Data Access Block — Application blockprovided by Microsoft Enterprise Libraryrelated to database access architecture.

Daemon Computer program that runs as a back-ground process.

Data collection Process of preparing, collecting and savingthe data generated by some type of source.

Data mining Process of sorting through large amountsof data and picking out relevant informa-tion.

Data warehouse Electronic repository of an organization’sstored data

DDL Data Definition Language — Computerlanguage for defining data structures.

Deadlock Situation that occurs when two or morecompeting actions are waiting for the otherto finish, and thus neither ever does.

Design pattern General reusable solution to a commonlyoccurring problem in software design.

xvii

Glossary

DLL Dynamic Link LibraryDMS Decision Making System — Computer-

based information system includingknowledge-based systems that supportdecision-making activities. Also known asDecision Support System (DSS).

ECMA European Computer Manufacturers As-sociation — International and private(membership-based) standards organiza-tion for information and communicationsystems.

EDA Equipment Data Acquisition — Collectionof SEMI standards for the semiconductorindustry to improve and facilitate commu-nication between data collection softwareapplications and factory equipments.

EDC Engineering Data Collection — Engineer-ing processes and tools used in data collec-tion.

Event-based programming Programming paradigm in which the flowof the program is determined by sensoroutputs, user actions, or messages fromother programs or threads.

Folder monitoring Process of observing the contents of afolder and detect occurred changes in itsfiles or folders.

Framework Reusable design of a software system de-scribed by a set of abstract classes andby the way instances of these classes col-laborate, allowing both the reutilization ofcode and design architecture, which con-siderably reduces the development effortneeded.

GNU Computer operating system composed en-tirely of free software.

GoF Gang of Four — The group of authorsformed by Erich Gamma, Richard Helm,Ralph Johnson, and John Vlissides.

GUI Graphical User Interface

xviii

Glossary

HTML HyperText Markup Language — Markuplanguage that provides a means to describethe structure of text-based information ina document.

IDE Integrated Development Environment —Software application that provides com-prehensive facilities to computer program-mers for software development.

Integration tests Tests in which individual software modulesare combined and tested as a group.

Interface A Alternative name used to EDA (Equip-ment Data Acquisition).

Inversion of Control Abstract principle describing an aspectof some software architecture designs inwhich the flow of a system is inverted incomparison to the traditional architectureof software libraries.

Locking (database) Mechanism used to prevent data from be-ing corrupted or invalidated when multi-ple users need to access a database con-currently.

Lot (semiconductor) Set of semiconductor modules acting as asole unit.

LotEquipParamsHistSrv Lot Equipment Parameters History Server

Manufacturing equipment Semiconductor manufacturing equipmentconsists of manufacturing equipment usedin a clean room for the fabrication of semi-conductor chips, test equipment used inthe manufacturing and research and devel-opment environment and to test semicon-ductor manufacturing equipment, and fix-tures in place to support a semiconductorfabrication facility.

Message center Global access point of a software applica-tion used to monitor flow of messages in-side the application and that guaranteesthe attending and dispatching of receivedmessages.

Message handling Comprises the concepts of sending, receiv-ing and correctly routing messages in asoftware application.

xix

Glossary

MSMQ Microsoft Message Queueing — Messagingprotocol that allows applications runningon disparate servers to communicate in afailsafe manner.

OCL Object Constraint Language — Declara-tive language for describing rules that ap-ply to UML models.

OEDV Online Equipment Data Visualization —Tools used to online visualization of thedata generated by equipments.

OMG Object Management Group — Consor-tium focused on modeling and model-based standards.

OOP Object-oriented programming — Pro-gramming paradigm based on the useand interaction of different software unitsknown as “objects”.

ORM Object-Relational Mapping — Program-ming technique for converting data be-tween incompatible type systems in rela-tional databases and object-oriented pro-gramming languages.

PCB Printed Circuit Board — Board used tomechanically support and electrically con-nect electronic components using conduc-tive pathways etched from copper sheetslaminated onto a non-conductive sub-strate.

PL/SQL Procedural Language / Structured QueryLanguage — Procedural extension to theSQL database language.

Raw data Term used to refer unprocessed data (alsoknown as primary data).

RDBMS Relational Database Management SystemRefactoring Rewriting of some pieces of software code

to improve its performance or to increaseits understanding, without changing itsinitial meaning and behavior

Reflection (programming) Reflection is the mechanism of discoveringclass information solely at runtime.

xx

Glossary

Rolling (log files) Rolling is a combination of rotation andtranslation operations used when adding anew log entry. Oldest entries are trans-lated (or deleted if necessary) in favor ofnew entries, keeping a log file always up-dated with the recent entries.

RS-232 Recommended Standard 232 — Standardfor serial binary data signals commonlyused in computer serial ports.

SAM Statistical Appearance Model — Tech-nique that detaches users from the param-eter adjustment of complex algorithms andthat makes a more systematic use of train-ing images during teaching steps.

SECS/GEM SEMI Equipment Communications Stan-dard / Generic Equipment Module —Standard interface used in semiconductorindustry for equipment communications.

SEMI Semiconductor Equipment and MaterialsInternational — International organiza-tion which main focus is the promotion ofthe semiconductor industry and associatemanufacturers of equipment and materialsused in the fabrication of semiconductordevices.

Semiconductor industry Collection of business firms engaged in thedesign and fabrication of semiconductordevices.

SGML Standard Generalized Markup Language— ISO Standard metalanguage used to de-fined markup languages for documents.

SMT Surface Mounting Technology — Methodfor constructing electronic circuits inwhich the components are mounted di-rectly onto the surface of printed circuitboards.

SMTP Simple Mail Transfer Protocol — Standardfor email transmissions across the Internet.

SQL Structured Query Language — Databasecomputer language designed for the re-trieval and management of data inRDBMS.

xxi

Glossary

SVN Subversion — Version control system usedto maintain current and historical versionsof files such as source code, web pages, anddocumentation.

TCP/IP Transmission Control Protocol / InternetProtocol — Set of communications proto-cols that implement the protocol stack onwhich the Internet and most commercialnetworks run. Also known as Internet Pro-tocol Suite.

TDD Test Driven Development — Software de-velopment technique consisting of short it-erations where new test cases covering thedesired improvement or new functionalityare written first, then the production codenecessary to pass the tests is implemented,and finally the software is refactored to ac-commodate changes.

Test automation Test automation is the use of software toolsto control the execution of tests automat-ically.

TNS Transparent Network Substrate — Trans-parent layer that enables a heterogeneousnetwork consisting of different protocols tofunction as a homogeneous network.

UML Unified Modeling Language — Objectmodeling and specification language usedin software engineering.

Unit test Automated test that validates if individualunits of source code are working properly.

W3C World Wide Web Consortium — Main in-ternational standards organization for theWorld Wide Web.

Wrapper (software) Refers to a type of packaging that hidesimplementation code from end users andjust provides the required interfaces thatallow the execution of a wrapped function-ality.

XML Extensible Markup Language — General-purpose specification for creating custommarkup languages.

xxii

Glossary

YODA Your Own Data Adapter — Middlewarecross-platform application that allows dif-ferent applications to exchange messagesin the same network.

xxiii

Glossary

xxiv

Chapter 1

Introduction

This first chapter introduces the relevant themes needed for a full understanding

of both the current document and the project context. The chapter includes five

different sections to describe the project summary, the project main motivations

and expected goals, the methodology used and finally a last section to describe the

report structure.

1.1 Project Introduction

The project Manufacturing Equipment Data Collection Framework - Bee Framework

- is essentially a framework specification for collecting the data generated by a wide

variety of equipments in an assembly line.

Its subtitle, Bee Framework, is a pure analogy with what happens in the animal

world of bee. Bees focus either on gathering nectar or on gathering pollen depending

on demand from a wide variety of sources. Bees also play an important role in

pollinating flowering plants, and are the major pollinator in ecosystems that contain

flowering plants. It is estimated that one third of the human food supply depends

on insect pollination, most of which is accomplished by bees [1]. Moreover, pollen

and nectar can then be used to produce a large number of products, such as honey,

candles and beeswax, among others, and its usage comprises areas like medicine,

food, cosmetics, beebread, pollination and even pollution monitoring [2, 3].

Similarly, the Bee Framework project is intended to provide the necessary tools

required to perform data collection from a wide source of manufacturing equipments.

The collected data needs then to be prepared and finally fed into the data analysis

solutions that support and help improving the manufacturing process. In summary,

1

Introduction

Figure 1.1: Bee Framework logo

data may be collected from many sources and may as well have many target desti-

nations, just like the pollen and nectar collected by bees that can be used in many

areas.

Most of the generated data is available through equipment local databases or file

systems. This means that this data needs to be collected by querying a database or

by reading, parsing and interpreting files, respectively. Since there is a large number

of different equipment types, integrating equipments and collecting the data they

generate becomes an unpleasant and difficult task. The Bee Framework must then

provide, most of all, the common methods to collect the generated data and the

crucial services generally used while integrating equipment, such as database access,

folder monitoring, communication, or logging, among others.

The impact of such a framework would be considerable in the equipment integra-

tion process, mostly because common requirements are general enough to be easily

used, decreasing the amount of effort and time needed to integrate equipments in

terms of collecting the data they generate.

1.2 Project Motivation

With the continuous development and improvement of existing and new data anal-

ysis and data mining software solutions, there is an increasing need to collect data.

However, these software tools only provide valid and reliable results if large amounts

of data are considered, otherwise the correctness of achieved results can be question-

able and controversial. Regarding this need, it becomes obvious that data collection

plays a critical role in the early phase of the process.

Nowadays, a growing variety of problem domains require the usage of this kind

of work and decision tools in order to follow-up a continuous process optimization

and control. In order to meet these demands from data analysis solutions, effective

and powerful data collection tools are also needed.

2

Introduction

A data collection framework enclosing common functionalities generally used to

collect data from a wide scope of sources and save the collected data into numerous

possible targets can then be particularly useful and critical. A framework meeting

these characteristics could then be used in a large set of problem domains with a

small amount of effort and in a very short time.

Moreover, since the problem domain considered is data collection in manufactur-

ing equipments, there are still some several interface constraints and limitations in

following international standards in the way equipments generate data and how these

data can be collected. With such restrictions in the current scenario, it generally

implies the development of new specific data collection solutions, without promoting

reusability and decreasing developers’ productivity by keeping them doing similar

approaches.

1.3 Project Goals

The main goal of this project is to come up with a detailed system specification

of a framework to be used for data collection in manufacturing equipments in an

assembly line. Additionally, the core requirements of the framework should also

be implemented and an equipment integration should also be done as a proof of

concept. The proposed solution must take in consideration some of the existing

data collection solutions and a wide range of possible equipments, so that it can

achieve a high-level of abstraction. This required level of abstraction will then make

it possible to collect data from different equipment types without much integration

effort.

The data collection framework must consider the existence of common data col-

lection functionalities and different approaches for integrating different equipment

types. These common functionalities are considered the central part of the frame-

work. They provide the necessary tools to promote code and components reusability

while integrating equipments and must be general enough so that no changes need

to be done when their usage is required.

Finally, a prototype should be built as a proof of concept of the proposed solu-

tion. This prototype should consider the core functionalities of data collection for

manufacturing equipments. Since the data collection problem is mostly related to

the equipment integration topic, the prototype should consider the integration of

an equipment type used in the Qimonda1 assembly line in terms of collecting and

saving the data it generates.

1http://www.qimonda.com

3

http://www.qimonda.com

Introduction

1.4 Approach Methodology and Constraints

This section will firstly describe the approach methodology used during the project.

Afterwards, some constraints that affected the normal progress of the project or

that imposed some restrictions in terms of project decisions are also referred.

The collection and analysis of equipment data collection requirements has been

the first task. Usual data collection flows and main requirements related to data

collection have been analyzed and some previous equipment integration solutions

regarding data collection have been studied, so that the common functionalities

could be identified.

At the same time, a research on the state of the art technologies that could

possibly be used in the prototype development has also been started, so that the

decisions about the choice of appropriate technologies and applications could be

made.

After collecting the major requirements needed for a data collection framework

and after studying both the previous work done about this theme and the technolo-

gies used, the design of the solution started almost in parallel with the development

of the framework core functionalities.

When the development of these core functionalities has been successfully com-

pleted, the prototype related to the effective data collection by integrating an equip-

ment and collect the data it generates has then started.

Throughout the prototype development, whenever it was possible a Test Driven

Development (TDD) methodology has been followed and the SVN (Subversion)

version control system has also been adopted. In addition, some notes about archi-

tecture design and a high-level design document about the data collection regarding

the AOI — Automatic Optical Inspection — equipments have been written. This

equipment type and its data collection process will be detailed in Chapter 5.

The constraints that affected the normal progress of the project are mostly re-

lated to technological characteristics and decisions about the applications and tech-

nologies to be used in the project. For example, both C# programming language

and UML modeling can be considered as being project normative.

These constraints are essentially correlated to software applications that could

possibly be used and the lack of the required licenses for using them. This situation

has lead to the usage of other software alternatives or, in most cases, to the usage

of older releases of the desired software. These technological constraints will be

detailed in the Technology Review section (section 3.1) of the State of the Art

chapter (Chapter 3).

4

Introduction

1.5 Report Structure

This report has been written and structured in order to help readers understanding

the document and the project by performing a top-down analysis. This writing

approach starts by giving readers a high-level overview of the project contents so

that they can immediately understand the global concepts of the project and what

it is intended to do. After this high-level overview, the document will then gradually

disclose all the details concerning the problem analysis and design, as well as the

current implementation of the proposed proof of concept solution.

According to this approach, this initial chapter elucidates the readers about the

project overview context, its main goals and motivation and the used approach

methodology that lead to the project conclusion.

Chapter 2 will present a more detailed analysis of the data collection problem.

It also will refer the main motivations that lead to the origin of the problem and the

expected results that may have a considerable impact improving the data collection.

The global problem that concerns data collection will then be divided into smaller

problems, detailing each one and presenting a review of the previous approaches.

Chapter 3 will focus on the State of the Art. The first section of this chapter will

present a technology review, focusing in the technologies and applications consid-

ered. An individual description of each one of these technologies and applications

will also be done. In addition, a comparative analysis relating these technologies

and the main reasons that lead to their adoption or abandon will also be provided.

The following section of this chapter will concern the previous work done regarding

data collection in manufacturing equipments used in the assembly line.

Chapter 4 will refer to the proposed Framework Architecture and Specification.

Initially, this chapter will focus on the main requirements identified for the presented

solution, especially regarding the data collection problem related to manufacturing

equipments. The following section of this chapter will then detail the proposed

architecture for fulfilling the requirements identified and explain the usage of design

patterns to solve some of those requirements and needs.

Chapter 5 will describe the project development and some of the technical deci-

sions taken at implementation level. This chapter will mainly focus in the related

characteristics of each requirement previously specified in the fourth chapter. More-

over, the chapter will explain the overall framework development, the required con-

figuration settings needed and finally how the application works. Since the project

is about data collection, specifically data collection from manufacturing equipments,

the development of the proof of concept regarding the equipment integration and

the consequent data collection for one type of equipment as an example will also be

considered in this chapter.

5

Introduction

In Chapter 6, the main findings will be discussed. Additionally, a comparison

between the proposed solution and the previous approaches will also be done. This

comparison will mostly comprise the integration of the same type of equipment

using both approaches in terms of effort and time needed as well as performance

evaluation.

Finally, the main conclusions achieved after the project conclusion will be related

in Chapter 7. Furthermore, the conclusions will also include some final recommenda-

tions and perspectives of future work to improve and expand the proposed solution.

6

Chapter 2

Data Collection Problem Analysis

This chapter presents a detailed description regarding the data collection problem,

relating the problem with manufacturing equipments. The needs that lead to the

origin of the problem and the expected results that may have a considerable impact

in tasks related to data collection will also be considered. The chapter starts by

giving an overview about the data collection problem in general, then refers to the

data collection problem considering the semiconductor industry and finally details

the problem taking in consideration the specific case data collection at Qimonda.

2.1 Data Collection Overview

People live in an information society in which the creation, distribution, diffusion,

use, and manipulation of information is significant for almost every activity. More

now than ever before, governments, industry and society need reliable information to

make better decisions in tackling these problems [4]. Consequently, the importance

of data collection mechanisms has been and still is growing considerably, so that the

collected data can be used to provide the adequate information required by society.

A data collection system collects data from the outside world and which main

goal is to feed other systems with this data, such as decision making systems (DMS),

usually for the purpose of controlling a system [5]. The concept beyond data

collection means collecting data from a wide range of possible data sources and

turn it into useful information that can be further used. This information can then

be critical not only to control a process or a system, but also to provide a better

understanding about the domain considered in the data collection process [6, 7].

7


2.2 Data Collection in Semiconductor Industry

The existing competitiveness in the semiconductor industry and the demands for

high quality and very reliable memory products lead companies in this sector to

adopt manufacturing processes based on a set of international standards, rules and

conventions in order to guarantee final products that meet customer quality de-

mands. The adoption of common standards is a sign of a mature industry. World-

wide semiconductor companies, through fierce competitors in the marketplace, have

shown remarkable willingness to cooperate in creating and adopting common stan-

dards for factory automation [8].

Manufacturing processes and equipments used in the semiconductor industry

have been continually improved. Along with the technical advances of the semicon-

ductor manufacturing processes themselves, factory productivity and efficient man-

ufacturing control are key to a fab’s success. In fact, that success increasingly relies

on the collection and analysis of growing amounts of detailed process, measurement

and operational data from the equipment to improve yield, efficiency, productivity

and more. As processes become more complex, it becomes more important to use

the data to reduce process variation, minimize the impact of excursions, and improve

overall equipment effectiveness [9].

During all the phases of the manufacturing processes, products are exhaustively

tested to ensure a high quality level. The equipments used either when the mem-

ory products are being mounted either when they are being tested generate large

amounts of data. This data is related to measurements, physical or electrical fail-

ures detected, temperatures or humidity values, tensions and voltages, or optical

inspections, for example.

The data generated by these different equipment types is extremely important

and plays a critical role in the improvement of manufacturing processes. Collection

of real data is a vital step in managing a modern manufacturing organization. This

data can be very useful since it provides Process engineers the required values that

can help them better understand all the manufacturing process stages and help

detect which steps can be improved and how these improvements should be achieved.

Furthermore, information is available immediately so problems can be identified and

corrected when a problem exists, not when it is noticed after a full day of incorrect

production [10].

In order to face the issues related to the semiconductor industry, an organization

have been founded during the decade of 1970. This organization is the Semicon-

ductor Equipment and Materials International — shortly SEMI — and its initial

main focus was to divulge and promote the semiconductor industry and associate

manufacturers of equipment and materials used in the fabrication of semiconductor

8


devices such as integrated circuits, transistors, diodes, and thyristors [11]. Among

other activities, SEMI acts as a “clearinghouse” for the generation of standards

specific to the industry and the generation of long-range plans for the industry.

The most well known standard developed by this organization is the SECS/GEM

(SEMI Equipment Communications Standard / Generic Equipment Model). The

SECS/GEM interface is the semiconductor industry’s standard for equipment-to-

host communications. In an automated fab, the interface can start and stop equip-

ment processing, collect measurement data, change variables and select recipes for

products. The SECS/GEM standards do all this in a defined way, defining a common

set of equipment behavior and capabilities [12].

With the purpose of always trying to improve performance and productivity for

semiconductor fabs and equipment used, a new standard named Equipment Data

Acquisition (EDA) Interface, also known as Interface A, is available and ready to

be deployed in manufacturing organizations. Whereas the SECS/GEM standards

were created to improve tool control and to facilitate and support high levels of

factory automation, the EDA standards focus on improving process monitoring and

control, given the advancing technology and increasing complexity of semiconductor

manufacturing processes [13].

Although Interface A offers improved data ports over SECS/GEM, it does not

replace the SECS/GEM standards, which pertain to equipment control and config-

uration. It is also distinct from Interface B, which facilitates data sharing between

applications, and Interface C, which provides remote access to equipment data. In-

dustry adoption of Interface A has been gaining momentum, but more needs to be

done to fully implement the standard across the industry [14].

2.3 Data Collection at Qimonda

However, even considering the adoption of international standards by the industry,

some of the equipments used in Qimonda assembly lines are generically used not

only by the semiconductor industry. Consequently, some of these equipments do

not follow the SECS/GEM standards defined by the semiconductor industry, making

equipment integration tasks harder to perform and also introduces many difficulties

in the data collection process.

The integration of such equipments lead to the development of specific data

collection solutions for each equipment type considered. Since all these equipments

follow different conventions and rules not only in the way they are used but also in

how they generate data and how this data can be collected, it becomes very hard to

promote consistency.

9


This situation led Qimonda equipment integration team to the development of

multiple and different approaches related to data collection from these manufac-

turing equipments. These approaches are designed and planned almost exclusively

taking in consideration a specific type of equipment, which causes a single and

unique architecture design. Consequently, the amount of time and effort required

from equipment integration team members increases, not only because they have

to think and implement a new architecture solution each time a new equipment

needs to be integrated but also because they have to document them. Obviously,

the productivity of the team members is then negatively affected.

The existing lack of consistency related to different integration architectures

approaches not only led developers to focus on specific equipments instead of a global

architecture but also led them to adapt some software components used in previous

integrations approaches achieved. The reutilization can be considered a positive

aspect but also has some negative consequences and turn downs. Since integration

solutions are developed focusing a single specific equipment, similar components can

not usually be directly reused without having to change and adapt it to the new

equipment requirements.

Consequently, these components have to be continuously adapted and devel-

opers recurrently face the same problems. Additionally, the reutilization of such

components also introduces some constraints related to technological issues. Old

versions of these software components are commonly used and this fact limits some

choices related to the technologies and implementation approaches used due to com-

patibility requirements. This way, even when considering the development of new

integration solutions, these solutions are limited since their beginning by out of date

technologies, which can considerably affect both the performance of solutions and

the maintenance effort needed.

A project to evaluate Qimonda EDA has looked at the problem of factory-wide

deployment of Interface A from a number of perspectives and has tried to incor-

porate the goals of the EDC — Engineering Data Collection — refactoring into a

comprehensive vision [15]. It is difficult to conclude with certainty how rapid and

pervasive the adoption of the nascent EDA standards will be.

Moreover, even considering this new standard, the problem related to data col-

lection is still not solved because there are still some equipments not following the

standards of the semiconductor industry and for which a data collection approach

is needed.

10

Chapter 3

State of the Art

This chapter introduces both the technologies used in the project and the previous

work done to help solving the data collection problem described in the second chap-

ter. Both introductions will be focused on the semiconductor industry, since some

of the technologies used to create the framework are related to this sector.

Additionally, some alternative technologies that could possibly be used to act

upon data collection will also be considered in this chapter. However, once again,

some decisions about different technological choices have been made considering the

semiconductor industry, so taking a decision about replacing a technology by other

one should always take this factor in consideration.

3.1 Technology Review

This section presents the main applications and technologies studied during the

initial phase of the project and later used in the project development. Additionally,

some alternative technologies and applications that were considered but not used in

the project will also be described. For these technologies, the main focus are the

pros and cons of using them in the data collection problem related to manufacturing

equipments and which were the main reasons that lead to their abandon.

3.1.1 Programming Languages and Tools

3.1.1.1 C# — C Sharp

C Sharp is a successfully adopted object-oriented programming language developed

by Microsoft as part of the .NET initiative and later approved as a standard by

ECMA [16]. C# 3.0 is the current version of the language and was released on

19 November 2007 as part of .NET Framework 3.5, but due to some technical and

11

State of the Art

licensing aspects, 2.0 version of the language and the 2.0 .NET Framework have

been used instead in this project. This programming language has a procedural,

object-oriented syntax initially based in the C family of languages but also including

very strong influences from several other aspects of programming languages (C++,

Python and most notably Java) with a particular emphasis on code simplification.

C# is intended to be a simple, modern, type-safe, general-purpose programming

language which allows the development of robust and durable applications. The lan-

guage includes strong type checking, array bounds checking, detection of attempts

to use uninitialized variables, source code portability, exception handling and auto-

matic garbage collection. This way, not only software robustness and durability are

considered by the language, but it also helps programmers using it, increasing their

productivity.

C# is an object-oriented language, but it further includes support for component-

oriented programming. Currently, software design relies more and more on software

components [17]. Key to such components is that they present a programming

model with properties, methods, and events; they also have attributes that provide

declarative information about the component and incorporate their own documen-

tation. One of the biggest advantages about using C# is that this language directly

supports all these concepts, making it a very natural language to create and use

software components.

C# can be considered a very high-level programming language when considering

other languages such as C or Assembly. Although C# applications are intended to

be economical concerning memory and processing power requirements, the language

cannot compete on performance with those low-level languages. C# applications,

like all programs written for the .NET tend to require more system resources than

functionally similar applications that access machine resources more directly.

Microsoft Visual Studio1 has been the development environment chosen to de-

velop the framework and the proof of concept. However, two different versions of

this product from Microsoft have been considered: Microsoft Visual Studio 2005

and Microsoft Visual Studio 2008.

Figure 3.1: Visual Studio logo

1http://www.microsoft.com

12

http://www.microsoft.com

State of the Art

Microsoft Visual Studio 2008

Visual Studio 2008 is the most recent version of this IDE — Integrated Development

Environment — from Microsoft and has been recently released in November 2007.

This version has been the one initially chosen and considered to develop the C#

components of the framework. However due to licensing difficulties, this choice has

been abandoned, which also has determined the choice of the .NET Framework and

C# language versions [18].

Microsof Visual Studio 2005

Visual Studio 2005 is the ancestor version of the Microsoft IDE referred in the

previous section and it was the one adopted as a development environment [19].

This product supports the features introduced by the .NET Framework 2.0 version.

Resharper

Furthermore, a trial version of Resharper 2.0 has been used. Resharper is a product

from JetBrains2 and is a refactoring add-in for Visual Studio which helps program-

mers increase their productivity while developing. Resharper allows code comple-

tion, easy refactoring, code analysis and assistance, code formatting, code generation

and templates, and also easier code navigation [20].

This work tool has been proven very useful, allowing an easier and quicker de-

velopment of the framework components.

Figure 3.2: Resharper logo

NUnit

NUnit3 is an open source unit-testing framework for all .Net languages. Initially

ported from JUnit (used with the same purpose for Java), it is written entirely in

C# and has been completely redesigned to take advantage of many .NET language

features, for example custom attributes and other reflection related capabilities [21].

This testing framework discovers test methods using code reflection and provides

test automation to control the execution of unit tests, the comparison of actual

outcomes to predicted outcomes, the setting up of test preconditions, and other test

control and test reporting functions [22].

2http://www.jetbrains.com3http://www.nunit.org

13

http://www.jetbrains.comhttp://www.nunit.org

State of the Art

Since a Test Driven Development [23] agile methodology has been adopted, this

unit-testing framework has been not only very useful but also fundamental in the

development process.

Figure 3.3: NUnit logo

3.1.2 Modeling Languages and Tools

3.1.2.1 UML — Unified Modeling Language

UML4 is a standardized visual specification language for object modeling used in

the software engineering field. UML is a general-purpose modeling language that

includes a graphical notation used to create an abstract model of a system, referred

to as a UML model. UML is the OMG’s — Object Management Group — most-

used specification, and the way the world models not only application structure,

behavior, and architecture, but also business process and data structure [24].

Figure 3.4: UML logo

Modeling is the designing of software applications before coding. Achieved mod-

els are very helpful since they let us work at a higher level of abstraction, helping

specification, visualization and documentation modeling tasks of software systems,

including their structure and design [25].

Its origin dates from 1994, when the large abundance of modeling languages was

slowing down the adoption of object technology. A unified method was needed and a

consortium with several organizations, named UML Partners, has been established

in 1996 with the purpose of coming up with a specification of a unified modeling

language. As a result of this collaboration, a strong UML 1.0 definition has been

4http://www.uml.org

14

http://www.uml.org

State of the Art

achieved. This modeling language was already well defined, expressive, powerful,

and generally applicable. It was submitted to the OMG in January 1997 as an

initial Request for Purposal response [26].

UML has matured significantly since its early versions. Several minor revisions

fixed shortcomings and bugs and the UML 2.0 major revision was adopted by the

OMG in 2003. There are actually four parts to the UML 2.x specification: the

Superstructure (defines the notation and semantics for diagrams and their model

elements), the Infrastructure (defines the core metamodel on which the Superstruc-

ture is based), the OCL — Object Constraint Language — (defines rules for model

elements) and finally the UML Diagram Interchange (defines how UML 2 diagram

layouts are exchanged). The current versions of these standards follow: UML Su-

perstructure version 2.1.2, UML Infrastructure version 2.1.2, OCL version 2.0, and

UML Diagram Interchange version 1.0.

Visio

Visio5 is a diagramming software originally developed by Vision Corporation, a

company that has been bought by Microsoft in 2000. It uses vector graphics to

create diagrams and follows the recent standards of UML modeling language.

Visio provides a wide range of templates - business process flowcharts, network

diagrams, workflow diagrams, database models, and software diagrams - that can

be used to visualize and streamline business processes, track projects and resources,

chart organizations, map networks, diagram building sites, and optimize systems

[27].

Visio 2007 is the most recent version of this software. The current version of

Visio has been used to model all UML diagrams referring to framework specification

and architecture.

3.1.3 RDBMS and Database Tools

3.1.3.1 Oracle

Oracle Database6 is a relational database management system (RDBMS) commonly

referred as simply Oracle which has become a major presence in database computing.

Oracle Corporation, which is the company that produces and markets this database

software, has been founded in late 70’s (1977) and since then many widespread

computing platforms have come to use the Oracle database extensively, making the

company to actually be the market leader [28].

The last version of Oracle Database has been recently released and this software

is currently in the 11g version. Once again, due to licensing and technical matters,

5http://office.microsoft.com6http://www.oracle.com

15

http://office.microsoft.comhttp://www.oracle.com

State of the Art

Figure 3.5: Oracle Corporation logo

the version considered while developing this proof of concept is the 9i. Moreover,

data collection generally involves large volumes of collected data, increasing the

risks of performing migrations of current database migrations. However, database

operations described in this report and used in the proof of concept should behave

the same expected way in recent releases of this software, even considering the grid-

computing technology that came with 10g or later versions.

Unlike the C# programming language referred before, a powerful development

database environment was not needed. Some database applications, such as SQL

Navigator or PL/SQL Developer have been considered but SQL Developer, which

already comes up with the Oracle Database Client, meet the necessary requirements.

SQL Developer

Oracle SQL Developer7 is a free graphical tool for database development. SQL

Developer can be used to browse, create and modify database objects, run SQL

statements and SQL scripts, and edit, run and debug PL/SQL statements. It can

also run any number of provided reports, as well as create and save new types of

reports. Additionally, SQL Developer allows to export / import data and DDL and

supports version control. SQL Developer is a tool that enhances productivity and

simplifies database development tasks [29].

SQL*Loader

SQL*Loader8 is a bulk loader utility used for moving data from external files into

the Oracle database. It comes with some configurable loading options and supports

various load formats, selective loading, and multi-table loads [30]. Its usage is

particularly recommended to load large volumes of data into Oracle Database be-

cause it consumes less resources, specially those related to time, memory and CPU

processing.

3.1.3.2 SQL Server, Access and PostgreSQL

Microsoft SQL Server9, Microsoft Access10 and PostgreSQL11 have been considered

as possible alternative database management systems. However, since data collec-

7http://www.oracle.com/technology/products/database/sql_developer/8http://www.orafaq.com/wiki/SQL*Loader_FAQ9http://www.microsoft.com/SQL

10http://office.microsoft.com/access11http://www.postgresql.org

16

http://www.oracle.com/technology/products/database/sql_developer/http://www.orafaq.com/wiki/SQL*Loader_FAQhttp://www.microsoft.com/SQLhttp://office.microsoft.com/accesshttp://www.postgresql.org

State of the Art

tion can be very resource consuming, usually involving large volumes of data and

strongly related to mechanisms of data warehousing and data mining, Oracle has

been the natural database choice. The main reason for this choice is the well known

robustness and efficiency of Oracle databases in such situations. Nevertheless, the

data collection framework presented not only considers Oracle but also these al-

ternative databases. This topic will be detailed and explained in Chapter 5.

3.1.4 Data Persistence

3.1.4.1 Enterprise Library

Microsoft Enterprise Library12 or simply Enterprise Library is a collection of reusable

software components for the Microsoft .NET Framework. These components are

application blocks designed to assist software developers with common enterprise

development challenges and problems that commonly are faced from one project to

the next ones [31].

Application blocks are designed to encapsulate the Microsoft recommended best

practices for .NET applications. In addition, they can be added to .NET applica-

tions quickly and easily [32]. Application blocks are a type of guidance, provided

not only as source code but also as documentation that can be directly used, ex-

tended, or modified by developers to use on complex, enterprise-level line-of-business

development projects.

This guidance is based on real-world experience and goes far beyond typical

white-papers and sample applications. They provide proven architectures, produc-

tion quality code, and recommended engineering best practices. The technical guid-

ance is created, reviewed, and approved by a wide diversity of experienced people

including Microsoft architects, partners and customers, engineering teams, consul-

tants, and product support engineers. The result is a thoroughly engineered and

tested set of recommendations that can be followed with confidence when building

applications based on this guidance [33].

Enterprise Library and its applications blocks provide an API to facilitate best

practices in core areas of programming such as logging, validation, data access,

exception handling, and many others. However, application blocks are designed to

be as “agnostic” as possible to the application architecture, so that they can be

easily reused in different contexts.

Figure 3.6 shows the applications blocks available in the Enterprise Library 3.1

release and illustrates their interdependencies. [32] Both Data Access and Logging

Application Blocks have been used in the Bee Framework. Their usage will be

12http://msdn.microsoft.com/entlib

17

http://msdn.microsoft.com/entlib

State of the Art

Figure 3.6: Interdependencies of the Enterprise Library application blocks

explained later in the Framework Specification and Architecture chapter (Chapter

4) because they have significant impact in the proposed architecture.

Amongst the main benefits of using Enterprise Library we can identify the pro-

ductivity and testability enhancements: each of the application blocks provides

several interfaces meant to satisfy common concerns and a level of isolation that

allows individual testing of each block. Additionally, extensibility (developers can

customize the application blocks and extend their functionality to suit own needs),

consistency (design patterns are applied in similar fashion in all the blocks), ease of

use and integration (application blocks can be used as pluggable components) are

also strong advantages of using Enterprise Library [34].

Enterprise Library 4.0 is the most recent version and has just been released

recently in May 2008. This last release includes some bug corrections, new appli-

cations blocks, some performance improvements, and already supports Microsoft

18

State of the Art

Visual Studio 2008 and the .NET Framework 3.5 [35]. However, since this last

version has been released after the beginning of the project, the previous release

of Enterprise Library, version 3.1 - May 2007, has been used instead. Moreover,

as described before in the C# Technology Review section, Visual Studio 2005 and

.NET Framework 2.0 have been used, so using the last release of Enterprise Library

would be impossible due to software requirements.

3.1.4.2 NHibernate

NHibernate13 is a port of the famous Hibernate Core for Java to the .NET Frame-

work [36]. It is an Object-Relational Mapping (ORM) solution that provides an

easy to use framework for mapping an object-oriented domain module to a tradi-

tional relational database, handling persisting plain .NET objects to and from the

underlying database.

With this support to transparent persistence, object classes do not have to fol-

low a restrictive programming module. These persistent classes do not need to

implement any interface or inherit from a special base class. Just by giving a XML

description of the entities and relationships, NHibernate automatically generates

the necessary SQL for loading and storing the objects. This characteristic makes

it possible to design the business logic using plain .NET (CLR — Common Lan-

guage Runtime) objects and object-oriented idiom. This object-oriented approach

relieves the developer from a significant amount of relational data persistence-related

programming tasks.

NHibernate is free as open source software that is distributed under the GNU

Lesser General Public License and its most recent version number is 1.2.1. NHiber-

nate 2.0 is currently under development [37].

However, despite these apparent advantages and interesting features, NHiber-

nate has been abandoned in favor of Microsoft Enterprise Library, described in the

previous section. The reason for this choice is inspired by expressions as “no solu-

tion is final solution” and “all have pros and cons” because it depends on the the

data access layer architecture and how this layer is implemented. Each case must

be individually considered in order to determine whether a technology / application

is better or worst than other one.

The main reason for this decision is the type of application considered: a data

collection framework. When referring to a data collection framework, it is expected

that all database accesses can be abstract, not depending on which type of database

is used, which tables exists, which columns each table contains, and so on. Such a

13http://www.hibernate.org

19

http://www.hibernate.org

State of the Art

framework is expected to have a high level of abstraction, providing a wrapper that

can execute any kind of query, stored procedure, or transaction different databases.

Of course, each query, stored procedure or transaction must be defined when using

the framework services but the way of using database services remain exactly the

same whatever the database is.

By choosing NHibernate it would not be possible to attend the required universal

way because it would be necessary to generate the mapping XML files, then gener-

ate the SQL used for database access and finally generate all object-oriented class

files. Even considering the usage of automatic code generation tools for NHibernate,

such as MyGeneration14 or CodeSmith15, some adjustments in XML files and object

classes must usually be done specially when dealing with complex entity relation-

ships. This situation increases the complexity and difficulty of maintainability tasks

because it is easy to introduce an error and very hard to detect its origin.

Unlike NHibernate, the Data Access Block (DAB) from Enterprise Library makes

calling stored procedures very easy and uniform. The DAB manages the state of

existing database connections and also provides the required uniform way for data

access operations, making all the code look and behave similarly [38]. Moreover,

Enterprise Library supports multiple database types (Oracle, SQL Server, DB2, or

Access for example): due to its abstraction level regarding databases types and due

to the usage of multiple abstract factories design patterns, there is no need to change

a single line of code if the database type changes.

Additionally, Enterprise Library supports applications using multiple databases

and provides a simple way of choosing and alternating between all the configured

databases and connection strings.

3.1.5 Communication Technologies

3.1.5.1 TIBCO RendezVous

TIBCO Rendezvous16 is a software product from TIBCO Company that allows

messaging interchanging between different applications. It is a very efficient, robust,

reliable, scalable product and is the leading low latency-messaging product for real

time throughput data distribution applications. It is a widely deployed, supported,

and proven low latency messaging solution on the market today [39].

TIBCO Rendezvous can be integrated with external components and provides

different Application Programming Interfaces (API) to support the development of

applications in different programming languages.

14http://www.mygenerationsoftware.com15http://www.codesmithtools.com16http://www.tibco.com

20

http://www.mygenerationsoftware.comhttp://www.codesmithtools.comhttp://www.tibco.com

State of the Art

The basic message passing is conceptually simple. A message has a single subject

composed of elements separated by periods and has some message parameters, each

one following the name-value-type paradigm [40]. The message is then sent to a

single Rendezvous Daemon and a listener announces its subjects of interest to a

Daemon (with a wildcard facility). The messages with matching subjects are then

delivered to that Daemon [41].

The main components for an application using TIBCO Rendezvous are the fol-

lowing:

• the messages and their content parameters;

• the events related to the subscription, sending and receiving of messages;

• finally, the transport and the logical connection between different applications,which includes the connection settings.

3.1.5.2 YODA

YODA is a middleware software solution developed by Infineon Technologies17.

YODA stands for Your Own Data Adapter and is a set of components and libraries

which are application, platform and technology independent. These components

and libraries have defined a set of well-defined rules and conventions [42].

YODA main goal is to achieve an efficient and complete integration between

distributed applications, using a reliable and quick method to exchange messages

between them. YODA is likely an internal network based protocol that allows differ-

ent applications running on different platforms to intercommunicate and exchange

information via YODA messages.

It is a high-level layer based on TIBCO Rendezvous software that allows appli-

cations to communicate through a network. YODA provides a uniform and easy

way to send and receive messages. The main advantage of using YODA is that

the communication between different applications is done by sending and receiving

messages via a network and both the sender and the receiver applications do not

need to know their locations on the network.

An application subscribes the messages by creating a transport, IfxTransport,

and specifying the subjects it wants to receive. When a message is available in the

network, it will be delegated accordingly to the applications that have subscribed

that message subject. These applications only have to install an event handler, so

that they can receive the delegated messages and process them.

17http://www.infineon.com

21

http://www.infineon.com

State of the Art

3.1.5.3 Microsoft Message Queuing

Microsoft Message Queuing (MSMQ) is a technology provided by Microsoft that

enables applications running at different times to communicate across different net-

works and systems. The main advantages of this technology are that it guarantees

message delivery, efficient routing, security and priority-based messaging. Addi-

tionally, it also can be used for implementing solutions for both synchronous and

asynchronous messaging scenarios, which means it also supports systems that may

be temporarily offline [43].

MSMQ is a middleware tool, not responsible for passing the messages themselves

bit by bit; this middleware leaves that low level work to already existing standards

and only provides a friendly interface API to help developers. Each computer par-

ticipating in the distributed application needs a message queue, which allows the

application to send asynchronous messages to a disconnected computer [44].

However, there is no need to use an advanced technology like MSMQ to support

communication, since most of its features are not required by the framework. Addi-

tionally, all the communications between different applications are already supported

by YODA, which is widely used in Qimonda universe.

3.1.6 Markup Languages

3.1.6.1 XML

XML stands for Extensible Markup Language and it is a W3C18 recommendation.

XML was developed by an XML Working Group (originally known as the SGML19

Editorial Review Board) formed under the auspices of the W3C in 1996 [45].

XML is a simple and very flexible text format originally designed to meet the

challenges of large-scale electronic publishing, XML is also playing an increasingly

important role in the exchange of a wide variety of data on the Web and elsewhere

[46]. XML is a markup language much like HTML, but XML is not a replacement for

HTML since they were designed with different goals. XML was designed to transport

and store data, with focus on what data is; HTML was designed to display data,

with focus on how data looks [47].

XML documents are made up of storage units called entities, which contain either

parsed or unparsed data. Parsed data is made up of characters, some of which form

character data, and some of which form markup. Markup encodes a description of

the document’s storage layout and logical structure. XML provides a mechanism

to impose constraints on the storage layout and logical structure. Unparsed data is

made by contents that may or not be text, and if text, may be other than XML.

18World Wide Web Consortium19Standard Generalized Markup Language

22

State of the Art

XML language can then be used to describe any kind of data type because tags

are not predefined; used tags must be defined as needed, which makes XML to be

classified as an extensible language. This characteristic makes XML documents self-

descriptive because these documents are easier and intuitive to understand since

they are relatively human-legible and reasonably clear [48].

Amongst its main purposes are the facility of sharing and transport structured

data across different information systems (interoperability), the encoding of docu-

ments and the serialization of data. This data is stored in plain text format, which

provides a software and hardware independent way of storing data. This makes XML

straightforwardly usable because it is much easier to create data that different ap-

plications can share. Moreover, XML documents should not only be easy to create,

but the design of XML documents should also be formal, concise and quickly pre-

pared. These characteristics also help reducing the complexity of exchanging data

between incompatible systems, since the data can be read by different incompatible

applications [49].

3.1.6.2 XPath

XML Path or shortly XPath20 is a language for finding information in an XML

document. XPath is used to navigate through elements and attributes in an XML

document [50]. In addition, XPath may be used to compute values (strings, num-

bers, or boolean values) from the content of an XML document. XPath became a

W3C Recommendation in November 1999 and the current version of the language

is XPath 2.0 [51].

XPath operates on the abstract, logical structure of an XML document, rather

than its surface syntax. The XPath language is based on a tree representation of

the XML document, and provides the ability to navigate around the tree, selecting

nodes by a variety of criteria. XPath has a natural subset that can be used for

matching (testing whether or not a node matches a pattern) [52].

XPath gets its name from its use of a path notation for navigating through the

hierarchical structure of an XML document. XPath uses path expressions to select

nodes or node-sets in an XML document. These path expressions look very much

like the expressions you see when you work with a traditional computer file system.

3.2 Previous Work

This section concerns some of the previous work done regarding the data collection

in manufacturing equipments used in an assembly line. It presents the state of the

20http://www.w3.org/TR/xpath

23

http://www.w3.org/TR/xpath

State of the Art

art in terms of how data is collected from files generated by equipments, how the

questions related to database concurrency accesses are resolved and how commu-

nication is handled between both equipments and applications and also between

different components of the same application.

3.2.1 Collecting Data from Files

Some manufacturing equipments generate data and save it into files. These files are

usually saved in a folder configured in the data collection application settings. The

folder containing these generated files is usually accessed via a network mapping

using the TCP/IP protocol, which allows to access these remote files just like if

they were in the same computer of the data collection application.

This way, the problem related to accessing these files is solved. However, data

collection approaches used at Qimonda for collecting files have some limitations,

especially because applications do not know neither the exact moment a new file is

created neither when the file becomes available for use and unlocked by the equip-

ment software. Consequently, a periodic approach must be used. Depending on the

frequency equipment generates data, the time value used between two consecutive

folder inspections is adjusted. Because of these periodic inspections, it is impossible

to know a priori which files have been generated between two consecutive inspec-

tions, so each inspection needs to check all the existing files and folders existing in

the mapped network folder. This way, a list containing the files and folders existing

inside a directory must be kept in memory, so that comparisons between two con-

secutive inspections can be made. Additionally, the list should always be updated

at the end of each inspection, so that it can be used in the next inspection.

Another important point related to collecting data from files is related to file

contents. File contents need to be parsed and specific parsers must be configured

to match the requirements of each specific equipment. These parsing approaches

commonly used implement a sequential parsing of files, which leads to less tolerant

parsers if errors occur and also makes harder to find the desired information inside

the file contents.

3.2.2 Database Concurrency

Databases play a critical role in the data collection process: some of the data may

be collected from equipment local databases and mostly because the main target for

the data collected is usually a database.

However, database accesses should be handled carefully because there is the

possibility of having many reads and writes operations using the same data rows

at the same type. This may be potentially dangerous due to concurrency problems

24

State of the Art

related to concurrent accesses. These concurrent acce

manufacturing equipment data collection framework · the bee framework project is essentially a...

Documents