neural network analysis of flow cytometry immunophenotype data
DESCRIPTION
IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. 43, NO. 8, AUGUST 1996. Neural Network Analysis of Flow Cytometry Immunophenotype Data. Ravi Kothari,* Member, IEEE, Hernani Cualing, and Thiagarajan Balachander. Mehrshad Mokhtaran M.D. Acute Leukemia. Definition Malignant Event - PowerPoint PPT PresentationTRANSCRIPT
Neural Network Analysis of Flow Cytometry Neural Network Analysis of Flow Cytometry Immunophenotype DataImmunophenotype Data
Mehrshad Mokhtaran M.D.Mehrshad Mokhtaran M.D.
IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. 43, NO. 8, AUGUST 1996
Ravi Kothari,* Member, IEEE, Hernani Cualing, and Thiagarajan Balachander
Acute LeukemiaAcute Leukemia
• Definition– Malignant Event– Replace the bone marrow with blast– Clinical Complication: Anemia, Infection, Bleeding– Rapidly fatal– With appropriate therapy, the natural history can be
markedly altered, and many patient can be cured.
Acute LeukemiaAcute Leukemia
• Etiology:– Radiation– Oncogenic Viruses– Genetic and Congenital Factors– Chemical and Drugs
Acute LeukemiaAcute Leukemia• Incidence:
– Annual new case (All Leukemia): 8 to 10 per 100,000.– Remained static over the past three decades.– ALL:11% CLL:29% AML:46% CML:14%– 3% of all cancer in United States– ALL is most common cancer in children(<15y)– ALL is second cause of death in children(<15y)– ALL has tow maximum incidence per age– AML gradually increases with age– Half of AML cases occur in patients younger than 50 y
Acute LeukemiaAcute Leukemia
• Pathophysiology:
Acute LeukemiaAcute Leukemia
• Classification– Morphology– Cytochemistry– Cell-surface markers– Cytoplasmic markers– Cytogenetics– Oncogene expression
Acute LeukemiaAcute Leukemia
• Must important Distinction is between: AML & ALL– Clinical behavior, prognosis, response to therapy
• AML (FAB)– M0, M1, M2, M3: Increasing degree of differentiation– M4, M5: Monocytic lineage – M6: Erytroid cell linage– M7: Acute Megakaryocytic Leukemia
• ALL (FAB)– L1– L2– L3
Acute LeukemiaAcute Leukemia• Cell-surface Markers:
– AML• Normal immature myeloid cells and blast cells from most patient
with AML: CD13, CD14, CD33, CD34• M6, M7: Antigens restricted to red cell and platelet lineage• AML may express: HLA-DR antigen• 10-20%: B- or T-cell lineage
– ALL• 60% of ALL: CALLA(CD10) (early pre-B-cell differentiation state)• Pre-B-cell ALL: 20% CALLA-positive that have intracytoplasmic
immunoglobulin• B-cell ALL(5%): Immunoglobulin on cell surface• T-cell ALL(20%): CD5, CD3 or CD2 (normal early T-cell)• Null cell ALL (15%): Fail to express CALLA, B- , T-cell markers• 25% of ALL: Myeloid antigens
Acute LeukemiaAcute Leukemia
• Cytogenetics and Molecular biology:
Acute LeukemiaAcute Leukemia• Clinical Manifestations:
– Decreaseing normal marrow function:• Anemia: Fatigue, pallor, headache, angina or heart failure• Thrombocytopenia: Bleeding(petechiae, ecchymosess, bleeding
gums, epistaxis) • Granulocytopenic(AML>ALL) : Infections (Bacterial)
– Invasioning of normal organs by leukemic blasts (ALL>AML):• Enlargement of lymph nodes, liver, spleen• Bone pain• Skin (Leukemia cutis)• Leukemic meningitis: Headache, nausea• CNS (particular in relapse): palsies and seizures• Testicular involvement (particular in relapse)• Any soft tissue (AML>ALL): Chloroma, myeloblastoma
– Specific subtype of leukemia:• M3: DIC (Disseminated intravascular coagulation)
Acute LeukemiaAcute Leukemia
• Laboratory Manifestations:– CBC– Bone marrow aspiration and biopsy– PT (Prothrombin Time) & PTT (Partial
Thromboplastin Time)– LDH (Lactate dehydrogenase)– …
Acute LeukemiaAcute Leukemia
• Treatment:– Combination Chemotherapy– Bone Marrow Transplantation– Stabilization:
• Hematological• Metabolical• Psychological
• Introduction
• Data Collection
• Classifier Design
• Results
• Discussion
• Conclusion
IntroductionIntroduction
• Immunophenotype data• Flow cytometry• Lineage & Differentiation• ALL: Immature (CALLA+), Pre-B,
Mature-B, T-Lymphoblastic• Response to chemotherapy• AML: M1,M2,…,M8• No relevant prognosis
Data CollectionData Collection
• Flow cytometry immunophenotype data of cases with leukemia or reactive bone marrow were collected retrospectively from computerized archival database.
• Selection Criterion:– Confirmed diagnosis
– Complete flow cytometry antibody panel result
• Total cases: 170– 151 leukemia and 19 nonleukemia
– 62 children and 89 adults
– 81 males and 70 females
First PhaseFirst Phase
• Lineage Categories• Categorize into:
– Reactive– ALL– Remission– Mixed AML-ALL– AML
Second PhaseSecond Phase
• Categorize the ALL Cases into subcategories based on differentiation
• Categorize into:– Pre-B– CALLA+– T Phenotype
• Not include: Mature-B (Difficulty in obtaining sufficient data for meaningful interpretation)
DataData
• Validation / Training set size = 33-50%• Only Bone marrow phenotypes (Most Sensetive specific)• Excluded: Peripheral blood and cerebro-spinal fluids immunophenotype• Flow cytometry immunophenotype data:
– Mean fluorescence intensity of a minimum of 10000 cells analyzed using either a red or green fluorescence tagged antibody
DataData
• 27 Standardized and most commonly used monoclonal antibodies with defined specificities.
• Not all of these are utilized for each case.• Average of 15 antibodies for each case.• At least ten antibodies are commonly used for acute leukemia as a
standard practice.• With a zero value if an antibody was not used• An additional binary input denoting past diagnosis of leukemia, were
used as input a neural network classifier.
Classifier DesignClassifier Design
• A feed-forward neural network
• Trained using back propagation algorithm
ClassifierClassifier
• How many hidden layer neurons are needed for a particular task?
– Having a large number of redundant weights leads to over fitting
ClassifierClassifier
• Given a network with a certain number of inputs, hidden layer
neurons, and output, how many training sample are needed to achieve good generalization?
• For accuracy of (1-ε):
p ≥ O(W/ε)
p: Number of training sample.
W: Total number of weights in the network.
ClassifierClassifier
• Perturbation: To generate a large number of cases by introducing small variation in actual cases.
• Optimal Brain Damage: The weight which least increase the error can be
eliminated
• Optimal Brain Surgeon: The sensitivity of an interconnection is expressed as the
cumulative sum of the changes experienced by a weight, during training.
• Weight Decay: Each weight has a tendency to decay to zero with a rate
proportional to the magnitude of the weight.
ClassifierClassifier
• Inputs: 27 + 1
• Hidden: 50 Progressively increasing the number of hidden neurons until
acceptable performance was achieved on training data.
• Output:– First phase (Based on lineage): 5– Second phase (Based on differentiation): 3
• Learning rate (η): 0.1
• Weight Decay Coefficient (λ): 0.05
ResultsResults
• Mean error was acceptably low (0.0001) in both the cases.
• First phase weights :– Total: 1650– Nonzero: 1106– Very small value(<0.1): 544
• Second phase weights :– Total: 1550– Nonzero: 446– Very small value(<0.1): 1104
Fig. 2. Performance of the network for categorization into reactive and the lineage categories of leukemia (ALL, Remission, Mixed AML-ALL, and AML).
Fig. 3. Performance of the network for categorization of ALL cases into subcategories based on differentiation (Pre-B, CALLA+, and T Phenotype).
ResultResult
• Generalization Error:– First phase: 10.3%– Second phase: 10.0%
• Back propagation without the complexity regulation term (Weight Decay): – Generalization performance was poor
DiscussionDiscussion
• Clustering-based methods fall into one of two categories:
– Partitioning– Hierarchical
DiscussionDiscussion
• Partitioning:– e.g., k-means, c-means fuzzy clustering– Divide the inputs, so that members of a
cluster are close to each other and far away from other clusters
– The shared specificity of some monoclonal antibodies make this extremely difficult.
DiscussionDiscussion
• Hierarchical:– e.g., centroid sorting, linkage methods– Try to merge two closest data points together
at each step, and repeat the process until there is only one cluster.
– Have a better chance of succeeding due to the variability in immunophenotype data
– An error in merging made earlier on is propagated throughout.
ConclusionConclusion
• Off line retraining
• Extract rules from trained networks