data warehousing and data mining

4
Code No: V3244/R07 Set No: 1 III B. Tech - II Semester Regular, Examinations, April/May 2011 DATA WAREHOUSING AND DATA MINING (Information Technology) Time: 3 Hours Max. Marks: 80 Answer any FIVE Questions All Questions carry equal marks ***** 1. a)What kind of patterns can be mined? b) Discuss about data reduction. (8+8) 2. a) Draw and explain the architecture of data warehouse. b) Explain about further development of data cube technology. (8+8) 3. a) What defines a data mining task? b) Discuss in detail about DMQL. (8+8) 4. a) Discuss about analytical characterization. b) Explain mining descriptive statistical measures in large databases. (8+8) 5. A database has four transactions. Let min_sup=60% and min_conf=80%. TID Date items bought ------------------------------------------ T100 10/15/99 {K,A,D,B} T200 10/15/99 {D,A,C,E,B} T300 10/19/99 {C,A,B,E} T400 10/22/99 {B,A,D} Find all frequent item sets using Apriori and FP-growth, respectively. Compare the efficiency of the two mining processes. (16) 6. a) Explain the following classification methods i) k-Nearest neighbor classifiers ii) case-based reasoning iii) rough set approach iv) fuzzy set approaches (3+3+2+2) b) Explain classifier accuracy. (6) 7. a) Briefly outline how to compute the dissimilarity between objects described by the following types of variables: i) Asymmetric binary variables ii) Nominal variables iii) Ratio-scaled variables iv) Numeric (interval-scaled) variables b) Explain about grid-based methods. (8+8) 8. a) Explain the generalization of structured data. b) What is a multimedia database? What are different approaches for similarity-based retrieval in image databases? c) How can we study time-series data? Explain with suitable example. d) What are the basic steps in latent semantic indexing? (4x4) 1 of 1 www.jntuhub.com

Upload: ramesh-yadav

Post on 22-Aug-2014

133 views

Category:

Documents


1 download

TRANSCRIPT

Code No: V3244/R07 Set No: 1

III B. Tech - II Semester Regular, Examinations, April/May 2011DATA WAREHOUSING AND DATA MINING

(Information Technology)Time: 3 Hours Max. Marks: 80

Answer any FIVE QuestionsAll Questions carry equal marks

*****1. a)What kind of patterns can be mined?

b) Discuss about data reduction. (8+8)

2. a) Draw and explain the architecture of data warehouse.b) Explain about further development of data cube technology. (8+8)

3. a) What defines a data mining task?b) Discuss in detail about DMQL. (8+8)

4. a) Discuss about analytical characterization.b) Explain mining descriptive statistical measures in large databases. (8+8)

5. A database has four transactions. Let min_sup=60% and min_conf=80%.TID Date items bought------------------------------------------T100 10/15/99 {K,A,D,B}T200 10/15/99 {D,A,C,E,B}T300 10/19/99 {C,A,B,E}T400 10/22/99 {B,A,D}Find all frequent item sets using Apriori and FP-growth, respectively. Compare theefficiency of the two mining processes. (16)

6. a) Explain the following classification methodsi) k-Nearest neighbor classifiers ii) case-based reasoningiii) rough set approach iv) fuzzy set approaches (3+3+2+2)

b) Explain classifier accuracy. (6)

7. a) Briefly outline how to compute the dissimilarity between objects described by thefollowing types of variables:

i) Asymmetric binary variablesii) Nominal variablesiii) Ratio-scaled variablesiv) Numeric (interval-scaled) variables

b) Explain about grid-based methods. (8+8)

8. a) Explain the generalization of structured data.b) What is a multimedia database? What are different approaches for similarity-basedretrieval in image databases?c) How can we study time-series data? Explain with suitable example.d) What are the basic steps in latent semantic indexing? (4x4)

1 of 1

www.jntuhub.com

Code No: V3244/R07 Set No: 2

III B. Tech - II Semester Regular, Examinations, April/May 2011DATA WAREHOUSING AND DATA MINING

(Information Technology)Time: 3 Hours Max. Marks: 80

Answer any FIVE QuestionsAll Questions carry equal marks

*****1. Briefly compare the following concepts. Use an example to explain your points.

i) Snowflake schema, fact constellation, star net query model.ii) Data cleaning, data transformation, refresh.iii) Discovery driven cube, multifeature cube, and virtual warehouse. (16)

2. a) Briefly discuss the data smoothing techniques.b) Explain about concept hierarchy generation for categorical data. (8+8)

3. a) Briefly discuss the various forms of Presenting and visualizing thediscovered patterns.b) Discuss about the objective measures of pattern interestingness. (8+8)

4. a) How can we specify a data mining query for characterization with DMQL?b) Describe the transformation of a data mining query to a relational query. (8+8)

5. Explain the following.i) Generating Association rules from frequent items.ii) Improving the efficiency of Apriori.iii) Constraint based Association mining. (4+4+8)

6. a) Can any ideas from association rule mining be applied to classification? Explain.(6)

b) Explain training Bayesian belief networks. (5)c)How does tree pruning work? What are some enhancements to basic decision tree

induction? (5)

7. a) Explain the categorization of major clustering methods. (6)b) Write CURE algorithm. (4)c) Explain DBSCAN and OPTICS with examples. (3+3)

8. a) Explain the classification and prediction analysis of multimedia data. (4)b) What are basic measures for text retrieval? What methods are there for informationretrieval? (6)c) What is meant by ‘authoritative’ Web pages? Explain about mining the Web’s linkstructures to identify authoritative web page. (6)

1 of 1

www.jntuhub.com

Code No: V3244/R07 Set No: 3III B. Tech - II Semester Regular, Examinations, April/May 2011

DATA WAREHOUSING AND DATA MINING(Information Technology)

Time: 3 Hours Max. Marks: 80Answer any FIVE Questions

All Questions carry equal marks*****

1. a) Explain data mining as a step in the process of knowledge discovery.b) Differentiate operational database systems and data warehousing. (8+8)

2. a) What is a Data Warehouse? Explain in detail.b) Discuss about further development of Data Cube technology. (8+8)

3. a) Discuss about Concept Hierarchies in detail.b) Explain syntax for specifying the kind of Knowledge to be mined. (8+8)

4. a) Explain efficient implementation of Attribute-Oriented Induction.b) Discuss about analytical characterization. (8+8)

5. a) Explain the following.i) Meta rule-guided mining of Association rules.ii) Mining guided by additional rule constraint.

b) How might the efficiency of Apriori improved? Explain.(4+4)

(8)

6. a) Why naïve Bayesian classification called “naive”? Briefly outline the major ideas ofnaïve Bayesian classification.b) Define regression. Briefly explain about linear, non-linear and multiple regressions.

(8+8)7. a) Given the following measurement for the variable age:

16, 25, 28, 46, 29, 44, 38, 37, 54, 27 Standardize the variable by the following:Compute the mean absolute deviation of age.Compute the Z-score for the first four measurements. (4+4)b) Explain clustering using representatives algorithm with example. (4)c) Write an algorithm for DBSCAN and give an example of DBSCAN. (4)

8. Explain the following:i) Spatial association analysisii) Sequential pattern miningiii) Latent semantic indexingiv) Text mining

(4x4)

1 of 1

www.jntuhub.com

Code No: V3244/R07 Set No: 4III B. Tech - II Semester Regular, Examinations, April/May 2011

DATA WAREHOUSING AND DATA MINING(Information Technology)

Time: 3 Hours Max. Marks: 80Answer any FIVE Questions

All Questions carry equal marks*****

1. a) What motivated Data Mining? Why is it important? What is Data Mining?b) Why preprocess the data? Explain Data Cleaning. (8+8)

2. a) What are schemas for Multidimensional databases? Explain with examples.b) Explain the implementation of Data Warehouse. (8+8)

3. a) Explain syntaxes for Concept Hierarchy Specification and Interestingness MeasureSpecification.b) Describe why concept hierarchies are useful in data mining. (8+8)

4. a) Discuss why analytical characterization is needed and how it can be performed.Compare the result of two induction methods:(1) with relevance analysis and (2) without relevance analysis.

b) How is class comparison performed? (8+8)

5. a) What is an iceberg query? Give an example. (5)b) Explain about mining distance based association rules. (6)c) How are meta rules useful? Explain with example. (5)

6. a) What is a decision tree? Write basic algorithm for inducting a decision tree fromtraining samples and explain.b) Explain about prediction in detail. (8+8)

7. a) Briefly discuss about density-based methods.b) Explain COBWEB model. (8+8)

8. a) What kinds of association can be mined in multimedia data? What are the differencesbetween mining association rules in multimedia databases versus in transactionaldatabases? (6)b) How does latent semantic indexing reduce the size of the term frequency matrix?Explain. (6)c) Describe the construction of a multilayered web information base. (4)

1 of 1

www.jntuhub.com