shodhganga.inflibnet.ac.inshodhganga.inflibnet.ac.in/bitstream/10603/21171/9/9.doc · web...

32
1 8. CONCLUSION This research work is concerned with the problem of developing a framework for Telugu cross language information retrieval. In particular, this research work is concerned with the problem of making use of bilingual ontologies and language grammar rules for Telugu information retrieval. The problem we faced in the evaluation of CLIR using different approaches was that the retrieval performance may be affected by several factors: stemming, term segmentation, and retrieval models, for example. The results suggest that the language grammar based model led to much better retrieval performance than traditional methods. The principal objective attained in this research work, as shown by our methodology and the results of our experiments, was the approach to cross language information retrieval for Telugu using the ontology and language grammar rules for query and content conversion. The research work presented in this thesis has developed a new grammar rule based technique to process the user given queries. It leads to an improvement in CLIR effectiveness and can also be used to improve in

Upload: lyliem

Post on 15-Mar-2018

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: shodhganga.inflibnet.ac.inshodhganga.inflibnet.ac.in/bitstream/10603/21171/9/9.doc · Web viewTelugu has four important dialectal areas, namely, kalinga, Telangana, Rayalasema and

1

8. CONCLUSION

This research work is concerned with the problem of developing a

framework for Telugu cross language information retrieval. In particular, this

research work is concerned with the problem of making use of bilingual

ontologies and language grammar rules for Telugu information retrieval.

The problem we faced in the evaluation of CLIR using different

approaches was that the retrieval performance may be affected by several

factors: stemming, term segmentation, and retrieval models, for example.

The results suggest that the language grammar based model led to much

better retrieval performance than traditional methods.

The principal objective attained in this research work, as shown by our

methodology and the results of our experiments, was the approach to cross

language information retrieval for Telugu using the ontology and language

grammar rules for query and content conversion.

The research work presented in this thesis has developed a new

grammar rule based technique to process the user given queries. It leads to

an improvement in CLIR effectiveness and can also be used to improve in

retrieving of relevant information for given Telugu query.

In this research, we provided new ways to acquire linguistic resources

using multilingual content on the web. These linguistic resources not only

improve the efficiency and effectiveness of Telugu English cross-language

Information retrieval but also have wider applications than CLIR. The focus

for the future will be on designing strategies that can convert the full content

in the retrieved results.

Page 2: shodhganga.inflibnet.ac.inshodhganga.inflibnet.ac.in/bitstream/10603/21171/9/9.doc · Web viewTelugu has four important dialectal areas, namely, kalinga, Telangana, Rayalasema and

2

We evaluated the user acceptance of retrieval performance attained

under using the rule based cross language information retrieval for Telugu

using technology acceptance model.

Limitations and Future Work

The main focus of the work presented in this thesis was the

investigation of our hypothesis for rule based cross language information

retrieval for Telugu, namely that a CLIR for Telugu can perform better using

bilingual ontologies and language grammar rules to convert user queries and

content retrieved for the user given query than using classic dictionary

translation approaches.

Content conversion is another issue. There is no gold standard or

complete set of content in Telugu language, which implies that there is a

need for content conversion mechanism for Telugu cross language

information retrieval.

There is also a series of research aspects related to CLIR requiring

further investigation, such as domain knowledge acquisition, complete

conversion of the content represented by the snippets and the adaptation of

the algorithm for mobile devices.

Page 3: shodhganga.inflibnet.ac.inshodhganga.inflibnet.ac.in/bitstream/10603/21171/9/9.doc · Web viewTelugu has four important dialectal areas, namely, kalinga, Telangana, Rayalasema and

3

BIBLIOGRAPHY

1. Borodin, Y., Mahmud, J., Ramakrishnan, I. V. “Context browsing with

mobiles-when less is more,” in Proc. Mobisys, 2007.

2. Bruce, H, User satisfaction with information seeking on the Internet,

Journal of the American Society for Information Science, Vol.49, No.6,

1998, pp.541-556.

3. Carol Peters, Martin Braschler, Paul Clough, “Cross-Language

Information Retrieval”, Springer Berlin Heidelberg, 2012, pp. 57-84

4. Carpineto, C., Pietra, A., Mizzaro, S. and Romano, G. “Mobile information

Retrieval”, Lecture Notes in Computer Science, Vol. 3936, Advances in

Information Retrieval, pp. 155-166, 2006.

5. Carpineto. C., Romano. G., Snidero. M “Mobile information Retrieval with

Search Results Clustering: Prototypes and Evaluation”, In Journal of the

American Society for Information Science and Technology, Volume 60,

Issue 5, pages 877-895, May 2009.

6. Castellano, G., Mesto, F., Minunno, M. and Torsello, A. “Web user

profiling using fuzzy clustering : In Applications of Fuzzy Sets Theory,” in

Proc. WILF, 2007 .

7. Church, K., Smyth, B., Bradley, K. and Cotter, P. “A large scale study of

European mobile search behavior,” in Proc. MobileHCI’08, 2008.

8. Douglas W. Oard, Daqing He, Jianqiang Wang , “User-assisted query

translation for interactive cross-language information retrieval”,

Information Processing & Management, Vol.44, No.1, pp. 181-211,

January 2008.

Page 4: shodhganga.inflibnet.ac.inshodhganga.inflibnet.ac.in/bitstream/10603/21171/9/9.doc · Web viewTelugu has four important dialectal areas, namely, kalinga, Telangana, Rayalasema and

4

9. E. Ngai, J. Poon, and Y. Chan, Empirical examination of the adoption of

WebCT using TAM, Journal of Computers & Education, vol. 48, 2007,

pp.250-267.

10.G. Madhavi, M. Balakrishnan, and N. Balakrishnan Reddy “Om: One tool

for many (Indian) languages”, Journal of Zhejiang University Science,

Vol.,6, No.,11, pp. 1348-1353, 2005.

11.Gatian,A.W., Is user satisfaction a valid measure of system

effectiveness? Information and Management, Vol.26, No.3, 1994, pp119-

131.

12.Gérald Kembellec, Imad Saleh, Catherine Sauvaget , “OntologyNavigator:

WEB 2.0 scalable ontology based CLIR portal to IT scientific corpus for

researchers”, International Journal of Design Sciences and Technology

16, 2 (2009).

13.Gluck, M., Exploring the relationship between user satisfaction and

relevance in information systems, Information Processing and

Management, Vol.32 No.1, 1996, pp.89-104.

14.Goenka, K., Arpinar, I. B. and Mustafa, N. “Mobile Web Search

Personalization using Ontological User profile,” in Proc. ACM SE’10,

2010.

15.Griffiths, J., Johnson, F. & Hartley, R., User satisfaction as a measure of

system performance, Journal of Librarianship and Science, Vol.39, No.3,

2007, pp.142-152.

16.Homa. B, Hashemi and Shakery. A “Mining a Persian–English

comparable corpus for cross-language information retrieval”, Information

Processing & Management, Vol.50, No. 2, pp.384-398, March 2014.

17.Huffman,S.B., and Hochster, M., How well does result relevance predict

session satisfaction?, Proceedings of the annual international ACM SIGIR

Page 5: shodhganga.inflibnet.ac.inshodhganga.inflibnet.ac.in/bitstream/10603/21171/9/9.doc · Web viewTelugu has four important dialectal areas, namely, kalinga, Telangana, Rayalasema and

5

conference on Research and development in information retrieval, 2007,

pp.567-574.

18.Huo. Z., Zhao. J., Hu. X “Web Data Management for Mobile Users,

Network and Parallel Computing Workshops”, In NPC Workshops IFIP

International Conference on 18-21 Sept, 2007.

19.A. Ashish, and P. Bhattacharyya “Using Morphology to Improve Marathi

Monolingual Information Retrieval” Indian Institute of Technology,

Bombay. India. Source http://www.isical.ac.in/~fire/paper/Ashish_almeida-

IITB-fire2008.pdf, 2008.

20.A. Menon, S. Saravanan, R. Loganathan and K. Soman “Amrita Morph

Analyzer and Generator for Tamil: A Rule-Based Approach”, Tamil

Internet Conference, Cologne, Germany, pp. 239-243, 2009.

21.Aitao Chen and Fredric C. Gey “Combining Query Translation and

Document Translation in Cross-Language Retrieval”, In Comparative

Evaluation of Multilingual Information Access Systems, volume 3237,

pages 121–124, Berlin, Heidelberg, 2004.

22.Allan, J., Aslam, J., Belkin, N., Buckley, C., Callan, J., Croft, B. and

Dumais, S. “Challenges in Information Retrieval and Language Modeling”,

Report of a Workshop held at the Center for Intelligent Information

Retrieval, 2002.

23.Anand Kumar M, Dhanalakshmi V, Rajendran S, Soman K P: A Novel

Approach to Morphological "Hörsaalgebäude" of the University of Koeln

Köln, Universitätsstrasse 35, Albertus-Magnus-Platz 1,Germany, 2009.

24.Arias, M., Cantera, J. M., Vegas, J., Fuente, J., Alonso, J. C., Bernardo,

G., Llamas, C. and Zubizarreta, A. “Context-based personalization for

mobile web search,” in Proc. PersDB2008, 2008.

Page 6: shodhganga.inflibnet.ac.inshodhganga.inflibnet.ac.in/bitstream/10603/21171/9/9.doc · Web viewTelugu has four important dialectal areas, namely, kalinga, Telangana, Rayalasema and

6

25.Banu. W.A., Kader. P.S.A “A Hybrid Context Based Approach for Web

Information Retrieval”, In International Journal of Computer Applications,

article 5, 2010.

26.Bergstorm, A., Jaksetic, P. and Nordin, P. “Enhancing information

retrieval by automatic acquisition of textual relations using genetic

programming,” in Proc. IUI’00, 2000.

27.D Mandal, M Gupta, S Dandapat, P Banerjee, and S Sarkar “Bengali and

Hindi to English CLIR Evaluation”, Journal of Advances in Multilingual and

Multimodal Information Retrieval, Springer Berlin Heidelberg Series, Vol.,

5152, ISSN 0302-9743, pp. 95-102, 2008.

28.D.He and D. Wu “Translation enhancement: A new relevance feedback

method for cross-language information retrieval”, in CIKM ’08: Proceeding

of the 17th ACM conference on information and knowledge management,

ACM New York, USA pp. 729-738, 2008.

29.Damjanovic. V., Gasevic. D., and Devedzic. V “Semiotics for Ontologies

and Knowledge Representation”, In Proc. of Wissens management,

pp.571-574, 2005.

30.Dinesh Mavaluru, R. Shriram and W. Aisha Banu, “Ensemble Approach

for Cross Language Information Retrieval”, in 13th International

Conference on Intelligent Text Processing and Computational Linguistics

(CICLing 2012), IIT-Delhi, New Delhi, Springer, Vol.,2, pp. 274-286, H-

Index-8, 2012.

31.J. L. Herlocker, A. J. Konstan, L. G. Terveen, and J.T. Riedl, “Evaluating

collaborative filtering recommender systems”. ACM Transactions on

Information Systems, Vol. 22, No 1, pp. 5–53, 2004.

32.J. Shen and L.B. Eder, Intentions to Use Virtual Worlds for Education,

Journal of Information Systems Education, Vol. 20, 2009, pp. 225-233

Page 7: shodhganga.inflibnet.ac.inshodhganga.inflibnet.ac.in/bitstream/10603/21171/9/9.doc · Web viewTelugu has four important dialectal areas, namely, kalinga, Telangana, Rayalasema and

7

33.J.H. Sharp, Development, Extension and Application: A Review of the

Technology Acceptance Model, Proceedings of Information Systems

Educators Conference, vol. 23, 2006

34.Jan. Z, and Darena. F, “Discovering Opinions from Customers

Unstructured Textual Reviews Written in Different Natural Languages”,

pp.137-159, 2013

35.K. R. Beesley and L. Karttunen, Finite State Morphology. Stanford: CSLI

Publications, 2003.

36.K. Saravanan, R. Udupa, and A. Kumaran, “Cross lingual Information

Retrieval System Enhanced with Transliteration Generation and Mining”

in Proceedings of Forum for Information Retrieval Evaluation (FIRE-

2010), Kolkata, India, 2010.

37.K.C Manoj, R. Sagar, P. Bhattacharyya and P. Damani “Hindi and Marathi

to English Cross Language Information Retrieval” at Cross-Language

Evaluation. Forum 2007, Springer-Verlag Berlin, Heidelberg, ISBN: 978-

3-540-85759-4, pp 111-118, 2008.

38.Khan. A., and Naveed. A.M “Corpus Based Mapping of Urdu Characters

for Cell Phones”, In Proceedings of the Conference on Language &

Technology, 2009.

39.Koehn, P. Europarl “A parallel corpus for statistical machine translation”,

In MT summit 2005.

40.Kumar Sourabh and Vibhakar Mansotra “Factors Affecting the

Performance of Hindi Language searching on web: An Experimental

Study”, in International Journal of Scientific & Engineering Research

Volume 3, Issue 4, April-2012

41.Lazarinis. F., Jesus. S., and John. V “Current research issues and trends

in non-English Web searching”, Springer Science, February 2009.

Page 8: shodhganga.inflibnet.ac.inshodhganga.inflibnet.ac.in/bitstream/10603/21171/9/9.doc · Web viewTelugu has four important dialectal areas, namely, kalinga, Telangana, Rayalasema and

8

42.M. Anand and V. Dhanalakshmi “A Novel Data Driven Algorithm for Tamil

Morphological Generator”, International Journal of Computer Applications,

Vol.,12, no 7, pp. 52–56, 2010.

43.M. Federico, and N. Bertoldi “Statistical cross-language information

retrieval using N-best query translations.” in Proceedings of the 25th

annual international ACM SIGIR conference on Research and

development in information retrieval, pp. 167–174. ACM Press, New York,

2002.

44.M. Ganapathiraju and L. Levin "TelMore: Morphological Generator for

Telugu Nouns and Verbs”, in Proceedings of Second International

Conference on Universal Digital Library, Alexandria, Egypt, pp. 17-19,

2006.

45.S. Kumar and V. Mansotra “An Experimental Analysis on the Influence of

English on Hindi Language Information Retrieval” International Journal of

Computational Linguistics Research Vol.,2, 2011.

46.S. Liaw, H. Huang, and G. Chen, Surveying instructor and learner

attitudes toward e-learning, Journal of Computers & Education, vol.49,

2007, pp.1066-1080

47.S. Saraswathi, M. AsmaSiddhiqaa, K. Kalaimagal and M. Kalaiyarasi

“BiLingual Information Retrieval System for English and Tamil”, Journal of

Computing, Vol., 2, No., 4, ISSN 2151-9617, pp 85-89, 2010.

48.Saraswathi. S., Siddhiqaa. M., and Kalaimagal. K “Bilingual Information

Retrieval System for English and Tamil”, Journal of Computing, 2(4), April

2010.

49.Saurabh Varshney, Jyoti Bajpai , “Improving Performance Of English-

Hindi Cross Language Information Retrieval Using Transliteration Of

Query Terms”, International Journal on Natural Language Computing

(IJNLC), Vol. 2, No.6, December 2013.

Page 9: shodhganga.inflibnet.ac.inshodhganga.inflibnet.ac.in/bitstream/10603/21171/9/9.doc · Web viewTelugu has four important dialectal areas, namely, kalinga, Telangana, Rayalasema and

9

50.Shriram, R. Sugumaran, V. and Vivekanandan K. ”A middleware for

information processing in mobile computing platforms”, Int. J. Mob.

Comm., Vol. 6, No. 5, pp. 646-666, 2008.

51.Sujatha. P and Dhavachelvan, “A Review on the Cross and Multilingual

Information Retrieval”, International Journal of Web & Semantic

Technology, Vol.2, No.4, pp:115-124, 2011.

52.T. Teo, Modelling technology acceptance in education: A study of pre-

service teachers Computers & Education, The Turkish Online Journal of

Educational Technology, vol. 11, 2012, pp. 264-272

53.V. Mallamma Reddy and M. Hanumanthappa “Kannada and Telugu

Native Languages to English Cross Language Information Retrieval”,

International Journal of Computer Science and Information Technologies,

Vol., 2, No., 5, pp. 1876-1880, 2011.

54.Vijayanand. K., and Seenivasan. R.P “Named Entity Recognition and

Transliteration for Telugu Language”, In Language in India , Special

Volume: Problems of Parsing in Indian Languages, May 2011.

55.Wang. X., Broder. A., Gabrilovich. E., Josifovski. V., and Pang. B.: Cross-

language query classification using web search for exogenous

knowledge. In Proceedings of the Second ACM International Conference

on Web Search and Data Mining, February 2009.

56.Machado, D., Barbosa, T., Pais, S., Martins, B. and Dias., G., “Universal

Mobile Information Retrieval,” in Proc. UAHCI’09, 2009.

57.Maeda. A., and Kimura. F “An Approach to Cross-Age and Cross-Cultural

Information Access for Digital Humanities”, In Digital Resources for the

Humanities and Arts 2008 Conference (DRHA08), Cambridge, U.K.,

2008.

Page 10: shodhganga.inflibnet.ac.inshodhganga.inflibnet.ac.in/bitstream/10603/21171/9/9.doc · Web viewTelugu has four important dialectal areas, namely, kalinga, Telangana, Rayalasema and

10

58.Mallamma V Reddy, M. Hanumanthappa “Kannada and Telugu Native

Languages to English Cross Language Information Retrieval” Department

of Computer Science and Applications, Bangalore University, Bangalore,

INDIA. (IJCSIT) International Journal of Computer Science and

Information Technologies, Vol. 2 (5) , 2011.

59.Manaal F, Prasenjit M and Sebastian P “Soundex-based Translation

correction in Urdu–English Cross-Language Information Retrieval”,

Proceedings of the 5th International Joint Conference on Natural

Language Processing, Chiang Mai, Thailand, November 8-12, pp. 25-29,

2011.

60.Manish Shrivastava, “Morphology Based Natural Language Processing

tools for Indian Languages,” Department of Computer Science and

Engineering, Indian Institute of Technology, Powai, Mumbai, 2005.

61.Manning, C.D and Schutze, H “Foundations of Statistical Natural

Language Processing”, The MIT Press, 2001.

62.Manuela Yapomo, Gloria Corpas, Ruslan Mitkov, “CLIR- and ontology-

based approach for bilingual extraction of comparable documents”,

PP.121-125.

63.Matthijs, N. and Radlinski, F., “Personalizing Web Search using Long

Term Browsing History,” in Proc. WSDM’11, 2011.

64.Monti. J., Monteleone. M., di Buono. M.P., Marano. F.,”Natural Language

Processing and Big Data - An Ontology-Based Approach for Cross-

Lingual Information Retrieval”, Social Computing (SocialCom), 2013,

PP.725 – 731.

65.Nasharuddin. N.A., and Abdullah. M “Cross-lingual Information Retrieval:

State-of-the-Art”, In electronic Journal of Computer Science and

Information Technology. Vol 2, 2010.

Page 11: shodhganga.inflibnet.ac.inshodhganga.inflibnet.ac.in/bitstream/10603/21171/9/9.doc · Web viewTelugu has four important dialectal areas, namely, kalinga, Telangana, Rayalasema and

11

66.Nguyen. D et al, “WikiTranslate, Query Translation for Cross-lingual

Information Retrieval using only Wikipedia”, In: Proceedings of CLEF, pp.

58–65, 2009.

67.P. Pingali, and V. Varma “Multilingual Indexing Support for CLIR using

Language Modeling.” in Bulletin of the IEEE Computer Society Technical

Committee on Data Engineering, India, 2008.

68.P. Pingali, K. Kula, and V. Varma, “Hindi, Telugu, Oromo, English CLIR

Evaluation”, in Evaluation of Multilingual and Multi-modal Information

Retrieval, 7th Workshop of the Cross-Language Evaluation Forum,

Alicante, Spain, Vol., 4730, 2007.

69.P.Sengottuvelan, A.Karthikeyan, “An Novel Approach Using Semantic

Information Retrieval For Tamil Documents”, International Journal of

Engineering Science and Technology, Vol. 2 No.9, 2010.

70.Petrelli. D., Levin. S., Beaulieu. M., and Sanderson. M “Which user

interaction for cross-language information retrieval? Design issues and

reflections”, In Journal of the American Society for Information Science

and Technology, 57 (5), 709-722.

71.Pingali V.V. Prasad Rao “Recall Oriented Approaches for improved Indian

Language Information Access” Language Technologies Research Centre

International Institute of Information Technology Hyderabad - 500 032,

INDIA August 2009.

72.Prasad. P., Varma. V “Hindi and Telugu to English Cross Language

Information Retrieval”, In Working Notes for the CLEF 2006 Workshop

Alicante, Spain, 2006.

73.R. Makin., N. Pandey., P. Pingali and V. Varma “Experiments in Cross

lingual IR among Indian Languages”, International Workshop on Cross

Language Information Processing, , Genoa, July 9 to 10, 2007.

Page 12: shodhganga.inflibnet.ac.inshodhganga.inflibnet.ac.in/bitstream/10603/21171/9/9.doc · Web viewTelugu has four important dialectal areas, namely, kalinga, Telangana, Rayalasema and

12

74.R. Sri Badri Narayanan, S. Saravanan and K. Soman “ Data Driven Suffix

List And Concatenation Algorithm For Telugu Morphological Generator”

International Journal Of Engineering Science and Technology, Vol.,3, No.,

8, pp.6712-6717, 2011.

75.Roth, B., Klakow, D,” Cross-language retrieval using link-based language

models” In: Proceedings of SIGIR, ACM, pp. 773–774, New York 2010

Page 13: shodhganga.inflibnet.ac.inshodhganga.inflibnet.ac.in/bitstream/10603/21171/9/9.doc · Web viewTelugu has four important dialectal areas, namely, kalinga, Telangana, Rayalasema and

13

APPENDIX 1

TELUGU LANGUAGE

Telugu is mainly spoken in the state of Andhra Pradesh and Yanam

district of Pondicherry as well as in the neighboring states of Tamil Nadu,

Pondicherry, Karnataka, Maharashtra, Odessa, Chhattisgarh, some parts of

Jharkhand and the Kharagpur region of West Bengal in India. It is also

spoken in the United States, where the Telugu diaspora numbers more than

800,000, with the highest concentration in Central New Jersey and Silicon

Valley; as well as in Australia, New Zealand, Bahrain, Canada, Fiji, Malaysia,

Singapore, Mauritius, Ireland, South Africa, Trinidad and Tobago, the United

Arab Emirates, United Kingdom, as well as other western European

countries, where there is also a considerable Telugu diaspora. At 7.2% of the

population, Telugu is the third-most-spoken language in the Indian

subcontinent after Hindi and Bengali. In Karnataka, 7.0% of the population

speak Telugu, and in Tamil Nadu, where it commonly known [123] as

Telungu, 5.6%.

History and Affiliation

The Russian linguist Andronov [124], Telugu was split from Proto-

Dravidian languages between 1500–1000 BC. Inscriptions containing Telugu

words claimed to "date back to 400 B.C." were discovered in Bhattiprolu in

Guntur district. During this period the separation of Telugu script from the

Kannada script took place. Tikkana wrote his works in this script.

Telugu is one of the 22 official languages of India. The Andhra

Pradesh Official Language Act, 1966, declares [126] Telugu the official

language of Andhra Pradesh. This enactment was implemented by GOMs No

420 in 2005. Telugu, along with Kannada, was declared as one of the

classical languages of India in the year 2008. The fourth World Telugu

Page 14: shodhganga.inflibnet.ac.inshodhganga.inflibnet.ac.in/bitstream/10603/21171/9/9.doc · Web viewTelugu has four important dialectal areas, namely, kalinga, Telangana, Rayalasema and

14

Conference was organized in Tirupathi city in the last week of December

2012 and deliberated at length on issues related to Telugu development.

Telugu has four important dialectal areas, namely, kalinga, Telangana,

Rayalasema and Coastal area. As far as the structure is concerned the

Telugu language have the structural pattern that is, the Subject, Object and

Verb (SOV) patterns. There are three persons, namely, First person, Second

person and Third person, Two way distinctions in Number namely Singular

(Sg.) and Plural (pl.) and three way distinctions of Gender namely Masculine,

Feminine and Neutral. In Telugu Feminine singular belongs to the Neuter and

the Feminine plural belongs to the Human. In Telugu language three types of

tenses, namely, Past, Present and Future. Telugu has one more special

tense that is, the Future Habitual.

Telugu Script

The main elements of Telugu language alphabet are syllables

therefore; it should be rightly called a syllabary and most appropriately a

mixed alphabetic syllabic script. Unlike in the Roman alphabet used for

English, in the Telugu alphabet the correspondence between the symbols

(graphemes) and sounds (phonemes) is more or less exact. In its most

general sense this term refers to the whole process of morphological

variation in the constitution of words which including the two main divisions of

inflection (word variations, signaling, Lexical relationships).However, there

exist some differences between the alphabet and the phonemic inventory of

Telugu. The overall pattern consists of 60 vowels, 3 vowel modifiers and 41

consonants.

In Telugu writing system syllabic alphabet in which all consonants

have an inherent vowel. Diacritics, which can appear above, below, before or

after the consonant they belong to, are used to change the inherent vowel.

When they appear at the beginning of a syllable, vowels are written as

Page 15: shodhganga.inflibnet.ac.inshodhganga.inflibnet.ac.in/bitstream/10603/21171/9/9.doc · Web viewTelugu has four important dialectal areas, namely, kalinga, Telangana, Rayalasema and

15

independent letters. When certain consonants occur together, special

conjunct symbols are used which combine the essential parts of each letter.

Telugu Grammar

In Telugu writing system syllabic alphabet in which all consonants

have an inherent vowel. Diacritics, which can appear above, below, before or

after the consonant they belong to, are used to change the inherent vowel.

When they appear at the beginning of a syllable, vowels are written as

independent letters. When certain consonants occur together, special

conjunct symbols are used which combine the essential parts of each letter.

Telugu grammar is called as “Vyākaranam”. Every Telugu

grammatical rule is derived from Pāṇinian, Katyayana and Patanjali

concepts. However high percentage of Paninian aspects and technics

borrowed in Telugu.

Gender Marking On Noun

Though the inflection classes are insensitive to gender distinctions,

there are distinctions of gender discernible from morphology of agreement on

verbs, adjectives, possessives, predicate nominal, numerals and deictic

categories. It is necessary to identify four distinctions in gender, viz. nouns

indicating:

• Human males

Other than human males, in singular and plural, nouns indicating

• Humans, and

• Non-humans.

Page 16: shodhganga.inflibnet.ac.inshodhganga.inflibnet.ac.in/bitstream/10603/21171/9/9.doc · Web viewTelugu has four important dialectal areas, namely, kalinga, Telangana, Rayalasema and

16

This distinct is necessitated by the distribution of nouns indicating

human females which are grouped with neuter nouns in singular, but human

males in plural. However, a number of nouns denoting human males end in –

du, and human females end in –di.

Number Marking In Telugu Nouns

Telugu nouns usually occur in two numbers, singular and plural.

However, only plural nouns are explicitly marked. In case of large number of

nouns the form of the plural suffix is –lu, while in case of some nouns of

human male category, the form of plural suffix alternant is –ru.

Gender- Number-Person Marking On Nouns

Telugu nouns when function as nominal predicate show agreement

with the gender, number and person of the surface subject of the clause.

Pronominalized possessive nouns (possessors) show agreement (in gender,

number and person) with the nouns of possession and function as heads of

possessive phrases. In these two cases nouns are marked by pronominal

suffixes of the relevant gender-number-person. The person marking on

nouns is however, explicit only in 1st and 2nd person both singular and

plural, In the case of 3rd person, only the number is marked explicitly and not

the person.

Case Markers and Post- Positions

Nouns are usually inflected by case by case markers and post-

positions to indicate their semantic-syntactic function in clausal predication.

The terms case markers and post-positions roughly correspond to Type-1

and Type-2 post-positions of Krishnamurti and Gwynn. They use the term

post-positions corresponds in meaning to prepositions in English. However,

they makes a distinction between two types of post-positions, viz. Type-1 and

Type-2 based on the criteria like the freedom of distribution (bound and free)

Page 17: shodhganga.inflibnet.ac.inshodhganga.inflibnet.ac.in/bitstream/10603/21171/9/9.doc · Web viewTelugu has four important dialectal areas, namely, kalinga, Telangana, Rayalasema and

17

and the nature of composition of post-positions (Type-1 post-positions are

attached to Type-2 post-positions and not vice-versa).

Telugu uses a wide variety of case markers and post-positions and

their combinations to indicate various relations between nouns and verbs or

nouns. Case suffixes and post-positions fall into two types viz. “Grammatical”

and “Semantic or location and directional”. Grammatical case suffixes are

those which express grammatical case relations such as nominative,

accusative, dative, instrumental, genitive, commutative, vocative and causal.

The semantic cases include such as nouns inflected for location in time and

space. Nouns when attached with various combinations of adverbial nouns

and case markers or post-positions express many more such relations.

In Telugu grammar verb denotes the state of or action by a substance.

Telugu verb may be finite or non-finite. All finite verbs and some non-finite

verbs can occur according to situation before the utterance final juncture /#/

characterized by of following terminal contours: rising pitch, meaning

question; level pitch, falling pitch, meaning command. A finite verb does not

occur before any of the non-final junctures. On the morphological level, no

non- finite verb contains a morpheme indicating person; this statement

should not, however, be taken to mean that all finite verbs necessarily

contain a morpheme indicating person. Since any verb, finite or non-finite,

occurs only after some marked juncture, by definition of these junctures, all

verbs have phonetic stress or prominence on their first syllable, which

invariably part of the root. Almost every Telugu verb has a Finite and a non-

finite form. A finite form is one that can stand as the main verb of a sentence

and occur before a final pause (full stop). A non- finite form cannot stand as a

main verb and rarely occurs before a final pause.

Page 18: shodhganga.inflibnet.ac.inshodhganga.inflibnet.ac.in/bitstream/10603/21171/9/9.doc · Web viewTelugu has four important dialectal areas, namely, kalinga, Telangana, Rayalasema and

18

APPENDIX 2

Section 1: Demographic Information, Awareness and extent of usage of Computers and Internet 1. Name of the Participant :

2. Native Language :

3. Languages Known :

4. Select your age range : □ 17-19 □ 20-22 □ 23-25 □ above 25

5. Gender : □ Male □ Female

6. Occupation : □ Student □ Employed □ Others

7. Proficiency in Computers : □ Very High □ High □ Low □ Very

Low□No

8. Proficiency in Web Usage : □ Very High □ High □ Low □ Very Low□No

9. Accessing Telugu Information over Web : □ Frequently □ Very

Rare □ Never

10. How much time do you spend on the Internet to access native language

content every day?

□ Not at all □ 30 Minutes □ 1 Hour □ 2-3 Hours □ More than 3 Hours

11. How often do you use the following features of internet for learning

activity?

To Search for academic materials

from search engines (like Google,

Yahoo, Bing, MSN etc.)

□ many times a week □ at least once

in a month

□ about once a term □ never

To download notes or similar items

like PPT, PDF, Video, Audio & Doc,

etc.

□ many times a week □ at least once

in a month

□ about once a term □ never

To access content in native or other

language through search engines (like

Google, Yahoo, Bing, MSN etc.)

□ many times a week □ at least once

in a month

□ about once a term □ never

Page 19: shodhganga.inflibnet.ac.inshodhganga.inflibnet.ac.in/bitstream/10603/21171/9/9.doc · Web viewTelugu has four important dialectal areas, namely, kalinga, Telangana, Rayalasema and

19

Section 2: Perceived Usefulness (PU)

Questions

I stro

ngly

agr

ee

I agr

ee

Can

’t D

ecid

e

I dis

agre

e

I stro

ngly

Dis

agre

e

1. I would find this system useful for

retrieval□ □ □ □ □

2. Using this system content is

retrieved more quickly □ □ □ □ □

3. The system provide content that

seem to be just about exactly what I

need

□ □ □ □ □

4. If I use this system, I will increase

my chances of getting knowledge□ □ □ □ □

5. The content presented by this

system is easy to understand□ □ □ □ □

Section 3: Perceived Ease of Use (PEOU)

Questions

I stro

ngly

agr

ee

I agr

ee

Can

’t D

ecid

e

I dis

agre

e

I stro

ngly

Dis

agre

e

1. Interaction with this system is clear

and understandable□ □ □ □ □

2. It is easy to access the information

and skillful at accessing this system□ □ □ □ □

3. I would find this system is easy to

use□ □ □ □ □

4. Learning to operate this system is

easy for me□ □ □ □ □

Page 20: shodhganga.inflibnet.ac.inshodhganga.inflibnet.ac.in/bitstream/10603/21171/9/9.doc · Web viewTelugu has four important dialectal areas, namely, kalinga, Telangana, Rayalasema and

20

5. I find this system is flexible to

access □ □ □ □ □

Section 4: Attitude Towards Using Technology (ATU)

Questions

I stro

ngly

agr

ee

I agr

ee

Can

’t D

ecid

e

I dis

agre

e

I stro

ngly

Dis

agre

e

1. Using this system is a bad idea

(negative)□ □ □ □ □

2. This system makes retrieving

information more interesting □ □ □ □ □

3. Working with this system is fun □ □ □ □ □

4. Using this system it is easier to do

my job□ □ □ □ □

5. This system has the user’s best

interest □ □ □ □ □

Section 5: Behavioral Intention (BI)

Questions

I stro

ngly

agr

ee

I agr

ee

Can

’t D

ecid

e

I dis

agre

e

I stro

ngly

Dis

agre

e

1. I had a access to a this system, I

intend to use it□ □ □ □ □

2. I will recommend this system to

others□ □ □ □ □

3. As a whole, I am satisfied with this

system□ □ □ □ □

4. As a whole, this system is □ □ □ □ □

Page 21: shodhganga.inflibnet.ac.inshodhganga.inflibnet.ac.in/bitstream/10603/21171/9/9.doc · Web viewTelugu has four important dialectal areas, namely, kalinga, Telangana, Rayalasema and

21

Successful

LIST OF PUBLICATIONS

[1] Dinesh Mavaluru, R. Shriram and W. Aisha Banu, “Ensemble

Approach for Cross Language Information Retrieval”, in Springer,

Lecture Notes in Computer Science, Vol.,2, pp. 274-286, H-Index -

100, ISSN No: 0302-9743, 2012. (Annexure I)

[2] Dinesh Mavaluru, R. Shriram and W. Aisha Banu, “Factors Affecting

Acceptance and Use of Telugu Cross Language Information Retrieval

System”, in International Journal of Applied Engineering Research

(IJAER), H-Index - 2, ISSN No: 0973-4562, 2013. (Annexure I)

[3] Dinesh Mavaluru and R. Shriram, “Telugu English Cross Language

Information Retrieval: A Case Study”, in International Journal of

Research in Advance Technology in Engineering (IJRATE), Volume 1,

issue 5, 2013.

[4] W. Aisha Banu, P. Sheik Abdul Khader and Dinesh Mavaluru,

“Information Retrieval in Mobile Phones Using Snippet Clustering

Methods”, in International Conference on Network and Computer

Science, Kanyakumari, IEEE Proceedings, v5-270, 2011.

Page 22: shodhganga.inflibnet.ac.inshodhganga.inflibnet.ac.in/bitstream/10603/21171/9/9.doc · Web viewTelugu has four important dialectal areas, namely, kalinga, Telangana, Rayalasema and

22

CURRICULUM VITAE

Mr. Dinesh Mavaluru (RRN: 1194207) was born on 10th May 1987 in

Tirupathi, Andhra Pradesh. He did his schooling in Seven Hills High School,

Tirupathi and secured first division. He did his Higher Secondary education in

Priyadarshini Junior College (Vizag Defence Academy), Visakapatnam,

Andhra Pradesh and secured first division. He obtained Bachelor’s degree in

Computer Science from Sri Venkateswara University in the year 2007. He

has completed Master’s degree in Computer Applications from the Karunya

University in 2010. He is currently pursuing Ph.D. Degree in Computer

Science in the department of Computer Applications of B.S. Abdur Rahman

University. His area of interests includes information retrieval, mobile

computing and big data. He published two papers in journals and presented

two papers in the international conferences.

The e-mail id is: [email protected] and the contact number is:

+91-9790640802.