architecture and evaluation issues
DESCRIPTION
Architecture and Evaluation Issues. Julie Fitzgerald Arlington, Virginia August 9, 2003. Areas of Research for Future (KBS) Evaluations. Evaluation Mechanisms Role of Participants Data collection Time management Meaningfulness of results. Open Questions for Future (KBS) Evaluations. - PowerPoint PPT PresentationTRANSCRIPT
1
Architecture and Evaluation Issues
Julie FitzgeraldArlington, Virginia
August 9, 2003
2
Areas of Research for Future (KBS) Evaluations
Evaluation MechanismsRole of ParticipantsData collectionTime managementMeaningfulness of results
3
Open Questions for Future (KBS) Evaluations
Evaluation Mechanisms IET has used challenge problems and specifications to present evaluation
mechanisms, including information related to the questions to be used and the grading format.
CPs are time consuming to develop Future research: Develop a methodology for evaluating large KB systems.
Role of Participants In HPKB, the systems were tested using KEs. RKF was SME focused. The user effects the data we need to collect and the analysis performed on
that data. Future research:
User profiling Interaction profiling (user to system; user to technology developer; user to
outside resource)
4
Data collection Labor intensive and invasive Future research:
Automated data collection and processing At a minimum, better specification of needed data
Time management Evaluations are time consuming
True for development, execution, and analysis More formalized evaluation methodology should help Future research:
More painless evaluations (on-going evaluations, evaluations in the background, automatic evaluations, other ideas?)
Open Questions for Future (KBS) Evaluations
5
Meaningfulness of Results Methods and Metrics need to be better defined
This is context dependent—we won’t always be testing the same things Methodology development is required, esp. variable isolation and controls
Characterization of users need to improve Test users on related tasks Track user-system interaction more closely
Requires better task decompositions Need to relate results back to both system and user performance
Scope of evaluations needs to widen Need more data (more users, longer durations)
Open Questions for Future (KBS) Evaluations