Elizabeth D. Murphy


Dissertation submitted to the Faculty of the Graduate School of the University of Maryland at College Park in

partial fulfillment of the requirements

 of the degree of

Doctor of Philosophy





Advisory Committee:

      Professor Kent L. Norman, Chair

      Professor Emerita, Nancy S. Anderson

      Professor, Michael Dougherty

      Professor Katherine J. Klein

      Professor Christine M. Mitchell

      Professor Ben Shneiderman




















     This work is dedicated to my husband, John A. Murphy, without whose caring support it would not have been possible, and to the memory of my parents, Hugh Vincent and Edna Sibley Drummond, who passed on a love of reading and respect for education.





Special thanks to the distinguished faculty members who served on my committee:  Professors Kent L. Norman (chair), Nancy S. Anderson (Emerita), Michael Dougherty, Katherine J. Klein, Christine M. Mitchell, and Ben Shneiderman.  As my advisor, Dr. Norman provided detailed guidance and encouragement throughout the course of preparing for and conducting the research. His belief that it was, indeed, possible to finish kept me going. Dr. Anderson served faithfully on the committee until circumstances prevented her from attending the defense.  I am grateful for the helpful comments she provided on the draft.  Dr. Dougherty kindly filled in for Dr. Anderson, and he provided insightful comments on short notice.  Thanks to all my committee members for their support, patience, encouragement, and useful suggestions.

My thanks go to Walt Truszkowski and Sylvia Sheppard of the NASA-Goddard Space Flight Center for financial support (NASA grant NAG5-3425), which provided equipment and personnel, and for their warm encouragement. Many NASA-Goddard personnel generously contributed their time and operational expertise to answering questions about spacecraft engineering and human decision-making in spacecraft control.  They include Matthew Brandt, David Bradley, Matthew Fatig, Leigh Gatto, Peter Gonzales, Kevin Hartnett, Cathy Penafiel, Christopher Rouff, Robert Sodano, Stacey St. Pierre, Herman Williams, and William Worrall.  Thanks to personnel at the Johns Hopkins Applied Physics Laboratory for their hospitality and willingness to provide information about the missions under their control: Ray Harvey (MSX Mission Manager), Madeleine Marshall (NEAR Mission Director), and Robert Nelson (NEAR anomaly specialist). 

Major programming issues were resolved by Daniel Y. Moshinsky, an outstanding undergraduate laboratory assistant.  Thanks to Daniel for his patience with changing requirements and for brilliantly overcoming many technical obstacles in implementing the experimental simulation as well as the on-line test of spatial ability. Thanks to Kirk Norman, who performed other important programming tasks.  Several undergraduate laboratory assistants helped in administering the experimental treatment.  I'm grateful to Kelly Hennessy, Daniel Moshinsky, and Kirk Norman for their work with participants.

Thanks to a classmate, Heather Tedesco, for providing materials from her doctoral research, from which many questions were drawn for the distractor survey.  Special thanks to many friends who cheered me on from the beginning, especially Lisa Stewart, Paula VanBalen, Kelly Harwood, and Renate Roske-Shelton. For the suggestion that planted the seed, thanks to Dr. Robert Holt of George Mason University, a great teacher. And special thanks to my family for their good-natured forebearance with the process and for their pride in this accomplishment.  It was a team effort.






Dedication                                   ii


Acknowledgements                                 iii                             

List of Tables                                   


List of Figures


Chapter 1.  Introduction                           1

     1.1  Background                          1

     1.2  Definition of Terms                      3

     1.3  Literature Review                        5

          1.3.1 Effects of Automation

                on Human Performance               6

          1.3.2 Trust versus Over-reliance

                on Automation                      6

          1.3.3 Passive Monitoring in

                Supervisory Control          11

          1.3.4 Cognitive Demands in

                Autonomous, ASP-based Systems    15

          1.3.5 Limitations in Decision

                Making                            17

          1.3.6 Information Display Needs

                in On-Call Situations             20

          1.3.7 Performance Effects of

                Spatial Visualization

                Ability                           24

     1.4  Research Design                    26

          1.4.1 Independent Variables             26

          1.4.2 Dependent Variables          28

     1.5  Hypotheses                         30


Chapter 2.  Method                                35

     2.1  Participants                            35

2.2  Materials                               36

2.3  Simulation Environment                  37

2.4  Procedure                               40

     2.4.1  Pilot Studies               41

     2.4.2  Pre-Experimental Procedure       41

     2.4.3  Experimental Procedure           43

     2.4.4  Data Capture and Analysis        47


Chapter 3.  Results                               48

     3.1  Effects of Practice                     48

     3.2  Monitoring versus On-Call

          Group Differences                       48

     3.3  Effects of Display-Selection

          Mode                                     54

     3.4  Effects of Display Type                 55

     3.5  Anchoring Effect of Agent

          Confidence                         59

     3.6  Relationships between Subjective

          Confidence Ratings and Performance

          Measures                                61

     3.7  Attitudes toward Automation

          (Reliability and Trust)                63

     3.8  Perceived Need to Monitor

          Automated Systems                       64

     3.9  Effects of Differences in SVA           64


Chapter 4.  Discussion, Design Implications,

and Suggestions for Further Research              75

     4.1  Monitoring versus On-Call

          Conditions                         75

     4.2  Levels of Automation               79

     4.3  Table versus Bar Chart versus

          Line Graph                         80

     4.4  Anchoring and Adjustment                81

     4.5  Subjective Confidence Predicts

          Accuracy                                84

     4.6  Novice Effects in Attitude Findings    85

     4.7  No Change in Rated Need to Monitor      86

     4.8  SVA as a Key Factor in Human-

          Computer Interaction               87

     4.9  General Discussion                      91


Appendix A:  Experimental Materials          94

      Consent Form                                94

      Demographics Survey                    95

             Pre-Experimental Automation Survey         97

      Post-Experimental Automation Survey        99


Appendix B: MOCHA Screen Shots                  100


Appendix C: Training and Test Materials

      MOCHA Problem Descriptions with

      Agent Reasoning                            110

      Sample Status Messages for the

      Monitoring Condition                       115



  Instructions for Research

       Participants and Training in

       the Experimental Task                       116

       Training in System Components               122


Appendix D: Distractor Survey for the

             On-Call Condition                124


References                                    148












1.        Self-reported Experience

on a Nine-Point Scale                             29


2.        Group Means and Standard Deviations

on the Main Dependent Variables                   42


3.        Tests of Between-Groups Differences

for Accuracy and Speed                            42


  4.  Interaction of MOCHA Grouping Condition

      and Sex on Test Score (Accuracy)                  43


5.  Summary of Linear Regression Analysis

for Display-Selection GroupÕs Prediction

of the Number of Bar Charts Displayed

for Test Tasks                                    45


6.  Summary of Linear Regression Analysis

for Display-Selection GroupÕs Prediction

of the Number of Timelines Displayed

 for Test Tasks                               48


7.  Correlations of Percent Correct Using

Different Display Formats on Practice

Problems and Test Problems                        49


8.  Mean Percent Correct Using Different

Display Formats Across Practice and

Test Problems                                     49


9.  Correlations of Percent Correct Using

Different Display Formats on Test

Problems                                          50


10. Mean Percent Correct Using Different

Display Formats on Test Problems                  50


11. Mean Task-Completion Times Using

Different Display Formats on Test

Problems                                          51


12. Descriptive Statistics for Mean Subjective

Confidence, Mean Test Accuracy, and Mean

Task-Completion Time                              53


13.  Summary of Linear Regression for Mean

 Subjective Confidence as a Hypothesized

 Predictor of Mean Test Accuracy and Mean

 Test Task-Completion Time                        54


14.  Score Ranges, Mean Test Scores (Accuracy)

 and Standard Deviations for the Three SVA

 Groups                                           57


15.  Pre-Test and Post-Test Ratings of the Need

 to Monitor Automated Systems by SVA Groups

 on a Nine-Point Scale                            58


































List of Figures



   1. Research design                                    27


   2. Mean task-completion time (in seconds)

      reaches asymptote over six practice

 tasks.                                            49


   3. Interaction between grouping condition

      (monitoring versus on-call) and sex

      on decision accuracy (mean test score)            53  


   4. SVA groups differ on decision accuracy as

      measured by test score                             66


   5. Scatter plot of SVA score and test score

      for men (R2 = .28)                                 71


   6. Scatter plot of SVA score and test score

      for women (R2 = .12)                               72


   7. Welcome screen with pre-entered subject

      number (125) and monitoring condition



   8. Sample problem from the VZ-2 test of spatial-

      visualization ability (SVA)


   9. Hierarchy of MOCHA components used for pre-

      practice training


  10. Monitoring condition: Status messages

      coming up in the Description area in-

      between problems


  11. Sample MOCHA problem with system data

      displayed in a table


  12. Sample MOCHA problem with system data

      displayed in a bar chart


  13. Sample MOCHA problem with system data

      displayed in a line graph


  14. Manual display mode: Subject was

      given a choice of the display format

         to be presented (table, bar chart, or

         line graph).


     15. The Details dialog box required the

         subject to provide an explanation for

         deciding that, in this case, the actual

         problem was the problem as reported by

         the advanced software process in the

         problem description.


     16. Final screen of the MOCHA experiment









Main body