This paper offers a comprehensive literature review as well as a valuable taxonomy into data collection techniques in studying software engineering. The taxonomy is primarily based on the degree of human intervention involved in the data collection process. Further, the authors discuss ways to decide which technique to use such as thinking about the data volume, the purpose of the study, etc. Detailed descriptions of each single technique are also presented with both advantages and disadvantages taken into consideration.
Three degrees of human intervention
- FIRST DEGREE: direct contact with participants;
- SECOND DEGREE: observe work without needing to communicate directly with participants;
- THIRD DEGREE: retrospective study of work artifacts such as source code, problem logs, or documentation.
Three factors to consider when selecting techniques
- The degree of access required to software engineers;
- The volume of data produced, and;
- The type of research questions.
Understanding each technique (under each category)
|FIRST DEGREE||Brainstorming and Focus Group||Several people, focus on a particular issue, to uncover as many ideas as possible|
|Interviews and Questionnaires||Asking a series of questions, close-ended or open-ended, ensuring that data collected is meaningful|
|Conceptual Model||Participants create a model of some aspect of their work (e.g. a flow chart)|
|Work Diaries||Record various events that occur during the day (e.g. recoding specific events as they occur)|
|Think-Aloud Protocols||Thinking out loud while performing a task|
|Shadowing / Observation||Experimenters follow participants around and record their activities.|
|Participant Observation||Experimenters become part of the team while doing the observation|
|SECOND DEGREE||Instrumenting Systems||Mechanisms are incorporated into the software process to monitor various kinds of changes|
|Fly on the Wall||Participants record their own activities when engaged in tasks|
|THIRD DEGREE||Analysis of Electronic Databases of Work Performed||Analyzing data recorded (e.g. problem reporting, change request, etc.) during the task process|
|Analysis of Tool Logs||Analyzing the log data left during the software process|
|Documentation Analysis||Analyzing the documents that define the development process such as comments, newsgroups, email lists, etc.|
|Static and Dynamic Analysis of a Stream||Analyzing the source code (static) or traces left by running the code (dynamic)|
This paper can always serve as a good reference whenever one is designing a study, not limited to deciding ways to collect data only.