The Importance of Data Collection in the Statistical Investigative Process

Imagine this – you’re playing a new video game. 🎮 You’re on a quest, and to beat this level, you need to gather certain items hidden in the game world. You search high and low, collecting these items – each one bringing you closer to your goal. Your progress in the game depends on what you collect – and it’s exciting to see what each new item helps you accomplish, right? 

You’ve already been a data collector in your video games without even knowing it. đź’ˇ Today, we’re going to explore how that same excitement and curiosity can drive us in our understanding of data collection in a statistical study. 


Data Collection Process 

Step 1: Identify the data you need

Let’s imagine you have heard that studying a foreign language will help you do well in computer programming because both focus on learning the basic structures of language. Your research question is whether students who are studying a foreign language get better grades in computer science class than students who are not taking a foreign language. So you will need the grades of students in a computer science class(es) and have that split by those taking a foreign language and those who are not taking a foreign language class.

Step 2: Develop a data collection plan

Next, we need a plan to gather our data. In the case of student grades, we might consider asking the school to provide them without student identification for privacy concerns. Another approach would be to develop a survey to collect the data directly from our fellow students. When developing your plan, you need to think about both the ideal data you would like and balance that with feasibility. Ideally, you would like the school’s data, but they might be restricted by privacy laws, so you would opt for a survey.

Step 3: Collect your data

With your plan in hand, it’s time to collect your data! If we are conducting a student survey, we will seek out a time when we are most likely to get a high response rate. The computer science class might be a good place if the teacher will help you with data collection, or perhaps at lunch when most students are together.

Step 4: Prepare data for analysis

After collecting the data, it’s time to tidy up. This step involves organizing the data you’ve collected into a form that’s easy to understand and analyze. This might mean entering survey responses into a spreadsheet and double-checking for errors.

Now, let’s look at why this whole process is so important.


Data Collection: The Heart of the Investigative Process

Different types of data going into a funnel and into a computer for processing. Data collection is like fueling your car for a road trip – without it, you’re not going anywhere. It provides the raw material you need to start answering your question.

The type of data you collect also sets the stage for the kind of analysis you can do. For instance, you decide you also want to see if it matters whether you are taking Spanish or French; you won’t be able to analyze that if you didn’t ask about what language they were studying in your survey!

Data serves as your evidence. When you eventually share your findings, people will want to know how you know what you know. The data you collected is your proof!

Remember, when you interpret your results, you’re actually interpreting the data you’ve collected. It’s like reading a book; the data tells the story, and your job is to understand and explain it.

Finally, the quality of your data collection matters a lot. Like baking a cake, if you don’t use the right ingredients (or data), the final result might not turn out as you hoped.


Case Study – Sleep Patterns and Test Scores 

Josh, a high school junior, is intrigued by his peers’ varying sleep schedules and how they might affect their performance in school, particularly on standardized tests. 

Josh first needed to define the data he required to answer his question. He knew he needed two main pieces of data: his peers’ hours of sleep and their standardized test scores. To gather this data, Josh decided to distribute a survey to his classmates, asking them about their average hours of sleep each night and their most recent standardized test scores. 

Josh knew that getting a good representation of the school’s students was crucial, so he didn’t just distribute the survey to his friends or the people in his classes. Instead, he got permission from the principal to distribute the survey school-wide during homeroom, ensuring a diverse sample of students from freshmen to seniors, athletes to artists, and early birds to night owls. 

After the survey responses started pouring in, Josh was careful about preparing the data for analysis. He checked the responses, cleaned up any inconsistencies, and organized the data in a spreadsheet, categorizing it based on grade levels, test scores, and reported sleep duration. 

As Josh looked over the spreadsheet filled with responses, he felt the thrill of possibility. His data collection phase was complete, and he was one step closer to understanding the relationship between sleep and test scores in his school. Little did he know, the exciting part of the statistical investigation process – the analysis – was about to start.