Overview

Quantitative Research Methods: A Step-by-Step Guide
img

Quantitative Research Methods

What are Research Methods?

Research methods are the techniques used by the researchers to collect and analyze information/ data in order to answer a specific question. In simple terms, research methods explain how you carry out a study.

Types of Research Methods

Research methods are broadly divided into two major categories based on the nature of data:

  1. Qualitative Research – focuses on descriptive, textual, and experience-based data to understand meanings, perspectives, and behaviors.

  2. Quantitative Research – focuses on numerical, measurable, and statistical data to test relationships, patterns, or outcomes.

Each of these two categories can be further subdivided into primary data (data collected firsthand) and secondary data (data drawn from existing sources) .

1. Primary Qualitative Methods

Data you collect yourself in the form of words, ideas, or experiences.

  • Examples: Capturing students’ emotions about exam stress through personal interviews, gathering people’s opinions on climate change in a group discussion, or recording classroom interactions in descriptive notes.

2. Primary Quantitative Methods

Data you collect yourself in the form of numbers.

  • Examples: Asking 200 students to rate their satisfaction with online classes on a 1–5 scale, testing whether background music affects memory by comparing test scores, or counting how many times children participate during a lesson.

3. Secondary Qualitative Methods

Text-based data already collected by others.

  • Examples: Analysing archived transcripts from past focus groups, reviewing published case studies on workplace culture, or examining social media posts to understand public opinion.

4. Secondary Quantitative Methods

Numerical data already collected by others.

  • Examples: Using census data to measure unemployment rates, analysing national health survey figures on obesity, or studying World Bank statistics on income levels across countries.

Now that we understand the four categories, the next step is to look closely at quantitative research methods, which focus on numbers and allow researchers to test, compare, and measure patterns.

Quantitative Research Methods

A. Primary Quantitative Data Collection Methods

Primary Quantitative Data Collection

Primary quantitative methods mean you, the researcher, are collecting fresh numerical data yourself (Herwanis, Alianur and Nasir, 2025). The goal is to measure, compare, or test relationships using structured tools.

Types of Primary Quantitative Methods

  1. Surveys and Questionnaires – Asking large groups the same set of structured questions and recording numerical responses.

  2. Experiments – Testing cause-and-effect by changing one factor and measuring the outcome.

  3. Structured Observation – Watching behaviour systematically and recording it with numbers or checklists.

I. Surveys and Questionnaires

Definition

A survey or questionnaire is a set of structured questions given to a large number of people to collect numerical data (Ranganathan and Caduff, 2023). Answers are usually in the form of tick boxes, rating scales, or multiple choice, which makes the data easy to count and compare.

Types (pick what fits your study)

  • By timing

    • Cross-sectional – one snapshot in time.

    • Repeated cross-sectional (trend) – different samples at multiple times.

    • Longitudinal / Panel – follow the same people over time.

  • By mode

    • Online (self-administered), Paper, Telephone (CATI), Face-to-face (CAPI).

  • By design

    • Anonymous vs. identified, Self-administered vs. interviewer-administered,

    • Mostly closed-ended (Likert, MCQs) with a few short open questions if needed.

When to Use Surveys and Questionnaires in a Dissertation

1. When You Want to Measure Opinions or Attitudes at Scale

Surveys are excellent when you need to capture broad opinions quickly.

  • What is the level of student satisfaction with online classes?
    → 200 students rate satisfaction on a 1–5 scale.

  • What are employees' perceptions of flexible work arrangements?
    → Staff complete a job satisfaction survey.

  • How do voters prioritize education compared to healthcare?
    → Poll with rating options.

2. When You Want to Compare Groups

Surveys allow you to separate responses into categories (like gender, location, or occupation) and then compare results.

  • To what extent do social media usage patterns differ between rural and urban residents?
    → Results compared by location.

  • What differences exist in exercise habits between men and women?
    → Responses sorted by gender.

  • How do stress levels vary across science and arts students?
    → Compare mean scores across faculties.

3. When You Need Descriptive Data About "How Many" or "How Often"

Surveys are perfect for collecting counts and frequencies.

  • What is the average weekly online time expenditure among teenagers?
    → Numeric survey items.

  • How frequently do employees attend training workshops?
    → Frequency-based questions.

  • What percentage of tourists revisit the same destination annually?
    → Questionnaire records.

4. When You Want to Test Relationships Between Variables

Surveys can include multiple questions that let you check links.

  • What is the relationship between study time and GPA?
    → Survey of weekly hours vs. exam results.

  • To what extent does job autonomy correlate with job satisfaction?
    → Survey scales compared statistically.

  • How is social media use associated with sleep quality?
    → Survey measures correlated.

5. When You Want to Explore Trends Across Time

With repeated surveys, you can show change.

  • What changes occur in student stress levels before and after exams?
    → Two survey waves.

  • How did shopping behaviors shift during COVID-19?
    → Surveys at different stages.

  • In what ways do voter priorities evolve between elections?
    → Repeated questionnaires.

6. When You Need Quick, Low-Cost, Wide Reach

Surveys are practical for reaching large populations efficiently.

  • What are commuters' perceptions of bus services across the city?
    → Short mobile survey.

  • How do customers across regions evaluate a new app?
    → Online form distributed widely.

  • What are university students' opinions about exam rules?
    → Mass email survey.

7. When the Topic Is Sensitive and Anonymity Encourages Honesty

Surveys let people answer privately without fear of judgment.

  • What are students' reports of bullying experiences?
    → Anonymous school survey.

  • How do employees describe workplace harassment in confidential settings?
    → Closed, no-names survey.

  • What self-reported risky behaviors do patients disclose anonymously?
    → Anonymous health survey.

8. When You Want to Reach Geographically Dispersed Groups

Surveys can cross distance easily through online tools.

  • What are the comparative experiences of international students across different countries?
    → Web survey.

  • How do rural and urban communities respond differently to climate policy?
    → Online/paper distribution.

  • What are global customers' ratings of an e-commerce service?
    → E-questionnaire.

9. When You Need Standardized Answers for Statistical Testing

Surveys ensure everyone answers in the same format.

  • What are employees' ratings of leadership on a 10-item scale?
    → Comparable results across respondents.

  • How do patients score satisfaction with hospital care?
    → Likert responses produce averages.

  • What are consumers' trust scores for different brands?
    → Multiple-choice questions for easy comparison.

10. When Piloting or Pre-Testing a Bigger Study

Short surveys are often used as pilots before interviews or experiments.

  • Which social media platform is most frequently used by teenagers?
    → Survey to guide deeper focus groups.

  • What are the primary stress factors affecting students?
    → Pre-survey to design interview questions.

  • Which health habits are most prevalent in the target population?
    → Pilot data before launching a full-scale study.

II. Experiments

Definition
An experiment is a method where you change one factor (independent variable) and measure its effect on another (dependent variable) (Knight, 2019). The aim is to test cause-and-effect relationships in a controlled way.

Types of Experiments (and how they fit in a dissertation)

  • Laboratory Experiments – Best when your dissertation question needs a high level of control. Example: Testing whether background noise affects memory recall using a computer-based lab task.

  • Field Experiments – Useful when you want results in natural, real-life settings. Example: Studying whether rearranging supermarket shelves changes customer buying patterns.

  • Quasi-Experiments – Practical for dissertations where you can’t randomly assign groups. Example: Comparing exam performance of two existing classes taught with different teaching methods.

  • Natural Experiments – Chosen when real-world events act like experiments. Example: Evaluating how the sudden shift to online learning during COVID-19 impacted student outcomes compared to face-to-face cohorts.

When to Use Experiments in a Dissertation

1. When You Want to Prove Cause-and-Effect

Experiments are the best way to show that one factor directly changes another.

  • What impact does caffeine have on short-term memory performance?
    → One group drinks coffee before a recall test, another drinks decaf. The caffeinated group remembers more words.

  • To what extent does background music influence focus during study sessions?
    → Half the class revises with instrumental music, the other half in silence. The music group scores lower on concentration tests.

  • What measurable effect does a fitness app produce on daily step counts?
    → Participants track steps for two weeks without the app, then two weeks with it. Average steps rise significantly in the app period.

2. When Evaluating an Intervention or Program

Dissertations often test whether new methods or tools are more effective than traditional ones.

  • Does a new mathematics teaching method lead to improved test scores?
    → One class uses interactive online lessons, another uses standard lectures. The online group scores higher end-of-term results.

  • To what degree does mindfulness training reduce exam-related stress?
    → Half the students attend 4 weeks of mindfulness sessions, the other half does not. The trained group reports significantly lower stress on exam day.

  • What differences emerge in engagement levels between gamified and standard e-learning platforms?
    → Students using a gamified system log in more often and spend more time on lessons than those on a plain platform.

3. When You Need Strong Control Over Variables

Experiments allow you to isolate one factor and hold everything else constant.

  • What effect do different lighting conditions have on workplace productivity?
    → Office workers are randomly assigned to bright-light and dim-light rooms. The bright group completes more tasks in the same timeframe.

  • To what extent do ad color variations influence consumer preference?
    → Shoppers see identical cereal boxes in red, blue, and green versions. More choose the red box, demonstrating color’s impact.

  • What measurable difference do reminder emails make in survey completion rates?
    → Half the participants receive reminders, the other half does not. The reminder group shows substantially higher response rates.

4. When You Want to Study Behavioral Responses

Experiments capture real-time reactions to different conditions.

  • What impact do emotional appeals have on donation amounts compared to factual information?
    → One donor group reads a charity ad with a sad child’s story, another reads a factual ad with statistics. The story-based group donates more.

  • To what extent do positive versus negative campaign ads alter voter perceptions?
    → Participants watch either an achievement-focused ad or an opponent-attacking ad. The negative-ad group reports stronger distrust.

  • What measurable influence does background music exert on consumer spending?
    → Customers are observed in two store settings: one with soft music, one in silence. Sales rise significantly in the music environment.

5. When Ethical or Practical Limits Prevent Real-World Testing

Controlled experiments let you simulate risky or unrealistic settings.

  • What measurable effects do phone distractions produce on driving performance?
    → Students drive a simulator: half with no interruptions, half receiving mock calls. The distracted group makes more errors.

  • How quickly do people respond to emergency alarms in simulated workplace environments?
    → Participants in a lab room imagine being at work when an alarm sounds. Researchers measure evacuation speed.

  • To what degree does sleep deprivation affect concentration ability?
    → One group sleeps 8 hours, another only 4 hours. Both take identical puzzle tests; the sleep-deprived group scores lower.

6. When Natural Events Act Like Experiments

Sometimes real-world changes create "natural experiments" you can study.

  • What measurable differences emerged in learning outcomes after the sudden shift to online classes during COVID-19?
    → 2019 (face-to-face) and 2020 (online) student cohorts are compared. Online students report more flexibility but worse concentration.

  • To what extent did a plastic bag ban alter shopping behaviors?
    → Sales data before and after the ban shows a sharp rise in reusable bag purchases.

  • What productivity changes occurred during remote work lockdowns across different sectors?
    → Employee performance data before vs. during lockdowns reveals gains in some sectors and declines in others.

7. When Comparing Multiple Conditions Directly

Experiments let you test different versions of the same thing.

  • Which teaching format produces the highest learning outcomes among lecture, video, and blended methods?
    → Three groups learn identical content via lecture, video, or blended learning. The blended group achieves the highest test scores.

  • What type of online advertisement generates the most engagement: text, image, or video?
    → Online users are randomly shown text, image, or video ads. The video version receives the most clicks.

  • Which exercise style yields the greatest fitness improvements: yoga, running, or HIIT training?
    → Groups are assigned yoga, running, or HIIT regimens. After six weeks, the HIIT group shows the largest fitness gains.

8. When Replicating or Extending Existing Research

Many dissertations repeat famous experiments in new settings.

  • Does the Stroop effect (colour-word conflict) manifest differently in bilingual versus monolingual individuals?
    → Students name ink colours in congruent vs. incongruent words. Bilinguals show distinct interference patterns.

  • To what extent does anchoring bias influence pricing decisions in Indian consumer contexts?
    → Shoppers in India see a high versus low "suggested price." High-anchor groups spend significantly more.

  • Is social loafing equally prevalent in virtual teams compared to traditional face-to-face groups?
    → Online teams perform brainstorming tasks. Virtual teams show the same reduction in individual effort as in prior face-to-face studies.

III. Structured Observation

Definition

Structured observation means watching people’s behaviour in a systematic way and recording it using numbers, checklists, or coding sheets (Fix et al., 2022). Instead of asking what people say they do, you measure what they actually do.

Types of Structured Observation (and how they fit into dissertations)

  • Frequency Counts – Best when your dissertation asks “how often does this happen?”
    Example: In an education dissertation, you might count how many times students raise their hands in a lesson to measure participation levels. In a healthcare project, you could count how often nurses wash their hands during shifts.

  • Time Sampling – Used when you want to see patterns of behaviour over time.
    Example: In a childcare study, you could record what toddlers are doing every 5 minutes across a morning session. In workplace research, you might track how often employees switch tasks within an hour.

  • Event Sampling – Ideal when your focus is on specific actions that matter most.
    Example: In a teaching dissertation, you could record every time a teacher gives praise. In a psychology project, you might log every time a child shows signs of aggression during play.

  • Rating Scales – Useful when you want to measure the intensity or quality of behaviour, not just its presence.
    Example: In a classroom study, you could rate student engagement from 1 (very low) to 5 (very high). In a clinical dissertation, you might rate patient pain levels observed during physical therapy.

  • Checklist Observations – Best when your question is about compliance with standards or rules.
    Example: In a nursing dissertation, you could use a checklist to record whether staff followed hygiene protocols step by step. In a retail study, you might tick yes/no for whether staff greeted customers on entry.

When to Use Structured Observation in a Dissertation

1. When You Want to Measure Behaviour Frequency

This is the most common use: counting actions to generate reliable numbers.

  • What is the frequency of student hand-raising during a 40-minute class?
    → Observer records each instance.

  • To what extent do hospital nurses sanitise their hands during shifts?
    → Each observed instance is tallied.

  • What percentage of drivers come to a complete stop at pedestrian crossings?
    → Every vehicle’s stopping behavior is recorded.

  • How do promotional items compare to regular products in shopper selection rates?
    → Choices are counted across store visits.

2. When Comparing Behaviours Across Groups or Settings

Structured observation helps identify differences across categories.

  • In what ways do playground participation patterns differ between boys and girls?
    → Observers track behaviors by gender.

  • What variations exist in customer behaviors between luxury and budget retail environments?
    → Observers record browsing time and items handled.

  • To what degree do hygiene compliance rates differ between private and public hospital patients?
    → Adherence levels are compared across settings.

  • How do collaboration patterns vary between open-plan and closed office spaces?
    → Observer measures interactions and interruptions.

3. When Testing the Impact of Interventions

Observations reveal whether changes alter behaviour.

  • What measurable effect does background music have on café customer dwell time?
    → Observers record average duration before/after implementation.

  • To what extent do hand-sanitiser stations at entrances increase usage frequency?
    → Compare rates before and after installation.

  • How does classroom desk rearrangement influence group discussion frequency?
    → Number of student interactions is counted pre/post-change.

  • What impact do calorie labels on menus have on junk food selection rates?
    → Observers track choices across two menu versions.

4. When Studying Compliance With Rules or Standards

Structured observation is ideal for checking procedural adherence.

  • To what degree do food handlers consistently wear gloves in restaurant settings?
    → Observer uses a compliance checklist.

  • How thoroughly do lab technicians follow safety protocols during experiments?
    → Each step is ticked as completed or omitted.

  • What percentage of bus drivers comply with designated stop requirements?
    → Observers track adherence rates across routes.

  • How accurately do nurses implement hospital hygiene protocols?
    → Observers score adherence to standardized procedures.

5. When You Need Objective Data Instead of Self-Reports

People don’t always report behaviour accurately; observation reveals reality.

  • What discrepancies exist between reported and observed toy-sharing behaviors in children?
    → Observer tracks actual sharing instances.

  • To what extent do self-reported break times align with observed employee break durations?
    → Actual time logged versus claimed time.

  • How consistently do gym members follow machine instructions compared to their self-assessed compliance?
    → Observers record technique accuracy.

  • What differences emerge between claimed and actual nutrition label reading behaviors among shoppers?
    → Observed actions compared with survey responses.

6. When Studying Behaviour Patterns Over Time

Time sampling and longitudinal observation reveal trends.

  • What changes occur in children’s interaction patterns across a school term?
    → Weekly observations are coded and compared.

  • How do shopper behaviors vary between morning and evening store hours?
    → Temporal patterns are documented and analyzed.

  • What seasonal differences emerge in public space usage patterns among park visitors?
    → Summer versus winter behaviors are systematically tracked.

  • How does patient engagement evolve throughout the course of therapy sessions?
    → Regular observation logs document progression.

 

B. Secondary Quantitative Data Collection Methods

So far, we have looked at primary quantitative methods, where you go out and collect fresh numbers yourself (Rana, Gutierrez and Oldroyd, 2021). But sometimes, for your dissertation, you don’t need (or can’t manage) to collect new data. Instead, you use quantitative data that already exists, numbers that someone else has gathered, often on a much larger scale than you could manage alone. This is called secondary quantitative research.

This approach is especially useful when your research question needs big samples, official statistics, or historical comparisons, or when time and access make primary data collection unrealistic.

Types of Secondary Quantitative Data

  • Datasets – Large databases from governments, universities, or organisations (e.g., census data, World Bank statistics, NHS health records). Perfect when your dissertation needs wide coverage or trend analysis.

  • Survey Data (conducted by someone else) – Using data from large-scale surveys like National Family Health Survey (NFHS), Labour Force Survey, Eurobarometer, or Pew Global Attitudes Survey. This saves you from running your own huge survey while giving access to reliable, representative data.

  • Experimental Data (archived or shared) – Analysing results from experiments already carried out by other researchers, labs, or institutions. Useful if you want to replicate, validate, or extend findings without running your own experiments.

I. Datasets

Definition

A dataset is a ready-made collection of numbers compiled by governments, organisations, or researchers (Mc Grath-Lone et al., 2022). For a dissertation, datasets give you large, credible, and often long-term data that you can analyse without running your own large survey.

Types of Datasets (and how they fit into dissertations)

  • Official Statistics – Government-collected data such as census, labour force surveys, or crime statistics.
    Dissertation fit: Perfect when you need reliable, population-level numbers (e.g., poverty rates, employment trends).

  • Institutional / Organisational Datasets – Data published by NGOs, companies, or institutions (e.g., WHO health stats, World Bank, IMF, OECD, corporate annual datasets).
    Dissertation fit: Ideal for business, policy, or international studies needing global coverage.

  • Academic Research Datasets – Data shared from previous studies (e.g., datasets on Kaggle, ICPSR, or university repositories).
    Dissertation fit: Useful for replicating or extending existing research.

  • Big Data / Digital Datasets – Data drawn from online platforms, sensors, or digital logs (e.g., Twitter API data, Google Trends, e-commerce logs).
    Dissertation fit: Helpful when studying real-time behaviour, trends, or online activity.

  • Panel / Longitudinal Datasets – Data tracking the same people or households across time (e.g., British Household Panel Survey).
    Dissertation fit: Ideal for studying change, like income mobility or health outcomes over years.

When to Use Datasets in a Dissertation

1. When Your Research Question Needs Large-Scale Evidence

Datasets provide huge samples, beyond what you could collect yourself.

  • What is the relationship between education levels and income at a national scale?
    → Census data covering millions of households.

  • To what extent do employment patterns differ across industries?
    → National labour force dataset.

  • How do health outcomes vary across geographic regions?
    → NHS hospital records or WHO data.

2. When You Need Long-Term or Historical Data

Datasets track trends over decades.

  • What changes occurred in life expectancy between 1970–2020?
    → World Bank or UN historical data.

  • How did unemployment rates shift during the 2008 financial crisis?
    → Labour statistics spanning 20 years.

  • What transformations occurred in school enrollment since the 1990s?
    → National education datasets.

3. When You Need Reliable, Official Statistics

Credibility is vital in dissertations, and datasets provide trusted sources.

  • What percentage of households live below the poverty line?
    → National census data.

  • How many internet users exist in each geographic region?
    → Telecom authority datasets.

  • What are the comparative crime rates across urban areas?
    → Police statistics databases.

4. When Direct Access to Participants Is Difficult

Some groups are impossible to survey yourself.

  • What employment pattern differences exist among migrant workers across European countries?
    → Eurostat migration datasets.

  • How do women’s health outcomes compare across different states?
    → Demographic and Health Surveys (DHS).

  • What year-by-year changes occur in refugee populations globally?
    → UNHCR datasets.

5. When Comparing Across Countries or Regions

Datasets are standardised to enable comparisons.

  • What literacy rate variations exist across South Asian nations?
    → UNESCO education statistics.

  • How does renewable energy usage compare between EU and Asian countries?
    → International Energy Agency (IEA) datasets.

  • What global differences emerge in happiness scores?
    → World Happiness Index dataset.

6. When You Want Secondary Validation or Replication

Sometimes you replicate or extend prior research rather than test brand-new questions.

  • Can published findings on income inequality be replicated using updated datasets?
    → World Bank Gini index data.

  • Do recent Eurobarometer survey waves confirm earlier trends in EU trust levels?
    → Cross-wave comparative analysis.

  • What shifts occur in climate change perceptions over time within a single country?
    → Longitudinal environmental survey datasets.

7. When Triangulating With Your Primary Data

Dissertations often combine primary and secondary data for richer insights.

  • How do local student stress survey results compare with national well-being datasets?
    → Triangulation of institutional survey data with national statistics.

  • What patterns emerge when small business owner interviews are contrasted with World Bank SME revenue data?
    → Qualitative insights validated against quantitative trends.

8. When Resources (Time, Budget, Ethics) Are Limited

Datasets provide a practical alternative to large-scale data collection.

  • What income distribution patterns exist across demographics without conducting thousands of surveys?
    → Household survey datasets.

  • How do pollution levels vary across regions without direct measurement equipment?
    → Environmental agency air-quality datasets.

II. Survey Data (Conducted by Someone Else)

Definition

This means using survey results that have already been collected by governments, research agencies, or organisations (Ponto, 2015). Instead of running your own large survey, you analyse responses from big, reliable surveys such as the National Family Health Survey (NFHS), Labour Force Survey, Eurobarometer, or Pew Global Attitudes Survey. For dissertations, this is especially useful when you need large, representative samples without the cost and time of data collection.

Types of Existing Survey Data (for dissertations)

  • National Surveys – Government surveys like NFHS (India) or Labour Force Survey (UK). → Use when you need reliable country-level evidence.

  • International Surveys – Cross-country surveys like Eurobarometer or World Values Survey. → Use when comparing attitudes across nations.

  • Specialised Surveys – Pew, Gallup, or topic-specific polls. → Use when your dissertation is about niche issues like climate change or consumer habits.

  • Panel Surveys – Surveys repeated over time, e.g., Understanding Society (UK). → Use when you need to track changes or trends.

When to Use Existing Survey Data in a Dissertation

1. When you need large, representative samples

National and international surveys cover thousands of people with careful sampling.

  • How do job satisfaction levels differ by gender across the UK? → Labour Force Survey.

  • What are the regional differences in literacy in India? → NFHS data.

  • How do Europeans view immigration? → Eurobarometer surveys.

2. When you want to compare groups across regions or countries

International surveys let you make cross-cultural or cross-country comparisons.

  • How does trust in government differ between Western and Eastern Europe? → European Social Survey.

  • How do health behaviours vary between rural and urban populations? → NFHS or WHO surveys.

  • How do attitudes to climate change differ across developed vs. developing nations? → Pew Research global surveys.

3. When your research needs trends over time

Repeated or panel surveys let you track changes across years.

  • How have attitudes to same-sex marriage changed in the US since 1990? → General Social Survey.

  • How has youth unemployment shifted across Europe in the last decade? → Eurostat labour surveys.

  • How has internet use grown in rural vs. urban India since 2000? → NFHS survey waves.

4. When primary surveys are not feasible

Large-scale surveys can be too expensive or time-consuming for students, but secondary survey data solves this.

  • Instead of surveying 5,000 households about fertility, use NFHS.

  • Instead of polling thousands about political trust, use Pew or Gallup surveys.

  • Instead of conducting a citywide transport survey, use official travel behaviour surveys.

5. When validating or extending findings

You can use existing survey data to confirm or challenge previous research.

  • Do Eurobarometer results on EU trust replicate smaller national polls?

  • Do NFHS fertility results match findings from local NGO studies?

  • Do Pew surveys on social media use align with your own small-scale survey?

6. When triangulating with your own data

Secondary survey data provides a benchmark to compare your own findings.

  • Compare your small university stress survey with national student well-being survey results.

  • Contrast your local consumer behaviour questionnaire with Gallup global trends.

  • Match your interview findings on political trust with Eurobarometer survey results.

III. Experimental Data (archived or shared)

Experimental data refers to results from experiments already conducted by other researchers, labs, or organisations (Rodd, 2024). Instead of running your own, you analyse this ready-made data to answer your dissertation question.

Types of Experimental Data (and How They’re Used in Dissertations)

  • Clinical or Medical Trials – Best for dissertations in health sciences where direct experiments are not possible.

Example: Analysing WHO clinical trial datasets to see whether treatment outcomes differ by gender or age groups.

  • Laboratory Psychology/Behavioural Studies – Useful in psychology or social sciences for replicating classic findings.

Example: Re-analysing Stroop effect datasets to test whether results differ in bilingual vs. monolingual students.

  • Educational Experiments – Perfect for dissertations in education when testing new methods is impractical.

Example: Using archived classroom intervention data to check if interactive teaching improves maths performance compared to traditional lectures.

  • Economic or Policy Experiments – Valuable in business or public policy dissertations for studying interventions in real-world settings.
    Example: Reviewing government-run field experiment data to measure whether financial incentives increase renewable energy adoption.

When to Use Experimental Data in a Dissertation

1. Replicating Classic Studies

Archived experimental data lets you re-test famous effects without new experiments.

  • To what extent do memory experiments from the 1990s show consistent recall patterns today?
    → Re-analyse earlier lab datasets with modern samples.

  • Does the Stroop effect manifest differently in bilingual versus monolingual participants?
    → Test using open psychology datasets.

  • What survival advantages do heart disease treatments from 10-year-old trials still demonstrate?
    → Secondary clinical trial data provide answers.

2. Testing New Hypotheses With Existing Data

Existing datasets often contain unexplored variables.

  • Can clinical trial data on diabetes medication reveal gender differences in side effects?
    → Fresh analysis may uncover new findings.

  • To what extent do archived school intervention data show impacts on attendance (not just grades)?
    → Re-examine variables beyond the original scope.

  • What role do personality traits play in decision-making experiments ignored in initial studies?
    → Re-analyse behavioural datasets with new lenses.

3. When Your Own Experiment Isn’t Possible

Secondary experimental data provides alternatives when experiments are impractical.

  • Instead of running your own cancer drug trial, analyse WHO/hospital datasets.

  • Curious about children’s responses to learning apps? Use archived classroom intervention data.

  • Interested in consumer reactions to marketing? Use company/published trial datasets.

4. Comparing Results Across Contexts

Secondary data reveals whether effects hold across different settings.

  • To what degree do psychology experiments on stress produce consistent results in Asian vs. European samples?
    → Compare datasets to identify cultural variations.

  • What differences exist in government subsidy trial impacts on school attendance across Africa and Latin America?
    → Secondary data shows contextual variations.

  • Do marketing trials on packaging show similar consumer preferences in the US vs. India?
    → Archived datasets enable cross-market comparison.

5. Validating or Challenging Past Conclusions

Dissertations can use secondary data to test published conclusions.

  • Does a published psychology experiment show the same effect with new statistical methods?

  • To what extent do clinical trial conclusions about drug effectiveness hold when side effects are re-analysed?

  • Do government policy trial claims remain robust when examined with different variables?

Up to this point, we have explored how to collect quantitative data, whether through primary methods like surveys, experiments, or structured observation, or through secondary methods like datasets, existing surveys, and experimental archives. But collecting numbers is only half the journey. For your dissertation, the real value comes from what you do with those numbers. This next stage is called data analysis. In simple terms, data analysis means taking the raw numbers you collected (or borrowed) and making sense of them, turning spreadsheets into insights that answer your research question. In this section, we’ll explore quantitative data analysis technique such as statistical tests, correlations, and regression, and show how these tools can be applied to answer your dissertation research questions with accuracy.

DATA ANALYSIS METHODS

What is Data Analysis?

Data analysis is the process of cleaning, examining, and interpreting the information you collected to make it meaningful (Dibekulu, 2020). In a dissertation, data analysis means taking your raw material, whether numbers from surveys, experiments, or datasets, or words from interviews, focus groups, or documents, and turning it into answers for your research question.

Think of it like cooking: data collection gives you the raw ingredients, but analysis is the stage where you mix, cook, and season them to create a final dish. Without analysis, your dissertation would just be a pile of uncooked ingredients, facts without meaning.

Just like data collection, data analysis methods are divided into two broad families: quantitative (numbers) and qualitative (words, meanings, stories).

Quantitative Data Analysis Techniques

  1. Descriptive Statistics

  2. Inferential Statistics

  3. Regression Analysis

  4. Cluster Analysis

  5. Cohort Analysis

  6. Correlation Analysis

1. Descriptive Statistics

Definition

Descriptive statistics are used to organise, summarise, and present data in a clear way (Yellapu, 2022). They don’t test theories or predict outcomes, they give a snapshot of “what the data looks like.” In a dissertation, descriptive statistics are often the first step of analysis, helping you present results before moving to deeper tests.

Types of Descriptive Statistics (All Major Categories)

Measures of Central Tendency

Definition: Numbers that show the "center" or typical value in your data.

  • Mean: The average (add all values, divide by number of values).
    Example use: Average exam score of your participants.

  • Median: The middle value when data is ordered from lowest to highest.
    Example use: Median age of respondents (less affected by extreme values than mean).

  • Mode: The most frequently occurring value.
    Example use: Most common shoe size in a survey.

Measures of Dispersion (Spread)

Definition: Numbers that show how spread out or varied your data is.

  • Range: Difference between the highest and lowest values.
    Example use: Show the gap between best and worst exam scores.

  • Variance: Average of squared differences from the mean.
    Example use: Measure how much student grades deviate from the average.

  • Standard Deviation: Square root of variance (shows typical distance from the mean).
    Example use: Show whether student grades are tightly clustered or widely varied.

  • Interquartile Range (IQR): Range of the middle 50% of data (between 25th and 75th percentiles).
    Example use: Show the spread of typical household incomes, ignoring extremes.

Frequency Distribution

Definition: A summary showing how often each value or category appears in your data.
Example use: Show how many students chose "Strongly Agree," "Agree," or "Neutral" on a survey.

Percentages & Proportions

Definition: Ways to show the share of each group in your data.

  • Percentage: Parts per 100 (e.g., 60% female respondents).

  • Proportion: Part of the whole (e.g., 0.6 female respondents).
    Example use: 60% of respondents female, 40% male.

Cross-tabulation (Crosstabs)

Definition: A table showing two categorical variables together to see relationships.
Example use: Gender × Course Type (e.g., Male/Female × Science/Arts).

Graphs and Charts

Definition: Visual tools to display patterns in data.

  • Bar Chart: Compares categories (e.g., votes per candidate).

  • Pie Chart: Shows parts of a whole (e.g., budget allocation).

  • Histogram: Shows distribution of numerical data (e.g., age groups).

  • Line Chart: Shows trends over time (e.g., monthly sales).

  • Boxplot: Shows spread and outliers (e.g., exam score distributions).
    Example use: Histogram of age distribution; pie chart for voting preferences.

Measures of Position/Ranking

Definition: Numbers that show where a specific value stands relative to others.

  • Percentiles: Value below which a percentage of data falls (e.g., 90th percentile = top 10%).

  • Quartiles: Divides data into four equal parts (Q1, Q2=median, Q3).

  • Z-scores: Shows how many standard deviations a value is from the mean.
    Example use: Show that a student’s score was in the top 10% (90th percentile) of the class.

Shape of Distribution

Definition: Describes the pattern of data distribution.

  • Skewness: Measures asymmetry (positive skew = tail to the right; negative skew = tail to the left).

  • Kurtosis: Measures "tailedness" (heavy tails = more outliers; light tails = fewer outliers).
    Example use: Test if exam results are normally distributed (bell-shaped) or skewed toward high performers.

Ratios and Rates

Definition: Comparisons between two numbers.

  • Ratio: Relationship between two quantities (e.g., 2:1 male-to-female ratio).

  • Rate: Measure of frequency over time (e.g., crime rate per 1,000 people).
    Example use: Male-to-female ratio in your dataset.

Indexes/Composite Scores

Definition: A single number created by combining multiple variables or survey items.
Example use: Create a "Job Satisfaction Index" from multiple survey items (e.g., pay, work-life balance, relationships).

When to Use Descriptive Statistics in a Dissertation

1. Summarising survey results

  • How many hours per week do students study? → Mean = 14.3, SD = 2.5.

  • What % prefer online classes? → Pie chart showing 68%.

  • How often do customers shop online per month? → Frequency table.

2. Describing sample demographics

  • Age → Median = 22 years; histogram shows bell curve.

  • Gender → 55% female, 45% male.

  • Education → Crosstab: 70% undergrad, 30% postgrad.

3. Comparing basic group differences

  • Do men vs. women report different average stress levels? → Compare means.

  • Do urban vs. rural respondents differ in internet usage? → Crosstab table + bar chart.

  • Do first-year vs. final-year students attend more lectures? → Means side by side.

4. Showing variability and spread

  • Two classes both average 75%, but one has SD = 2 (very consistent), the other SD = 12 (highly varied).

  • Use boxplots to show income spread among respondents.

5. Presenting trends clearly

  • Bar chart of device ownership (Laptop 80%, Tablet 40%, Smartphone 95%).

  • Line graph of attendance across 12 weeks.

  • Histogram showing exam score distribution.

6. Identifying outliers and unusual data

  • One participant’s income is far above others → spotted via boxplot.

  • Standardised z-scores highlight unusual responses.

7. Profiling sub-groups in your study

  • Male students’ mean GPA vs. female students’ mean GPA.

  • Participants under 25 vs. over 25 → Crosstab shows employment differences.

8. Describing longitudinal snapshots (before deeper analysis)

  • Average stress score at Week 1, Week 4, and Week 8 of the semester.

  • Histogram showing income changes across three survey waves.

2. Inferential Statistics

Inferential statistics are techniques that let you go beyond your sample and make claims about the larger population it represents (Alacaci, 2004). In other words, you take data from the group you studied and infer what is likely true for the bigger group. In dissertations, this is where you move from “what happened in my data” (descriptive) to “what this means more broadly.”

Types of Inferential Statistics (with dissertation uses)

Hypothesis Testing

Definition: A formal procedure to determine if sample data provides enough evidence to reject a null hypothesis (e.g., "no difference exists") in favor of an alternative hypothesis.

Use when: You want to test if groups differ significantly or if a relationship exists.

Common Tests:

  • t-tests (independent, paired): Compares means between two groups.
    Example: Do male and female students differ in average exam scores? → Independent samples t-test.

  • ANOVA: Compares means across 3+ groups.
    Example: Do teaching methods (A, B, C) affect test scores differently? → One-way ANOVA.

  • Chi-square: Tests relationships between categorical variables.
    Example: Is voting preference associated with age group? → Chi-square test.

  • Non-parametric alternatives (Mann–Whitney, Kruskal–Wallis): Used when data isn’t normally distributed.
    Example: Compare median satisfaction scores between two small groups → Mann–Whitney U test.

Confidence Intervals

Definition: A range of values (e.g., 95% CI) that likely contains the true population parameter. It quantifies uncertainty around your sample estimate.

Use when: You want to estimate where the true population value lies.

Example: Average weekly study time in your sample = 15 hrs; 95% CI [14.2, 15.8] hrs means the true population mean is likely between 14.2 and 15.8 hours.

Correlation Analysis

Definition: Measures the strength and direction of a relationship between two continuous variables (without implying causation).

Common Tests:

  • Pearson’s r: For linear relationships with normally distributed data.

  • Spearman’s ρ/Kendall’s τ: For monotonic relationships or non-normal data.
    Use when: You want to test if two variables move together.
    Example: Is there a relationship between stress levels (scale 1–10) and sleep hours? → Pearson’s correlation.

Regression Analysis

Definition: Models the relationship between a dependent variable (outcome) and one or more independent variables (predictors).

Types:

  • Simple Regression: One predictor.
    Example: Does study time predict exam scores? → Simple linear regression.

  • Multiple Regression: Multiple predictors.
    Example: Do study hours, attendance, and sleep predict GPA? → Multiple regression.
    Use when: You want to test if variables predict an outcome.

ANOVA & MANOVA

Definition:

  • ANOVA (Analysis of Variance): Tests if means differ across 3+ groups for one outcome.

  • MANOVA (Multivariate ANOVA): Tests group differences across multiple outcomes simultaneously.
    Use when: Comparing groups on one (ANOVA) or multiple (MANOVA) dependent variables.
    Example: Do students in science, arts, and commerce faculties differ in GPA, stress levels, and study time? → MANOVA.

Chi-square Tests

Definition: Tests if two categorical variables are independent (no association) or related.

Use when: You want to test relationships between categorical variables.

Example: Is gender associated with voting preference (Party A vs. Party B)? → Chi-square test of independence.

Non-parametric Tests

Definition: Tests that don’t assume normal distribution or equal variance. Used for ordinal data, small samples, or skewed distributions.

Common Tests: Mann–Whitney U (2 groups), Kruskal–Wallis (3+ groups), Wilcoxon signed-rank (paired data).

Use when: Your data violates parametric test assumptions.

Example: Compare median satisfaction ratings between two small groups → Mann–Whitney U test.

Factor Analysis

Definition: Reduces a large set of variables into fewer underlying "factors" by identifying patterns in correlations.
Types:

  • EFA (Exploratory): Uncovers hidden factor structures.

  • CFA (Confirmatory): Tests if data fits a pre-specified factor model.
    Use when: You want to identify latent constructs in survey data.
    Example: A job satisfaction survey with 20 questions reduces to 3 factors: pay, workload, environment → EFA.

Structural Equation Modelling (SEM)

Definition: Combines factor analysis and regression to test complex relationships among observed and latent variables, including direct/indirect effects.

Use when: Testing theoretical models with multiple pathways.

Example: Do leadership style → motivation → job satisfaction → employee turnover? → SEM.

Survival Analysis

Definition: Analyzes "time-to-event" data, accounting for censored cases (subjects who haven’t experienced the event by study end).
Common Tests: Kaplan-Meier curves, Cox regression.

Use when: Your outcome is "time until an event occurs."

Example: In healthcare, time until patient relapse after treatment → Kaplan-Meier analysis.

When to Use Inferential Statistics in a Dissertation

1. When you need to test group differences

  • Do male vs. female students differ in exam scores? → t-test.

  • Do three teaching methods produce different outcomes? → ANOVA.

  • Do public vs. private sector workers differ in job satisfaction? → Mann–Whitney.

2. When you want to generalise findings from sample to population

  • My survey found 65% of students prefer online learning. What’s the likely percentage in the full student body? → Confidence interval estimation.

  • In a sample of 200 voters, 55% support Policy X. CI helps estimate national support.

3. When testing relationships between variables

  • Is there a significant link between hours of sleep and GPA? → Pearson correlation.

  • Is stress related to smartphone usage? → Spearman correlation.

  • Do leadership style and work motivation predict performance? → Regression analysis.

4. When you want to identify hidden patterns

  • Do survey items cluster into underlying factors like “academic pressure” vs. “social support”? → Factor analysis.

  • Do employee satisfaction scores reflect different latent variables like pay vs. recognition?

5. When dealing with categorical data

  • Is gender linked with voting preference? → Chi-square.

  • Are rural vs. urban residents more likely to own smartphones? → Chi-square test of independence.

6. When testing complex models

  • Do leadership, culture, and training predict turnover together? → SEM.

  • Do motivation and peer support mediate the relationship between workload and performance?

7. When analysing time-to-event outcomes

  • How long do patients remain relapse-free after therapy? → Survival analysis.

  • How long until employees leave after joining?

3. Regression Analysis

Definition

Regression is a statistical method used to test and predict how independent variables (predictors) influence a dependent variable (outcome) (Palmer and O’Connell, 2009). It goes beyond correlation by estimating the strength, direction, and significance of those relationships.

Types of Regression (connected to dissertations)

Simple Linear Regression

Definition: A method that models the relationship between exactly one independent variable (predictor) and one continuous dependent variable (outcome) by fitting a straight line to the observed data.

Use when: You want to test if a single factor predicts an outcome.

Example: Do study hours predict GPA?

Multiple Linear Regression

Definition: Extends simple linear regression by using two or more independent variables to predict a single continuous dependent variable.

Use when: You want to test how multiple factors together predict an outcome.

Example: Do study hours, attendance, and sleep together predict GPA?

Logistic Regression

Definition: Used when the dependent variable is binary (two categories, e.g., yes/no). It models the probability of an outcome occurring based on predictor variables.

Use when: Your outcome has only two possible values.

Example: Do demographics predict whether a voter supports Policy X?

Ordinal Regression

Definition: Designed for dependent variables with ordered categories (e.g., low/medium/high). It models the relationship between predictors and the ordered outcome categories.

Use when: Your outcome has naturally ordered categories.
Example: Do income and education predict satisfaction levels (low/medium/high)?

Multinomial Logistic Regression

Definition: An extension of logistic regression for dependent variables with three or more unordered categories. It estimates the probability of each category relative to a reference category.

Use when: Your outcome has multiple non-ordered categories.

Example: Do personality traits predict career choice (arts/science/commerce)?

Hierarchical Regression

Definition: A method where predictors are entered in blocks or steps to see how each block improves prediction of the outcome. This helps test the unique contribution of new variables.

Use when: You want to see if adding new predictors improves the model.

Example: Does adding "motivation" improve prediction of grades beyond study habits?

Stepwise Regression

Definition: An automated approach where statistical software selects predictors based on predefined criteria (e.g., p-values). It adds or removes variables step-by-step to find the best model.

Use when: You have many predictors and want to identify the most important ones.

Example: Which lifestyle factors best predict heart health?

Polynomial Regression

Definition: Models curved (non-linear) relationships between predictors and outcomes by including polynomial terms (e.g., squared or cubed predictors).

Use when: You suspect the relationship isn't straight-line.

Example: Does stress improve performance up to a point, then reduce it?

Panel/Longitudinal Regression

Definition: Analyzes data where the same subjects are measured repeatedly over time. It accounts for both within-subject changes and between-subject differences.

Use when: You have repeated measurements from the same units.

Example: Does family income predict student progress across five years?

Structural Regression Models (part of SEM)

Definition: Complex models within Structural Equation Modeling (SEM) that test relationships among multiple predictors, outcomes, mediators, and moderators simultaneously.

Use when: You need to test intricate theoretical frameworks.

Example: Do leadership and culture influence turnover indirectly through job satisfaction?

When to Use Regression in a Dissertation (all possible scenarios)

1. When Predicting an Outcome

  • What is the predictive relationship between study hours and exam scores?

  • To what extent do customer reviews influence purchase likelihood?

  • How does training affect employee performance?

2. When Testing Multiple Predictors Together

  • What combined effect do pay, workload, and leadership style have on job satisfaction?

  • How do study habits, class attendance, and sleep quality collectively influence GPA?

  • To what degree do diet, exercise, and genetics contribute to BMI prediction?

3. When Your Dependent Variable Is Categorical (Yes/No or Groups)

  • What is the likelihood of voting based on age, gender, and income? → Logistic regression.

  • How do workplace factors influence the probability of staff resignation? → Logistic regression.

  • Which demographic factors are associated with students' career path choices (arts/science/commerce)? → Multinomial regression.

4. When Controlling for Confounding Variables

  • What is the effect of motivation on GPA when accounting for socio-economic status?

  • How does job autonomy impact satisfaction after adjusting for years of experience?

  • To what extent does class size influence performance when controlling for teacher quality?

5. When Building Hierarchical or Step Models

  • What additional variance in grades is explained by motivation beyond study habits? → Hierarchical regression.

  • Which predictors emerge as most significant for stress when entered sequentially? → Stepwise regression.

6. When Relationships Are Non-Linear

  • How does stress affect productivity across different levels, indicating a potential curvilinear pattern? → Polynomial regression.

  • What is the nature of the relationship between income and happiness, and is it non-linear?

7. When Analysing Longitudinal or Panel Data

  • How does household income influence children's educational progress over time?

  • What is the relationship between employee satisfaction scores and turnover risk at various time points?

  • To what extent do government subsidies impact renewable energy adoption across years?

8. When Testing Indirect/Mediated Relationships

  • What is the mediating role of motivation in the relationship between leadership and performance?

  • How does self-esteem mediate the effect of social media use on well-being?

  • Through what mechanism (school choice) does parental education influence student achievement?

9. When Testing Moderation (Interaction Effects)

  • How does sleep quality moderate the relationship between study time and GPA?

  • In what way does age group alter the effect of leadership style on motivation?

  • To what extent does exercise frequency influence the association between stress and health outcomes?

10. When Validating or Replicating Past Research

  • To what extent do findings on income inequality from prior research generalize to more recent datasets?

  • How consistent are leadership predictors of turnover across different cultural settings?

  • Do established health risk factors demonstrate similar effects in a different national context?

4. Cluster Analysis

Definition

Cluster analysis is a statistical method used to group cases (people, objects, countries, behaviours) into clusters where members of a cluster are more similar to each other than to those in other clusters (Alonso-Betanzos and Bolón-Canedo, 2018). It’s exploratory , you don’t start with a hypothesis, but instead use the data to discover hidden structures.

Types of Cluster Analysis (connected to dissertations)

Hierarchical Clustering

Definition: An approach that builds nested clusters by either merging smaller clusters into larger ones (agglomerative) or splitting larger clusters into smaller ones (divisive), creating a tree-like structure called a dendrogram.

Example: Grouping universities based on research output to see natural tiers.

K-Means Clustering

Definition: Partitions cases into a pre-specified number of clusters (K) by minimizing within-cluster variance.

Example: Segmenting consumers into 4 lifestyle groups.

K-Medoids (PAM)

Definition: Similar to K-means but uses actual data points (medoids) as cluster centers, making it more robust to outliers.

Example: Grouping hospitals based on performance indicators without being skewed by outliers.

Two-Step Clustering

Definition: Handles large datasets with mixed data types (categorical + continuous) by first pre-clustering cases into subclusters, then grouping subclusters into final clusters.

Example: Grouping patients based on age, gender, and health markers.

Fuzzy Clustering

Definition: Allows cases to belong to multiple clusters with varying degrees of membership (probabilities).

Example: Students who partly belong to "high achiever" and "average" clusters.

Model-Based Clustering (Gaussian Mixture Models)

Definition: Uses probability distributions (e.g., Gaussian) to model clusters, handling overlapping group boundaries.

Example: Classifying customer purchase behaviour when group boundaries overlap.

Density-Based Clustering (DBSCAN, OPTICS)

Definition: Forms clusters based on dense regions of data points, identifying outliers as noise.

Example: Identifying unusual clusters of online fraud transactions.

Spectral Clustering

Definition: Uses graph theory and eigenvalues to detect non-linear cluster boundaries in complex data.

Example: Clustering complex social network communities.

Self-Organising Maps (SOMs, Neural Network Clustering)

Definition: An AI-based method that reduces dimensionality and maps cases onto a grid, preserving topological relationships.

Example: Grouping online learners into engagement types using clickstream data.

Biclustering/Co-clustering

Definition: Simultaneously clusters both rows and columns of a data matrix (e.g., genes and conditions).
Example: Grouping genes and conditions in bioinformatics dissertations.

When to Use Cluster Analysis in a Dissertation

1. When segmenting populations

  • Group consumers into clusters: budget buyers, quality seekers, brand loyalists.

  • Group students by learning styles: visual, practical, theoretical.

  • Group patients by symptoms: mild, moderate, severe.

2. When reducing complexity

  • Turn hundreds of survey variables into 3–4 meaningful clusters.

  • Reduce large social media user datasets into engagement types.

  • Collapse thousands of companies into clusters by innovation level.

3. When discovering hidden patterns (exploratory work)

  • Find clusters of political attitudes from survey data.

  • Identify clusters of social values across countries.

  • Detect clusters of employees based on job motivation profiles.

4. When informing interventions or strategies

  • Tailor teaching methods to “high stress” vs. “low stress” student clusters.

  • Create different marketing campaigns for “eco-conscious” vs. “price-driven” consumers.

  • Assign different healthcare treatments to patient risk clusters.

5. When validating or comparing groups

  • Compare clusters from 2023 vs. 2010 consumer data to see if groups have changed.

  • Compare political belief clusters across regions/countries.

  • Compare clusters in online vs. offline education datasets.

6. When detecting anomalies/outliers

  • Fraudulent transactions may form a distinct cluster.

  • Abnormal voting patterns may appear as outlier clusters.

  • Rare diseases may show up as small, unusual patient clusters.

7. When combining with other methods

  • Use regression after clustering to predict cluster membership.

  • Use cohort analysis inside clusters (e.g., track high-engagement vs. low-engagement student clusters over time).

  • Use qualitative follow-up interviews on members of each cluster to explain differences.

5. Cohort Analysis

Definition

Cohort analysis is a method where you track and compare groups of people (cohorts) who share a common characteristic or starting point over time (Wang and Kattan, 2020). Instead of treating all participants as one, you split them into cohorts (e.g., by year of birth, by first year at university, by date of joining a company) and examine how their behaviours, outcomes, or attitudes change.

Types of Cohort Analysis (and dissertation uses)

Birth Cohorts

Definition: Groups defined by year or decade of birth, allowing researchers to study generational differences and the impact of historical events on specific age groups.

Example: Comparing internet use patterns of Gen Z vs. Millennials.

Academic/Educational Cohorts

Definition: Groups defined by starting year in a course/program.

Example: Tracking 2019 vs. 2023 entrants at a university and comparing dropout rates.

Employment Cohorts

Definition: Groups defined by job entry year or position.

Example: Comparing job satisfaction between employees hired before and after remote-work policies.

Geographic Cohorts

Definition: Groups defined by location.

Example: Analysing health outcomes of urban vs. rural cohorts born in the same year.

Behavioural Cohorts

Definition: Groups defined by shared behaviour.

Example: Grouping users who joined an app in the same month and tracking retention.

Event-based Cohorts

Definition: Groups defined by exposure to a specific event.

Example: Comparing academic outcomes of students who started school before vs. during COVID-19.

Product/Consumer Cohorts

Definition: Groups defined by when they first used or purchased something.

Example: Tracking customers who subscribed to a service in January vs. July to see churn differences.

Policy Cohorts

Definition: Groups created by policy changes or reforms.

Example: Comparing patients treated before vs. after a new healthcare policy.

When to Use Cohort Analysis in a Dissertation

1. When comparing groups over time

  • How does stress change for students who entered university in 2019 vs. 2022?

  • Do employees hired before COVID-19 adapt differently to remote work than those hired after?

  • Do consumer spending habits differ for people who started using online shopping pre- vs. post-pandemic?

2. When examining generational or birth cohort effects

  • Do Millennials and Gen Z have different attitudes to sustainability?

  • Do baby boomers show different health behaviours than younger cohorts?

  • Do Gen Z students learn differently than Gen Y?

3. When measuring the impact of a major event or policy

  • Did students who experienced school closures during COVID-19 perform differently than earlier cohorts?

  • Did employees promoted under a new HR policy show different retention rates?

  • Did patients diagnosed before vs. after a health reform show better recovery?

4. When tracking long-term outcomes

  • How do graduates from the 2010 cohort compare to the 2020 cohort in terms of career success?

  • How do patients’ survival rates differ by treatment cohort year?

  • How do voter attitudes shift across election cycles?

5. When studying user or consumer behaviour

  • Do app users who joined in January stay active longer than those who joined in June?

  • Do customers who bought during sales events return more often than regular buyers?

  • Do first-time online shoppers from 2015 behave differently than those from 2022?

6. When combining with other analysis methods

  • Use regression within cohorts to predict outcomes.

  • Use cluster analysis to see how cohorts split into behavioural groups.

  • Use survival analysis to check how long each cohort persists before dropping out.

 

Conclusion

By now, you should have a clearer picture of how quantitative research can strengthen your dissertation and the tools available to analyse your data effectively. And remember, no dissertation is limited to one approach. If your study calls for exploring personal meanings and lived experiences, our Qualitative Research Methods guide will support you. If you need the best of both worlds, a Mixed Methods approach may be the way forward. If you ever feel stuck, whether it’s designing your questionnaire, running statistical tests, or writing up results, our experts at AssignmentHelp4Me are here to guide you at every step of your dissertation journey.

References

Alacaci, C. (2004). Inferential Statistics: Understanding Expert Knowledge and its Implications for Statistics Education. Journal of Statistics Education, 12(2). doi:https://doi.org/10.1080/10691898.2004.11910737.

Alonso-Betanzos, A. and Bolón-Canedo, V. (2018). Big-Data Analysis, Cluster Analysis, and Machine-Learning Approaches. Advances in Experimental Medicine and Biology, pp.607–626. doi:https://doi.org/10.1007/978-3-319-77932-4_37.

Bazen, A., Barg, F.K. and Takeshita, J. (2021). Research techniques made simple: An introduction to qualitative research. Journal of Investigative Dermatology, [online] 141(2), pp.241–247. doi:https://doi.org/10.1016/j.jid.2020.11.029.

Bowen, G. (2009). Document Analysis as a Qualitative Research Method. [online] ResearchGate. Available at: https://www.researchgate.net/publication/240807798_Document_Analysis_as_a_Qualitative_Research_Method.

Dibekulu, D. (2020). An Overview of Data Analysis and Interpretations in Research. [online] ResearchGate. doi:https://doi.org/10.14662/IJARER2020.015.

Dufour, I.F. and Richard, M.-C. (2019). Theorizing from secondary qualitative data: A comparison of two data analysis methods. Cogent Education, [online] 6(1). doi:https://doi.org/10.1080/2331186x.2019.1690265.

Fix, G.M., Kim, B., Ruben, M. and McCullough, M.B. (2022). Direct Observation methods: a Practical Guide for Health Researchers. PEC Innovation, 1(1), p.100036. doi:https://doi.org/10.1016/j.pecinn.2022.100036.

Harun, M. (2025). Audio-Visual Materials: Types, Applications, Benefits, Features, Needs, and Enhance the Learning Experience. [online] Library & Information Management. Available at: https://limbd.org/audio-visual-materials-types-applications-benefits-features-needs-and-enhance-the-learning-experience/#google_vignette [Accessed 8 Sep. 2025].

Herwanis, D., Alianur, M. and Nasir, K. (2025). Quantitative Research: Data Collection and Data Analysis for Beginner Researcher in Discussion. Quantitative Research: Data Collection and Data Analysis for Beginner Researcher in Discussion, [online] 1(3), pp.54–71. doi:https://doi.org/10.63142/educompassion.v1i3.88.

Jamshed, S. (2014). Qualitative research method-interviewing and observation. Journal of Basic and Clinical Pharmacy, 5(4), pp.87–88. doi:https://doi.org/10.4103/0976-0105.141942.

Kiger, M.E. and Varpio, L. (2020). Thematic Analysis of Qualitative data: AMEE Guide no. 131. Medical Teacher, [online] 42(8), pp.846–854. doi:https://doi.org/10.1080/0142159X.2020.1755030.

Knight, K.L. (2019). Study/Experimental/Research Design: Much More Than Statistics. Journal of Athletic Training, [online] 45(1), pp.98–100. doi:https://doi.org/10.4085/1062-6050-45.1.98.

Kumar, A. (2023). OBSERVATION METHOD. Library Philosophy and Practice, [online] 13(6), pp.1–14. Available at: https://www.researchgate.net/publication/360808469_OBSERVATION_METHOD.

Mathers, N., Fox, N.J. and Hunn, A. (2000). Using interviews in a research project. [online] ResearchGate. Available at: https://www.researchgate.net/publication/253117832_Using_Interviews_in_a_Research_Project.

Mc Grath-Lone, L., Jay, M.A., Blackburn, R., Gordon, E., Zylbersztejn, A., Wiljaars, L. and Gilbert, R. (2022). What makes administrative data research-ready? International Journal of Population Data Science, 7(1). doi:https://doi.org/10.23889/ijpds.v7i1.1718.

McLeod, S. (2024). Narrative Analysis In Qualitative Research. ResearchGate. [online] doi:https://doi.org/10.13140/RG.2.2.30491.07200.

Palmer, P.B. and O’Connell, D.G. (2009). Regression Analysis for Prediction: Understanding the Process. Cardiopulmonary Physical Therapy Journal, [online] 20(3), p.23. Available at: https://pmc.ncbi.nlm.nih.gov/articles/PMC2845248/.

Ponto, J. (2015). Understanding and Evaluating Survey Research. Journal of the Advanced Practitioner in Oncology, [online] 6(2), pp.168–171. Available at: https://pmc.ncbi.nlm.nih.gov/articles/PMC4601897/.

Rana, J., Gutierrez, P.L.L. and Oldroyd, J. (2021). Quantitative Methods. [online] ResearchGate. Available at: https://www.researchgate.net/publication/352356475_Quantitative_Methods.

Ranganathan, P. and Caduff, C. (2023). Designing and validating a research questionnaire. Perspectives in Clinical Research, [online] 14(3), pp.152–155. doi:https://doi.org/10.4103/picr.picr_140_23.

Rodd, J.M. (2024). Moving experimental psychology online: How to obtain high quality data when we can’t see our participants. Journal of Memory and Language, 134, pp.104472–104472. doi:https://doi.org/10.1016/j.jml.2023.104472.

Talja, S. (2025). Analyzing Qualitative Interview Data. Library & Information Science Research, 21(4), pp.459–477.

Teufel-Shone, N.I. and Williams, S. (2010). Focus Groups in Small Communities. Preventing Chronic Disease, [online] 7(3), p.A67. Available at: https://pmc.ncbi.nlm.nih.gov/articles/PMC2879999/.

Tie, Y.C., Birks, M. and Francis, K. (2019). Grounded Theory research: a Design Framework for Novice Researchers. SAGE Open Medicine, [online] 7(1), pp.1–8. doi:https://doi.org/10.1177/2050312118822927.

Wang, X. and Kattan, M.W. (2020). Cohort studies: Design, analysis, and reporting. CHEST, 158(1), pp.72–78. doi:https://doi.org/10.1016/j.chest.2020.03.014.

Y Sirilakshmi, Ashwini T, Bidyut Pritom Gogoi, Bhuyan, N. and Ramesh Chand Bunkar (2024). CONTENT ANALYSIS IN QUALITATIVE RESEARCH: IMPORTANCE AND APPLICATION. [online] ResearchGate. Available at: https://www.researchgate.net/publication/385973745_CONTENT_ANALYSIS_IN_QUALITATIVE_RESEARCH_IMPORTANCE_AND_APPLICATION.

Yellapu, V. (2022). Descriptive Statistics. [online] ResearchGate. Available at: https://www.researchgate.net/publication/327496870_Descriptive_statistics.

Zelkowitz, M.V. and Wallace, D.R. (2025). Experimental models for validating technology. Computer, [online] 31(5), pp.23–31. doi:https://doi.org/10.1109/2.675630.


Discover our latest blogs on how to write
Literature Review
Dissertation Writing Strategies

Research Question Examples

Methodology Tips, and more at AssignmentHelp4Me.