Module 2. Analysis Playbook
This page gives a step by step guide to analysing the data from the Cornerstone Community workshops and Questionnaire responses.
Credits:
The basis for this Analysis Playbook was developed by Kaitlyn Ashmore from Wilfrid Laurier University as part of the Canadian Government's Mitacs scheme.
The final outputs were co-developed with the Dark Matter Labs Cornerstone Indicator Team.
Introduction
This Playbook offers a guide for performing thematic analysis on your community data. It is designed to be used in conjunction with the Excel Analysis Template.
Thematic analysis is essentially a way of looking for patterns in data. This is achieved by coding the data which can sound more technical and complicated than it actually is. Coding really just means organising text that you are interested in (e.g. sentences extracted from conversations or written responses to a questionnaire) so that it is easier to understand. When you assign a code to a piece of text you are giving it a label that quickly sums up the meaning. This process reduces the volume of data that you need to work with and breaks it down into more manageable chunks.
The Excel Analysis template has been structured to guide you through the entire process, from data preparation and coding to identifying the outputs that you will use to create your Cornerstone Indicators. This template can be used on its own or can be complemented by using OpenAI’s ChatGPT. You might already be familiar with thematic analysis and have your own preferred methodology. If this is the case then you are welcome to apply that method instead. The aim is to complete the final worksheet of the template (‘Outputs Tab’) which feeds into Module 3 where you will design your Cornerstone Indicators.
Please note: open source software is readily available for thematic analysis. If your community has some technical expertise or access to a knowledgeable facilitator, software like QualCoder or Taguette can be customised to perform the analysis. The important thing is to develop the outputs listed below in a way that feels both practical and legitimate for your community’s needs.
Aims and outputs
The aim is to identify themes, factors, dimensions, and metrics that represent what thriving means to your community. In the Cornerstone Communities Framework we are using these terms to mean the following:
Headline Themes: an overarching area of interest or an outcome that is considered important by the community; for example wellbeing, environmental sustainability, economic stability and community resilience;
Factors: things that contribute to the thematic outcomes; for example good health and green behaviours;
Dimensions: elements that can be quantified and contribute to the factors; e.g. disease prevalence or active travel uptake.
Metrics: existing data points that represent the dimensions; for example mortality rates and cycle route utilisation rates.
Overview of the template structure
The Excel Template is organised into sections, each corresponding to a specific step in the thematic analysis process:
Summary: an overview and key for using the template.
Raw Data Sheets: inputting raw data (questionnaire and workshop data).
Coding Sheets: defining a coding scheme, coding the raw data and counting the codes.
Identifying themes, factors and metrics: grouping related codes into headline themes and then mapping to factors, dimensions and metrics.
Outputs: summarising the information ready for Module 3 (Indicator Design)
Analysis
Thematic Analysis: sorting the raw data and assigning codes
In the first phase of the analysis the aim is to consolidate all of the raw data from the community engagement and label it using codes. The sections that follow correspond to the sections in the Excel Data Analysis Template.
A. Intro Sheet
Purpose: provides a brief overview of the template and key to the worksheets.
Instructions: please review this sheet to understand the workflow. The idea is to follow the 15 steps in sequence, starting with the Raw Data sheets.
B. Raw Data Sheets
Purpose: to input the raw data collected from community participatory engagement (i.e, questionnaire responses and workshop outputs). Data should be cleaned and formatted before pasting to avoid errors in subsequent steps.
Instructions: please paste the raw text data into the appropriate worksheets:
WS1 EX1. Workshop 1 Exercise 1
WS1 EX2. Workshop 1 Exercise 2
WS1 EX3. Workshop 1 Exercise 3
WS1 FF. Workshop 1 Facilitator Feedback
WS2 EX1. Workshop 2 Exercise 1
WS2 EX2. Workshop 2 Exercise 2
WS2 FF. Workshop 2 Facilitator Feedback
Questionnaire. responses from the pre-workshop questionnaire.
C. Coding Scheme Sheet
Purpose: to create a coding scheme and community codebook, including parent codes and child codes. This scheme will be used to identify themes from the workshops and questionnaire responses.
Parent Codes: broad categories that represent repeating topics in the data (e.g. Environment or Jobs). They provide a high-level view of the main areas of interest.
Child Codes: more specific subcategories that sit under each parent code (e.g. Climate Change and Air Quality might sit under parent code Environment). The child codes capture detailed aspects of the broader topic.
Instructions: please follow the following steps to create your unique community codebook:
Review the raw data.
Look for broad topics or themes that seem to recur in the data. These could be major issues, concerns, or areas of interest that are important to the community or the context of your study. These initial themes can often serve as your first set of parent codes.
Based on the identified themes, create a list of parent codes that represent the broad categories within your data. These should be high-level and capture the main areas of interest.
For each parent code, identify more specific subtopics or details that are mentioned in the data. These become your child codes. Child codes provide a finer level of detail under each broader category (parent code). For example, if "Health" is a parent code, child codes might include "Mental Health," "Physical Health," "Access to Healthcare," etc. Coding is an iterative process. As you code more data, you may find that some codes need to be refined, combined, or split into new categories. It is important to stay open to modifying your codes as new themes and insights emerge from the data.
Once you have thought of your parent and child codes, place these into the Codebook called Community Codebook.
As you can see in the example below, Parent codes will be in the first row of the table, and the child codes will be in the associated columns of the table.
Understanding the Layout
Example Codebook: the example table (Rows 5 to 11) is already filled with sample data. Columns C and D contain some possible codes and the rest of the table is filled with a generic coding scheme.
Parent Codes (Row 5): These are columns B5 to G5 labeled as
Code_1
,Community_cohesion
,Health
,Code_4
,Code_5
,Code_6
.HINT: please ensure your parent codes have underscores (_) in place of spaces. The following steps will not work if this is the case.
Child Codes (Rows 6 to 11): These are the child codes under each parent code column (B6:B11 for
Code_1
, etc.).
Community Codebook for your use: please create your own Custom Codes for your community using the Community Codebook table. You can add more columns (Parent Codes) and rows (Child Codes) if you need them.
Parent Codes (Row 15): These are columns B14 to G14 labeled with placeholders like
Insert_Parent_Code_1
.Child Codes (Rows 16 to 21): These are the cells where community members will input their custom child codes.
Detailed steps to take to create your Community Code Book
Input Custom Community Codes in the Community Codebook table
Parent Codes: replace the placeholder text
Insert_Parent_Code_1
through toInsert_Parent_Code_6
in row 15 with community-specific parent codes. For example: B15 might becomeEnvironment
, C15:Health
, D15:Advocacy
, etc.Child Codes: next enter corresponding child codes under each parent code. For example under
Environment
(B16:B21) you might have:B16:
Climate Change
B17:
Air Quality
B18:
Biodiversity
B19:
Land Use
B20:
Waste Management
B21:
Environmental Education
Repeat similar entries under Health
(C16:C21), etc. Please note these parent code headings are just examples and you will create your own depending on what your community told you.
Create a coding table
In this step you will create named ranges using excel’s table function that make things easier when you start actually coding the data.
Highlight the cells that include all the new Parent and Child Codes that you just inserted in the Community Code Book.
Then go to
Insert > Table > select “My table has headers” > OK
You can select a table colour you would like and be sure to make the formatting the best fit for you to read and interact with.
Setting the table to be dynamic
The next step is to enable your table to be dynamic. This is so that excel can match the child codes to the parent codes on the Coding Workspace worksheet.
To do this, go to
Formulas > Create from Selection > Top Row > OK.
D. Coding Workspace Sheet
Purpose: to code the raw data using the coding scheme that you created in Section C. This will involve copy and pasting the text from your raw data worksheets and then assigning codes to the data. This process will help to systematically organise and interpret the community data. Using the Coding Worksheet, we can identify common topics emerging from the data.
Detailed steps to code your data
Set up the coding workspace (starting in row 10)
In the example coding space you can see that there are drop down menus for each row under the Parent Code and Child Code headings. The purpose of the following steps is to create drop-down menus for your Community Codes. You will do this using the data validation lists
function in Excel.
1. Step one: create data validation lists for the codes in your Community Codebook:
Parent Code
Highlight the cells under
Parent Code
in the live tableGo to the
Data
tab in the Excel Ribbon.Click on
Data Validation > Data Validation
.In the Settings tab, choose
List
from the Allow dropdown.In the Source box, enter or highlight the range for the parent codes (e.g.,
=Coding Scheme’!$B$15:$G$15
) from Community Codebook on the Coding Scheme worksheet. Press OK.
Now you should have the drop-down list for the entire Parent Code column in the Coding Workspace.
Child Code: next you can repeat the process with the Child Codes, and ensure they are properly associated with the Parent Codes. If you have any issues with this step there is a good worked example here.
Highlight the cells under
Child Code
in the live table.Go to the
Data
tab in the Excel Ribbon.Click on
Data Validation > Data Validation
.In the Settings tab, choose
List
from the Allow dropdown.In the Source box, you need to type
=INDIRECT
before you manually enter the first Parent Code in the table shown above that you have already linked to the Coding CodeBook Table.,=INDIRECT(C13)
. PressOK
.
The reason you are choosing cell C13 is because this is the first cell of your Parent Codes and will link back to the respective child codes on the Coding Scheme Tab. You should now have the drop-down list for the Child Code column in correspondence to the Parent Code you entered.
2. Step two: insert the raw data into the coding workspace
Please refer to the Community Codebook as you are completing this step. It can help to use the Descriptions and Examples of Parent and Child Codes table in the Coding Scheme worksheet (row 23 and down) as a way of keeping the process and definitions clear. For each raw data worksheet you will need to:
Insert the name into column A (Example: WS1 EX1 for data relating to Exercise 1 of Workshop 1)
Paste the text into the Text Segment that you want to code into column B.
3. Step three: assign codes to your text
In this step you will tag the text that you have entered in column B using your community codes.
Assign a Parent Code and Child Code to the text using the drop down menus that you have just created.
E. Coding Count Tables Sheet
Purpose: to provide an overview of the coded data with counts for each code. We suggest using charts to help visualise the codes that are most prevalent in the data.
Instructions:
1. Step one: create your pivot tables
The first step is to create two pivot tables using the data from the Coding Workspace Tab. One table is for the Parent Codes count and the other is for the Child Codes count.
Select the data range: Highlight the data that includes the
Parent Code
,Child Code
, and any other relevant columns (excluding headers). For example, when creating the pivot table for the Parent Codes, you will select the highlighted cells below (not including the header)
Insert the pivot table:
Go to the
Insert tab
on the ribbon.Click on
PivotTable
.In the dialog box, ensure the selected range is correct, and choose where to place the pivot table. You can just choose a single cell on the Coding Count Tables Sheet as the location.
Configure the Pivot Table Fields:
Parent codes
In the PivotTable Fields pane, drag Parent Code to the Rows area.
Drag either Parent Code to the Values area (this will count the occurrences).
Child Codes
In the PivotTable Fields pane, drag Parent Code to the Rows area.
Drag either Parent Code to the Values area (this will count the occurrences).
Adjust the Value Field Settings:
In the Values area, click the drop-down arrow next to the field (it may say "Count of Child Code" or similar).
Select Value Field Settings and make sure it is set to
Count
.
2. Step 2: create your pivot charts
This step is optional but can help to visually identify the most prevalent codes.
Place your cursor in the pivot table data.
Go to the Insert tab on the ribbon.
Click on the first bar chart option (under Help in the image below).
Please edit and change titles and headings as needed.
Hint: you can use the “Refresh All” function to update pivot tables after making any changes in the Coding Workspace.
Preparing the indicator data: identifying headline themes, contributing factors, underlying dimensions and existing metrics
This phase of the analysis takes the counted codes from Section 2.1 and uses them to prepare the inputs for the Cornerstone Indicator design (Module 3).
These steps include prompts that can be used in OpenAI’s ChatGPT to assist in the identification of headline themes, factors, dimensions and metrics. If you are planning to use ChatGPT, we would suggest providing it with definitions for terms like child and parent codes, themes, factors, dimensions, and metrics. These are summarised below for ease of reference:
Headline themes: An overarching area of interest or an outcome that is considered important by the community; for example holistic health, environmental sustainability, economic stability, community resilience;
Factors: things that contribute to thematic outcomes; for example being free from illness and green behaviours;
Dimensions: elements that can be quantified and contribute to the factors; e.g. disease prevalence or active travel uptake.
Metrics: existing data points that represent the dimensions; for example mortality rates and cycle route utilisation rates.
Parent Codes: broad categories that represent repeating topics in the data (e.g., "Environment"). They provide a high-level view of the main areas of interest.
Child Codes: More specific subcategories under each parent code (e.g., "Climate Change," "Air Quality" under parent code "Environment"). They capture detailed aspects of the broader topic.
F. Headline Themes Sheet
Purpose: to identify overarching themes in the coded data. The codes that you have been working with will already have signalled strong themes in the community data. In this step, the goal is to refine and condense the codes into 4-8 overarching themes. The number of themes is not rigid but if you have more or less than this, it will make it difficult to create your Cornerstone Indicators.
Headline themes: An overarching area of interest or an outcome that is considered important by the community.
Instructions:
Step one (optional): using ChatGPT to group codes:
This step is optional. It is included to help if you have a large number of codes and are finding it difficult to identify strong themes in your community data. If you have already noted strong themes from your data then you can move on to the next step.
Copy and Paste Your Codes: First, copy the Original Parent Codes and Original Child Codes from your Community Codebook (on the Coding Scheme worksheet) into ChatGPT.
Use the Following Prompt: "I have these parent and child codes:
[paste list of codes]
. Can you help me identify any common themes or similarities among these codes? I would like to group related codes into new parent and child categories."Review ChatGPT's Suggested Groupings: ChatGPT will suggest potential groupings for similar codes. It may propose new parent and child codes based on the relationships between the original codes.
Apply the Groupings: After reviewing the suggestions, update your Proposed Grouped Parent and Child Codes in the first table of the worksheet.
Note any suggested themes
Based on the new codes, note any headline themes that you feel are now suggested in column E
.
Using your top Parent and Child Codes to identify overarching themes
The second table on this worksheet can be used to look at your most prevalent Parent and Child Codes in one place. You could also do this visually (on paper or using software like Miro) if you find it easier. The aim is to identify 4-8 overarching themes from your coded data.
Copy and paste your top-5 Parent and Child Codes into columns
A
andC
respectively.Review the pasted codes and add suggested themes to columns
B
andD
. Often this will be the code itself (e.g. a code might be “Physical Safety”) but with some codes you will need to extrapolate a theme (e.g. a code might just say “safety” but from working with the data you understand that it refers to physical safety rather than emotional safety).Condense your workings into 4-8 overarching themes in column
F
.
G. Factors & Dimension Sheet
This worksheet is designed to help you identify potential contributing factors and measurable dimensions that relate to your themes. This can be done manually or you might find it helpful to use the ChatGPT prompts that are detailed below.
Factors: things that contribute to thematic outcomes; for example being free from illness and green behaviours.
Dimensions: elements that can be quantified and contribute to the factors; e.g. disease prevalence or active travel uptake.
Step one: Identify Potential Factors:
This step is subjective and there are no ‘right’ answers. The aim is to use your judgement based on your knowledge of the community, local context and the coding work that you have already completed.
Add your Headline Themes to column
B
.Look at your Codes to find recurring, related topics. For example, under the headline theme of "Environmental Health," codes like "Climate Change," "Air Quality," and "Biodiversity" could point to the factor of sustainable living patterns.
Suggested prompt if you are using ChatGPT:
"I have these parent codes under the headline theme 'Environmental Health': [list of parent codes]. What potential factors can these parent codes represent? Can you suggest relevant factors for these codes?"
"I have these child codes under the parent code 'Environment': [list of child codes]. What potential factors can these child codes represent? Can you suggest relevant factors for these codes?"
Step two: Identify related Dimensions:
Once again this step is subjective. The idea is to think about quantifiable elements that could track if a factor is being achieved. The aim is to identify ways of assessing the status of different factors.
Look at each factor in turn and try to break them down into measurable parts. For example, "sustainable living patterns" could have dimensions like Recycling Rates and Pollution Levels.
Suggested prompt if you are using ChatGPT:
"For the factor 'Environmental Sustainability,' can you suggest measurable dimensions? What specific aspects can I use to evaluate this factor?"
H. Metrics Sheet
The final step in the analysis process is to match existing metrics to the dimensions that you have identified. By doing this, you will create a foundational backbone for the Cornerstone Indicators that you will later design. For each Cornerstone Indicator you will have a list of underlying metrics (or data points), many of which will already be actively measured. Again, this step can be completed manually by mapping existing metric databases to your dimensions, or you can use ChatGPT to complete the process for you.
Metrics: existing data points that represent the dimensions; for example, mortality rates and Co2 emission levels.
Step one: set up the Metrics Sheet
Please enter the dimensions that you identified in the previous step into column
A
of the worksheet.
Step two: identify a contextually relevant set of existing metrics:
The aim here is to identify existing metric sets that are readily available and already in use in your specific context. As the Cornerstone Communities Framework is centred around what thriving means in your local context, wellbeing indexes can be a useful reference point. For example:
Do some high level research about the types of metrics that are used already in your region or country.
Select a metric dataset(s) that you feel is representative of your context.
Assign metrics to each of your dimensions
Please add at least one metric per dimension to the table.
We would recommend including the sources and any notes for future reference.
Suggested prompt if you are using ChatGPT:
For the dimension ['Air Quality,]' can you suggest specific metrics that would help measure this dimension?
What metrics might be appropriate to assess [' Water Management?]
What existing metrics are available regarding [insert dimension] in [insert location]?
This sheet exists to capture the outputs that will be used in Module 3 to design your Cornerstone Indicators.
Step one: collate your outputs from the analysis
Please enter your headline themes in row 5. Please add descriptions so that you can easily refer back to them.
Please enter the associated factors under each theme. This is currently set up in rows 8-18 but you can add additional rows if you need them.
Finally, enter your associated metrics in rows 20-30.
Step two: refine and validate:
Once you have completed the table, it is a good idea to step back from the details and do a sense check.
Scanning down each column, does the flow of information seem logical and consistent?
Can you clearly link why the factors under each theme are relevant?
Looking at the metrics, it is obvious why they would be important in measuring the factors and themes that they sit beneath?
You are now ready to move onto Module 3 and design your Cornerstone Indicators. Please refer to the Indicator Design Playbook in the Tools section of the Cornerstone website.
Last updated