Introduction to gmoTree
Patricia F. Zauchner
2023-08-24 (updated: 2024-09-30)
Source:vignettes/intro_to_gmoTree.Rmd
intro_to_gmoTree.Rmd
Handling data from online experiments made with oTree (Chen et al., 2016) can be challenging, especially when dealing with complex experimental designs that span multiple sessions and return numerous files that must be combined. This is where the gmoTree package comes in. It helps streamline the data processing workflow by providing tools designed to import, merge, and manage data from oTree experiments.
Importing and cleaning up data
Background information on the data downloaded by oTree
An oTree experiment is structured around one or more modular units called “apps,” which encompass one or multiple “pages.” Data generated from each app can be downloaded individually, offering the flexibility to analyze separate components of the experiment. For an all-encompassing view of the experiment, the data from all apps can also be downloaded in a comprehensive file labeled “all_apps_wide.”
In addition to the aforementioned app data and the cumulative all_apps_wide file, oTree generates a file with time stamps for every page. A file documenting all chat interactions is also provided if the experiment includes one or multiple chat rooms. In newer oTree versions, also custom data can be downloaded.
When an oTree experiment is run across different databases, this set of data files is downloaded for each database. This would include individual app data files, the all_apps_wide file, a file for the time stamps of every page, and a chat log file if a chat room was used in the experiment.
The gmoTree package’s functionality lies in its ability to load and aggregate all of these files with ease.
import_otree()
We can import all oTree data and combine them into a list of data
frames using the import_otree()
function. Each data frame
is named according to its associated app, and the function generates an
accompanying info list that details essential information regarding the
imported files, such as any deleted cases. This information list is
updated as we use other functions within the package.
It is worth noting that even if we only use one all_apps_wide file,
we should still load the data with import_otree()
if we
want to access other functions within the gmoTree package.
Alternatively, we could reproduce the structure created by this function
by hand. The following example shows how to import oTree data, the
structure of the oTree list of data frames after importing the data, and
all of the information provided in oTree$info
.
# Get path to the package data
path <- system.file("extdata/exp_data_5.4.0", package = "gmoTree")
# Import without specifications
# Import all oTree files in this folder and its subfolders
otree <- import_otree(path = path)
## Warning in import_otree(path = path): You have stored all_apps_wide globally
## but also room-specific. This function will import both of them. (Globally, the
## files are saved as "all_apps_wide_." Room-specific, the files are saved as "All
## apps - wide-" or "all_apps_wide-.") After importing the data, make sure nothing
## is there twice! (Advice: You may use delete_duplicate() to remove duplicate
## rows of all oTree data frames.
# Check the structure of the oTree list of data frames
str(otree, 1)
## List of 8
## $ all_apps_wide:'data.frame': 12 obs. of 60 variables:
## $ info :List of 2
## $ chatapp :'data.frame': 8 obs. of 24 variables:
## $ dictator :'data.frame': 48 obs. of 25 variables:
## $ start :'data.frame': 8 obs. of 24 variables:
## $ survey :'data.frame': 8 obs. of 29 variables:
## $ Time :'data.frame': 77 obs. of 10 variables:
## $ Chats :'data.frame': 6 obs. of 7 variables:
# The initial info list
otree$info
## $imported_files
## [1] "C:/Users/pzauchner/AppData/Local/R/win-library/4.4/gmoTree/extdata/exp_data_5.4.0/ChatMessages-2023-05-16.csv"
## [2] "C:/Users/pzauchner/AppData/Local/R/win-library/4.4/gmoTree/extdata/exp_data_5.4.0/PageTimes-2023-05-16.csv"
## [3] "C:/Users/pzauchner/AppData/Local/R/win-library/4.4/gmoTree/extdata/exp_data_5.4.0/survey_2023-05-16.csv"
## [4] "C:/Users/pzauchner/AppData/Local/R/win-library/4.4/gmoTree/extdata/exp_data_5.4.0/start_2023-05-16.csv"
## [5] "C:/Users/pzauchner/AppData/Local/R/win-library/4.4/gmoTree/extdata/exp_data_5.4.0/dictator_2023-05-16.csv"
## [6] "C:/Users/pzauchner/AppData/Local/R/win-library/4.4/gmoTree/extdata/exp_data_5.4.0/dictator_2023-05-00.csv"
## [7] "C:/Users/pzauchner/AppData/Local/R/win-library/4.4/gmoTree/extdata/exp_data_5.4.0/chatapp_2023-05-16.csv"
## [8] "C:/Users/pzauchner/AppData/Local/R/win-library/4.4/gmoTree/extdata/exp_data_5.4.0/all_apps_wide_2023-05-16.csv"
## [9] "C:/Users/pzauchner/AppData/Local/R/win-library/4.4/gmoTree/extdata/exp_data_5.4.0/all_apps_wide-2023-05-16.csv"
##
## $initial_n
## [1] 12
Caution: This function only works if the oTree data is saved using the typical oTree file pattern!
delete_duplicate()
Sometimes, the same data is imported several times. This could happen
for several reasons. First, one data set might be part of another
because of the download of temporarily stored data before downloading
the final data frame. Second, if room-specific and global data is
imported, the data in the $all_apps_wide
data frame are
there two times. Third, the same data is stored in several imported
folders. The function delete_duplicate()
deletes duplicate
data from all apps and $all_apps_wide
. It, however, does
not change the $Time
and $Chats
data
frames.
Before running the function, let us first check the number of
participant codes and the initial count before executing the
delete_duplicate()
function. In the imported
$all_apps_wide
data frame, we have 12 participant codes.
However, among these, only 8 are unique, which indicates the presence of
duplicate data. The $info
data frame suggests that there
are initially 12 entries.
# Initial check before deletion
length(otree$all_apps_wide$participant.code)
## [1] 12
length(unique(otree$all_apps_wide$participant.code))
## [1] 8
otree$info$initial_n
## [1] 12
To remove these duplicates, we employ the
delete_duplicate()
function:
# Delete duplicate cases
otree <- delete_duplicate(otree)
Please note that details about the deleted rows are not added to a
list of deleted cases. This is because the list might be used for
analysis, and this function mainly focuses on cleaning up an untidy data
import. However, the count in $info$initial_n
is adjusted
accordingly.
After the deletion operation, we should find that all participant
codes are unique, and the count $info$initial_n
matches the
number of unique participant codes.
Dealing with messy Chats and Time data frames
If we combine data from experiments that ran on different versions of
oTree, it might happen that there are several variables referring to the
same attribute in the $Time
and in the $Chats
data frames. The functions messy_chat()
and
messy_time()
integrate the corresponding variables if used
with the argument combine = TRUE
.
To show an example, let us first load data from different versions of oTree.
# Import data from different oTree versions
otree_all <- import_otree(
path = system.file("extdata", package = "gmoTree"))
## Warning in import_otree(path = system.file("extdata", package = "gmoTree")):
## You have stored all_apps_wide globally but also room-specific. This function
## will import both of them. (Globally, the files are saved as "all_apps_wide_."
## Room-specific, the files are saved as "All apps - wide-" or "all_apps_wide-.")
## After importing the data, make sure nothing is there twice! (Advice: You may
## use delete_duplicate() to remove duplicate rows of all oTree data frames.
# Check names of Time data frame
names(otree_all$Time)
## [1] "session_code" "participant_id_in_session"
## [3] "participant_code" "page_index"
## [5] "app_name" "page_name"
## [7] "epoch_time_completed" "round_number"
## [9] "timeout_happened" "is_wait_page"
# Check names of Chats data frame
names(otree_all$Chats)
## [1] "session_code" "id_in_session" "participant_code" "channel"
## [5] "nickname" "body" "timestamp"
Now we can run the functions messy_time()
and
messy_chat()
. The warning messages are part of the expected
output, indicating precisely which variables were combined. There is no
need for concern when we see them. However, you can also turn them off
using info = FALSE
.
otree_all <- messy_time(otree_all,
combine = TRUE,
info = TRUE)
otree_all <- messy_chat(otree_all,
combine = TRUE,
info = TRUE)
# Check names of Time data frame again
names(otree_all$Time)
## [1] "session_code" "participant_id_in_session"
## [3] "participant_code" "page_index"
## [5] "app_name" "page_name"
## [7] "epoch_time_completed" "round_number"
## [9] "timeout_happened" "is_wait_page"
# Check names of Chats data frame again
names(otree_all$Chats)
## [1] "session_code" "id_in_session" "participant_code" "channel"
## [5] "nickname" "body" "timestamp"
Dealing with dropouts and deleting cases
show_dropouts()
Sometimes, participants drop out of experiments. To get an overview
of the dropouts, we can use the function show_dropouts()
.
It creates three data frames/tables with information on the participants
that did not finish at (a) certain app(s) or page(s).
First, the function show_dropouts()
creates a data frame
$full
that provides specific information on the apps and
pages where participants left the experiment prematurely. Additionally,
this data frame indicates which apps were affected by the participants
who dropped out.
# Show everyone that has not finished with the app "survey"
dropout_list <- show_dropouts(otree, "survey")
head(dropout_list$full)
## participant.code session.code end_app end_page
## 1 1k58kgm7 jk9ekpl0 dictator ResultsWaitPage
## 2 1k58kgm7 jk9ekpl0 dictator ResultsWaitPage
## 3 1k58kgm7 jk9ekpl0 dictator ResultsWaitPage
## 4 1k58kgm7 jk9ekpl0 dictator ResultsWaitPage
## 5 1k58kgm7 jk9ekpl0 dictator ResultsWaitPage
## 6 dhgisush 2wlrl5kb dictator Offer
## reason
## 1 Experiment not completed. Noticed at: all_apps_wide
## 2 Experiment not completed. Noticed at: chatapp
## 3 Experiment not completed. Noticed at: dictator
## 4 Experiment not completed. Noticed at: start
## 5 Experiment not completed. Noticed at: survey
## 6 Experiment not completed. Noticed at: all_apps_wide
Second, the function show_dropouts()
also generates a
smaller data frame $unique
that only includes information
on each person once.
dropout_list$unique
## participant.code session.code end_app end_page
## 1 1k58kgm7 jk9ekpl0 dictator ResultsWaitPage
## 6 dhgisush 2wlrl5kb dictator Offer
## 11 j2g9mcaf jk9ekpl0 dictator Introduction
## 16 p6m495xi 2wlrl5kb dictator Introduction
## reason
## 1 Experiment not completed
## 6 Experiment not completed
## 11 Experiment not completed
## 16 Experiment not completed
Third, the function show_dropouts()
furthermore creates
a table $all_end
, which contains information on all
participants and where they ended the experiment. The columns contain
the names of the pages of the experiment; the rows contain the names of
the apps.
dropout_list$all_end
##
## Demographics Introduction Offer ResultsWaitPage
## dictator 0 2 1 1
## survey 4 0 0 0
delete_dropouts()
The function delete_dropouts()
removes all data related
to participants who prematurely terminated the experiment from the data
frames in the oTree list, with the exception of data in the info list
and the $Chats
data frame. I highly recommend to manually
delete the chat data, because it can occasionally become unintelligible
once one person’s input is removed. Therefore, this function does not
delete the chat input of the participants who dropped out of the
experiment.
Before running the example, let us first check the row numbers of some data frames.
# First, check some row numbers
nrow(otree$all_apps_wide)
## [1] 8
nrow(otree$survey)
## [1] 8
nrow(otree$Time)
## [1] 77
nrow(otree$Chats)
## [1] 6
When we run the function delete_dropouts()
and check the
row numbers again, we see that cases were deleted in each data frame but
not in the $Chats
data frame.
# Delete all cases that didn't end the experiment on the page "Demographics"
# within the app "survey"
otree2 <- delete_dropouts(otree,
final_apps = "survey",
final_pages = "Demographics",
info = TRUE)
## 4 case(s) deleted
## Dropouts are deleted from all data frames. Except: The list of oTree data frames includes a chat. As the interpretation of chat data depends on how participants engage with each other, the data must be deleted with more care than deleting data in other apps. Hence, this function does not delete data in this data frame. Please do this manually if necessary!
# Check row numbers again
nrow(otree2$all_apps_wide)
## [1] 4
nrow(otree2$survey)
## [1] 4
nrow(otree2$Time)
## [1] 66
nrow(otree2$Chats)
## [1] 6
Just as show_dropouts()
, the
delete_dropouts()
function also gives detailed information
on all the deleted cases.
head(otree2$info$deleted_cases$full)
## participant.code session.code end_app end_page
## 1 1k58kgm7 jk9ekpl0 dictator ResultsWaitPage
## 2 1k58kgm7 jk9ekpl0 dictator ResultsWaitPage
## 3 1k58kgm7 jk9ekpl0 dictator ResultsWaitPage
## 4 1k58kgm7 jk9ekpl0 dictator ResultsWaitPage
## 5 1k58kgm7 jk9ekpl0 dictator ResultsWaitPage
## 6 dhgisush 2wlrl5kb dictator Offer
## reason
## 1 ENC. Noticed at: survey
## 2 ENC. Noticed at: start
## 3 ENC. Noticed at: dictator
## 4 ENC. Noticed at: chatapp
## 5 ENC. Noticed at: all_apps_wide
## 6 ENC. Noticed at: survey
otree2$info$deleted_cases$unique
## participant.code session.code end_app end_page reason
## 1 1k58kgm7 jk9ekpl0 dictator ResultsWaitPage ENC
## 6 dhgisush 2wlrl5kb dictator Offer ENC
## 11 j2g9mcaf jk9ekpl0 dictator Introduction ENC
## 16 p6m495xi 2wlrl5kb dictator Introduction ENC
otree2$info$deleted_cases$all_end
##
## Demographics Introduction Offer ResultsWaitPage
## dictator 0 2 1 1
## survey 4 0 0 0
Caution: This function does not delete any data from the original CSV and Excel files!
delete_cases()
Sometimes, participants ask for their data to be deleted. The
delete_cases()
function can delete a person from each app’s
data frame, $all_apps_wide
, and the $Time
data
frame. Again, data in the $Chats
data frame must be deleted
by hand.
# First, check some row numbers
nrow(otree2$all_apps_wide)
## [1] 4
nrow(otree2$survey)
## [1] 4
nrow(otree2$Time)
## [1] 66
nrow(otree2$Chats)
## [1] 6
# Delete one participant
person <- otree2$all_apps_wide$participant.code[1]
otree2 <- delete_cases(otree2,
pcodes = person,
reason = "requested",
saved_vars = "participant._index_in_pages",
info = TRUE)
## 1 case(s) deleted. Cases are deleted from all data frames. Except: The list of oTree data frames includes a chat. As the interpretation of chat data depends on how participants engage with each other, the data must be deleted with more care than deleting data in other apps. Hence, this function does not delete data in this data frame. Please do this manually if necessary!
# Check row numbers again
nrow(otree2$all_apps_wide)
## [1] 3
nrow(otree2$survey)
## [1] 3
nrow(otree2$Time)
## [1] 48
nrow(otree2$Chats)
## [1] 6
This function adds the information of all these deleted cases to the previously created information on all deleted cases.
# Check for all deleted cases (also dropouts):
tail(otree2$info$deleted_cases$full)
## participant.code session.code end_app end_page
## 16 p6m495xi 2wlrl5kb dictator Introduction
## 17 p6m495xi 2wlrl5kb dictator Introduction
## 18 p6m495xi 2wlrl5kb dictator Introduction
## 19 p6m495xi 2wlrl5kb dictator Introduction
## 20 p6m495xi 2wlrl5kb dictator Introduction
## 21 sdh9ar2m <NA>
## reason session participant._index_in_pages
## 16 ENC. Noticed at: survey <NA> NA
## 17 ENC. Noticed at: start <NA> NA
## 18 ENC. Noticed at: dictator <NA> NA
## 19 ENC. Noticed at: chatapp <NA> NA
## 20 ENC. Noticed at: all_apps_wide <NA> NA
## 21 requested jk9ekpl0 18
Caution: This function does not delete any data from the original CSV and Excel files!
delete_sessions()
While we certainly hope that it never becomes necessary, there may be
instances where an entire session needs to be removed from the data set
due to unforeseen issues. However, if that occurs, we can use the
function delete_sessions()
. This function removes not only
the sessions’ data in all apps, $all_apps_wide
, and the
$Time
data frame but also the sessions’ chat data in the
$Chats
data frame because chatting is usually restricted
within a session.
In the following, we see the row numbers before deletion, the
application of the function, and the row numbers after deletion. Apart
from the other functions, the sessions’ entries in the
$Chats
data frame are destroyed here since chat data occurs
just once per session and may thus be eliminated without impacting the
comprehensibility of the chat data.
# First, check some row numbers
nrow(otree2$all_apps_wide)
## [1] 3
nrow(otree2$survey)
## [1] 3
nrow(otree2$Time)
## [1] 48
nrow(otree2$Chats)
## [1] 6
# Delete one session
otree2 <- delete_sessions(otree,
scodes = "jk9ekpl0",
reason = "Only tests",
info = TRUE)
## 4 case(s) deleted
Deleting sensitive information
delete_pabels()
It is not uncommon for the participant.label variable to contain
sensitive information like an MTurk worker ID. This can raise serious
privacy concerns. The function delete_plabels()
automatically removes the variable from all data frames. Additionally,
the function has the option to delete all MTurk-related variables.
In the following, we see the application of the
delete_plabels()
function, preceded by information on the
sensitive variables before running the function and followed by
information on the sensitive variables after running the function.
# Check variables
head(otree2$all_apps_wide$participant.label)
## [1] "" "" "Person1" "Person2"
head(otree2$all_apps_wide$participant.mturk_worker_id)
## [1] NA NA NA NA
head(otree2$survey$participant.label)
## [1] "" "" "Person1" "Person2"
# Delete all participant labels
otree2 <- delete_plabels(otree2, del_mturk = TRUE)
# Check variables
head(otree2$all_apps_wide$participant.label)
## NULL
head(otree2$all_apps_wide$participant.mturk_worker_id)
## NULL
head(otree2$survey$participant.label)
## NULL
Caution: This function does not delete the variable from the original CSV and Excel files!
Making IDs
make_ids()
When working with oTree, participant codes, session codes, and group
IDs are used to identify the cases. However, often, researchers prefer a
streamlined, consecutive numbering system that spans all sessions,
beginning with the first participant, session, or group and concluding
with the last. The make_ids()
function provides a way to
achieve this goal. Before using the function, let us inspect the
underlying variables first.
# Check variables first
otree2$all_apps_wide$participant.code
## [1] "dhgisush" "p6m495xi" "c9inx5wl" "kr8yd7f3"
otree2$all_apps_wide$session.code
## [1] "2wlrl5kb" "2wlrl5kb" "mz2r27bk" "mz2r27bk"
otree2$all_apps_wide$dictator.1.group.id_in_subsession
## [1] 1 1 1 1
# Make session IDs only
otree2 <- make_ids(otree2)
This function returns the following variables.
# Check variables
otree2$all_apps_wide$participant_id
## [1] 1 2 3 4
otree2$all_apps_wide$session_id
## [1] 1 1 2 2
In the prior example, Group IDs were not calculated because group IDs
must be called specifically. Since the group IDs per app in our data do
not match (groups are only relevant in the dictator app), just using
group_id = TRUE
would lead to an error message.
For cases where the group IDs vary among apps, it can be specified in
make_ids()
which app or variable should be used for
extracting group information. For instance, the following syntax can be
used to obtain group IDs from the variable
dictator.1.group.id_in_subsession
in
$all_apps_wide
.
# Get IDs from "from_variable" in the data frame "all_apps_wide"
otree2 <- make_ids(otree2,
# gmake = TRUE, # Not necessary if from_var is not NULL
from_var = "dictator.1.group.id_in_subsession")
## Warning in make_ids(otree2, from_var = "dictator.1.group.id_in_subsession"):
## The group variable values are constant. Group IDs now correspond to session
## IDs.
# Check variables
otree2$all_apps_wide$participant_id
## [1] 1 2 3 4
otree2$all_apps_wide$group_id
## [1] 1 1 2 2
otree2$all_apps_wide$session_id
## [1] 1 1 2 2
Measuring the time
apptime()
If we need to determine how much time the participants spent on a
specific app, the apptime()
function is a powerful tool
that can help. This function calculates summary statistics such as the
mean, minimum, and maximum time spent on each page, as well as a
detailed list of durations for each participant in the app. The
following example shows how much time the participants spent on the app
“survey” in minutes.
# Calculate the time all participants spent on app "survey"
apptime(otree2, apps = "survey", digits = 3)
## $mean_duration
## [1] 0.242
##
## $min_duration
## [1] 0.233
##
## $max_duration
## [1] 0.25
##
## $single_durations
## participant session duration
## 1 c9inx5wl mz2r27bk 0.250
## 2 kr8yd7f3 mz2r27bk 0.233
##
## $messages
## [1] "For some participants, no duration could be calculated. See list in $warnings."
##
## $warnings
## [1] "dhgisush" "p6m495xi"
We can also get the time for specific participants only. Without specifying the applications, we get the duration for all applications individually.
# Calculate the time one participant spent on app "dictator"
apptime(otree2, pcode = "c9inx5wl", digits = 3)
## $chatapp
## [1] 0.267
##
## $dictator
## [1] 0.5
##
## $start
## [1] 0.017
##
## $survey
## [1] 0.25
extime()
If we need to determine how much time participants spent on the
complete experiment, we can use the extime()
function. This
function calculates summary statistics such as the mean, minimum, and
maximum time spent on the experiment, as well as a detailed list of
durations for each participant. (Note that these min, max, and mean
values only have two digits because of the underlying data.)
# Calculate the time that all participants spent on the experiment
extime(otree2, digits = 3)
## $mean_duration
## [1] 0.525
##
## $min_duration
## [1] 0.05
##
## $max_duration
## [1] 1.033
##
## $single_durations
## participant session duration
## 2 p6m495xi 2wlrl5kb 0.050
## 1 dhgisush 2wlrl5kb 0.117
## 4 kr8yd7f3 mz2r27bk 0.900
## 3 c9inx5wl mz2r27bk 1.033
We can also get the duration for just one participant.
# Calculate the time one participant spent on the experiment
extime(otree2, pcode = "c9inx5wl", digits = 3)
## [1] 1.033
pagesec()
The older versions of oTree included a variable called
seconds_on_page
in the $Time
data frame.
Although there is a good reason to omit it, we sometimes want to have
more detailed information on the time spent on one page. Therefore, I
created the function pagesec()
that adds a new variable
seconds_on_page2
to the $Time
data frame.
# Create two new columns: seconds_on_page2 and minutes_on_page
otree2 <- pagesec(otree2, rounded = TRUE, minutes = TRUE)
tail(otree2$Time)
## timeout_happened is_wait_page group_id session_id participant_id
## 72 0 0 2 2 3
## 73 0 0 2 2 4
## 74 0 0 2 2 4
## 75 0 0 2 2 3
## 76 0 0 2 2 4
## 77 0 0 2 2 3
## session_code participant_id_in_session participant_code page_index app_name
## 72 mz2r27bk 1 c9inx5wl 15 survey
## 73 mz2r27bk 2 kr8yd7f3 15 survey
## 74 mz2r27bk 2 kr8yd7f3 16 survey
## 75 mz2r27bk 1 c9inx5wl 16 survey
## 76 mz2r27bk 2 kr8yd7f3 17 survey
## 77 mz2r27bk 1 c9inx5wl 17 survey
## page_name epoch_time_completed round_number minutes_on_page
## 72 CognitiveReflectionTest 1684258104 1 0.05
## 73 CognitiveReflectionTest 1684258108 1 0.13
## 74 Offer 1684258109 1 0.02
## 75 Offer 1684258110 1 0.10
## 76 Demographics 1684258114 1 0.08
## 77 Demographics 1684258116 1 0.10
Transferring variables between the apps
assignv()
The function assignv()
copies a variable from the
$all_apps_wide
data frame to the data frames of all other
apps. In the following example, the variable
survey.1.player.gender
is copied from
$all_apps_wide
to all other app data frames; in all of
these data frames, the new variable is named gender
. It
also copies the variable to $all_apps_wide
to keep some
degree of consistency.
# Assign variable "survey.1.player.gender" and name it "gender"
otree2 <- assignv(oTree = otree2,
variable = "survey.1.player.gender",
newvar = "gender")
# Control
otree2$dictator$gender
## [1] "" "" "" "" "" "" "Female" "Male"
## [9] "Female" "Male" "Female" "Male"
otree2$chatapp$gender
## [1] "" "" "Female" "Male"
# In app "survey", the variable is now twice because it is taken from here
otree2$survey$gender
## [1] "" "" "Female" "Male"
otree2$survey$player.gender
## [1] "" "" "Female" "Male"
# In app "all_apps_wide," the variable is also there twice
# (This can be avoided by calling the new variable the same
# as the old variable)
otree2$all_apps_wide$gender
## [1] "" "" "Female" "Male"
otree2$all_apps_wide$survey.1.player.gender
## [1] "" "" "Female" "Male"
assignv_to_aaw()
The function assignv_to_aaw()
copies a variable from one
of the data frames to the $all_apps_wide
data frame. In the
following example, a variable from the $survey
data frame
is copied to $all_apps_wide
and placed directly after the
variable survey.1.player.age
.
# Create a new variable
otree2$survey$younger30 <- ifelse(otree2$survey$player.age < 30, 1, 0)
# Get variable younger30 from survey to all_apps_wide
# and put the new variable right behind the old age variable
otree2 <- assignv_to_aaw(otree2,
app = "survey",
variable = "younger30",
newvar = "younger30",
resafter = "survey.1.player.age")
# Control
otree2$all_apps_wide$survey.1.player.age
## [1] NA NA 66 33
# Check the position of the old age variable and the new variable
match("survey.1.player.age", colnames(otree2$all_apps_wide))
## [1] 52
match("younger30", colnames(otree2$all_apps_wide))
## [1] 53
Before running the experiment
show_constant()
When we program experiments, we frequently add variables that turn
out to be useless later. When we forget to remove them, especially in
experiments with numerous rounds, the data frame becomes unreasonably
huge. With the function show_constant()
, we may identify
variables that have no variation and remove them before the experiment.
In the following example, a variable named constant
is
created, which does not vary. The function show_constant()
shows us many variables that are also unchanging; however, we cannot
delete most of them because they are oTree internal. Yet, to prevent an
unreasonably large data frame, we should remove the variable
constant
before running the experiment.
# Make a constant column (this variable is usually created in oTree)
otree2$dictator$constant <- 3
# Show all columns that contain columns containing only one specified value
show_constant(oTree = otree2)
## $all_apps_wide
## [1] "participant._is_bot"
## [2] "participant._max_page_index"
## [3] "participant.visited"
## [4] "session.label"
## [5] "session.comment"
## [6] "session.config.real_world_currency_per_point"
## [7] "session.config.name"
## [8] "session.config.participation_fee"
## [9] "start.1.player.role"
## [10] "start.1.player.payoff"
## [11] "start.1.group.id_in_subsession"
## [12] "start.1.subsession.round_number"
## [13] "dictator.1.player.role"
## [14] "dictator.1.group.id_in_subsession"
## [15] "dictator.1.subsession.round_number"
## [16] "dictator.2.player.role"
## [17] "dictator.2.group.id_in_subsession"
## [18] "dictator.2.subsession.round_number"
## [19] "dictator.3.player.role"
## [20] "dictator.3.group.id_in_subsession"
## [21] "dictator.3.subsession.round_number"
## [22] "chatapp.1.player.role"
## [23] "chatapp.1.player.payoff"
## [24] "chatapp.1.group.id_in_subsession"
## [25] "chatapp.1.subsession.round_number"
## [26] "survey.1.player.role"
## [27] "survey.1.player.payoff"
## [28] "survey.1.group.id_in_subsession"
## [29] "survey.1.subsession.round_number"
##
## $chatapp
## [1] "session.comment" "participant._is_bot"
## [3] "participant._max_page_index" "participant.visited"
## [5] "player.role" "player.payoff"
## [7] "group.id_in_subsession" "subsession.round_number"
## [9] "session.label"
##
## $dictator
## [1] "session.comment" "participant._is_bot"
## [3] "participant._max_page_index" "participant.visited"
## [5] "player.role" "group.id_in_subsession"
## [7] "session.label" "constant"
##
## $start
## [1] "session.comment" "participant._is_bot"
## [3] "participant._max_page_index" "participant.visited"
## [5] "player.role" "player.payoff"
## [7] "group.id_in_subsession" "subsession.round_number"
## [9] "session.label"
##
## $survey
## [1] "session.comment" "participant._is_bot"
## [3] "participant._max_page_index" "participant.visited"
## [5] "player.role" "player.payoff"
## [7] "group.id_in_subsession" "subsession.round_number"
## [9] "session.label"
##
## $Time
## character(0)
##
## $Chats
## [1] "group_id" "session_id" "session_code" "channel"
codebook()
Thorough documentation of your experiment is essential. For detailed guidance on generating a codebook from your oTree code, refer to the vignette gmoTree Codebooks. I recommend creating this codebook before running the experiment, as it can help identify incomplete documentation or other potential issues within your code.
References
Chen, D. L., Schonger, M., & Wickens, C. (2016). oTree—An open-source platform for laboratory, online, and field experiments. Journal of Behavioral and Experimental Finance, 9, 88–97. https://doi.org/10.1016/j.jbef.2015.12.001