Investigating Instructor Intervention in MOOCs forums

Kan Min-Yen

Overview

The goal is to identify the circumstances under which interventions tend to be more effective as well as the types of interventions that yield positive learning outcomes. With such results obtained from virtual discussion forum of different types of MOOCs, it would be possible to identify contingency factors arising from the type of MOOCs (e.g., science and technology MOOCs versus humanities and social sciences MOOCs) and the background of students (e.g., foundational versus advanced students) that can impact the effectiveness of various types of interventions. These results would, in turn, allow tools to be built to guide instructors (e.g., as part of a dashboard) so that they only intervene when necessary and with the right
intervention. We also plan to integrate the tool(s) developed to NUS integrated Virtual Learning Environment (IVLE) and the knowledge gleaned to help instructors at NUS mange IVLE’s discussion forum for their own courses.
Content analysis and computational linguistics methods require an enrichment of the discussion forum text. The bulk of our proposal thus centers around acquiring such annotated data. Since annotation efforts are costly in time and effort we divide our project implementation into three phases where each phase will involve annotation of incrementally larger samples of forum threads.
– Phase 0 (Pilot Phase): We will annotate 10 threads from discussion forums of one of the 3 MOOCs offered by NUS in 2014, hereafter referred to as NUS MOOCs. We have obtained NUS IRB approval to engage NUS and crowdsourced subjects to annotate the NUS MOOC data. We will use this phase to iron out operational issues that may crop up in recruiting, training annotators and annotating this small corpus. We further plan to analyse the annotated corpus and revise our annotation schema if necessary.  
– Phase 1 (NUS MOOCs): We will additionally annotate all threads from NUS MOOCs. We will measure key variables – both quantitative and qualitative – for learner performance from Coursera’s exported data.    We anticipate publishing significant correlations and findings observed between the interventions, their types and learning gains obtained through forum participation. We recognize that findings from NUS MOOCs may not generalize to other MOOCs as:
i) the annotated data is only a small sample relative to all other MOOCs, and as ii) the diversity in volume and interventions across different MOOCs. Therefore, we additionally seek to expand our dataset further in the next phase.  To save time and cost of annotation, we also plan to test if machine learning can automatically label threads using Phase 1 annotated data as seeds.  
We will start our analysis and modeling from Phase 0 to derive hypotheses to test at the larger scale on the Phase 1 data, in parallel to annotation.
– Phase 2 (Other Coursera MOOCs) This phase requires two modes of data collection. We will continue to annotate forum threads from MOOCs held at other universities that partner with Coursera. This requires consent and approval from those universities and Coursera. While Coursera has agreed we have also been advised to seek permission from the relevant universities directly. Efforts for this are currently underway, as we have worked with CDTL (Prof. Geertasema) colleagues to advertise our efforts with other universities.  We have some promising leads so far.
This phase also requires travel and work onsite at Coursera (Mountain View, CA) to collect and process data for other variables of interest studied during Phase 1. This is onsite work is necessary since Coursera cannot transfer MOOC student data outside partner universities. To this end, we seek travel support for two trips1. The second trip is planned for the end of the project to consolidate findings that might require additional data, and collaborate with Coursera to integrate tools developed into learning management systems including NUS IVLE and on the client side (e.g., web browser) of MOOC platforms.  
From the modeling and analyses done in Phase 1 (over the small Phase 0 data) we will test and deploy our methods over the Phase and Phase 2 data) concurrently with the Phase 2 annotation.
We understand that LIF‐T funding specifically forbids student field trip funding, but in this case, the student’s physical presence at Coursera is necessary and we seek exemption from the LIF‐T funding policy in this respect.