Runbox Grading Script

This document explains how to write a grading script for Elice.

Structure of Elice Grading

Elice grading is based on file uploads. Python scripts validate the submitted materials based on the files submitted by students. The student's submission is limited to a file size of 50MB after compressing all files for each submission. The scores and submissions are stored on the server for future reference.

Although any file type can be submitted, it is recommended to use the following types. In Python-based exercises, pickle is often used for object serialization. However, pickle usage is not recommended due to security reasons such as the ability for arbitrary code execution (refer to the official documentation).

JSON

This is the most recommended file format for most cases except for competitions.

CSV

This file format is recommended for competition submissions. For other cases, it is recommended to use JSON instead of CSV for both the submission file and the answer key file.

PNG

This is used when grading the results of libraries such as Matplotlib and Seaborn. It is recommended to calculate the similarity between images rather than checking if the images are exactly the same.

NumPy array (.npy)

This is used when grading functions that return NumPy arrays directly.

Writing a Grading Script

When Elice receives a grading request, it executes the grader.py script included in the uploaded zip file on the Elice creation/management page. The Grader script can access the submitted materials that were submitted along with the grading request from /mnt/elice/userfile. It can validate these submissions using a Python script and return scores and messages to the students.

In addition to grader.py, other resources necessary for grading can be included in the zip file. For example, if you want to compare the submissions with a reference file, you can include the reference file in the zip file. The files included in the zip file can only be accessed by the grading script in /mnt/elice/grader, and students cannot access them.

  • For the score, print the score written in UTF-8 encoding to the /var/run/elice/grade_score file. Both integer and float scores are supported.

  • For the message, print the message written in UTF-8 encoding to the /var/run/elice/grade_message file to provide feedback to the students.

Here is an example of the simplest grading script:

import json
from scipy.spatial.distance import cosine

score = None
message = ""

try:
    with open("/mnt/elice/grader/reference.json", "r") as f:
        reference = json.load(f)
    with open("/mnt/elice/userfile/result.json", "r") as f:
        user_result = json.load(f)

    similarity = cosine(
        reference,
        user_result['submission']
    )

    score = similarity
    if similarity < 0.1:
        score = "100"
        message = "good job!"
    elif user_result < 0.5:
        score = score = 100 * (1 - (similarity - 0.1) / 0.4)
        score = f"{score:.3f}"
        message = "well done"
    else:
        score = "0"
        message = "try again"

except FileNotFoundError:
    score = "0"
    message = "missing required files"

except Exception:
    score = "0"
    message = "something goes wrong :("

finally:
    if score is not None:
        with open("/var/run/elice/grade_score", "w") as f:
            f.write(str(score))

    with open("/var/run/elice/grade_message", "w") as f:
        f.write(message)

Running the Grading Script

Once the grading script is configured, you can run the grading by executing the following command inside the Elice workspace:

elice_grade result.json code.ipynb

For JupyterLab-based exercises, it is recommended to provide the command within the learning material in advance, so that learners can easily execute the command. For Orange3 and VSCode-based exercises, a grading script execution UI is provided.

Complex Grading Examples

  • You can receive multiple files at once and conduct weighted grading for each problem.

  • For example, in exercises that generate graphs or images, you can accept images or binaries for grading.

  • You can accept a simple machine learning model itself for grading and conduct grading based on the model's inference result.

    • Please note that grading uses only the CPU regardless of the student's runtime.

Last updated