πŸ“‘
Autograding code structure in CodeGrade using Semgrep
In this guide we discuss how you can easily set up automatic structure testing for your CodeGrade assignments using a handy tool called semgrep and give you some concrete examples you can use in your own CodeGrade assignments!

This is a summary of a guide published on our Blog page, read the full guide here:

Autograding code structure using CodeGrade and Semgrep | CodeGrade Blog
Check out the link below for more information about Semgrep:
Semgrep
Learn more in our previous webinar on autograding code structure using Semgrep!

Real world scenarios

To better understand when you may want to automatically grade code structure and which problems we are solving in this blog, here are two real examples we received from computer science instructors:
  • For an Introduction to Programming in Python course, an instructor wants to effectively teach students the different types of loops. She has multiple assignments in the course, focusing on different loops. To force students to use a while-loop for one assignment and to use a for-loop for another one, she wants to automatically detect these structures in the code and deduct points if a loop is missing and if the wrong type of loop was used.
  • For numerous programming courses, instructors want to enforce good coding practices that are not caught by traditional linters. For this, they want to deduct points for common bad β€œspaghetti code” practices. In our example, we will automatically deduct points for Java code with too many if-statements.
Of course, there are endless ways you can apply semgrep to fit your own assignments or programming languages.

Semgrep

Traditional linters, like pylint for Python or eslint for JavaScript, are easily used in CodeGrade and great for general, broad language standards, but not for specific code structure checks. Semgrep is a tool that can do static code analysis on the structure of code, based on very simple patterns you provide it. Originally designed to find security vulnerabilities in code, Semgrep is an open-source tool by the software security company r2c (originally developed at Facebook) that supports many programming languages like Go, Java, JavaScript, Python and Ruby, with TypeScript, PHP and C currently being beta-tested.
Semgrep can also be used for Jupyter Notebooks, after converting the notebook to python code. Learn how to do that in our blog on grading Jupyter Notebooks.​
Semgrep makes it easy to perform more complex code analysis by allowing you to write rules in a human readable format. You can provide generic or language specific patterns, which are then found in the code. With its pattern syntax, you can find:
  • Equivalences: Matching code that means the same thing even though it looks different.
  • Wildcards / ellipsis (...): Matching any statement, expression or variable.
  • Metavariables ($X): Matching unknown expressions that you do not yet know what they will exactly look like, but want to be the same variable in multiple parts of your pattern.

Autograding code structure with semgrep in CodeGrade

CodeGrade has developed a wrapper for semgrep and is one of the many pre-installed unit testing frameworks available in CodeGrade's Unit Test step. This wrapper parses your semgrep tests so that each individual rule you define will show up as one specific Unit Test that can either pass or fail.
By default, all patterns and found matches in semgrep are considered errors. CodeGrade has added a feature that allows for β€œpositive matches”: when we define a pattern and expect to find it (e.g. if we enforce users to use a for-loop).
Setting up semgrep
Semgrep is installed in CodeGrade by default making setting up a semgrep AutoTest category incredibly simple. All that is required is to upload a YAML test file as a fixture. This test file will contain all the pattern recognition rules to test our students' code structure.
Detecting loops in Python
Now we can begin setting up our pattern matching rules. Semgrep has an online editor that can be used to check your patterns:
Semgrep
Here, we provide a solution to the examples introduced above. We have created a ruleset for the assignment for which we want to detect a for-loop in the student code, and make sure they do not use a while-loop.
1
rules:
2
- id: for-loop
3
match-expected: true
4
pattern: |
5
for $EL in $LST:
6
...
7
message: A for-loop was used
8
severity: INFO
9
languages:
10
- python
11
12
- id: no-while-loop
13
match-expected: false
14
pattern: |
15
while $COND:
16
...
17
message: No while-loop was used
18
severity: INFO
19
languages:
20
- python
Copied!
In this file, rules.yml, we define two rules: for-loop and no-while-loop. within these rules we define a few things:
  • Patterns
    • The ellipsis (...) is used to capture anything
    • metavariables $EL (element) and $LST (list) capture the two parts of the for-loop declaration (the naming of these metavariables is irrelevant and could have been anything else).
    • The same is done for the while loop condition $COND
  • Messages We are able to provide understandable messages that are parsed by the wrapper script making our tests understandable for our students.
  • Match-expected Importantly, we have added the match-expected field (this is added in the CodeGrade wrapper script and cannot be tested in regular semgrep), with putting this field to True for the for-loop rule, we specify that we are expecting a match in order to pass that test.
  • Severity This dictates the level of severity of failing the rule. The penalty for each severity level can be set in the Unit Test step.
  • languages Here we specify the language of the scripts which semgrep will be checking.
This ruleset will appear in CodeGrades AutoTest results as two separate test suites. The student will only get half the points should they fail one of the rules (e.g.: if they fail to use a for-loop in their code). The test suites use the message defined in the rule as their display name making it clear for students what is being assessed.
Semgrep results in a CodeGrade automatic test. Each rule is parsed seperately and the messages defined in the rules are displayed.
Detecting spaghetti code
For our final request, we would like to automatically deduct points for a common bad practice: too many nested if-statements. Structures like this are possible to catch with semgrep in CodeGrade:
1
rules:
2
- id: nested-if
3
match-expected: false
4
pattern: |
5
if (...) {
6
...
7
if (...) {
8
...
9
if (...) {
10
...
11
}
12
}
13
}
14
message: No more than 2 nested if-statements were used.
15
severity: INFO
16
languages:
17
- java
Copied!
We use the ellipses (the … in the pattern) here, so that any code can be in the condition and inside the blocks of the if-statements. As an if-statement itself can also be part of this, we automatically catch code that has more than three nested if-statements with this pattern too.