SuiteCRM strips and mangles text e.g. in Calls #5189

Open
opened 2026-02-20 16:31:09 -05:00 by deekerman · 9 comments
Owner

Originally created by @rtpt-alexanderneumann on GitHub (Apr 26, 2024).

Issue

We frequently discuss HTML and/or JavaScript issues with clients. We copy emails as text into the body of a call. I've recently upgraded to 7.14.3 (from a much older version) and I've discovered that SuiteCRM will mangle the text saved into the description of a Call (likely elsewhere as well):

  • Text between angle brackets < and > is removed, e.g. the text <script>fetch('/x')</script> is silently removed
  • For email addresses, a mailto: link is inserted in an odd way: the original text <foo@example.com> is replaced with <<a href="mailto:foo@example.com">foo@example.com</a>>, which is rendered in verbatim so the mailto link cannot be clicked (screenshots below)

Expected Behavior

I expect that text that I enter into the description field of a call (and everywhere else, too) is saved into the database as I entered it and it is displayed again (as text) with removing any parts.

Actual Behavior

  • SuiteCRM silently removes anything that looks like HTML or JavaScript
  • SuiteCRM replaces email addresses with a broken mailto: link (screenshots below)

Possible Fix

  1. Do not remove anything from the user's input
  2. On output, ensure that all special characters are encoded e.g. as HTML entities (via htmlentities())

Steps to Reproduce

  1. Log into SuiteCRM demo instance (version 7.14.3 at the time of writing)

  2. Edit a Call, enter the following text as the description:

From: <foo@example.com>
Subject: How to call fetch?

Is this the right way?

<script>fetch('/x')</script>

Screenshot from 2024-04-26 09-36-57

  1. Click Save, then the text has changed, everything between < and > was silently removed and the meaning is completely different:
From:
Subject: How to call fetch?

Is this the right way?

Screenshot from 2024-04-26 09-37-12

  1. When the call is edited again, it can be seen that this is not a display issue, but the text has been removed completely:
    Screenshot from 2024-04-26 09-37-33

  2. Edit the call again, enter the following text then click Save:

From: foo <foo@example.com>
Subject: How to call fetch?

Is this the right way?

Screenshot from 2024-04-26 09-37-46

  1. The description is displayed, but the email address has been modified and the following text is displayed:
From: foo <<a href="mailto:foo@example.com">foo@example.com</a>>

Screenshot from 2024-04-26 09-38-00

  1. Edit the call again, the text field shows the modified text:
From: foo <<a href="mailto:foo@example.com">foo@example.com</a>>

Screenshot from 2024-04-26 09-38-08

Context

This issue makes it very hard to trust SuiteCRM to not remove/modify Data I enter into the description fields. It has modified several calls in the background, which has changed the meaning of the text completely (sample described above).

We oftentimes discuss issues with HTML and JavaScript with clients and track such issues and support requests within SuiteCRM.

Since we regularly find and exploit XSS vulnerabilities, I know quite a bit about mitigations. Removing < and > before saving user-provided text into the database is not the right solution. For example, it does not help at all when user input is inserted into an HTML attribute. Instead, great care must be taken to always run the appropriate function (e.g. htmlentities()), which SuiteCRM seems to do right already.

It's just the input filter that's wrong here.

For me, this is a high-priority issue as it changes text I entered.

Please let me know if I can help in any way!

Your Environment

  • SuiteCRM Version used: 7.14.3
  • Browser name and version (e.g. Chrome Version 51.0.2704.63 (64-bit)): Firefox 125.0.2 on Linux
  • Environment name and version (e.g. MySQL, PHP 7): MariaDB 10.5.22
  • Operating System and version (e.g Ubuntu 16.04): PHP 8.2.10
Originally created by @rtpt-alexanderneumann on GitHub (Apr 26, 2024). <!--- Please be aware that as of the 31st January 2022 we no longer support 7.10.x. New issues referring to 7.10.x will only be valid if applicable to 7.12.x and above. If your issue is still applicable in 7.12.x, please create the issue following the template below --> #### Issue We frequently discuss HTML and/or JavaScript issues with clients. We copy emails as text into the body of a call. I've recently upgraded to 7.14.3 (from a much older version) and I've discovered that SuiteCRM will mangle the text saved into the description of a Call (likely elsewhere as well): * Text between angle brackets `<` and `>` is removed, e.g. the text `<script>fetch('/x')</script>` is silently removed * For email addresses, a `mailto:` link is inserted in an odd way: the original text `<foo@example.com>` is replaced with `<<a href="mailto:foo@example.com">foo@example.com</a>>`, which is rendered in verbatim so the mailto link cannot be clicked (screenshots below) #### Expected Behavior I expect that text that I enter into the description field of a call (and everywhere else, too) is saved into the database as I entered it and it is displayed again (as text) with removing any parts. #### Actual Behavior * SuiteCRM silently removes anything that looks like HTML or JavaScript * SuiteCRM replaces email addresses with a broken `mailto:` link (screenshots below) #### Possible Fix 1. Do not remove anything from the user's input 2. On output, ensure that all special characters are encoded e.g. as HTML entities (via `htmlentities()`) #### Steps to Reproduce 1. Log into SuiteCRM demo instance (version 7.14.3 at the time of writing) 2. Edit a Call, enter the following text as the description: ``` From: <foo@example.com> Subject: How to call fetch? Is this the right way? <script>fetch('/x')</script> ``` ![Screenshot from 2024-04-26 09-36-57](https://github.com/salesagility/SuiteCRM/assets/62751754/276d9933-9bac-4eb2-b9f0-8c35478f92ef) 3. Click `Save`, then the text has changed, everything between `<` and `>` was silently removed and the meaning is completely different: ``` From: Subject: How to call fetch? Is this the right way? ``` ![Screenshot from 2024-04-26 09-37-12](https://github.com/salesagility/SuiteCRM/assets/62751754/888e6e18-d8ce-4e41-a034-92257a10f773) 4. When the call is edited again, it can be seen that this is not a display issue, but the text has been removed completely: ![Screenshot from 2024-04-26 09-37-33](https://github.com/salesagility/SuiteCRM/assets/62751754/abbcbd72-045c-448d-80a4-0a8172ad8b92) 5. Edit the call again, enter the following text then click `Save`: ``` From: foo <foo@example.com> Subject: How to call fetch? Is this the right way? ``` ![Screenshot from 2024-04-26 09-37-46](https://github.com/salesagility/SuiteCRM/assets/62751754/19722554-fa10-4c2a-b679-3a8b62470048) 6. The description is displayed, but the email address has been modified and the following text is displayed: ``` From: foo <<a href="mailto:foo@example.com">foo@example.com</a>> ``` ![Screenshot from 2024-04-26 09-38-00](https://github.com/salesagility/SuiteCRM/assets/62751754/64a95f34-ee15-4f25-ad5a-6bae697bee4d) 7. Edit the call again, the text field shows the modified text: ``` From: foo <<a href="mailto:foo@example.com">foo@example.com</a>> ``` ![Screenshot from 2024-04-26 09-38-08](https://github.com/salesagility/SuiteCRM/assets/62751754/ed11dcb9-31e0-4123-bec0-9bb58c2bbb51) #### Context This issue makes it very hard to trust SuiteCRM to not remove/modify Data I enter into the description fields. It has modified several calls in the background, which has changed the meaning of the text completely (sample described above). We oftentimes discuss issues with HTML and JavaScript with clients and track such issues and support requests within SuiteCRM. Since we regularly find and exploit XSS vulnerabilities, I know quite a bit about mitigations. Removing `<` and `>` before saving user-provided text into the database is not the right solution. For example, it does not help at all when user input is inserted into an HTML attribute. Instead, great care must be taken to always run the appropriate function (e.g. `htmlentities()`), which SuiteCRM seems to do right already. It's just the input filter that's wrong here. For me, this is a high-priority issue as it changes text I entered. Please let me know if I can help in any way! #### Your Environment <!--- Include as many relevant details about the environment you experienced the bug in --> * SuiteCRM Version used: 7.14.3 * Browser name and version (e.g. Chrome Version 51.0.2704.63 (64-bit)): Firefox 125.0.2 on Linux * Environment name and version (e.g. MySQL, PHP 7): MariaDB 10.5.22 * Operating System and version (e.g Ubuntu 16.04): PHP 8.2.10
Author
Owner

@rtpt-alexanderneumann commented on GitHub (Apr 26, 2024):

At least some of the code in question was added in https://github.com/salesagility/SuiteCRM/pull/6883 by @g-fantini. Can you maybe chime in? :)

@rtpt-alexanderneumann commented on GitHub (Apr 26, 2024): At least some of the code in question was added in https://github.com/salesagility/SuiteCRM/pull/6883 by @g-fantini. Can you maybe chime in? :)
Author
Owner

@rtpt-alexanderneumann commented on GitHub (Apr 26, 2024):

The following patch disables cleaning HTML from text fields (which is the right thing to do in my professional opinion), it does not solve removing script tags yet:

diff --git a/data/SugarBean.php b/data/SugarBean.php
index 4f9ca9f25..d6b346f86 100755
--- a/data/SugarBean.php
+++ b/data/SugarBean.php
@@ -2522,11 +2522,6 @@ class SugarBean

                 if (isset($def['type']) && ($def['type'] == 'html' || $def['type'] == 'longhtml')) {
                     $this->$key = purify_html($this->$key);
-                } elseif (
-                    (strpos((string) $type, 'char') !== false || strpos((string) $type, 'text') !== false || $type == 'enum') &&
-                    !empty($this->$key)
-                ) {
-                    $this->$key = purify_html($this->$key);
                 }
             }
         }

Shall I submit a PR?

@rtpt-alexanderneumann commented on GitHub (Apr 26, 2024): The following patch disables cleaning HTML from text fields (which is the right thing to do in my professional opinion), it does not solve removing `script` tags yet: ```diff diff --git a/data/SugarBean.php b/data/SugarBean.php index 4f9ca9f25..d6b346f86 100755 --- a/data/SugarBean.php +++ b/data/SugarBean.php @@ -2522,11 +2522,6 @@ class SugarBean if (isset($def['type']) && ($def['type'] == 'html' || $def['type'] == 'longhtml')) { $this->$key = purify_html($this->$key); - } elseif ( - (strpos((string) $type, 'char') !== false || strpos((string) $type, 'text') !== false || $type == 'enum') && - !empty($this->$key) - ) { - $this->$key = purify_html($this->$key); } } } ``` Shall I submit a PR?
Author
Owner

@pgorod commented on GitHub (Apr 26, 2024):

If you search for some of my many rants regarding the over-zealous clean-ups in SuiteCRM you can get an idea of the depths of this problem....

https://github.com/salesagility/SuiteCRM/pull/9127

Just for laughs, I found this in my notes from a few years ago, when trying to work around one of these problems. This is a description of a code flow within SuiteCRM:

include/entryPoint.php:98 clean_incoming_data changes HTML tags in $_REQUEST, saves originals in RAW_REQUEST

Later in Save, handleAttachmentsProcessImages is called in modules/EmailTemplates/EmailTemplateFormBase.php:171
which calls processImages, which uses from_html (from db_utils?) to reverse HMTL tags to examine img links etc, but doesn't save it unless there are images... (incongruent). 
At the end of that, calls save from EmailTemplate, which calls parent::save from SugarBean
Which calls cleanBean, overriden in modules/EmailTemplates/EmailTemplate.php:855
This encodes vars, calls parent::cleanBean, puts back variables de-encoded...
cleanBean in SugarBean iterates all fields, has "if" for some types, applies this jewel:
   $this->$key = htmlentities(SugarCleaner::cleanHtml($this->$key, true));
   this calls HtmlSanitizer.php, cleanHtml
   which starts by running html_entity_decode but does not use ENT_QUOTES, so misses &#039; (')
   if ($removeHtml === true) $clean_html = [$sugarCleaner]$purifier->purify($dirty_html_decoded);
   which throws away data
Save, goes into DBManager, uses from_html on each field, which uses html_entity_decode(with ENTQUOTES) and str_ireplace

I know that sounds depressing, there are just too many layers in there, and some major errors were made years ago... now people are afraid to revert them. But in practice we can often solve the issues just by overriding a save function and putting back RAW values.

What we should be doing is simply:

  • don't strip anything when going into the system (this is the opposite of what is currently done!)
  • escape stuff that is dangerous to the DB (SQL delimiters etc) right before going into the DB
  • escape stuff that is dangerous to JS right before going out to the browser
  • same for HTML
@pgorod commented on GitHub (Apr 26, 2024): If you search for some of my many rants regarding the over-zealous clean-ups in SuiteCRM you can get an idea of the depths of this problem.... https://github.com/salesagility/SuiteCRM/pull/9127 Just for laughs, I found this in my notes from a few years ago, when trying to work around one of these problems. This is a description of a code flow within SuiteCRM: ``` include/entryPoint.php:98 clean_incoming_data changes HTML tags in $_REQUEST, saves originals in RAW_REQUEST Later in Save, handleAttachmentsProcessImages is called in modules/EmailTemplates/EmailTemplateFormBase.php:171 which calls processImages, which uses from_html (from db_utils?) to reverse HMTL tags to examine img links etc, but doesn't save it unless there are images... (incongruent). At the end of that, calls save from EmailTemplate, which calls parent::save from SugarBean Which calls cleanBean, overriden in modules/EmailTemplates/EmailTemplate.php:855 This encodes vars, calls parent::cleanBean, puts back variables de-encoded... cleanBean in SugarBean iterates all fields, has "if" for some types, applies this jewel: $this->$key = htmlentities(SugarCleaner::cleanHtml($this->$key, true)); this calls HtmlSanitizer.php, cleanHtml which starts by running html_entity_decode but does not use ENT_QUOTES, so misses &#039; (') if ($removeHtml === true) $clean_html = [$sugarCleaner]$purifier->purify($dirty_html_decoded); which throws away data Save, goes into DBManager, uses from_html on each field, which uses html_entity_decode(with ENTQUOTES) and str_ireplace ``` I know that sounds depressing, there are just too many layers in there, and some major errors were made years ago... now people are afraid to revert them. But in practice we can often solve the issues just by overriding a save function and putting back RAW values. What we should be doing is simply: - don't strip anything when going into the system (this is the opposite of what is currently done!) - escape stuff that is dangerous to the DB (SQL delimiters etc) right before going into the DB - escape stuff that is dangerous to JS right before going out to the browser - same for HTML
Author
Owner

@rtpt-alexanderneumann commented on GitHub (Apr 26, 2024):

I feared that it may be a lot deeper, but was too hesitant to look into it (not to stare into the abyss too much). I agree on most points, except this one:

escape stuff that is dangerous to the DB (SQL delimiters etc) right before going into the DB

Use Prepared Statements instead :)

@rtpt-alexanderneumann commented on GitHub (Apr 26, 2024): I feared that it may be a lot deeper, but was too hesitant to look into it (not to stare into the abyss too much). I agree on most points, except this one: > escape stuff that is dangerous to the DB (SQL delimiters etc) right before going into the DB Use Prepared Statements instead :)
Author
Owner

@rtpt-alexanderneumann commented on GitHub (Apr 26, 2024):

Oh wow, I just found out that I (with a different account) reported the exakt same behavior back in 2017 https://github.com/salesagility/SuiteCRM/issues/4446

@rtpt-alexanderneumann commented on GitHub (Apr 26, 2024): Oh wow, I just found out that I (with a different account) reported the **exakt same behavior** back in 2017 https://github.com/salesagility/SuiteCRM/issues/4446
Author
Owner

@pgorod commented on GitHub (Apr 26, 2024):

I agree with the Prepared Statements, but that would be a very big change to apply everywhere in SuiteCRM code. The new v8 doesn't have any of these issues, I guess the way forward is just to progressively deprecate all of the old stuff. The new stuff naturally leads to much better practices (e.g. Symfony Doctrine)

@pgorod commented on GitHub (Apr 26, 2024): I agree with the Prepared Statements, but that would be a very big change to apply everywhere in SuiteCRM code. The new v8 doesn't have any of these issues, I guess the way forward is just to progressively deprecate all of the old stuff. The new stuff naturally leads to much better practices (e.g. Symfony Doctrine)
Author
Owner

@rtpt-alexanderneumann commented on GitHub (Apr 30, 2024):

The new v8 doesn't have any of these issues

Hm, I'm not sure I understand you correctly. The v8 demo instance also just removes everything between < and > in descriptions of calls (<script> tags, email addresses).

@rtpt-alexanderneumann commented on GitHub (Apr 30, 2024): > The new v8 doesn't have any of these issues Hm, I'm not sure I understand you correctly. The [v8 demo instance](https://suite8demo.suiteondemand.com) also just removes **everything** between `<` and `>` in descriptions of calls (`<script>` tags, email addresses).
Author
Owner

@pgorod commented on GitHub (May 1, 2024):

You're right, I guess I overstated things (a lot). The technologies in v8 are better and potentially provide a cleaner architecture and better code. But there are still many bits of v7 in the code flow and so these bugs still arise.

@pgorod commented on GitHub (May 1, 2024): You're right, I guess I overstated things (a lot). The _technologies_ in v8 are better and _potentially_ provide a cleaner architecture and better code. But there are still many bits of v7 in the code flow and so these bugs still arise.
Author
Owner

@chris001 commented on GitHub (May 2, 2024):

If the app would call mysqli_real_escape_string on the the user provided text while saving into the database, and before outputting to the page, first process the text with htmlspecialchars(), then the exact user-generated text would display on the page correctly, and this issue bug would be solved.

@chris001 commented on GitHub (May 2, 2024): If the app would call `mysqli_real_escape_string` on the the user provided text while saving into the database, and before outputting to the page, first process the text with `htmlspecialchars()`, then the exact user-generated text would display on the page correctly, and this issue bug would be solved.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
starred/SuiteCRM-SuiteCRM#5189
No description provided.