Jump to content
thirty bees forum
  • 0

HTMLPurifier's purpose


Jeffrey de Bruijn

Question

Hello,

I am running a basic thirty bees install with Niara as theme,

Niara is based on Bootstrap and comes with the vendor collapse and transition javascript plugins, so the accordion example found here should work fine.

I go ahead and edit the CMS page I want to edit, paste this code in

<div class="panel-group" id="accordion" role="tablist" aria-multiselectable="true">
  <div class="panel panel-default">
    <div class="panel-heading" role="tab" id="headingOne">
      <h4 class="panel-title">
        <a role="button" data-toggle="collapse" data-parent="#accordion" href="#collapseOne" aria-expanded="true" aria-controls="collapseOne">
          Collapsible Group Item #1
        </a>
      </h4>
    </div>
    <div id="collapseOne" class="panel-collapse collapse in" role="tabpanel" aria-labelledby="headingOne">
      <div class="panel-body">
        Anim pariatur cliche reprehenderit, enim eiusmod high life accusamus terry richardson ad squid. 3 wolf moon officia aute, non cupidatat skateboard dolor brunch. Food truck quinoa nesciunt laborum eiusmod. Brunch 3 wolf moon tempor, sunt aliqua put a bird on it squid single-origin coffee nulla assumenda shoreditch et. Nihil anim keffiyeh helvetica, craft beer labore wes anderson cred nesciunt sapiente ea proident. Ad vegan excepteur butcher vice lomo. Leggings occaecat craft beer farm-to-table, raw denim aesthetic synth nesciunt you probably haven't heard of them accusamus labore sustainable VHS.
      </div>
    </div>
  </div>
  <div class="panel panel-default">
    <div class="panel-heading" role="tab" id="headingTwo">
      <h4 class="panel-title">
        <a class="collapsed" role="button" data-toggle="collapse" data-parent="#accordion" href="#collapseTwo" aria-expanded="false" aria-controls="collapseTwo">
          Collapsible Group Item #2
        </a>
      </h4>
    </div>
    <div id="collapseTwo" class="panel-collapse collapse" role="tabpanel" aria-labelledby="headingTwo">
      <div class="panel-body">
        Anim pariatur cliche reprehenderit, enim eiusmod high life accusamus terry richardson ad squid. 3 wolf moon officia aute, non cupidatat skateboard dolor brunch. Food truck quinoa nesciunt laborum eiusmod. Brunch 3 wolf moon tempor, sunt aliqua put a bird on it squid single-origin coffee nulla assumenda shoreditch et. Nihil anim keffiyeh helvetica, craft beer labore wes anderson cred nesciunt sapiente ea proident. Ad vegan excepteur butcher vice lomo. Leggings occaecat craft beer farm-to-table, raw denim aesthetic synth nesciunt you probably haven't heard of them accusamus labore sustainable VHS.
      </div>
    </div>
  </div>
  <div class="panel panel-default">
    <div class="panel-heading" role="tab" id="headingThree">
      <h4 class="panel-title">
        <a class="collapsed" role="button" data-toggle="collapse" data-parent="#accordion" href="#collapseThree" aria-expanded="false" aria-controls="collapseThree">
          Collapsible Group Item #3
        </a>
      </h4>
    </div>
    <div id="collapseThree" class="panel-collapse collapse" role="tabpanel" aria-labelledby="headingThree">
      <div class="panel-body">
        Anim pariatur cliche reprehenderit, enim eiusmod high life accusamus terry richardson ad squid. 3 wolf moon officia aute, non cupidatat skateboard dolor brunch. Food truck quinoa nesciunt laborum eiusmod. Brunch 3 wolf moon tempor, sunt aliqua put a bird on it squid single-origin coffee nulla assumenda shoreditch et. Nihil anim keffiyeh helvetica, craft beer labore wes anderson cred nesciunt sapiente ea proident. Ad vegan excepteur butcher vice lomo. Leggings occaecat craft beer farm-to-table, raw denim aesthetic synth nesciunt you probably haven't heard of them accusamus labore sustainable VHS.
      </div>
    </div>
  </div>
</div>

And as soon as I save the changes, the code that is actually saved is this:

<div class="panel-group" id="accordion">
<div class="panel panel-default">
<div class="panel-heading" id="headingOne">
<h4 class="panel-title"><a href="#collapseOne"> Collapsible Group Item #1 </a></h4>
</div>
<div id="collapseOne" class="panel-collapse collapse in">
<div class="panel-body">Anim pariatur cliche reprehenderit, enim eiusmod high life accusamus terry richardson ad squid. 3 wolf moon officia aute, non cupidatat skateboard dolor brunch. Food truck quinoa nesciunt laborum eiusmod. Brunch 3 wolf moon tempor, sunt aliqua put a bird on it squid single-origin coffee nulla assumenda shoreditch et. Nihil anim keffiyeh helvetica, craft beer labore wes anderson cred nesciunt sapiente ea proident. Ad vegan excepteur butcher vice lomo. Leggings occaecat craft beer farm-to-table, raw denim aesthetic synth nesciunt you probably haven't heard of them accusamus labore sustainable VHS.</div>
</div>
</div>
<div class="panel panel-default">
<div class="panel-heading" id="headingTwo">
<h4 class="panel-title"><a class="collapsed" href="#collapseTwo"> Collapsible Group Item #2 </a></h4>
</div>
<div id="collapseTwo" class="panel-collapse collapse">
<div class="panel-body">Anim pariatur cliche reprehenderit, enim eiusmod high life accusamus terry richardson ad squid. 3 wolf moon officia aute, non cupidatat skateboard dolor brunch. Food truck quinoa nesciunt laborum eiusmod. Brunch 3 wolf moon tempor, sunt aliqua put a bird on it squid single-origin coffee nulla assumenda shoreditch et. Nihil anim keffiyeh helvetica, craft beer labore wes anderson cred nesciunt sapiente ea proident. Ad vegan excepteur butcher vice lomo. Leggings occaecat craft beer farm-to-table, raw denim aesthetic synth nesciunt you probably haven't heard of them accusamus labore sustainable VHS.</div>
</div>
</div>
<div class="panel panel-default">
<div class="panel-heading" id="headingThree">
<h4 class="panel-title"><a class="collapsed" href="#collapseThree"> Collapsible Group Item #3 </a></h4>
</div>
<div id="collapseThree" class="panel-collapse collapse">
<div class="panel-body">Anim pariatur cliche reprehenderit, enim eiusmod high life accusamus terry richardson ad squid. 3 wolf moon officia aute, non cupidatat skateboard dolor brunch. Food truck quinoa nesciunt laborum eiusmod. Brunch 3 wolf moon tempor, sunt aliqua put a bird on it squid single-origin coffee nulla assumenda shoreditch et. Nihil anim keffiyeh helvetica, craft beer labore wes anderson cred nesciunt sapiente ea proident. Ad vegan excepteur butcher vice lomo. Leggings occaecat craft beer farm-to-table, raw denim aesthetic synth nesciunt you probably haven't heard of them accusamus labore sustainable VHS.</div>
</div>
</div>
</div>

A lot of attributes are stripped from the html tags which makes the accordion not work at all.

How do I work around this issue?

EDIT:

Disabling the Use HTMLPurifier Library option solves the issue.

What exactly is the purpose of HTMLPurifier? What kind of security vulnerability is it supposed to protect against?

Edited by Jeffrey de Bruijn
Edited title because I believe I have found the solution (to the original issue) on my own.
Link to comment
Share on other sites

8 answers to this question

Recommended Posts

  • 0
9 minutes ago, Jeffrey de Bruijn said:

I know what it does, but what is the point of it in thirty bees?

the same as main idea behind HTML Puriefier. it's made optional because in some cases it purifies too much. so if it's not good for you, just turn it off.

Quote

HTML Purifier will not only remove all malicious code (better known as XSS) with a thoroughly audited, secure yet permissive whitelist, it will also make sure your documents are standards compliant, something only achievable with a comprehensive knowledge of W3C's specifications.

 

Link to comment
Share on other sites

  • 0
14 hours ago, cienislaw said:

the same as main idea behind HTML Puriefier. it's made optional because in some cases it purifies too much. so if it's not good for you, just turn it off.

 

But is this assuming that the shop owner could enter malicious code, or is it to prevent somebody else from entering malicious code? I don't understand that. If it is there for "safety" there must be a case in which safety could be compromised, and I don't understand which that case could be.

Link to comment
Share on other sites

  • 0
5 minutes ago, Jeffrey de Bruijn said:

But is this assuming that the shop owner could enter malicious code, or is it to prevent somebody else from entering malicious code? I don't understand that. If it is there for "safety" there must be a case in which safety could be compromised, and I don't understand which that case could be.

This is generic mechanism that is utilised across multiple areas. It can be used in back office forms, or even by some front office features (modules).

Even in back office use, it's always good idea to be cautious. Shop owner could copy and paste some html code, a code that could contain javascript or css. This javascript would make it to shop front office, and that's quite severe security issue. Even if the js is not an attack, it can easily break your pages by throwing javascript errors, and thus preventing your own javascript code to work properly. Similarly, any css inline code can very easily make your pages unusable.

While WYSIWYG editors are useful, the backend php should never trust the input. Sanitization of input can help you very much.

If you need to enter special html markup into your 'texts', then it's a very good indication that you are doing something wrong. You should modify your theme templates instead, and keep your content clean - just text with some basic formatting options.

Link to comment
Share on other sites

  • 0

 

On 3/1/2022 at 12:08 PM, datakick said:

This is generic mechanism that is utilised across multiple areas. It can be used in back office forms, or even by some front office features (modules).

Even in back office use, it's always good idea to be cautious. Shop owner could copy and paste some html code, a code that could contain javascript or css. This javascript would make it to shop front office, and that's quite severe security issue. Even if the js is not an attack, it can easily break your pages by throwing javascript errors, and thus preventing your own javascript code to work properly. Similarly, any css inline code can very easily make your pages unusable.

While WYSIWYG editors are useful, the backend php should never trust the input. Sanitization of input can help you very much.

So this is also used in the front office, than yes it makes a lot of sense to prevent users from entering anything that isn't simple text. And I think I also understand that the use in the backoffice is to prevent users that aren't knowledgeable from entering dangerous code. Thank you for the explanation.

On 3/1/2022 at 12:08 PM, datakick said:

If you need to enter special html markup into your 'texts', then it's a very good indication that you are doing something wrong. You should modify your theme templates instead, and keep your content clean - just text with some basic formatting options.

In this situation, I'm trying to create a FAQ page that is a CMS page with the code above.

I entered the markup in the editor because the sample pages that come with the clean thirty bees installation suggest that this is how it should be done: the default "about-us" page comes with markup for a 3-column layout and the only way to create these 3 columns seems to be by entering markup in the editor. So I did not expect HTMLpurifier to also purify code added by the shop owner.

I am interested in that suggestion of yours, and I welcome the suggestion that I am not doing something wrong because I most certainly am not an expert of thirty bees. So thank you for suggesting that course of action.

I thought about how to go about this problem by using a dedicated template file for this specific page. This is what I would do:

  • Override CmsController.php to load a specific template for the FAQ page
  • Write the code above in the template, replace the content lines with the smarty {l s="content"}
  • Manage the translations from the translations menu in the backoffice
  • The content of the page in the CMS menu will be left blank

However, assuming that the questions of the FAQ were to be single paragraphs, the answers would be multiple paragraphs, with links, and subject to frequent changes (so likely a variable paragraph count). In a template then I'd have to decide in advance the number of paragraphs and paragraph structure (because of the presence of links) and editing this content-now-template becomes quite a bit more complex compared to just editing the CMS page content in the CMS menu.

Is there a better way to handle this, without disabling HTMLPurifier, that I am not aware of?

Edited by Jeffrey de Bruijn
Added thoughts on editing .tpl
Link to comment
Share on other sites

  • 0

HTMLpurifier strips picture tags too. The following

<picture>
    <source type="image/avif" srcset="/themes/themename/img/test.avif" class="picture-responsive" alt="Responsive image">
    <img src="/themes/themename/img/test.jpeg" class="img-responsive" alt="Responsive image">
</picture>

Becomes

<img src="/themes/themename/img/test.jpeg" class="img-responsive" alt="Responsive image">

I think this is a bit excessive. I read on the project's website that there should be a whitelist, how can I manage that whitelist in thirty bees?

Edited by Jeffrey de Bruijn
Correct name
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...