Jump to content
thirty bees forum
  • 0

robots.txt problem



12 answers to this question

Recommended Posts

  • 0

Hmm, weird. Looks like some sort of data corruption. Could you run this sql and paste the results here?

SELECT m.page, ml.url_rewrite, l.iso_code
FROM tb_meta m
INNER JOIN tb_meta_lang ml ON ml.id_meta = m.id_meta
INNER JOIN tb_lang l ON l.id_lang = ml.id_lang
WHERE l.active = 1 
  AND m.page IN ('addresses', 'address', 'authentication', 'cart', 'discount', 'footer', 'get-file', 'header', 'history', 'identity', 'images.inc', 'init', 'my-account', 'order', 'order-opc', 'order-slip', 'order-detail', 'order-follow', 'order-return', 'order-confirmation', 'pagination', 'password', 'pdf-invoice', 'pdf-order-return', 'pdf-order-slip', 'product-sort', 'search', 'statistics', 'attachment', 'guest-tracking')


Link to comment
Share on other sites

  • 0

You should run this sql statement to (somewhat) align your database with 1.1.0 version

UPDATE tb_meta 
SET configurable = 1 
WHERE page IN ('addresses', 'address', 'authentication', 'cart', 'discount', 'footer', 'get-file', 'header', 'history', 'identity', 'images.inc', 'init', 'my-account', 'order', 'order-opc', 'order-slip', 'order-detail', 'order-follow', 'order-return', 'order-confirmation', 'pagination', 'password', 'pdf-invoice', 'pdf-order-return', 'pdf-order-slip', 'product-sort', 'search', 'statistics', 'attachment', 'guest-tracking');

This will show the missing entries in the list. Then, you can edit them and assign friendly url to all missing entries (except the module-* ones, and index)

Then the robot generation will work correctly again. 

Link to comment
Share on other sites

  • 0
4 hours ago, SLiCK_303 said:

Any clue what might have caused this problem? 

It's hard to explain, because it was a result of few different things.

1) Theme references to page

Theme can reference different pages in its config.xml file to set appearance of left/right columns. This display layout settings is saved in tb_theme_meta table. But this table needs existing parent record in table tb_meta. There used to be a bug in 1.0.8 -- if tb_meta record didn't exists, then this layout display settings was not saved during theme installation. What this meant? Let's say that some theme specifies that page 'All reviews' of module 'revws' should be displayed with left column enabled. If you installed this theme before you revws module, then this settings would be lost during theme installation -- that's because this 'All reviews' is not yet known thirtybees at the time of theme installation. If you later install revws module, it's All reviews page will be displayed without left column. If you, however, first install revws module, than the result would be different. During theme installation, this page is already know, so the display preference would be stored. And this page would be displayed with left column.

As you can see, we got two different outcomes, depending on order of installation. That's not good.

So, in 1.1.x, there's a fix for this issue -- if the page is not known yet, new placeholder entry is created in table tb_meta. And that allows us to register display preference for this page, even before it's known to thirtybees. If you later install revws module, everything will work as expected. If you dont -- well, we only wasted one table row.

2) Missing entries for standard pages in tb_meta

There were many missing records in tb_meta table for standard controllers. That meant that you weren't able to specify left/right display column. But also, you weren't able to specify friendly url for these pages.

Again, in 1.1.x, these records were added to data pack -- if you install 1.1.x from scratch, all known controllers are represented by their tb_meta records. However, if you upgraded from 1.0.8, these controllers don't have these tb_meta records. But, because of #1, these records will be automatically created. But they are different than those created during installation. Most notable, these automatically generated records don't have friendly url assigned.

3) bug in robots.txt

When robots.txt file is generated, system use data from tb_meta table to genrate disallow rules. This mechanism always contained bugs, but they manifested differently

in 1.0.8 --> meta table didn't contain all records --> these missing records were NOT represented in robots.txt by respective disallow rules

in 1.1.x migrated from 1.0.8 --> because meta table now contains automatically generated placeholder entries for standard pages (with empty friendly url) --> this results in Disallow / rules

in 1.1.x installed from scratch --> meta table contains entries for all standard pages --> robots.txt have correct results



As you can see, this is a quite a mess. By fixing 2 long-standing bugs, another one is created / discovered. Unfortunatelly, this is the reallity of legacy software.


  • Like 3
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Create New...