Jump to content
thirty bees forum

dynambee

Members
  • Posts

    837
  • Joined

  • Last visited

  • Days Won

    5

Posts posted by dynambee

  1. 2 minutes ago, yaniv14 said:

    I am not familiar with Mollie module, but all the payment modules I am aware of don't really use the next available order number when using redirect method to payment gateways, they just send the cart id and expect to get it back when being redirected back to TB.

    I am also not familiar with Mollie. However if the TB order number is showing up in the PayPal payment then it must be being passed from TB to PayPal by the Mollie module. I do not see this order number showing up in PayPal with the TB PayPal module so I don't think the same thing is happening with the TB module.

    @30knees, can you confirm that it is the TB order number that is appearing in the PayPal purchase details? Do you know if it is the id_order value or if it is the order reference value?

  2. On 6/12/2019 at 5:24 AM, 30knees said:

    Just a note that this happened again.

    I have a Paypal paid order that isn't in the shop.

    Paypal shows order number 123 but the store order number 123 belongs to a manually created order 123. Both orders were placed approximately at the same time. I guess some fluke meeting of orders?

    I think @musicmaster described the reason this happens pretty well. If the cart of the missing order isn't converted into an order in TB then the next order created will get the order number that was supposed to be used for the "missing" order.

    However this brings up another question, and another possible reason this could be happening: It takes time for people to complete checkout with PayPal. On the short side of things 10-20 seconds but in some cases it could take several minutes. (Person trying to find their PayPal password, doing password recovery, creating a new shipping address, or something like that.) If in fact Mollie is checking for the next available order number to send to PayPal (but not immediately creating the order on TB) then an order number collision could happen.

    Like this:

    Customer 1: Gets to the end of the checkout process and the Mollie module sends them off to PayPal to complete the checkout process. Before sending them, Mollie looks to see what the next available order number is. Let's say "513". Mollie sends this number to PayPal so the order number is included with the PayPal payment. Customer #1 takes some time to finish up with PayPal.

    Customer 2: Gets to the end of the checkout process before the previous customer completes their PayPal payment with Mollie. In this case when Mollie looks up the next available order number it will also find "513", and will send 513 to PayPal. Customer #2 heads off to PayPal to complete payment and completes payment very quickly. Mollie then converts their pending cart into an order using order #513.

    Customer 1: They finally complete their PayPal payment and are returned to TB. Mollie tries to convert the cart into an order but order #513 is now used by Customer #2's order. This results in a unique ID collision and Mollie fails to convert the cart into an order.

    In your case above where order #123 belongs to a manually created order, replace Customer #2 with your manual order creation. Same thing happens. The order number Mollie expected to be able to use ends up being used by your manually created order before Customer #1 completes their PayPal payment.

    The above is just speculation on my part as I do not know the internals of how Mollie works and I do not know if Mollie actually waits to convert the cart into an order until after the PayPal process completes. Someone like @datakick might have a better idea.

    • Thanks 1
  3. 4 hours ago, wakabayashi said:

    In general I could not imagine a better way, than doing once rand() and check, if its unique.

    Yes, if the rand() function operates properly you would be correct.
     

    4 hours ago, wakabayashi said:

    If the collision rate is exceptional high, it indicates, that this is not really a normal distribution.

    Also correct. Unfortunately when generating big numbers many rand() functions do not work properly. There are various workarounds for this but the best solution is a properly working rand() function.
     

    4 hours ago, wakabayashi said:

    But as you say since php 7.1 is fine, why dont you just use it then?

    I do run PHP 7.1. I only discovered a short time ago that the rand() function in PHP 7.1 has been greatly improved and now uses a well known random number generator.

    So for my particular situation calling rand() or mt_rand() is likely to work very well. However anyone else visiting this thread in the future needs to know that rand() with earlier versions of PHP may have undesirable results for creating random table IDs.
     

    Quote

    If you are below PHP 7.1 you can go with mt_rand() direct!?

    Unfortunately below 7.1 mt_rand() also doesn't work properly. I think it is still better than rand() but there was a flaw in the implementation of mt_rand() in PHP versions prior to 7.1.

  4. 1 hour ago, wakabayashi said:

    @dynambee what was actually your idea behind the multiplication of two numbers? In my naive way of thinking, I would just create a random number and check if it's already exits. If it does I redo the process until I got a new unique number. As far as I see, thats also @datakick way of doing it.

    The problem is that not all rand() functions are equal. The rand() function in PHP 7.1 is very good. Excellent, even. However the rand() function in earlier versions of PHP (even 7.0) is not so good. Exactly how bad it is I am not sure. Earlier versions seem worse than later versions and in some cases it seems to depend on the underlying OS, too. All that is fixed with 7.1 though as it uses the Mersenne Twister Random Number Generator.

    What I am sure of is that the rand() function in MariaDB 10.2 is pretty terrible at generating a lot of large random numbers, far worse than I would have ever expected. For a table with a standard INT(11) ID column, generating a new random ID for the table had an 8.5% collision rate after only 1 million inserted rows. That's shockingly bad. Less than 0.05% of the available INT(11) numbers are in use with 1mil rows but 8.5% of the newly generated "random" numbers are collisions? Something is very wrong. (After 2.4mil rows rand() had a collision rate of 22.8%! 0.11% of numbers in use and 22.8% collisions... Not good.)

    Changing to using two smaller numbers multiplied together brought the MariaDB collision rate in the same scenario down to 0.6%. This is still higher than expected but is pretty acceptable given the use case being discussed in this thread.

    Anyway. for anyone using PHP 7.1 or newer none of this is an issue. For anyone using PHP 7.0 or lower it's still not clear, to me, if rand() is a suitable way to generate random IDs for table inserts.

  5. 1 hour ago, datakick said:

    As you can see, probability distribution changed significantly

    Yes, I see your point. Perhaps I'll make time to test a pre-PHP7.1 version of rand() and see how it performs vs the MariaDB rand() function. If it's similar then multiplying two numbers together, while not ideal, will still be far better than using a rand() function that generates a high number of collisions. (I have tested this multiple times with the MariaDB function.)

  6. 36 minutes ago, datakick said:

    I believe that product of two random number with uniform distribution results in a number that does not have uniform distribution, which means it leads to more collisions.

    I don't think that multiplying two properly generated pseudo-randoms together will see any decrease in randomness. Mathematically I don't see a reason why it would, but perhaps I am missing something.
     

    38 minutes ago, datakick said:

    In my case, it's not (assuming php rand generates number with uniform distribution, as my quick test suggests)

    It seems that with PHP 7.1 and newer the rand() function is fine as it is now just an alias of mt_rand(). PHP 7.1 also saw mt_rand() fixed to use the correct Mersenne Twister Algorithm. So with PHP 7.1 and newer using rand() or mt_rand() shouldn't cause problems, even if generating a lot of extremely large numbers.

    For older versions of PHP I think generating substantial quantities of large pseudo-random numbers will suffer from the same sort of lack of randomness that I have seen with MariaDB.

  7. I agree, the chance of a collision is low.

    That said, is there any downside to using `(FLOOR(RAND()*46341)+1) * (FLOOR(RAND()*46340)+1)` instead of `FLOOR(RAND()*2147483647)+1`? My testing with MariaDB shows only benefits. Increased randomness, far fewer collisions, and no performance penalty that I can detect.

  8. 1 hour ago, datakick said:

    I believe php rand() function returns number with uniform distribution. But even if it didn't it's not really an issue. My snippet above checks if record with the same ID already exists. If so, then another random ID is generated. The worst case scenario - this adds a little bit overhead.


    I thought MariaDB's RAND() function would also provide good levels of randomness but for large integers it starts to fall apart.

    MariaDB RAND() examples:

    When using `(FLOOR(RAND()*46341)+1) * (FLOOR(RAND()*46340)+1)` to generate random numbers between 1 and 2,147,441,940 I see a collision rate of 0.3% after generating 500,000 random numbers. After 1 million random numbers the collision rate is about 0.6%. These sorts of collision rates are fine, and I ran the test up to about 9.8 million randoms where the collision rate got up to about 5.7%. I don't expect to have any single TB table with anywhere near that many rows.

    However when using `FLOOR(RAND()*2147483647)+1` to generate random numbers between 1 and 2,147,483,647 the collision rate is already at 3.7% after generating 500,000 random numbers. After 1 million random numbers the collision rate is around 8.5%. This collision rate is incredibly high considering that less than 0.05% of the number range has been used.

    In both cases the collision rates are roughly repeatable across multiple test runs.

    The reason for these big differences is that pseudo-random number generators typically create random numbers based on a generated DOUBLE value with 16 decimals of precision. Bigger random numbers approach the limits of this precision and this causes a lot more collisions. Multiplying two smaller numbers together to give the same total range of numbers fixes the problem.

    Why does this matter even when checking for duplicate IDs? There is a time gap between the duplicate check and when the row is added. With a very low collision rate this time gap isn't too much of a concern. However with a high collision rate it becomes more likely that a collision could occur with two rows being inserted at around the same time. It's an edge case but not so much of an edge case that it shouldn't be a concern on a busy site.

  9. 3 hours ago, dynambee said:

    I've been testing random id_order generation and the RAND() function of MariaDB doesn't seem very random. I'm seeing regular collisions after even 7000 or 8000 random number generations from a pool of 10^12 numbers. That shouldn't be happening.

    I found out why these collisions were happening.

    The MariaDB RAND() function returns a double with 16 digits of precision. This means that it's not really suitable for generating long integers with a single call, but it can be done with two calls. So instead of `FLOOR(RAND()*999999999999)+1` it has to be `(FLOOR(RAND()*1000000)+1) * (FLOOR(RAND()*999999)+1)`.

    @datakick I don't know how good the PHP random function is for large integers. Would it be necessary to do something like:

    $rnd = rand(1, ‭46341) * rand(1, ‭46340);

    or does PHP do a good enough job on it's own with an 11 digit integer?

     

  10. 10 minutes ago, wakabayashi said:

    But cant you just use a higher auto_increment value than 1?

    It is possible to increase the next value that will be used by auto_increment manually with an SQL statement like `ALTER TABLE ... AUTO_INCREMENT = N`. However each `id_order` will still increment by 1.

    I could add some code that boosts the auto_increment value by some random number between 100 and 200 per day. It's not an ideal solution but it's better than just purely sequential increases over time.

    I've been testing random id_order generation and the RAND() function of MariaDB doesn't seem very random. I'm seeing regular collisions after even 7000 or 8000 random number generations from a pool of 10^12 numbers. That shouldn't be happening.

  11. I think you may have misunderstood.

    The orderrefnum is only part of the problem. It is a reference number that is shown on the surface, but the real order ID still exists and is stored separately in a field called id_order.  Ideally the id_order should be invisible and never shown to the customer but this is not the case. In links that the customer can see (to view order details, to view an invoice, and probably elsewhere) the real, sequentially generated id_order value is displayed.

    So if a competitor wants to do some research on a business they can go to that business' shop, make an account, and buy a cheap item. Then 10 days later buy another item. Look at the difference between the id_order numbers and divide by 10. Now you know how many orders per day your competitor is selling. If you happen to run a similar business (or just know the market) you will know the typical average order size and from that you can make a decent guess about their monthly and yearly sales.

    Some people don't care if this information is visible. Other people don't want to give out private business information. I happen to be in the latter group but I totally understand that many people won't care at all.

    • Like 1
  12. How bad an idea would it be to change all `id_order` columns (there are 14 of them) to BIGINT instead of INT and then randomly generate `id_order` values when orders are created?

    If I use 12 digit randomly generated `id_order` numbers (combined with a yymm prefix) and allow for the birthday paradox the site could do 1 million orders per month (an obviously impossible number for my situation) before the probability of a random `id_order` number collision exceeds 50%.

    To put that into a bit better context, I'd have to receive 200 orders per hour every hour for a month (144,000 orders) to have 1% probability of at least one collision with a 12 digit random `id_order` field. If everything goes extraordinarily well I could see having 5000 orders per month.

    Terrible idea?

  13. 18 minutes ago, DRMasterChief said:

    @dynambee i think you can easily contact the author of the module  ( @Daresh )  and he will have a solution, maybe he will modifiy the module for you for some bucks  🙂  All experiences with Daresh seems to be very good... 

    Yup, I replied to this old thread that was started by @Daresh so that he would see the discussion. Being a free module I don't really want to ask him to improve on it, the module itself does do what is advertised perfectly. However a way to generate random invoice numbers would also be appreciated, certainly.

  14. It sounds like the best option would be to generate invoice numbers randomly and not try to match order numbers and invoice numbers.

    After looking into this more though, it doesn't seem that generating random order reference numbers and random invoice numbers actually fixes the problem of competitors being able to easily view order volumes. The links used when referencing the order details `javascript:showOrder(1, 1, 'https://www.mydomain.com/order-detail?id_order=1');` and invoices `https://www.mydomain.com/pdf-invoice?id_order=1` still use the sequentially incremented primary order table ID. 😞

    I'm certainly not the first person to think about this (example) in the context of online sales and I suspect it's a pretty common concern. Rick James wrote up a list of potential ways to generate random table IDs some years back.

    To avoid having to randomize the `id_order` primary key would it be possible for any customer-facing order & invoice links to use the order table's `reference` field instead of the `id_order` field?

  15. 11 minutes ago, datakick said:

    Note that there can be multiple invoices for one order, so make sure your invoice numbers do not clash in this situation

    Interesting. What sort of situations give rise to multiple invoices within an order? Would it only happen when there is more than one shipping type within an order?

  16. Bringing an old thread back from the past. I have installed the random order reference number generator into TB 1.1.x and it works fine.

    However TB uses a separate sequential number for invoices. Since one of my reasons for using this module is to cloak the number of orders we process I would like to use the same randomly generated order reference number for the invoice number. If that isn't possible then I'd like to use any randomly generated number for invoices. Basically anything besides a sequentially generated number.

    Is this something that could be added to this same module? If not, is it something that could be done with a separate override? Any ideas?

  17. A few minor issues (don't hate me!)

    • The icon used in the back office (yellow and white star) does not match the purple icon used on the datakick webshop.
       
    • If there are too many items in the cart for the size of the user's computer screen it becomes difficult to see what is in the cart if the cart is displayed on the left or right side of the checkout.

      The floating cart ends up covering the store footer and even then there isn't enough space if 8 or 9 items are in the cart on a 1080p display. A scroll bar appears on the side of the cart but it doesn't work very well.

      Maybe people who expect large purchases should just put the cart at the top of the checkout? Perhaps a setting could be added so if the number of different items in the cart exceeds a certain number the cart moves to the top?
       
    • Is it possible to make the cart not float? Even if there are not very many items in the cart if the user scrolls down past the end of the checkout and into the footer the floating cart comes along and looks very odd.
       

    Here is what the floating cart & scrolling issues looks like on my website:

  18. I've installed the just-released 0.6.3 and it seems to be operating as expected. So far I have not discovered any additional issues. @x97wehner, one of the bug fixes is what you reported, and the CSS IDs are there for the totals too. 

    Huge thanks to @datakick for such an impressive checkout module!

     

    Bug fixes:

    Quote
      • Fixed bug with displaying incorrect carrier list in some corner cases
      • Added HTML IDs for elements in cart (tax, total, discounts,...) in order to easily target them by CSS rules
      • Fixed saving Newsletter / Opt In preferences
    •  

     

    • Thanks 1
  19. Finally yesterday I had time to do the upgrade from 10.1 to 10.4.8 and so far I haven't noticed any problems. CWP seems happy, TB seems happy. If I do come across any issues I'll be sure to report here.

    I followed the directions here, in case anyone else running CentOS 7 wants to make the same upgrade.

  20. 7 hours ago, datakick said:

    Anyway, I'll probably get rid of this font-awesome based icon, and replace it with svg icon that will be part of chex distribution. This way, I can be sure it will work on every theme. And if anyone want to use font-awesome icons, they still can achive it using css.

    That would be fantastic! One less thing to worry about.

     

    7 hours ago, datakick said:

    Meanwhile, I suggest you to declare .icon-remove class in the same way Panda theme declares .icon-cancel. 

    I'll report it as a bug to Jonny. I looked into it with the Niara theme and it only supports icon-remove and not icon-cancel so it would seem that themes designed for use with TB should also support icon-remove, even if icon-cancel is also declared. Hopefully Jonny will add it to Panda for TB.

  21. On 10/12/2019 at 5:10 AM, dynambee said:

    When I increase the quantities of items in the cart if the weight exceeds the maximum for a given shipping carrier that carrier is removed when I click on "Done". Perfect, this is what is supposed to happen. Unfortunately when I decrease the quantity of items to the point that those other carriers should now be usable they do not reappear. Actually the first time I decrease the quantity (and therefore the total weight) and click on "Done" the carrier options do not appear even if the total cart weight should allow them. However if I change the quantity a second time to a different quantity (either higher or lower) that should allow those carriers then they reappear.

     

    On 10/12/2019 at 3:24 PM, datakick said:

    I can't reproduce this. This works ok for me. You can have a look at this video 


    I have reproduced it as follows:

    • I created a brand new test store, fresh install of 1.1.0, ran core updater to 1.1.x Bleeding Edge.
    • Installed Chex 0.6.2 trial version. Did not make any changes to Chex or use any custom CSS.
    • Standard Niara theme
    • No modules added besides Chex. No customization done to TB or Niara.
    • Basic settings like enabling SSL and disabling sales taxes.
    • Deleted existing zones and created a new zone for United States. Made sure the United States country & states are properly set to the United States Zone. (This emulates the way I set up my other store.)
    • Created three carriers. First two (Economy and Registered) are for up to 2000g. Third (Express) is for up to 8000g
    • Created test item with weight of 767g (same weight as my item had in the previous store).
    • 1 item or 2 items in the cart should allow all three carriers but 3 items or more should allow only the Express Mail carrier.


    Once I had things set up I ran the same sort of test as I did before, and here are the results (may need to select 1080p for clear video):

     

     

     

    I will PM you the site access details so you can have a look and try things out yourself. You're more than welcome to make any changes to the site and generally use it for testing purposes, it is not a site that I have any need for beyond testing this particular problem, and I have a backup of it anyway.

     

×
×
  • Create New...