Jump to content
thirty bees forum

dynambee

Members
  • Posts

    837
  • Joined

  • Last visited

  • Days Won

    5

Everything posted by dynambee

  1. @Traumflug said in Indiegogo ElasticSearch project: who even seemed confused about how open source licensing works Me? lol I don't think there are many people on this planet who have studied all the flavors of "open source" more than me. I'm doing this for some 30 years now and have participated in more projects than I can count. You specifically asked if an open source project would be free in 5 years. That's a mighty strange question for anyone to ask who has even a basic understanding of how open source works. you sure seem to have a lot of “ideas” today about exactly how everything related to this module should be done. Ah. Pointing out flaws in the above specification upsets you. Good to know. You aren't pointing out flaws, you're adding requests, some of which make little sense. If you want to make requests that's fine but it would be nice if you learned a little about how ES works before you post random things. Even learning the very basics would be good. It doesn’t matter if ES is on the local server or a remote server it is still accessed through the API via IP & port. As such timeouts are possible even on a local server. If this is your assumption, please put it into the specification. MySQL is a local server, too, and queries to/from there are expected to be fast and reliable. If they're not, an exception happens, which means a blank page in production mode or this new encrypted error message. Handling such stuff more gracefully needs more code and if this isn't part of the specification, it won't happen. Right, and when websites get too busy for their hardware the website stops responding. This will be no different for ES than it is currently for MySQL or even for Apache in extreme situations. It's nothing unusual or unexpected.
  2. @Traumflug said in Indiegogo ElasticSearch project: Not all will be able to use it as a local service. [...] But then there is a possibility to use a hosted alternative. Writing a module which supports both, local and remote services, sounds like a non-trivial change. For example, remote requests have to handle timeouts gracefully and requests asynchronously. With local requests one can rely on a timely answer, which is much simpler. It should be clear whether remote requests are part of the specification or not, before that specification is finalized. For someone who seemed to have no idea what Elasticsearch was 24 hours ago and who even seemed confused about how open source licensing works, you sure seem to have a lot of "ideas" today about exactly how everything related to this module should be done. It doesn't matter if ES is on the local server or a remote server it is still accessed through the API via IP & port. As such timeouts are possible even on a local server. If the server is doing heavy indexing, has run out of memory, has crashed, or for some reason was just shut down then local timeouts will happen. As such there is fundamentally no difference between a local instance and a remote instance from the point of view of the module. Obviously the further the ES server is from the webserver the slower search results will be but if the two servers are relatively close geographically it will be fine. For me personally I plan to run five to ten 30bz sites on one VPS and then use a shared ES instance on a separate VPS in the same datacenter. Likewise I will run Piwik+Redis (for Piwik) on a separate server in the same datacenter. VPS servers are so cheap now it just makes sense to split things up a little. However in many cases it would be no problem at all to run ES on the same server as the webserver & db are running on. I just looked up above again, there is no mention that this should be a local service. And looking at the ES site, one sees subscription plans and remote requests as code samples. Looks like a few people forgot the basics in all this buzzword euphoria :-) This thread is a continuation of the "Let's talk about Search!" thread which you also participated in. Running ES on the same server as the web & db servers was discussed at length in that thread, as was running ES on a separate server in the same datacenter. Using ES instead of Algolia because ES is free and Algolia is expensive was also discussed. There should be no secrets or surprises here for anyone who has been following along.
  3. @vzex said in How to add interesting new features to thirty bees while staying PS 1.6 compatible?: Hmm, I found thirtybees because PS already seemed broken...waiting forever for a product to save? Oh go find this fix....yeah I did TB! Isn't the point to work, so why be compatible with something that doesn't. PS has hundreds if not thousands of modules and themes. 30bz has nearly none as it is brand new. Being compatible with an established player makes it easier to gain users as they can use their existing modules and themes and if they're looking for something different there's a large library available.
  4. @okom3pom said in Let's talk about Search!: FeedBack : Sales starts today in France. In analytics 250 visitor Brad use 1.6 Giga How many products? How many different combinations?
  5. Regardless of the module chosen I will contribute some funds to the cause. My preference, of course, is for an ES-powered search module but I can certainly see the benefits of the other options as well. Thanks @lesley for taking the time to set up the poll.
  6. I think there was some misunderstanding about what ES is and what was being proposed for this project. I hope the recent replies have helped shed a bit more light on the subject and made things clearer.
  7. Rather than adding still more to my reply above I'll make a separate post. There do exist other search engine options besides Elasticsearch, so why do I like ES best? Well in short it's the best open source search engine, and it's the one developing the fastest. There are three main OSS search engines, Elasticsearch, Solr, and Sphinx. ES and Solr both run on top of Apache Lucene. Sphinx is it's own separate system. All have advantages and disadvantages but right now Elasticsearch has the best balance of features and is the fastest growing option with the most mindshare. No option is absolutely perfect of course but there are some problems with the other two main options that ES fixes: Solr is very large and somewhat heavy for installation & use Solr is more difficult to configure Solr is backed by the Apache Foundation which is awesome but does have somewhat limited resources to push it forward. Progress happens but a bit slower. Sphinx uses a proprietary protocol for queries, not JSON or REST. Sphinx does not support server side scripts Sphinx does not support triggers Elasticsearch has brought a few very nice things to the table, with many of the best features of other OSS search engines and fixing some of the problems & limitations: * It's much lighter weight than Solr, both for installation and on server resources * It's very easy to configure. A standard installation will work well even without any configuration. * Offers standardized APIs for interaction (so does Solr but not Sphinx) * Is the fastest growing option at the moment * Open source license (same Apache 2 license as Solr) but being pushed forward by a business and a community so advancements are happening quite quickly Elasticsearch has become the search engine of choice to power eCommerce sites. The most popular Magento search module is powered by Elasticsearch. The best high performance PS search module is powered by Elasticsearch, as is the only free PS high performance search module (Brad). Many large eCommerce websites use Elasticsearch including eBay and Dell. Other large scale users that most people would recognize include Netflix, Tinder, Facebook, and Microsoft Azure. So I don't think there is any doubt that Elasticsearch is the best choice to power the 30bz high performance search module. The only real question is how many of us will pony up a bit of cash to make it happen. I know not everyone has a budget to contribute but I hope enough of our small community can come up with something to contribute so we can move forward. Edit: Another thing I meant to include is that Cloudways, so far the only host recommended by 30bz, provides Elasticsearch as a free service on all their VPS plans. It only takes two mouseclicks to turn it on and start using it.
  8. To everybody raising eyebrows now: there are no plans to give up PS 1.6 compatibility anytime soon, AFAIK. Thirty bees’ stated goal is to be stable and reliable. My understanding is that 30bz 1.0.x will maintain PS 1.6 compatibility but once 30bz moves to v1.1 then PS 1.6 compatibility will start to break. My personal belief is that 30bz needs to maintain 1.6 compatibility until either PS goes bust or the 30bz active community is much, much larger than it is now. That could be done by maintaining 1.0.x in tandem with 1.1, but with a small team that is going to be difficult to do. Right now I think most of us are here because we can transition our PS work relatively seamlessly to 30bz and have a far better final product. To me it seems the best way to attract more users is to keep providing this mostly seamless transition.
  9. Okay, I'll take a crack at answering these: ES is free now, will it be free in 5 years? Elasticsearch is open source software licensed under the Apache license. It's based on other open source projects (like most things in the OSS movement are) and it is very unlikely that the licensing situation will change. However if the licensing situation did change the existing versions would still remain available as OSS under the Apache license and the community would fork the code and ES would live on. Much like we see with MariaDB and MySQL, except MySQL also still exists as OSS too. (Edit: Also much like we see with PS and 30bz.) What if this API changes, who does the migration? If the API changes or evolves with a future version of ES then 30bz can continue to use compatible versions of ES, the versions of ES that the module was originally designed to work with. If the newer versions of ES are dramatically better than existing versions then the 30bz module can be updated to be compatible with newer versions of ES. Who does this update and how it is done can be tackled at that time. I would assume the 30bz community will be far larger by this time so it should be much less of a concern. The same question could be asked of MySQL or any other technology that 30bz makes use of, and the answer would be the same. Merchants don’t care about the technology used behind the scenes, Of course, and that's the way it should be! they want a well working search function. ES is the best existing way to give this to merchants who need/want high performance eCommerce search but don't want to spend huge money on a hosted solution like Algolia. (And everyone with more than a few pages of products needs high performance search!) For merchants, a requirement to hook up with another service just to get basic functionality is a burden and entry barrier. How can this be avoided? 30bz already needs multiple services just to function. No PHP? No MySQL/MariaDB? No Apache? No website. Additionally the proposed ES solution would be a module to use on top of 30bz, replacing the standard search function on websites that use the module. The standard search function works fine for small websites, it just doesn't scale and doesn't provide an ideal user experience. Per the provided feature list, ES search requires a fallback search engine. Which means duplicate code, more code maintenance work. ES does not require a fallback search option but if a large site needs to reindex everything it can be a better user experience for customers if ES is disabled during this time. It's somewhat unusual to need to reindex everything, generally only changes need to be indexed and they are done as the changes happen. If ES is disabled then the site would fall back to using the standard 30bz search. An analogy to this would be if you need to rebuild your Redis cache or if you are doing some work on the site and need to disable caching temporarily. 30bz falls back to running with no cache until the cache is turned back on and can rebuild itself. The site will be slower but will still work. If it’s just about great search algorithms, what makes ES better than an on-site Google search? What makes Piwik better than using Google Analytics? The answers are pretty similar: Using ES is way faster than using onsite Google Search because ES is hosted locally, either on the same server or on a separate server in the same datacenter, depending on site owner preference and skill level. (With Cloudways you can use ES on your VPS with two mouse clicks, so that's how easy it can be to set up and use.) As with Piwik, when you use ES you control your own data and you can decide exactly how you want the system to work. Want to index descriptions? Great! Don't want to index descriptions? No problem! Want to index descriptions but give them a very low weight in the search results? Just change the weighting number and make it lower. You don't have this type of control with onsite Google Search. In fact you really have no control at all, Google just gives you search results. Additionally using ES allows high speed faceted search. The more data you add about your items to your website the finer the control customers will have over the results they see. If customers want only blue widgets they can select Blue. If they want only widgets from a certain brand they can select that too. It's basically similar to how Amazon's search functions. You can narrow down your search results dramatically with a few clicks. Why are such on-site Google searches used so rarely, despite being free, available for many years and backed with powerful algorithms? Because onsite Google search looks terrible, has Google branding, is comparatively slow, and it can take quite a lot of time between adding new products and Google reindexing your site. Additionally you can't filter results easily and quickly with onsite Google search and you really have no flexibility or control over one of the most important functions on your website. How well does the current engine work with state-of-the-art Ajax callbacks? The problem with the current search system is that there is no search engine. It's simply making calls to the database using SQL. SQL databases are great at storing large quantities of data in stable ways but they are not optimized for high speed full text searches. They do a very poor job of it and do not scale well to larger numbers of products or larger numbers of users. Results are slow and they aren't all that accurate unless the user exactly nails the search terms. It's like asking why do we need to use cache on the website when Apache can just serve everything directly to the visitor? Of course Redis isn't absolutely necessary for basic website operation but the site is going to work a lot better if Redis is turned on and functioning. What about privacy? With ES hosted on your webserver or on another server you control there are no additional concerns about privacy. You control the server, you control the service, you control exactly how your data is used. What about other search engines, like Brad, what makes ES better? Brad is a search module that uses ES for search. That's why it's so fast. I would expect that the existing code for Brad will be a base of inspiration for the 30bz module. Unfortunately while Brad is open source the license is not a standardized one which makes it less than ideal to directly lift code from for reuse. I think we might want to contact the author and ask them to release it under a standardized license that would give the same rights to users but be better from a legal standpoint for everyone. Can ES searches be integrated into the page at page load time? With a native engine one can prepare search results even before sending the page. I don't really understand this question. However to my knowledge there is nothing that can be done with the existing 30bz search system that can't be done better and faster with an ES module. Can ES search results be reported back to the next page request? Like featuring products similar to the ones a user searched for before. Like showing a “you recently searched for …” selection. This is a feature that would generally be provided by the module itself rather than the search engine. The module keeps track of what users have searched for and can display these results in a "you recently searched for..." section. This would work much the same way as the existing 30bz search works, perhaps even exactly the same way. Can page rendering take advantage of recent search results, like adjusting prices for often searched products or highlighting products which that particular user has searched before? Rendering is managed by 30bz as always. Therefore anything that can be done at the render stage with the current search could be done with Elasticsearch. Of course adding these types of features will slow down page rendering, but that isn't specific to ES or any other search engine. Regarding highlighting, it is possible with ES to highlight search results or even to provide special search result sorts. For example if you have your own in-house brand you can make sure that those items always appear at the top of search results for that type of product. You can also highlight products that are on sale, or highlight newly arrived products. Endless flexibility. I hope this answers your questions and provides some clarity. Feel free to ask followup questions or new questions and I'll do my best to answer. (Of course if someone else has questions, answers, corrections, or additional info please chime in!) Edit: A bunch of small edits for minor corrections and clarifications. Should be finished editing now, 2017-06-28 03:45 UTC. Edit: And another small edit for clarification, 04:20 UTC.
  10. @roband7 said in Indiegogo ElasticSearch project: @Traumflug I think perhaps you make ES into something it isn't. From a license, deployment and technological point of view it's basically comparable to MySQL. Meaning you could more or less take all your questions above and replace ES with MySQL. In other words we're not talking about ES as a cloud service, but as a locally installed piece of open source software, just like MySQL. MySQL, Apache, Redis, PHP, Imagemagick, Linux... Even 30bz itself for that matter. These (including Elasticsearch of course) are all under strong open source licenses though so I'm not concerned.
  11. There is no way to have a great search module without a great search service. Honestly you seem pretty clueless about search and the importance of excellent search to eCommerce.
  12. Native database full text search sucks, and that's why Elasticsearch exists and several other open source search solutions exist. It's why Algolia exists, has received ~$75mil in funding, and can charge customers so much for their services. Native db text searching does not scale well, does not provide features like synonym search, does not handle spelling mistakes well (really not at all), and there is no way to easily weight the indexes. "Instant" search results in 30bz using native db searching are too slow to be useful and really are just frustrating. They're slow enough that I didn't even realize they existed as I could type my entire query and hit enter to search before any "instant" results appeared. Filtering search results using native db search is likewise slow. Besides being incredibly fast, Elasticsearch provides an easy to use query API which is far easier to work with and modify than complicated SQL queries full of joins. It scales very well and is the search platform of choice for many well known sites. It's also being actively developed and is improving even more with each release. IMO if 30bz aspires to complete with PS and Magento then a powerful search module is an absolute must. Next to fast caching I'd say that fast & accurate search is the most important thing an eCommerce site can do well. I'm happy to contribute to this project and will put in 250 Euro as soon as I can after the Indiegogo project goes live. Edit: Totally forgot to mention, ES supports autocomplete, auto-suggest, and the ability to highlight certain results if desired. It's endlessly flexible and extremely powerful.
  13. The PS addons store FAQ implies that it is 90 days for updates as well as support: "When you purchase a module, theme or email template, you pay the price displayed only once and acquire the right to use it on a single online store. The price of each product includes 3 months of free support: during the 90 days following purchase, you can contact the developer who will answer your technical and functional questions and provide you with all product updates." I think another point worth mentioning from a cost perspective is that in order to stay legal one would have to buy a copy of each module for each store being run. Run 5 stores? Buy 5 sets of modules. That starts to get expensive very quickly.
  14. I think the key point here is that there is no existing single module for PS that does everything this proposed module will do. I think the proposed feature set is pretty complete and I look forward to the Indiegogo link when it's ready.
  15. I'm in! Just need that Indiegogo link.
  16. @Traumflug said in Currency format: Another thing to keep in mind is that some markets, e.g. Switzerland, don't round to 0.01, but to 0.05: And then there is Canada where cash payments are rounded to 0.05 but credit/debit card payments are not. When rounding a cash payment the final total is rounded, not each item calculation.
  17. @Havouza said in Difference between uninstalling and disabling modules?: But that is in many cases not the case either. I have tested very expensive modules, some with subscription fees that after I have uninstalled and then deleted still leave all db tables intact I don't think PS does any testing or quality control on the modules sold in their store. I can imagine it would be a bit time consuming to do this but if they're taking a big chunk of the sales (30%?) a little basic testing doesn't seem like too much to expect.
  18. Yeah, having uninstall not actually uninstall is pretty odd. "Uninstall" should undo whatever "install" did. If "install" copies files and creates the db structure then "uninstall" should delete those same files and remove all created db structure. "Delete" as an option should remove all traces of the module, including the originally uploaded zip file.
  19. Creating the product over and over again means that your customers won't be able to (re-)download what they purchased after you delete one product and create another. You'll also end up with a bunch of different versions of the same file on your server if you don't delete the old versions, and customers that bought in the past won't be able to download the newest version of the file. So it's not really an ideal solution to delete & recreate. Just saw your new reply post. As long as that keeps the same product ID then existing customers should be able to download the new version.
  20. There's a PS module that helps with this, including sending notifications to existing customers when the file is updated. Not exactly the solution you were looking for but it will probably save you a bunch of time. There's also a somewhat less expensive module for downloadable product management. It basically is a tool to make it easier to manage the files associated with downloadable products but doesn't seem to provide the notifications of the first option.
  21. Doesn't uninstalling also remove the installed module files?
  22. I just read more of the Redis FAQ I linked to above and came across these bits: "For instance, using pipelining Redis running on an average Linux system can deliver even 500k requests per second, so if your application mainly uses O(N) or O(log(N)) commands, it is hardly going to use too much CPU." and "Redis can handle up to 2^32 keys, and was tested in practice to handle at least 250 million keys per instance. Every hash, list, set, and sorted set, can hold 2^32 elements. In other words your limit is likely the available memory in your system." As long as Redis is properly configured & has enough memory available I can't see a bunch of 30bz sites coming remotely close to maxing out even a single Redis instance. Something else will become a bottleneck long before Redis will.
  23. Cloudways provides one redis instance per VPS. Customer can turn it on or off at will with a couple of clicks.
  24. For low traffic sites with a few thosand products each I doubt it will make any noticeable difference so a single instance is likely preferred due to ease of management.
  25. @dprophitjr said in One redis, many shops: @dynambee So 1 Gig instance dedicated to Redis per site? I don't think Redis will use that much memory per instance, at least not for most 30bz sites. They have some details about memory usage here. Edit: Also it looks like the additional memory overhead per redis instance is only about 1MB. If this is accurate then the amount of memory used for a single Redis instance caching 5 databases and five separate Redis instances caching those same 5 databases shouldn't be more than a few MB different. This is likely too simple an example, but Redis doesn't seem to have a memory footprint too much bigger than the data it is caching.
×
×
  • Create New...