Monday, June 26, 2017
Sunday, June 25, 2017
In this article, I present a few modern techniques that have been used in various business contexts, comparing performance with traditional methods. The advanced techniques in question are math-free, innovative, efficiently process large amounts of unstructured data, and are robust and scalable. Implementations in Python, R, Julia and Perl are provided, but here we focus on an Excel version that does not even require any Excel macros, coding, plug-ins, or anything other than the most basic version of Excel. It is actually easily implemented in standard, basic SQL too, and we invite readers to work on an SQL version.
Who should use the spreadsheet?
First, the spreadsheet (as well as the Python, R, Perl or Julia version) are free to use and modify in any context, even commercial, and even to make a product out of it and sell it. It is part of my concept of open patent, in which I share all my intellectual property publicly and for free.
The spreadsheet is designed as a tutorial, thought it processes the same data set as the one used for the Python version. It is aimed at people that are not professional coders, people who manage data scientists, BI experts, MBA professionals, and people from other fields, with an interest in understanding the mechanics of some state-of-the-art machine learning techniques, without having to spend months or years learning mathematics, programming, and computer science. A few hours is needed to understand the details. This spreadsheet can be the first step to help you transition to a new, more analytical career path, or to better understand the data scientists that you manage or interact with. Or to spark a career in data science. Or even to teach machine learning concepts to high school students.
The spreadsheet also features a traditional technique (linear regression) for comparison purposes.
Click here to read this article, download the spreadsheet, and start using it.
Thursday, June 22, 2017
We study the properties of a typical chaotic system to derive general insights that apply to a large class of unusual statistical distribut...
This book is intended for busy professionals working with data of any kind: engineers, BI analysts, statisticians, operations research, AI ...
Full title: Applied Stochastic Processes, Chaos Modeling, and Probabilistic Properties of Numeration Systems . Published June 2, 2018. Aut...
Originally published in 2014 and viewed more than 200,000 times, this is the oldest data science cheat sheet - the mother of all the numero...