Data science and programming are two topics that continue to expand and evolve as computation, knowledge bases and best practices continue to improve. This makes it very difficult to keep up with all the new articles and bodies of thought. So We compiled a list of 10 articles we or other people have enjoyed in the past year or so on the topics of programming, data science and machine learning. We hope they provide you new perspective as well as practical advice.
1. Why businesses fail at machine learning
Imagine hiring a chef to build you an oven or an electrical engineer to bake bread for you. When it comes to machine learning, that's the kind of mistake I see businesses making over and over.
If you're opening a bakery, it's a great idea to hire an experienced baker well-versed in the nuances of making delicious bread and pastry. You'd also want an oven. While it's a critical tool, I bet you wouldn't charge your top pastry chef with the task of knowing how to build that oven; so why is your company focused on the equivalent for machine learning?
Are you in the business of making bread? Or making ovens?
2. R vs Python: What's The Difference?
With the massive growth in the importance of Big Data, machine learning, and data science in the software industry or software service companies, two languages have emerged as the most favourable ones for the developers. R and Python have become the two most popular and favourite languages for the data scientists and data analysts. Both of these are similar, yet, different in their ways which makes it difficult for the developers to pick one out of the two.
3. Automatic Machine Learning: Learning How to Learn
The paper introduces AlphaD3M, an automatic machine learning (AutoML) system whose objective is to learn how to learn via self-play. The “D3M” in AlphaD3M’s name comes from DARPA’s Data Driven Discovery of Models (D3M) program, which has propelled machine learning toward solving any user-specified task, given any dataset. This goes beyond the traditional vision, in which AutoML solved a task given a dataset, a well-defined task, and performance criteria.
4. How to effortlessly create a website for free with GitHub Pages
Do you need an online portfolio of your work for potential employers to check out but you don't know how to make a website? Do you want to create a blog or a business site but you don't know where to start? Is it possible that you just don't want to deal with (or pay for) website hosting, domain names, and everything else?
5. How I Trained an AI to Play Atari Space Invaders
By Vedant Gupta
Everyone is talking about the race between Artificial Intelligence and Human Intelligence. When will AI fully surpass human ability and be in control of a majority of our daily lives? While humans spend their days going to school and educating themselves, what is AI doing to get an edge on the competition? AI needs to step up it's game!
6. Why Model Explainability is The Next Data Science Superpower
Some people think machine learning models are black boxes, useful for making predictions but otherwise unintelligible; but the best data scientists know techniques to extract real-world insights from any model. For any given model, these data scientists can easily answer questions like
- What features in the data did the model think are most important?
- For any single prediction from a model, how did each feature in the data affect that particular prediction
- What interactions between features have the biggest effects on a model's predictions
7. Why the Platform Model is Broken
A pandemic hit us hard a few years back. At that time, every startup was building the Airbnb for this or the Uber of that. The technology industry was convinced that startups could only be worthwhile if they looked like a Platform; if they aspired to be the next Facebook.
Off the back of the successes of juggernauts such as Airbnb, Uber, Amazon, Facebook, and eBay, our industry was shoehorning every business case into a Platform business model. It became the default. Unfortunately, this neurosis is still ingrained in our culture.
8. 4 Must Have Skills Every Data Scientist Should Learn
By Ben Rogojan
We wanted to follow up our previous piece about how to grow as a data scientist with some other skills senior data scientists should have. Our hope is to bridge the gap between business managers and technical data scientists by creating clear goals senior data scientists can aim for. Both entities have to take on very different problems. Both benefit when they are on the same page. This is why the previous post focused so highly on communication. It seems simple, but the gap between technical and business continues to grow as new technologies keep getting piled on every year. Thus, we find it important that managers and data scientists have a clear path of expectations.
9. How to Build a Reporting Dashboard using Dash and Plotly
I built the reporting dashboard as a multi-page app in order to break up the dashboard into different pages so it less overwhelming and to present data in an organized fashion. On each dashboard page, there are two data tables, a date range selector, a data download link, as well as a set of graphs below the two dashboards. I ran into several technical challenges while building the dashboard and I describe in detail how I overcame these challenges.
10. It's Only Natural: An Excessively Deep Dive Into Natural Gradient Optimization
To a first (order) approximation, all modern deep learning models are trained using gradient descent. At each step of gradient descent, your parameter values begin at some starting point, and you move them in the direction of greatest loss reduction. You do this by taking the derivative of your loss with respect to your whole vector of parameters, otherwise called the Jacobian. However, this is just the first derivative of your loss....