Practical Applications of Machine Learning Models-When and where to use which model

DS+Finance:

The scenarios in the finance industry are mostly Fraud Detection in credit card applications, loan applications, credit card transactions, and risk analysis. These are usually unbalanced dataset classification problems, and the common practice is to use Boosting with Logistic Regression.

DS+Media/Internet

The media industry, including social media, requires a broader range of ML applications. Practically, some of the top companies survive based on a very advanced recommendation system(TikTok). For social media platforms like Facebook, Instagram, Twitter, and Tiktok. The flow goes like this:

A General Recommendation System Flow in Social Media Industry

Collaborative Filtering(CF)

A lot of companies choose to use Collaborative Filtering. Some schools probably covered part of it in the customer analytics courses, but they probably did not tell you it is called “Collaborative Filtering.” There are three types of CF — — user-based CF, item-based CF, and model-based CF. They calculate the similarities between the users, items, or models, sort the similarities and calculate the interest scores. At last, they use the “TOP-N” method for recommendations. There are different kinds of calculations for similarity scores based on your choice of CF.

Different models for model-based CF

NLP based social media analytics

Content recommendation is a large part of ML in the social media industry. The LSA and LDA mentioned in CF are also popular just for building user-personas. Data Scientists should be familiar with other NLP tools such as Doc2Vec, Topic Modeling, POS(part of speech tagging).

Retail Industry

Sales Prediction

There are multiple ways for sales prediction, including the SCAN-PRO model(regression) and Time Series Prediction(LSTM in particular). However, in most situations, sale prediction can be tough because of omitted variable bias. Generally speaking, preparing a regression model can be a great choice when applying for a sales analyst position.

Recommendation System

The CF model and logistic model are also popular in the retail industry. They are mainly used to build customer personas, predict consumer preferences, and launch precise promotions.

Prescriptive Analytics & Operations Research(OR)

This is not really related to machine learning. When it comes to logistics optimization and supply chain management, knowledge in OR and prescriptive analytics can help a lot. Programmers should be familiar with Python packages, e.g., Pulp & SciPy, for these kinds of problems.

What will be popular in the future?

The basic models that students learn from school are not enough for practical situations. For example, most schools do not teach students how to build or optimize a recommendation system, how to use AutoML techniques, and how to apply reinforcement learning in advanced internet companies. AutoML is a really trending and popular field since it can build the ML model by itself, which will reduce the cost of labor and definitely add difficulty for data scientists to find jobs. Instead of figuring out the formulas behind the current ML models, focusing on trending technologies could benefit our careers. Last but not least, great command of databases(SQL & NoSQL), big data tools(Hadoop, Spark, etc.), cloud computing tools(GCP, AWS, etc.) will help a lot.

References

[1]Do we Need Hundreds of Classifiers to Solve Real World Classification Problems

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Sean Zhang

Sean Zhang

Data Science | Machine Learning| Data Engineer