Introduction to Machine Learning

Machine learning is the use of man-made consciousness (Artificial Intelligence AI) that gives structures (or frameworks) the option/availability/function to take in and improve without being expressly customized. Machine learning focuses on the betterment of computer programs that can get to details/information/statistics and use it to grasp (or learn) on their own.

In machine learning, classification alludes to a predictive modeling issue where a class mark is anticipated for a given case of input information. Instances of grouping issues include: Given a model, arrange if the event is spam or not. Given a handwritten character, order it as one of the known characters.

3 Types of Machine Learning (shorthand ML) are available:

· 1.Supervised Machine Learning: Supervised learning is a methodology that involves “mentoring” a computer. The machine receives sample data, sets of inputs that are known and sets of outputs that are known, and let them infer an appropriate function.

· 2.Unsupervised Machine Learning: Unsupervised learning requires no sample (labeled) data to teach the machines. The program is fed with raw data which, apart from its attributes and volume, is unknown to us, and then let an algorithm process it – understand structures and patterns within the dataset and categorize items based on similarity measures. Some methods rely on statistical similarities or algebraic ones, some on metric measures.

· 3.Semi-supervised Machine Learning:

It aims to combine the best of both worlds. It uses a mixture of labeled and unlabeled data. Labeled data is utilized here to provide some supervision information, some labels, while the more impressive part of training the model is done through unlabeled data. In that way it is a combination of both Supervised and Unsupervised Machine Learning, giving us the option to work on a number of different problems.

The first project most people create when developing their data science experience is the Iris flower dataset project. It is called the “Hello World” project to help people understand what is needed and what entails in this kind of project.

The project itself has a number of steps that should be completed in order to get the best results. Those steps are the following:

1. Strategy:

1. Matching the problem with the solution

2. Dataset preparation and pre-processing (Data collection, Data visualization, Data selection, Data transformation)

3. Dataset splitting (Training set, Test set, Validation set)

4. Modeling (Model training, Model evaluation, Improving predictions)

5. Model deployment

Each step has numerous sub-steps or variations to them which may be very time-consuming.

Iris Project

In The following images a basic project of flower petal analysis and final prediction:

Figure 1: Title of project (Markdown cell)

Figure 2: The first step is importing needed libraries and dataset

Figure 3: Exploring the data

Figure 4: Plotting the data to visualize known information

Figure 5: Results should be understandable to everyone, even people with no background in Data Visualization

Figure 6: After the initial analysis is complete, the data is divided into training and testing sets

Figure 7: The accuracy of the model is evaluated, to know how accurate and how trustworthy it is.

Figure 8: The final part is to add new data and see how the model evaluates the new specimen.

Machine Learning Application

Machine learning is used for many different things where data can be used and manipulated.

· Virtual Personal Assistants. Siri, Alexa, Google Now are some of the popular examples of virtual personal assistants.

· It is used to predict traffic or online transportation networks (like Uber’s pricing).

· It is used in Social Media Services like personalizing news feeds and ad targeting.

· Another instance where it is used is in Email filtering to differentiate between necessary and spam so spam can be overlooked.

· Another time, when it is used, is for refining results on Search engines or for product recommendations and to show detect fraud online for the protection of online assets.

· Fraud Detection and Video Surveillance

An example of the use of Machine Learning is determining which Tweets are from a certain person (it’s an example that is good to get an understanding of the complexity needed to be able to complete ML projects). The project will be showcased.

A very important detail which is included in such projects are explanations (Markdown cells or Text cells) that explain what is happening or what is about to happen so the person looking at the project can get a full understanding.

Figure 9: Explanation of the project, different font sizes and lists can be seen which is important to make the reader understand easily.

Twitter Project

In a similar way, the first step is to import the libraries needed.

Figure 10: Importing libraries, due to the project working with tweets (uploads to Twitter) there has to be some kind of connection established.

Figure 11: Creating a class to be able to work with the data from Twitter faster

Then the data is converted to Pandas Dataframe(s).

Figure 12: The class is instantiated and examples are called

Figure 13: Dataframe creation from the tweets

Figure 14: Creating target (y)

Figure 15: Model choice and estimation

Figure 16: Estimation of the probability that one of the two persons would say the words (source_test).

The project can be continually searching for what kind of tweets the same two people or working with different people (different examples) and getting results depending on their writing patterns.

Conclusions

Machine learning is the study of algorithms that improve or give better results through experience so when it is being trained by the user. It can be used in a number of ways and for various reasons depending on the task at hand. Most notably in financial technology (FinTech) and protection (Fraud detection, spam detection, Video Surveillance).